RISC-V Instruction Encoding

Instructions

Write a C program that reads a string with a RISC-V instruction from STDIN, parses its content in a way of obtaining its fields, packs the instruction's fields in a 32 bit value and writes the hexadecimal representation of the instruction to STDOUT.

The code snippet below can be used to compare strings as standard C libraries such as string.h are not available in the simulator. It is similar to string.h's strcmp but it has the number of characters to be compared as a parameter.

int strcmp_custom(char *str1, char *str2, int n_char){
    for (int i = 0; i < n_char; i++){
        if (str1[i] < str2 [i])
            return -1;
        else if (str1[i] > str2 [i])
            return 1;
    }
    return 0;
}

The set of instructions that need to be encoded by your program is presented in the table below, alongside its opcode, instruction type and other fields (e.g. funct3 and funct7) if applicable.

InstructionInst SyntaxInst TypeOPCODEFUNCT3FUNCT7
luilui rd, immU0110111N/AN/A
auipcauipc rd, immU0010111N/AN/A
jaljal rd, immJ1101111N/AN/A
jalrjalr rd, imm(rs1)I1100111000N/A
beqbeq rs1, rs2, immB1100011000N/A
bnebne rs1, rs2, immB1100011001N/A
bltblt rs1, rs2, immB1100011100N/A
bgebge rs1, rs2, immB1100011101N/A
bltubltu rs1, rs2, immB1100011110N/A
bgeubgeu rs1, rs2, immB1100011111N/A
lblb rd, imm(rs1)I0000011000N/A
lhlh rd, imm(rs1)I0000011001N/A
lwlw rd, imm(rs1)I0000011010N/A
lbulbu rd, imm(rs1)I0000011100N/A
lhulhu rd, imm(rs1)I0000011101N/A
sbsb rs2, imm(rs1)S0100011000N/A
shsh rs2, imm(rs1)S0100011001N/A
swsw rs2, imm(rs1)S0100011010N/A
addiaddi rd, rs1, immI0010011000N/A
sltislti rd, rs1, immI0010011010N/A
sltiusltiu rd, rs1, immI0010011011N/A
xorixori rd, rs1, immI0010011100N/A
oriori rd, rs1, immI0010011110N/A
andiandi rd, rs1, immI0010011111N/A
sllislli rd, rs1, imm**I00100110010000000*
srlisrli rd, rs1, immI00100111010000000*
sraisrai rd, rs1, immI00100111010100000*
addadd rd, rs1, rs2R01100110000000000
subsub rd, rs1, rs2R01100110000100000
sllsll rd, rs1, rs2R01100110010000000
sltslt rd, rs1, rs2R01100110100000000
sltusltu rd, rs1, rs2R01100110110000000
xorxor rd, rs1, rs2R01100111000000000
srlsrl rd, rs1, rs2R01100111010000000
srasra rd, rs1, rs2R01100111010100000
oror rd, rs1, rs2R01100111100000000
andand rd, rs1, rs2R01100111110000000

* slli, srli and srai are type I instructions but its immediate takes up only 5 bits, the remaining 7 bits are filled with a funct7 value.
** The imm field may also appear as shamt, which stands for shift amount.

Input

  • RV32I assembly instruction string with at most 40 bytes. There will be no pseudo-instructions, the registers will be referenced with their x-name (e.g. x2, x10), and immediate values will be in decimal.

Output

  • The 32 bit encoded instruction in its Big Endian hexadecimal representation (hex_code() from the previous exercise can be used).

Examples

Test CaseInputOutput
1lb x10, 4(x9)0x00448503
2and x31, x20, x250x019A7FB3
3slti x12, x13, -10xFFF6A613
4bge x7, x0, 2560x1003D063
5jalr x1, -32(x9)0xFE0480E7

Notes and Tips

  • This exercise depends on some things used in Exercise 5.1, such as hex_code() and pack() functions, so it is recommended to do Exercise 5.1 first.
  • You can use this base code as a starting point, or build your solution from scratch if you want.
  • Refer to the RISC-V Instruction Set Manual to check how each instruction is encoded. Especially, consult the RV32I Base Instruction Set in Table 19.2, presented in Chapter 19. For information regarding the encoding of immediates, check Figures 2.3 and 2.4 in Section 2.3 .
  • You can test your code using the simulator's assistant from this link.