RISC-V Machine Boot Code

Last updated on February 24, 2023 pm

RISC-V Machine Boot Code

Basically, when a RISC-V machine powers on, the bootloader built in hardware or emulator will go to memory address 0x8000_0000 to load an OS.

At this time, the computer works in machine mode, and our boot.S needs to do something that can only be done in this privilege level. And here is a line-by-line explanation.

Some Definitions

If you want, you can take a glance at code first, and come back when encountering any problems.

Directives

All keywords begin with a period are called directives. They are not defined in RISC-V specification, but in assembler’s side, to give some hints to assembler. Its syntax may be different from assembler to assembler, and it’s not corresponding to any specific instruction.

Pseudoinstructions

Technically, pseudoinstructions are not instruction. They resemble common instructions and are used to improve programming efficiency. One line of pseudo instruction may correspond to more than one instruction when assembled by assembler.

For example, li (load immediate) is a common pseudoinstruction.

ABI

ABI stands for application binary interface. Instead of naming registers’ absolute location, say, x0, x1 and so on, it is highly recommended to refer them with their ABI, such as zero, ra and so on.

CSR and Zicsr

Control and status registers (CSR) are used to control and monitor the operation of various hardware components. They are part of privileged component in architechure, and thus, accessible by using different opcodes. These opcodes for CSRs are differently categorized and named as Zicsr.

Code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
.globl _start

.equ STACK_SIZE, 1024
.equ CPU_NUM, 8

.section .text

_start:
csrr t0, mhartid
bnez t0, pend

slli t0, t0, 10
la sp, stacks + STACK_SIZE
add sp, sp, t0

j enter

stacks:
.space STACK_SIZE * CPU_NUM

pend:
wfi
j pend

.end

Branching out Other Harts

Hart is a conventional name for a Hardware Thread.

In line 9, csrr is a pseudoinstruction meaning ‘read CSR’. You can simply understand it as ‘read mhartid and save into t0‘, but since mhartid is a CSR, it requires using csrr, which is also in Zicsr, to read it.

mhartid can be understood as ‘machine mode hart id’. When hardware boots, all its harts are activated and run this assembly code, whereas this snippet is only intended to use one hart.

t0 is the first temporary register, mapping to register x5.

In line 10, bnez is also a pseudoinstruction, standing for ‘branch if not equal to zero’.

So this two lines are trying to figure out which hart is running, and then lead it to the pend label, unless it is the first hart.

Let a hart jump endlessly is a common way to pend it. And in line 22, we added an instruction called wfi, meaning ‘wait for interrupt’. It could be seen as entering sleep mode for this hart.

Setting up Stacks

The layout of stacks is quite baffling and took me a lot of time to figure out.

In general, we assign a bunch of memory, let sp points to the end of first hart’s stack, and move sp to its corresponding hart’s stack, with respect to label stacks(line 18-19), line 13 and line 14 accordingly.

The stacks have a gross size of STACK_SIZE * CPU_NUM in bytes. The directive .space means fill these size with zeros automatically, the same as .skip. When line 13 refers to stacks, it refers to the beginning, or in other words, smallest address of this assigned memory. By adding exactly one STACK_SIZE, the sum should be the end address of first hart’s stack.

Always remember these rules when trying to understand this part:

  1. The low address is the start of the stack, whereas the high address is its end.
  2. The stack pointer sp always points to the end of the stack.
  3. Hart id starts at 0.

Since we designed the size of the stack to be 1024 bytes, which is 210, we can left shift the hart ID 10 bits and add it to location of the first stack to find its own stack. Line 12 is the instruction that exactly does this. The opcode slli instructs the hart to logical shift t0 left, which also means multiplying it by 1024.

Going to C

In line 16, the program jumps to enter which is declared in C language. Calling a C function is just like calling an assembly function since they are both converted to binary and located in text section. When booting, quickly switching to high-level programming language can be helpful.

Initializing stacks can also be done in C. Referring to xv6-riscv/kernel/start.c, it is written as:

1
2
// entry.S needs one stack per CPU.
__attribute__ ((aligned (16))) char stack0[4096 * NCPU];

Here __attribute__ is a GNU GCC feature used to provide additional information to the compiler. In this case, it ensures that stack0 is aligned on a 16-byte boundary. It gives each hart 4096 bytes and NCPU stands for the number of harts.

Loading Address

Note that address 0x8000_0000 is different from so-called magic number, which is 0xAA55, in x86. The latter is defined as the end of boot sector. BIOS would go through all storage devices and try to boot from this section.

Thus, when linking objects together, we need to add flag -Ttext=0x80000000 to make sure the .text section is located at where we want. From the doc, -Ttext here is a short hand for --section-start=.text.

This 8000_0000 is not standardized, but conventionally, it should be this location, as you can find out in QEMU source code. For example, in qemu/hw/riscv/virt.c, in the array of memory map virt_memap, the last one is VIRT_DRAM and starts at 0x8000_0000.

References

  1. xv6: a simple, Unix-like teaching operating system, Russ Cox, Frans Kaashoek, Robert Morris, MIT, 5 September 2022, 2.6 Code: starting xv6, the first process and system call (P27 - 28)
  2. xv6-riscv/entry.S at riscv · mit-pdos/xv6-riscv
  3. Writing a Simple Operating System – from Scratch, Nick Blundell, University of Birmingham, 2 December 2010, Chapter 2 Computer Architecture and the Boot Process (P3 - 7)
  4. 55 and AA. What’s special about 55 and AA? Or more… | by Larry K. | Medium
  5. riscv-asm-manual/riscv-asm.md at master · riscv-non-isa/riscv-asm-manual
  6. Pseudo Ops (Using as)
  7. Options (LD)
  8. Documentation for binutils 2.40
  9. [完结] 循序渐进,学习开发一个RISC-V上的操作系统 - 汪辰 - 第7章(上)-Hello RVOS_哔哩哔哩_bilibili
  10. riscv-operating-system-mooc/start.S at main · plctlab/riscv-operating-system-mooc
  11. RISC-V System emulator — QEMU documentation
  12. qemu/virt.c at master · qemu/qemu
  13. Specifications - RISC-V International
  14. RISC-V Instruction Set Manual, Volume I: RISC-V User-Level ISA | Five EmbedDev
  15. Taking control of RISC-V: RISCV OS in Rust
  16. osblog/boot.S at master · sgmarz/osblog

RISC-V Machine Boot Code
https://lingkang.dev/2023/02/23/boot-risc-v/
Author
Lingkang
Posted on
February 23, 2023
Licensed under