Booting to Rust

A couple nights ago I was looking over the UEFI spec, and I realized it shouldn't be too hard to write UEFI applications in Rust. It turns out, you can, and here I will tell you how.

The thing that surprises me most about UEFI is that it now appears possible to boot your machine without ever writing a single line of assembly language. Booting used to require this tedious process of starting out in 16-bit real mode, then transitioning into 32-bit protected mode and then doing it all over again to get into 64-bit mode. UEFI firmwares, on the other hand, will happily load an executable file that you give it and run your code in 64-bit mode from the start. Your startup function receives a pointer to some functions that give you basic console support, as well as an API to access richer features. From a productivity standpoint, this seems like a win, but I also miss the sorcery you used to have to do when you were programming at this level.

Booting to Rust is a lot like writing bindings to any C library, except that the linking process is a bit more involved.

An Overview of the UEFI Boot Process🔗

I was trying to find the simplest way to get my code running. As such, I'm not sure I did this in the most official way, or that my terminology is totally correct, but here's something that worked.

UEFI firmware works around this idea of applications. These could be things like a shell to help you troubleshoot a broken computer, but most of the apps seem to be bootloaders. As far as I can tell, a bootloader is just an app that transfers control over to an operating system rather than returning control to the firmware.

One of the things that confused me is that all of the example code I could find was written in C. I always thought code that runs this early in the boot process should be written in assembly language. The difference is that now a lot of this early boot code is handled by the firmware. Thus, you just have to run a program that looks kind of like this example from over on the OSDev wiki

#include <efi.h>
#include <efilib.h>

EFI_STATUS efi_main(EFI_HANDLE ImageHandle, EFI_SYSTEM_TABLE *SystemTable)
{
	SIMPLE_TEXT_OUTPUT_INTERFACE *conout;
	InitializeLib(ImageHandle, SystemTable);
	conout = SystemTable->ConOut;

	uefi_call_wrapper(conout->OutputString, 2, conout, (CHAR16 *)L"Hello World\n\r");

	return EFI_SUCCESS;
}

I'm not super wild about all of the extra code this example uses from an EFI library, but fortunately the spec tells us what we need to know to recreate this. Now we can look at how to do a similar program in Rust.

UEFI in Rust🔗

Much of the UEFI spec looks like reading C header files, since the API that UEFI exposes to applications is basically a C API. This is fantastic, because Rust has always had very good C interoperability. Basically, we just have to transcribe the various structs and function pointers and start writing our application. Here's what the system table struct looks like.

type EFI_HANDLE = *();

struct EFI_TABLE_HEADER {
    Signature  : u64,
    Revision   : u32,
    HeaderSize : u32,
    CRC32      : u32,
    priv Reserved : u32
}

struct EFI_SYSTEM_TABLE {
    Hdr : EFI_TABLE_HEADER,
    FirmwareVendor : *u16,
    FirmwareRevision : u32,
    ConsoleInHandle : EFI_HANDLE,
    ConIn : *EFI_SIMPLE_TEXT_INPUT_PROTOCOL,
    ConsoleOutHandle : EFI_HANDLE,
    ConOut : *EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL,
    // ... other stuff that we're ignoring for now.
}

It's basically a transliteration of the C definition. However, we do start to see the advantages of using a higher level language like Rust. For example, we can mark the Reserved field in EFI_TABLE_HEADER as private to make sure nothing uses this when it's not supposed to.

You'll notice I left out a lot of fields from the structure. For now I'm just trying to get a simple Hello, World program running, which means all I need is text output. For the same reason, I leave the EFI_SIMPLE_TEXT_INPUT_PROTOCOL struct completely unspecified:

struct EFI_SIMPLE_TEXT_INPUT_PROTOCOL;

For EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL, we define just enough to output a string.

struct EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL {
    Reset : EFI_TEXT_RESET,
    OutputString : EFI_TEXT_STRING,
    // ... and more stuff that we're ignoring.
}

type EFI_TEXT_RESET = *();

type EFI_TEXT_STRING = extern fn(*EFI_SIMPLE_TEXT_OUTPUT_PROTOCOL,
                                 *u16);

One thing I like about this code is that all you have to do to access functions provided by the UEFI firmware is declare a function pointer. You can then call this function just like any other, although you must call from an unsafe block.

Now we have enough to write our UEFI entry point. The skeleton of it looks like this:

pub extern fn efi_main(_ImageHandle : EFI_HANDLE,
                       _SystemTable : *EFI_SYSTEM_TABLE) -> int
{
	loop {}
}

This function just does an infinite loop. The return type is probably not exactly correct, but we don't need to worry about this since our function doesn't return anyway. Building from this, we can fill in the function to print "Hello, World!"

Hello World🔗

Ideally, we'd just do SystemTable.ConOut.OutputString("Hello, World!"). Unfortunately, it's not quite that simple. One problem is that Rust represents strings as UTF-8, while UEFI's console output needs UTF-16 strings. We'll deal with this for now by rather verbosely making an array of u16s where each element is filled with a character casted to u16, such as 'H' as u16. We also need to be sure we set the last element to 0, and we should add both the carriage return ('\r') and linefeed ('\n') characters. Once we have this array, we use transmute to get a *u16 that points to its contents.

We're working with a lot of unsafe pointers, so we'll put the whole body of efi_main in an unsafe block. We'll also have to be rather explicit about where we are dereferencing our pointers.

Here's the code to print out hello world:

pub extern fn efi_main(_ImageHandle : EFI_HANDLE,
                       SystemTable : *EFI_SYSTEM_TABLE) -> int
{
    unsafe {
        let SystemTable = *SystemTable;
        let vendor = SystemTable.FirmwareVendor;
        let conout = SystemTable.ConOut;
        let output = (*conout).OutputString;

        let hello = ['H' as u16,
                     'e' as u16,
                     'l' as u16,
                     'l' as u16,
                     'o' as u16,
                     ',' as u16,
                     ' ' as u16,
                     'W' as u16,
                     'o' as u16,
                     'r' as u16,
                     'l' as u16,
                     'd' as u16,
                     '\r' as u16,
                     '\n' as u16,
                     0u16];
        let (hello_ptr, _) = buf_ptr(hello);

        output(conout, hello_ptr);

        loop {
        }
    }
}

// We also need some helpers to find a pointer to the hello world string.
fn buf_ptr<T>(buf: &[T]) -> (*T, uint) {
    unsafe { transmute(buf) }
}

extern "rust-intrinsic" {
    pub fn transmute<T,U>(val: T) -> U;
}

We had to define transmute as a Rust intrinsic directly, rather than using the version that's in Rust's standard library. The reason is that we can't use Rust's standard library because we don't even have an operating system.

Now that we have our program, we need to figure out how to run it.

Linking🔗

Rust generally assumes you are writing programs that you want to run on your own machine, meaning it generates binaries in the format that the host operating system expects. On Linux, for example, these are ELF files. According to the UEFI spec, we need a PE32+ file, and it needs to be marked as using subsystem 10, meaning it's a UEFI Application. Fortunately, the Linux linker can generate nearly any binary format known to man.

We'll have to tell Rust to only compile our program and not link it, and then we'll write the linker command ourselves. Let's say our program was in a file called boot.rs. You can see my boot.rs here. We compile it using this command:

rustc -c -O --lib boot.rs

This creates a file called boot.o, which we can then link.

My linker didn't support the right output format by default, so I had to install a cross-linker. I was running Gentoo, so I got this by running sudo crossdev -s0 --target x86_64-efi-pe.

Once we have an appropriate linker, we build our PE file as follows:

x86_64-efi-pe-ld --oformat pei-x86-64 --subsystem 10 -pie -e efi_main boot.o -o boot.efi

That's quite a mouthful. Let's break down the options.

x86_64-efi-pe-ld - this is what crossdev called my custom linker.
--oformat pei-x86-64 - Generate a PE32+ file containing 64-bit code.
--subsystem 10 - set the subsystem to 10. This tells the firmware that this is a UEFI application.
-pie - generate position independent code. This is important. This program runs before virtual memory is enabled, so the firmware might have to relocate your program if the addresses you requested were already in use.
-e efi_main - set the entry point to efi_main.
boot.o - this is the file we are linking.
-o boot.efi - call the result boot.efi

I also had to add the #[no_mangle] attribute to efi_main in the Rust code so that Rust would not use a different symbol name in the final object code.

The first time I ran this, the linker complained that __morestack was missing. The easiest way to deal with this is to add the #[no_split_stack] attribute to whatever functions report that symbol as missing, which in our case is efi_main. This attribute prevents LLVM from inserting the stack safety checks.

When all is said and done, file reports that we have the correct kind of executable:

boot.efi: PE32+ executable (EFI application) x86-64 (stripped to external PDB), for MS Windows

Calling Conventions🔗

There's one more issue we have to deal with, which is calling conventions. Calling conventions describe rules for where function arguments go when you call a procedure. For example, some calling conventions say all arguments go on the stack, while others pass some parameters through registers. Rust purposefully leaves its calling convention undefined for now, so we don't really know what's going to happen. Let's take a look at some assembly and see what we get.

_efi_main:
0000000000000000	pushq	%rbp
0000000000000001	movq	%rsp, %rbp
0000000000000004	subq	$0x20, %rsp
0000000000000008	movq	0x40(%rsi), %rsi
000000000000000c	movq	0x8(%rsi), %rax
0000000000000010	movabsq	$0x6c006c00650048, %rcx
000000000000001a	movq	%rcx, 0xffffffffffffffe2(%rbp)
000000000000001e	movabsq	$0x570020002c006f, %rcx
0000000000000028	movq	%rcx, 0xffffffffffffffea(%rbp)
000000000000002c	movabsq	$0x64006c0072006f, %rcx
0000000000000036	movq	%rcx, 0xfffffffffffffff2(%rbp)
000000000000003a	movl	$0xa000d, 0xfffffffffffffffa(%rbp)
0000000000000041	leaq	0xffffffffffffffe2(%rbp), %rdx
0000000000000045	movw	$_efi_main, 0xfffffffffffffffe(%rbp)
000000000000004b	callq	*%rax
000000000000004d	nop
000000000000004e	nop
000000000000004f	nop
0000000000000050	jmp	0x50

Let's work backwards through this to see where this code expects its arguments to come from. We know our function made one procedure call, which seems to be the callq instruction on line 16. Line 16 calls the procedure pointed to by rax, and we see this value is assigned in line 6. This reads the value at 8 bytes away from the rsi register, which in turn came form 32 bytes offset from the original value of rsi. If you work out the field offsets from the system table struct and the console output struct, we see that Rust assumed the pointer to the system table was in rsi.

The UEFI spec, however, says that the system table will be in rdx when it calls your entry point. The calling convention described in that document for x86_64 happen to match the convention used by Microsoft Windows. Rust didn't support this convention by default (although it is probably the convention used when you run Rust on Windows), but it was only a few lines of code to add support for it. Once we've made this change, we add "win64" as the calling convention and see what the new assembly code looks like.

_efi_main:
0000000000000000	pushq	%rbp
0000000000000001	movq	%rsp, %rbp
0000000000000004	subq	$0x40, %rsp
0000000000000008	movq	0x40(%rdx), %rcx
000000000000000c	movq	0x8(%rcx), %rax
0000000000000010	movabsq	$0x6c006c00650048, %rdx
000000000000001a	movq	%rdx, 0xffffffffffffffe2(%rbp)
000000000000001e	movabsq	$0x570020002c006f, %rdx
0000000000000028	movq	%rdx, 0xffffffffffffffea(%rbp)
000000000000002c	movabsq	$0x64006c0072006f, %rdx
0000000000000036	movq	%rdx, 0xfffffffffffffff2(%rbp)
000000000000003a	movl	$0xa000d, 0xfffffffffffffffa(%rbp)
0000000000000041	leaq	0xffffffffffffffe2(%rbp), %rdx
0000000000000045	movw	$_efi_main, 0xfffffffffffffffe(%rbp)
000000000000004b	callq	*%rax
000000000000004d	nop
000000000000004e	nop
000000000000004f	nop
0000000000000050	jmp	0x50

Line 5 shows us reading from rdx, which is what we wanted.

Booting🔗

Now the moment of truth.

The easiest way to boot this code is to stick it on a CD image. The EFI spec states that you can create a file called /efi/boot/bootx64.efi and the firmware will try to use this file to boot from a removable media. I used mkisofs to build an ISO image with the right format.

In theory I could burn this to a CD and boot an EFI-capable computer off it. I'm not quite brave enough to try this yet, so I used VirtualBox instead. I created a new VM, saying the operating system was an other/unknown 64-bit OS. In the system settings, I had to check the Enable EFI option as well. Then, I pointed the virtual CD-ROM device at my ISO image and booted the machine.

It worked!

Screen shot of virtual machine running Rust Hello, World!

Okay, so it didn't actually work on the first try. It took a while to get the PE file format right, the calling conventions correct, and all the pointers pointing to the right place. Fortunately, once that much is done, adding more functionality shouldn't be hard.

Conclusion🔗

So we've covered how to make a minimally interesting EFI application in Rust. The next step is to fill out the EFI API and start building more interesting applications, such as a full-fledged operating system. The initial working version of this code is available here. Since then, I've cleaned it up and added a few more pieces. You can see the latest version at the location below.

https://github.com/eholk/Boot2Rust