I recently mentioned on the internet I did work in this direction and a friend of mine asked me to write a blogpost on this. I didn’t blog for a long time (keeping all the goodness for myself hehe), so here we go. 🦝 To set the scene, let’s assume we want to make an exectuable binary for x86_64 Linux that’s supposed to be extremely portable. It should work on both Debian and Arch Linux. It should work on systems without glibc like Alpine Linux. It should even work in a FROM scratch Docker container. In a more serious setting you would statically link musl-libc with your Rust program, but today we’re in a silly-goofy mood so we’re going to try to make this work without a libc. And we’re also going to use Rust for this, more specifically the stable release channel of Rust, so this blog post won’t use any nightly-only features that might still change/break. If you’re using a Rust 1.0 version that was recent at the time of writing or later (>= 1.68.0 according to my computer), you should be able to try this at home just fine™.

This tutorial assumes you have no prior programming experience in any programming language, but it’s going to involve some x86_64 assembly. If you already know what a syscall is, you’ll be just fine. If this is your first exposure to programming you might still be able to follow along, but it might be a wild ride.

If you haven’t already, install rustup (possibly also available in your package manager, who knows?)

# when asked, press enter to confirm default settings
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh

This is going to install everything you need to use Rust on Linux (this tutorial assumes you’re following along on Linux btw). Usually it’s still using a system linker (by calling the cc binary, and errors out if none is present), but instead we’re going to use rustup to install an additional target:

rustup target add x86_64-unknown-none

I don’t know if/how this is made available by Linux distributions, so I recommend following along with rust installed from rustup.

Anyway, we’re creating a new project with cargo, this creates a new directory that we can then change into (you might’ve done this before):

cargo new hack-the-planet
cd hack-the-planet

There’s going to be a file named Cargo.toml, we don’t need to make any changes there, but the one that was auto-generated for me at the time of writing looks like this:

[package]
name = "hack-the-planet"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]

There’s a second file named src/main.rs, it’s going to contain some pre-generated hello world, but we’re going to delete it and create a new, empty file:

rm src/main.rs
touch src/main.rs

Alrighty, leaving this file empty is not valid but we’re going to walk through the individual steps so we’re going to try to build with an empty file first. At this point I would like to credit this chapter of a fasterthanli.me series and a blogpost by Philipp Oppermann, this tutorial is merely an 2023 update and makes it work with stable Rust. Let’s run the build:

$ cargo build --release --target x86_64-unknown-none
   Compiling hack-the-planet v0.1.0 (/hack-the-planet)
error[E0463]: can't find crate for `std`
  |
  = note: the `x86_64-unknown-none` target may not support the standard library
  = note: `std` is required by `hack_the_planet` because it does not declare `#![no_std]`

error[E0601]: `main` function not found in crate `hack_the_planet`
  |
  = note: consider adding a `main` function to `src/main.rs`

Some errors have detailed explanations: E0463, E0601.
For more information about an error, try `rustc --explain E0463`.
error: could not compile `hack-the-planet` due to 2 previous errors

Since this doesn’t use a libc (oh right, I forgot to mention this up to this point actually), this also means there’s no std standard library. Usually the standard library of Rust still uses the system libc to do syscalls, but since we specify our libc as none this means std won’t be available (use std::fs::rename won’t work). There are still other functions we can use and import, for example there’s core that’s effectively a second standard library, but much smaller.

To opt-out of the std standard library, we can put #![no_std] into src/main.rs:

#![no_std]

Running the build again:

$ cargo build --release --target x86_64-unknown-none
   Compiling hack-the-planet v0.1.0 (/hack-the-planet)
error[E0601]: `main` function not found in crate `hack_the_planet`
 --> src/main.rs:1:11
  |
1 | #![no_std]
  |           ^ consider adding a `main` function to `src/main.rs`

For more information about this error, try `rustc --explain E0601`.
error: could not compile `hack-the-planet` due to previous error

Rust noticed we didn’t define a main function and suggest we add one. This isn’t what we want though so we’ll politely decline and inform Rust we don’t have a main and it shouldn’t attempt to call it. We’re adding #![no_main] to our file and src/main.rs now looks like this:

#![no_std]
#![no_main]

Running the build again:

$ cargo build
   Compiling hack-the-planet v0.1.0 (/hack-the-planet)
error: `#[panic_handler]` function required, but not found

error: language item required, but not found: `eh_personality`
  |
  = note: this can occur when a binary crate with `#![no_std]` is compiled for a target where `eh_personality` is defined in the standard library
  = help: you may be able to compile for a target that doesn't need `eh_personality`, specify a target with `--target` or in `.cargo/config`

error: could not compile `hack-the-planet` due to 2 previous errors

Rust is asking us for a panic handler, basically “I’m going to jump to this address if something goes terribly wrong and execute whatever you put there”. Eventually we would put some code there to just exit the program, but for now an infinitely loop will do. This is likely going to get stripped away anyway by the compiler if it notices our program has no code-branches leading to a panic and the code is unused. Our src/main.rs now looks like this:

#![no_std]
#![no_main]

use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

Running the build again:

$ cargo build --release --target x86_64-unknown-none
   Compiling hack-the-planet v0.1.0 (/hack-the-planet)
    Finished release [optimized] target(s) in 0.16s

Neat, it worked! What happens if we run it? 👀

$ target/x86_64-unknown-none/release/hack-the-planet
Segmentation fault (core dumped)

Oops. Let’s try to disassemble it:

$ objdump -d target/x86_64-unknown-none/release/hack-the-planet

target/x86_64-unknown-none/release/hack-the-planet:     file format elf64-x86-64

Ok that looks pretty “from scratch” to me. The file contains no cpu instructions. Also note how our infinity loop is not present (as predicted).

Making a basic program and executing it

Ok let’s try to make a valid program that basically just cleanly exits. First let’s try to add some cpu instructions and verify they’re indeed getting executed. Lemme introduce, the ✨ INT 3 ✨ instruction in x86_64 assembly. In binary it’s also known as the 0xCC opcode. It crashes our program in a slightly different way, so if the error message changes, we know it worked. The other tutorials use a #[naked] function for the entry point, but since this feature isn’t stabilized at the time of writing we’re going to use the global_asm! macro. Also don’t worry, I’m not going to introduce every assembly instruction individually. Our program now looks like this:

#![no_std]
#![no_main]

use core::arch::global_asm;
use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

global_asm! {
    ".global _start",
    "_start:",
    "int 3"
}

Running the build again (ok basically from now on the build is always going to be expected to work unless I say otherwise):

$ cargo build --release --target x86_64-unknown-none
   Compiling hack-the-planet v0.1.0 (/hack-the-planet)
    Finished release [optimized] target(s) in 0.11s

Let’s try to disassemble the binary again:

$ objdump -d target/x86_64-unknown-none/release/hack-the-planet

target/x86_64-unknown-none/release/hack-the-planet:     file format elf64-x86-64


Disassembly of section .text:

0000000000001210 <_start>:
    1210:	cc                   	int3

And sure enough, there’s a cc instruction that was identified as int3. Let’s try to run this:

$ target/x86_64-unknown-none/release/hack-the-planet
Trace/breakpoint trap (core dumped)

The error message of the crash is now slightly different because it’s hitting our breakpoint cpu instruction. Funfact btw, if you run this in strace you can see this isn’t making any system calls (aka not talking to the kernel at all, it just crashes):

$ strace -f ./hack-the-planet
execve("./hack-the-planet", ["./hack-the-planet"], 0x74f12430d1d8 /* 39 vars */) = 0
--- SIGTRAP {si_signo=SIGTRAP, si_code=SI_KERNEL, si_addr=NULL} ---
+++ killed by SIGTRAP (core dumped) +++
[1]    2796457 trace trap (core dumped)  strace -f ./hack-the-planet

Let’s try to make a program that does a clean shutdown. To do this we inform the kernel with a system call that we may like to exit. We can get more info on this with man 2 exit and it defines exit like this:

[[noreturn]] void _exit(int status);

On Linux this syscall is actually called _exit and exit is implemented as a libc function, but we don’t care about any of that today, it’s going to do the job just fine. Also note how it takes a single argument of type int. In C-speak this means “signed 32 bit”, i32 in Rust.

Next we need to figure out the “syscall number” of this syscall. These numbers are cpu architecture specific for some reason (idk, idc). We’re looking these numbers up with ripgrep in /usr/include/asm/:

$ rg __NR_exit /usr/include/asm
/usr/include/asm/unistd_64.h
64:#define __NR_exit 60
235:#define __NR_exit_group 231

/usr/include/asm/unistd_x32.h
53:#define __NR_exit (__X32_SYSCALL_BIT + 60)
206:#define __NR_exit_group (__X32_SYSCALL_BIT + 231)

/usr/include/asm/unistd_32.h
5:#define __NR_exit 1
253:#define __NR_exit_group 252

Since we’re on x86_64 the correct value is the one in unistd_64.h, 60. Also, on x86_64 the syscall number goes into the rax cpu register, the status argument goes in the rdi register. The return value of the syscall is going to be placed in the rax register after the syscall is done, but for exit the execution is never given back to us. Let’s try to write 60 into the rax register and 69 into the rdi register. To copy into registers we’re going to use the mov destination, source instruction to copy from source to destination. With these registers setup we can use the syscall cpu instruction to hand execution over to the kernel. Don’t worry, there’s only one more assembly instruction coming and for everything else we’re going to use Rust.

Our code now looks like this:

#![no_std]
#![no_main]

use core::arch::global_asm;
use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

global_asm! {
    ".global _start",
    "_start:",
    "mov rax, 60",
    "mov rdi, 69",
    "syscall"
}

Build the binary, run it and print the exit code:

$ cargo build --release --target x86_64-unknown-none
$ target/x86_64-unknown-none/release/hack-the-planet; echo $?
69

Nice. Rust is quite literally putting these cpu instructions into the binary for us, nothing else.

$ objdump -d target/x86_64-unknown-none/release/hack-the-planet

target/x86_64-unknown-none/release/hack-the-planet:     file format elf64-x86-64


Disassembly of section .text:

0000000000001210 <_start>:
    1210:	48 c7 c0 3c 00 00 00 	mov    $0x3c,%rax
    1217:	48 c7 c7 45 00 00 00 	mov    $0x45,%rdi
    121e:	0f 05                	syscall

Running this with strace shows the program does exactly one thing.

$ strace -f ./hack-the-planet
execve("./hack-the-planet", ["./hack-the-planet"], 0x70699fe8c908 /* 39 vars */) = 0
exit(69)                                = ?
+++ exited with 69 +++

Writing Rust

Ok but even though cpu instructions can be fun at times, I’d rather not deal with them most of the time (this might strike you as odd, considering this blog post). Instead let’s try to define a function in Rust and call into that instead. We’re going to define this function as unsafe (btw none of this is taking advantage of the safety guarantees by Rust in case it wasn’t obvious. This tutorial is mostly going to stick to unsafe Rust, but for bigger projects you can attempt to reduce your usage of unsafe to opt back into “normal” safe Rust), it also declares the function with #[no_mangle] so the function name is preserved as main and we can call it from our global_asm entry point. Lastely, when our program is started it’s going to get the stack address passed in one of the cpu registers, this value is expected to be passed to our function as an argument. Our function declares ! as return type, which means it never returns:

#[no_mangle]
unsafe fn main(_stack_top: *const u8) -> ! {
    // TODO: this is missing
}

This won’t compile yet, we need to add our assembly for the exit syscall back in.

#[no_mangle]
unsafe fn main(_stack_top: *const u8) -> ! {
    asm!(
        "syscall",
        in("rax") 60,
        in("rdi") 0,
        options(noreturn)
    );
}

This time we’re using the asm! macro, this is a slightly more declarative approach. We want to run the syscall cpu instruction with 60 in the rax register, and this time we want the rdi register to be zero, to indicate a successful exit. We also use options(noreturn) so Rust knows it should assume execution does not resume after this assembly is executed (the Linux kernel guarantees this). We modify our global_asm! entrypoint to call our new main function, and to copy the stack address from rsp into the register for the first argument rdi because it would otherwise get lost forever:

global_asm! {
    ".global _start",
    "_start:",
    "mov rdi, rsp",
    "call main"
}

Our full program now looks like this:

#![no_std]
#![no_main]

use core::arch::asm;
use core::arch::global_asm;
use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    loop {}
}

global_asm! {
    ".global _start",
    "_start:",
    "mov rdi, rsp",
    "call main"
}

#[no_mangle]
unsafe fn main(_stack_top: *const u8) -> ! {
    asm!(
        "syscall",
        in("rax") 60,
        in("rdi") 0,
        options(noreturn)
    );
}

After building and disassembling this the Rust compiler is slowly starting to do work for us:

$ cargo build --release --target x86_64-unknown-none
$ objdump -d target/x86_64-unknown-none/release/hack-the-planet

target/x86_64-unknown-none/release/hack-the-planet:     file format elf64-x86-64


Disassembly of section .text:

0000000000001210 <_start>:
    1210:	48 89 e7             	mov    %rsp,%rdi
    1213:	e8 08 00 00 00       	call   1220 <main>
    1218:	cc                   	int3
    1219:	cc                   	int3
    121a:	cc                   	int3
    121b:	cc                   	int3
    121c:	cc                   	int3
    121d:	cc                   	int3
    121e:	cc                   	int3
    121f:	cc                   	int3

0000000000001220 <main>:
    1220:	50                   	push   %rax
    1221:	b8 3c 00 00 00       	mov    $0x3c,%eax
    1226:	31 ff                	xor    %edi,%edi
    1228:	0f 05                	syscall
    122a:	0f 0b                	ud2

The mov and syscall instructions are still the same, but it noticed it can XOR the rdi register with itself to set it to zero. It’s using x86 assembly language (the 32 bit variant of x86_64, that also happens to work on x86_64) to do so, that’s why the register is refered to as edi in the disassembly. You can also see it’s inserting a bunch of 0xCC instructions (for alignment) and Rust puts the opcodes 0x0F 0x0B at the end of the function to force an “invalid opcode exception” so the program is guaranteed to crash in case the exit syscall doesn’t do it.

This code still executes as expected:

$ strace -f ./hack-the-planet
execve("./hack-the-planet", ["./hack-the-planet"], 0x72dae7e5dc08 /* 39 vars */) = 0
exit(0)                                 = ?
+++ exited with 0 +++

Adding functions

Ok we’re getting closer but we aren’t quite there yet. Let’s try to write an exit function for our assembly that we can then call like a normal function. Remember that it takes a signed 32 bit integer that’s supposed to go into rdi.

unsafe fn exit(status: i32) -> ! {
    asm!(
        "syscall",
        in("rax") 60,
        in("rdi") status,
        options(noreturn)
    );
}

Actually, since this function doesn’t take any raw pointers and any i32 is valid for this syscall we’re going to remove the unsafe marker of this function. When doing this we still need to use unsafe { } within the function for our inline assembly.

fn exit(status: i32) -> ! {
    unsafe {
        asm!(
            "syscall",
            in("rax") 60,
            in("rdi") status,
            options(noreturn)
        );
    }
}

Let’s call this function from our main, and also remove the infinity loop of the panic handler with a call to exit(1):

#![no_std]
#![no_main]

use core::arch::asm;
use core::arch::global_asm;
use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    exit(1);
}

global_asm! {
    ".global _start",
    "_start:",
    "mov rdi, rsp",
    "call main"
}

fn exit(status: i32) -> ! {
    unsafe {
        asm!(
            "syscall",
            in("rax") 60,
            in("rdi") status,
            options(noreturn)
        );
    }
}

#[no_mangle]
unsafe fn main(_stack_top: *const u8) -> ! {
    exit(0);
}

Running this still works, but interestingly the generated assembly didn’t change at all:

$ cargo build --release --target x86_64-unknown-none
$ objdump -d target/x86_64-unknown-none/release/hack-the-planet

target/x86_64-unknown-none/release/hack-the-planet:     file format elf64-x86-64


Disassembly of section .text:

0000000000001210 <_start>:
    1210:	48 89 e7             	mov    %rsp,%rdi
    1213:	e8 08 00 00 00       	call   1220 <main>
    1218:	cc                   	int3
    1219:	cc                   	int3
    121a:	cc                   	int3
    121b:	cc                   	int3
    121c:	cc                   	int3
    121d:	cc                   	int3
    121e:	cc                   	int3
    121f:	cc                   	int3

0000000000001220 <main>:
    1220:	50                   	push   %rax
    1221:	b8 3c 00 00 00       	mov    $0x3c,%eax
    1226:	31 ff                	xor    %edi,%edi
    1228:	0f 05                	syscall
    122a:	0f 0b                	ud2

Rust noticed there’s no need to make it a separate function at runtime and instead merged the instructions of the exit function directly into our main. It also noticed the 0 argument in exit(0) means “rdi is supposed to be zero” and uses the XOR optimization mentioned before.

Since main is not calling any unsafe functions anymore we could mark it as safe too, but in the next few functions we’re going to deal with file descriptors and raw pointers, so this is likely the only safe function we’re going to write in this tutorial so let’s just keep the unsafe marker.

Printing text

Ok let’s try to do a quick hello world, to do this we’re going to call the write syscall. Looking it up with man 2 write:

ssize_t write(int fd, const void buf[.count], size_t count);

The write syscall takes 3 arguments and returns a signed size_t. In Rust this is called isize. In C size_t is an unsigned integer type that can hold any value of sizeof(...) for the given platform, ssize_t can only store half of that because it uses one of the bits to indicate an error has occured (the first s means signed, write returns -1 in case of an error).

The arguments for write are:

  • the file descriptor to write to. stdout is located on file descriptor 1.
  • a pointer/address to some memory.
  • the number of bytes that should be written, starting at the given address.

Let’s also lookup the syscall number of write:

% rg __NR_write /usr/include/asm
/usr/include/asm/unistd_64.h
5:#define __NR_write 1
24:#define __NR_writev 20

/usr/include/asm/unistd_32.h
8:#define __NR_write 4
150:#define __NR_writev 146

/usr/include/asm/unistd_x32.h
5:#define __NR_write (__X32_SYSCALL_BIT + 1)
323:#define __NR_writev (__X32_SYSCALL_BIT + 516)

The value we’re looking for is 1. Let’s write our write function (heh).

unsafe fn write(fd: i32, buf: *const u8, count: usize) -> isize {
    let r0;
    asm!(
        "syscall",
        inlateout("rax") 1 => r0,
        in("rdi") fd,
        in("rsi") buf,
        in("rdx") count,
        lateout("rcx") _,
        lateout("r11") _,
        options(nostack, preserves_flags)
    );
    r0
}

Now that’s a lot of stuff at once. Since this syscall is actually going to hand execution back to our program we need to let Rust know which cpu registers the syscall is writing to, so Rust doesn’t attempt to use them to store data (that would be silently overwritten by the syscall). inlateout("raw") 1 => r0 means we’re writing a value to the register and want the result back in variable r0. in("rdi") fd means we want to write the value of fd into the rdi register. lateout("rcx") _ means the Linux kernel may write to that register (so the previous value may get lost), but we don’t want to store the value anywhere (the underscore acts as a dummy variable name).

This doesn’t compile just yet though

$ cargo build --release --target x86_64-unknown-none
   Compiling hack-the-planet v0.1.0 (/hack-the-planet)
error: incompatible types for asm inout argument
  --> src/main.rs:35:26
   |
35 |         inlateout("rax") 1 => r0,
   |                          ^    ^^ type `isize`
   |                          |
   |                          type `i32`
   |
   = note: asm inout arguments must have the same type, unless they are both pointers or integers of the same size

error: could not compile `hack-the-planet` due to previous error

Rust has inferred the type of r0 is isize since that’s what our function returns, but the type of the input value for the register was inferred to be i32. We’re going to select a specific number type to fix this.

unsafe fn write(fd: i32, buf: *const u8, count: usize) -> isize {
    let r0;
    asm!(
        "syscall",
        inlateout("rax") 1isize => r0,
        in("rdi") fd,
        in("rsi") buf,
        in("rdx") count,
        lateout("rcx") _,
        lateout("r11") _,
        options(nostack, preserves_flags)
    );
    r0
}

We can now call our new write function like this:

write(1, b"Hello world\n".as_ptr(), 12);

We need to set the number of bytes we want to write explicitly because there’s no concept of null-byte termination in the write system call, it’s quite literally “write the next X bytes, starting from this address”. Our program now looks like this:

#![no_std]
#![no_main]

use core::arch::asm;
use core::arch::global_asm;
use core::panic::PanicInfo;

#[panic_handler]
fn panic(_info: &PanicInfo) -> ! {
    exit(1);
}

global_asm! {
    ".global _start",
    "_start:",
    "mov rdi, rsp",
    "call main"
}

fn exit(status: i32) -> ! {
    unsafe {
        asm!(
            "syscall",
            in("rax") 60,
            in("rdi") status,
            options(noreturn)
        );
    }
}

unsafe fn write(fd: i32, buf: *const u8, count: usize) -> isize {
    let r0;
    asm!(
        "syscall",
        inlateout("rax") 1isize => r0,
        in("rdi") fd,
        in("rsi") buf,
        in("rdx") count,
        lateout("rcx") _,
        lateout("r11") _,
        options(nostack, preserves_flags)
    );
    r0
}

#[no_mangle]
unsafe fn main(_stack_top: *const u8) -> ! {
    write(1, b"Hello world\n".as_ptr(), 12);
    exit(0);
}

Let’s try to build and disassemble it:

$ cargo build --release --target x86_64-unknown-none
$ objdump -d target/x86_64-unknown-none/release/hack-the-planet

target/x86_64-unknown-none/release/hack-the-planet:     file format elf64-x86-64


Disassembly of section .text:

0000000000001220 <_start>:
    1220:	48 89 e7             	mov    %rsp,%rdi
    1223:	e8 08 00 00 00       	call   1230 <main>
    1228:	cc                   	int3
    1229:	cc                   	int3
    122a:	cc                   	int3
    122b:	cc                   	int3
    122c:	cc                   	int3
    122d:	cc                   	int3
    122e:	cc                   	int3
    122f:	cc                   	int3

0000000000001230 <main>:
    1230:	50                   	push   %rax
    1231:	48 8d 35 d5 ef ff ff 	lea    -0x102b(%rip),%rsi        # 20d <_start-0x1013>
    1238:	b8 01 00 00 00       	mov    $0x1,%eax
    123d:	ba 0c 00 00 00       	mov    $0xc,%edx
    1242:	bf 01 00 00 00       	mov    $0x1,%edi
    1247:	0f 05                	syscall
    1249:	b8 3c 00 00 00       	mov    $0x3c,%eax
    124e:	31 ff                	xor    %edi,%edi
    1250:	0f 05                	syscall
    1252:	0f 0b                	ud2

This time there are 2 syscalls, first write, then exit. For write it’s setting up the 3 arguments in our cpu registers (rdi, rsi, rdx). The lea instruction subtracts 0x102b from the rip register (the instruction pointer) and places the result in the rsi register. This is effectively saying “an address relative to wherever this code was loaded into memory”. The instruction pointer is going to point directly behind the opcodes of the lea instruction, so 0x1238 - 0x102b = 0x20d. This address is also pointed out in the disassembly as a comment.

We don’t see the string in our disassembly but we can convert our 0x20d hex to 525 in decimal and use dd to read 12 bytes from that offset, and sure enough:

$ dd bs=1 skip=525 count=12 if=target/x86_64-unknown-none/release/hack-the-planet
Hello world
12+0 records in
12+0 records out

Execute our binary with strace also shows the new write syscall (and the bytes that are being written mixed up in the output).

$ strace -f ./hack-the-planet
execve("./hack-the-planet", ["./hack-the-planet"], 0x74493abe64a8 /* 39 vars */) = 0
write(1, "Hello world\n", 12Hello world
)           = 12
exit(0)                                 = ?
+++ exited with 0 +++

After running strip on it to remove some symbols the binary is so small, if you open it in a text editor it fits on a screenshot: