DEV Community

Ripan Deuri
Ripan Deuri

Posted on

From ARM Assembly to Machine Code: A Bare-Metal Primer


Bare-metal programming begins at a boundary where abstractions disappear. There is no operating system, no loader, and no safety net between software and silicon. At this level, understanding how human-readable assembly instructions become raw bytes in memory is not optional—it is foundational.

This article walks through that transformation step by step on ARMv7, showing how an assembler encodes instructions, how PC-relative addressing works, and how the final binary is laid out in memory.


ARMv7 programmer-visible registers

ARMv7 exposes sixteen general-purpose registers to software, along with status registers that control execution.

Registers R0–R12 are general-purpose and are typically used for data manipulation, parameter passing, and temporary storage.

Registers R13–R15 have fixed architectural roles:

  • R13 (SP – Stack Pointer)
    Holds the address of the top of the current stack.

  • R14 (LR – Link Register)
    Stores the return address during subroutine calls.

  • R15 (PC – Program Counter)
    Holds the address of the instruction being fetched.

In addition to general registers, ARMv7 defines program status registers:

  • CPSR (Current Program Status Register)
    Contains condition flags (N, Z, C, V) and processor control bits.

  • SPSR (Saved Program Status Register)
    Used in exception modes to preserve the prior CPSR value.

ARMv7 also supports multiple processor modes (User, IRQ, FIQ, Supervisor, etc.). Some registers are banked across modes—most notably stack pointers and link registers—allowing fast exception entry without saving all state.

From assembly source to machine code

An ARM processor does not execute assembly text. It executes 32-bit instruction words fetched from memory. The assembler’s role is to translate readable mnemonics into those instruction words, based on templates defined by the ARM Instruction Set Architecture (ISA).

For this discussion, it is assumed:

  • All instructions are 32 bits wide
  • Instructions are word-aligned
  • PC reads as the current instruction address + 8

A minimal startup example

startup.s:

ldr r2, str1
b .
str1: .word 0xDEADBEEF
Enter fullscreen mode Exit fullscreen mode

This program performs three actions:

  1. Loads a 32-bit value into r2
  2. Enters an infinite loop
  3. Places a literal value in memory

How labels and offsets are resolved

The assembler parses the file and assigns section-relative offsets to instructions and data:

Address Source code Machine code
0x00 ldr r2, str1 0xE59F2000
0x04 b . 0xEAFFFFFE
0x08 str1: .word 0xDEADBEEF 0xDEADBEEF

Because everything resides in one section, no linker relocation is required. In more complex programs, the linker completes this step.

Encoding ldr r2, str1

ARM uses PC-relative addressing to load nearby constants. The instruction is encoded as LDR Rd, [PC, #offset]

A defining ARM rule applies here:

When an ARM instruction executes, the PC value equals the address of the current instruction plus 8 bytes.

For the instruction at address 0x00:

  • Target address = 0x08
  • PC during execution = 0x00 + 0x08 = 0x08
  • Required offset = 0x08 - 0x08 = 0

The assembler fills the instruction fields accordingly:

Field Bits Value Meaning
Condition 31–28 1110 (E) Always
Opcode 27–20 01011001 LDR, immediate
Base register (Rn) 19–16 1111 PC
Destination (Rd) 15–12 0010 R2
Offset 11–0 000000000000 0

Final instruction word: 0xE59F2000

Encoding b .

The branch instruction also uses PC-relative addressing.

  • Instruction address = 0x04
  • Target address (.) = 0x04
  • Effective PC = 0x04 + 0x08 = 0x0C
  • Byte offset = 0x04 - 0x0C = -8

ARM branch offsets are stored in words, so the offset in words -8 / 4 = -2.

Using 24-bit two’s complement encoding, -2 becomes 0xFFFFFFFE

Final instruction word: 0xEAFFFFFE

The result is an intentional infinite loop.

Why b . becomes an infinite loop

After reset and initial setup, execution of the example proceeds as follows.

  1. Execution reaches address 0x04
    The CPU completes the ldr instruction at address 0x00 and advances to the next instruction at 0x04, which contains the branch instruction.

  2. PC value during execution
    Due to the ARM pipeline design, when the instruction at 0x04 is executed, the PC holds 0x04 + 0x08 = 0x0C

  3. Instruction decode and offset interpretation
    The machine code 0xEAFFFFFE decodes to offset -2 words. Because ARM instructions are 4 bytes wide, this corresponds to a signed offset of -2 × 4 = -8 bytes.

  4. Branch target calculation
    The CPU computes the branch destination using the PC-relative rule: Target address = PC + offset = 0x0C + (-8) = 0x04

  5. PC update and control flow
    The processor updates the Program Counter to 0x04.
    The next instruction fetch begins from the same branch instruction.

This sequence repeats indefinitely. Each execution of the branch returns control to itself, creating a tight infinite loop without consuming stack space, registers, or memory.

Why this pattern is used in bare-metal code

The b . idiom is commonly used in bare-metal programs for intentional halting:

  • when execution reaches an unrecoverable state
  • when waiting for a debugger connection
  • as a placeholder during early bring-up
  • as a deliberate end-of-program marker

Encoding .word 0xDEADBEEF

.word is not an executable instruction. It is an assembler directive (command to the assembler program).

  • str1 defines a label at the current location
  • .word reserves 4 bytes
  • 0xDEADBEEF is written verbatim into the output

The CPU never decodes this as an instruction unless execution jumps into data memory.

How instructions appear in memory

The instruction 0xE59F2000 is a logical 32-bit value.
Memory layout depends on system endianness.

In little-endian ARM, storage is byte-reversed:

Address offset Byte
+0 00
+1 20
+2 9F
+3 E5

The same applies to all instructions and data words.

From source to raw binary

arm-none-eabi-as -o startup.o startup.s
arm-none-eabi-ld -o first-hang.elf startup.o
arm-none-eabi-objcopy -O binary first-hang.elf first-hang.bin
Enter fullscreen mode Exit fullscreen mode

Hex dump of the final binary:

$ hexdump -C first-hang.bin 
00000000  00 20 9f e5 fe ff ff ea  ef be ad de              |. ..........|
0000000c
Enter fullscreen mode Exit fullscreen mode

Interpreted as 32-bit words:

$ xxd -e first-hang.bin 
00000000: e59f2000 eafffffe deadbeef            . ..........
Enter fullscreen mode Exit fullscreen mode

Exactly as predicted by the instruction encoding rules.


Bare-metal software executes in an environment where nothing is implicit. Reset vectors, startup code, exception handling, and peripheral access all rely on the same fundamentals shown here: instruction encoding, PC semantics, and binary layout.

Understanding how assembly becomes machine code removes a layer of mystery from the system and replaces it with something far more useful—predictability. From this foundation, linker scripts (used in My First Bare-Metal Program: From Reset to hello, memory maps become logical extensions.


References

Top comments (0)