What is the main difference between x86 and x64 assembly?

x64 widens registers to 64 bits, doubles the general-purpose register count from 8 to 16, and changes calling conventions to pass arguments in registers instead of on the stack. Pointers and addresses become 8 bytes, and code uses RIP-relative addressing.

Are x86 and x64 calling conventions compatible?

No. 32-bit code typically uses cdecl or stdcall and pushes arguments onto the stack. x64 uses register-based conventions — System V AMD64 on Linux/macOS and Microsoft x64 fastcall on Windows — which differ from each other, so you must know the target OS before reading argument flow.

Can a 64-bit CPU still run 32-bit assembly?

Yes. x86-64 processors include a compatibility mode that executes 32-bit code natively, which is why 32-bit binaries still run on 64-bit operating systems that ship a 32-bit runtime.

How do system calls differ between x86 and x64?

32-bit Linux uses int 0x80 or sysenter with arguments in ebx, ecx, edx, and so on. x64 uses the dedicated syscall instruction with a different argument-register order and a different syscall number table.

x86 vs x64 Assembly: Key Differences

If you reverse engineer binaries, the single most common confusion is mixing up the rules of 32-bit and 64-bit code. They look similar — same mnemonics, same flags register, same broad CPU model — but the conventions underneath differ enough that reading x64 with an x86 mental model will silently mislead you. This guide covers the practical differences that change how you read a disassembly listing. For the full instruction set, keep the x86/x64 ISA reference open in another tab.

Registers: width and count

The headline change is width. The 32-bit general-purpose registers (eax, ebx, ecx, edx, esi, edi, esp, ebp) become 64-bit registers with an r prefix: rax, rbx, and so on. Each register is still addressable at its narrower widths, so the same physical register exposes rax (64-bit), eax (32-bit), ax (16-bit), and al (8-bit).

The second change is count. x86 has 8 general-purpose registers; x64 adds eight more — r8 through r15 — for 16 GPRs total. These new registers have width-specific names too: r8, r8d (32-bit), r8w (16-bit), and r8b (8-bit byte).

One subtle rule trips up almost everyone: writing to a 32-bit sub-register zeroes the upper 32 bits of the full register. Writing to the 16-bit or 8-bit parts does not.

asm

mov rax, 0x1122334455667788
mov eax, 1          ; rax is now 0x0000000000000001 (top half zeroed)
mov ax, 2           ; only the low 16 bits change

That zero-extension behavior is why compilers emit xor eax, eax to clear rax — it is shorter than a 64-bit clear and produces the same result.

Pointer and address size

In x86, pointers are 4 bytes and the virtual address space is 32-bit. In x64, pointers are 8 bytes and addresses are 64-bit (though current hardware typically uses 48-bit canonical addresses). When you spot stack slots spaced 8 bytes apart, or push/pop moving the stack pointer by 8, you are looking at 64-bit code. Getting this wrong throws off every offset calculation you do by hand. The glossary has entries for the addressing terms used throughout this article.

Calling conventions

This is where reading argument flow goes wrong most often.

32-bit (cdecl / stdcall): arguments are pushed onto the stack, usually right-to-left. The difference between the two is who cleans the stack — the caller (cdecl) or the callee via ret N (stdcall). The return value comes back in eax.

x64 System V AMD64 (Linux, macOS): the first six integer arguments go in rdi, rsi, rdx, rcx, r8, r9. Floating-point arguments use xmm0–xmm7. The return value is in rax.

x64 Microsoft fastcall (Windows): the first four integer arguments go in rcx, rdx, r8, r9. The caller must also reserve 32 bytes of shadow space on the stack for those four registers, even when they are passed in registers.

Aspect	x86 (cdecl/stdcall)	x64 System V	x64 Microsoft
Arg passing	Stack (push)	rdi, rsi, rdx, rcx, r8, r9	rcx, rdx, r8, r9
Register args	0	6 integer + 8 SSE	4 (int or float)
Return value	eax	rax	rax
Stack cleanup	caller / callee	caller	caller
Shadow space	none	none	32 bytes
Stack alignment	4 bytes	16 bytes	16 bytes

The takeaway when reversing: before you name function arguments, confirm the OS. A Linux syscall wrapper and a Windows API thunk read completely differently even though both are x64. You can step through both conventions live in the interactive CPU simulator to watch registers change.

RIP-relative addressing

x86 references global data with absolute addresses baked into the instruction. x64 introduces RIP-relative addressing, where the operand is encoded as a signed 32-bit displacement from the instruction pointer (rip).

asm

; x86 — absolute address
mov eax, [0x00403010]

; x64 — relative to the next instruction
mov eax, [rip + 0x2ff0]   ; resolves to a fixed global at link time

This is the backbone of position-independent code, so PIE executables and most shared libraries are full of it. In a disassembler the destination is usually pre-computed for you, but when patching bytes by hand you must recompute the displacement relative to the end of the instruction, not its start.

The red zone

System V AMD64 defines a 128-byte red zone below rsp that a leaf function (one that calls nothing) may use as scratch space without adjusting the stack pointer. You will see functions read and write [rsp - 8], [rsp - 16], and so on with no preceding sub rsp. That is legal and expected on Linux/macOS. The Windows x64 ABI has no red zone, so its absence is one more clue to which convention you are reading.

System calls

Userland-to-kernel transitions changed completely.

asm

; 32-bit Linux — write(1, buf, len)
mov eax, 4          ; __NR_write (32-bit table)
mov ebx, 1
mov ecx, buf
mov edx, len
int 0x80

; 64-bit Linux — write(1, buf, len)
mov rax, 1          ; __NR_write (64-bit table — different number!)
mov rdi, 1
mov rsi, buf
mov rdx, len
syscall

Three things differ: the instruction (int 0x80/sysenter versus the dedicated syscall), the argument registers (ebx, ecx, edx... versus rdi, rsi, rdx...), and the syscall numbers themselves — the 32-bit and 64-bit tables are not the same. syscall also clobbers rcx and r11, which you will see saved around the call. Misreading the table is a classic way to mislabel a malware sample's behavior.

Instruction encoding differences

x64 adds the REX prefix (a byte in the 0x40–0x4F range) ahead of many instructions. It encodes operand width (the REX.W bit selects 64-bit) and the extra high bit needed to address r8–r15. This means the same opcode byte can decode differently depending on the prefix, so a disassembler set to the wrong mode produces convincing nonsense. It also matters for tricks like overlapping instructions, where jumping mid-instruction yields a different decode in 32-bit versus 64-bit mode.

Why it matters when reversing

Mode confusion is not academic. Set IDA, Ghidra, or objdump to the wrong bitness and you get plausible-looking garbage. Knowing the conventions lets you:

Read argument flow correctly per OS instead of guessing.
Recognize obfuscation that hides data inline, like stack strings, which look different across the two ABIs because of register pressure and stack layout.
Patch RIP-relative and absolute references without breaking offsets.

For the broader workflow of identifying mode, convention, and intent in a binary, see the full reverse engineering techniques catalog.

Next steps

Load a binary into the interactive CPU simulator, single-step a function prologue, and watch which registers fill before the first instruction runs — that one habit tells you the bitness and the calling convention faster than any cheat sheet. Then dig into the ISA reference for the encoding details behind everything above.