Multi-Layer Packing
Stacking multiple packers and stages so each unpacking step only reveals the next packed layer, forcing analysts to recurse to the real payload.
Multi-layer packing nests one packing scheme inside another so that unpacking is not a single event but a chain. The outer stub decompresses or decrypts a buffer that is itself a packed image; running it produces a second stub, which yields a third, and only the innermost layer is the real payload. Authors deliberately mix techniques — a UPX-style compressor wrapping a custom XOR stub wrapping an aPLib-compressed core — so no single tool or signature unwinds the whole stack.
Each layer commonly reuses the same primitives — VirtualAlloc, a decode loop,
and a tail jump to a per-layer OEP — but with different keys, algorithms, and
section layouts. Families like Emotet, Smoke Loader, and GuLoader have all shipped
samples where two or three distinct unpacking stages must be peeled in order
before any meaningful code or strings are recoverable.
How it works
Conceptually each stage allocates a buffer, materialises the next stage into it, and transfers control; the new stage repeats with its own scheme:
file on disk
└─ Layer 0 stub (compressor, e.g. UPX)
decompress -> buffer A ───────────► jmp OEP_A
└─ Layer 1 stub (custom XOR loader)
VirtualAlloc + xor-decrypt -> buf B ► call OEP_B
└─ Layer 2 stub (aPLib depacker)
depack -> buffer C ─────────────────► jmp OEP_C
└─ Layer 3 = real payload PE
resolve IAT, run
each ─► is a distinct OEP; the dumped buffer of stage N
is the *packed* input of stage N+1, not the final codeThe defining property is recursion: dumping after the first OEP gives you another packed object, not the payload. Stages may differ in entropy profile (compressed vs. encrypted), in which APIs they call, and in whether they run in-place or in a fresh allocation, which is why a uniform automated unpacker tends to stall on the second layer.
Detection & analysis
Static analysis: Expect to identify only the outermost layer statically — its
section names, entropy, and import table. After the first conceptual unpack the
remaining layers are opaque blobs. A clue to depth is multiple distinct
high-entropy regions or a stub whose output buffer still has packer markers (UPX
section names, an AP32 header, a second decrypt loop). Treat the IAT and OEP
you recover as per-stage, not final.
Dynamic analysis: Unpack iteratively. Breakpoint on each allocation
(VirtualAlloc/VirtualProtect, mmap/mprotect) and on the tail control
transfer; at each OEP dump the new region and re-examine it as a fresh sample —
re-running entropy, header, and import checks to decide whether another layer
remains. Tools like pe-sieve and Scylla dump and rebuild a layer, but you
repeat the cycle until the dumped object has a normal entropy profile, a real
import table, and readable strings — that is the final payload.
Detection rule hint: Flag a process that performs more than one write-then-execute (W^X) transition into separate freshly allocated regions during startup, each followed by a control transfer — repeated unpacking events in a single run distinguish a multi-layer packer from a normal loader that unpacks once and proceeds.