Multi-Layer Packing

Multi-layer packing nests one packing scheme inside another so that unpacking is not a single event but a chain. The outer stub decompresses or decrypts a buffer that is itself a packed image; running it produces a second stub, which yields a third, and only the innermost layer is the real payload. Authors deliberately mix techniques — a UPX-style compressor wrapping a custom XOR stub wrapping an aPLib-compressed core — so no single tool or signature unwinds the whole stack.

Each layer commonly reuses the same primitives — VirtualAlloc, a decode loop, and a tail jump to a per-layer OEP — but with different keys, algorithms, and section layouts. Families like Emotet, Smoke Loader, and GuLoader have all shipped samples where two or three distinct unpacking stages must be peeled in order before any meaningful code or strings are recoverable.

How it works

Conceptually each stage allocates a buffer, materialises the next stage into it, and transfers control; the new stage repeats with its own scheme:

text

file on disk
  └─ Layer 0 stub (compressor, e.g. UPX)
        decompress -> buffer A  ───────────► jmp OEP_A
  └─ Layer 1 stub (custom XOR loader)
        VirtualAlloc + xor-decrypt -> buf B ► call OEP_B
  └─ Layer 2 stub (aPLib depacker)
        depack -> buffer C ─────────────────► jmp OEP_C
  └─ Layer 3 = real payload PE
        resolve IAT, run

each ─► is a distinct OEP; the dumped buffer of stage N
is the *packed* input of stage N+1, not the final code

The defining property is recursion: dumping after the first OEP gives you another packed object, not the payload. Stages may differ in entropy profile (compressed vs. encrypted), in which APIs they call, and in whether they run in-place or in a fresh allocation, which is why a uniform automated unpacker tends to stall on the second layer.

Detection & analysis

Static analysis: Expect to identify only the outermost layer statically — its section names, entropy, and import table. After the first conceptual unpack the remaining layers are opaque blobs. A clue to depth is multiple distinct high-entropy regions or a stub whose output buffer still has packer markers (UPX section names, an AP32 header, a second decrypt loop). Treat the IAT and OEP you recover as per-stage, not final.

Dynamic analysis: Unpack iteratively. Breakpoint on each allocation (VirtualAlloc/VirtualProtect, mmap/mprotect) and on the tail control transfer; at each OEP dump the new region and re-examine it as a fresh sample — re-running entropy, header, and import checks to decide whether another layer remains. Tools like pe-sieve and Scylla dump and rebuild a layer, but you repeat the cycle until the dumped object has a normal entropy profile, a real import table, and readable strings — that is the final payload.

Detection rule hint: Flag a process that performs more than one write-then-execute (W^X) transition into separate freshly allocated regions during startup, each followed by a control transfer — repeated unpacking events in a single run distinguish a multi-layer packer from a normal loader that unpacks once and proceeds.

How it works

Detection & analysis

Comments(0)