Skip to content

Getting Started with Malware Analysis

A beginner's guide to malware analysis: the four analysis types, building a safe lab, static and dynamic triage, and a learning path.

Published on 6 min read

Malware analysis is the practice of taking a suspicious file apart to understand what it does, how it does it, and how to detect and stop it. It is a defensive discipline: analysts dissect threats so that incident responders, detection engineers, and threat-intelligence teams can protect real systems. This guide is an entry point for beginners who want to learn how to analyze malware safely — not how to write it.

What malware analysis actually answers

When a sample lands on an analyst's desk, the goal is to answer a handful of practical questions:

  • What is this file, and is it actually malicious?
  • What does it do when it runs — what does it touch, change, or steal?
  • How does it persist, hide, or spread?
  • What indicators of compromise (IOCs) can we extract to detect it elsewhere?

Everything else — the tooling, the lab, the methodology — exists to answer those questions reliably and without getting infected in the process.

The four types of analysis

Most workflows blend four complementary approaches. You rarely use just one.

Static analysis

Examining the file without executing it: file type, hashes, embedded strings, headers, imports, and resources. It is fast, low-risk, and the natural first step.

Dynamic analysis

Running the sample in a controlled, instrumented environment and watching its behavior — processes spawned, files written, registry keys set, and network connections made.

Code analysis

Deep reverse engineering of the binary itself. Static code analysis means reading disassembly or decompiled output; dynamic code analysis means stepping through the program in a debugger to watch logic unfold. This is where you defeat tricks documented under anti-analysis techniques.

Behavioral analysis

Stepping back to characterize intent: is this a downloader, a banker, ransomware, a backdoor? Behavioral analysis correlates the raw observations from the other three into a story and a family attribution.

Build a safe, isolated lab first

This is the single most important section. Never analyze malware on a machine you care about. Build the lab before you touch a single sample.

Use virtual machines with snapshots

Run a guest operating system (commonly Windows for most malware) inside a hypervisor such as VirtualBox or VMware. Take a clean snapshot before each run so you can revert to a pristine state in seconds. Snapshots are your undo button — use them generously.

Isolate the network

By default, malware will try to phone home. Configure a host-only or fully isolated network so the guest cannot reach your real LAN, your host, or the internet. Tools like INetSim or FakeDNS can simulate internet services so the sample behaves naturally while staying contained.

Use purpose-built distributions

Two free toolkits save weeks of setup:

  • FLARE-VM — a Windows-based collection of analysis tools (disassemblers, debuggers, PE viewers, monitors).
  • REMnux — a Linux distribution packed with static, network, and document-analysis utilities.

Harden the lab further: disable shared folders and clipboard sharing, remove guest additions if they widen the attack surface, and never sign into personal accounts inside the VM.

Basic static triage

With the lab ready, start every sample with static triage. It is safe and frequently tells you most of what you need.

  1. Hash the file. Compute MD5, SHA-1, and SHA-256. Hashes are the universal IOC and let you check threat-intel databases for prior sightings.
  2. Identify the file type. Don't trust the extension. Confirm whether it is a Windows PE, an ELF, a script, or a document.
  3. Pull the strings. Extract printable ASCII and Unicode strings. URLs, IP addresses, registry paths, mutex names, and error messages often leak the sample's purpose.
  4. Inspect the PE headers. Compile timestamp, sections, and entropy hint at the sample's nature. High entropy in a section often signals compression or encryption — a classic sign of packing. A UPX-packed binary, for instance, is a common and quickly reversible case covered in UPX packing.
  5. Read the imports. The Windows APIs a binary imports are a roadmap to its capabilities — CreateRemoteThread and WriteProcessMemory suggest injection, RegSetValueEx suggests persistence, and network APIs suggest command-and-control.

A short import table on a large file is itself a red flag: it usually means the real code is packed and only unfolds at runtime.

Basic dynamic triage

When static analysis stalls — typically because the sample is packed or obfuscated — detonate it in the lab and watch. Snapshot first, then monitor:

  • Processes. Watch for spawned children, injected threads, and hollowed-out legitimate processes. Hiding code inside a trusted process is a staple technique; see process hollowing.
  • File system. Note dropped files, especially in temp and startup locations.
  • Registry. Track new keys and modified run keys — the backbone of most persistence mechanisms.
  • Network. Capture DNS lookups, HTTP requests, and raw connections to reveal C2 infrastructure and download URLs.

Run the sample, let it act for a few minutes, collect your observations, then revert the snapshot. Repeat with different conditions (fake internet on/off, different privileges) to coax out more behavior.

Staying safe and ethical

  • Contain everything. Assume any sample can escape a misconfigured lab. Default to no internet and revert after every run.
  • Never run samples on production or personal hardware. Not even "just this once."
  • Obtain samples legally and handle them only for defensive, research, or educational purposes.
  • Respect the law and your policy. Do not distribute live malware or test it against systems you don't own.
  • Document as you go. Reproducibility is the difference between a hunch and an analysis.

A realistic learning path

You don't have to learn everything at once. A sensible order:

  1. Fundamentals. Learn how operating systems, processes, memory, and the Windows API work.
  2. Static triage. Get fluent with hashing, strings, and PE inspection. Build the muscle of forming a hypothesis before running anything.
  3. Lab and dynamic triage. Stand up FLARE-VM and REMnux, master snapshots and isolated networking, and practice safe detonation.
  4. Assembly and code analysis. Learn x86/x64 assembly and a disassembler, then graduate to debugging through packers and anti-analysis tricks.
  5. Automation and intel. Script repetitive triage and learn to write detection rules from your findings.

Practice on intentionally vulnerable, sanctioned samples and CTF-style challenges before touching anything live. Browse the full reverse-engineering techniques library to see how specific tricks work, and keep the glossary open for the terms you'll meet along the way.

Start analyzing

Malware analysis rewards patience, curiosity, and discipline about safety. Set up your isolated lab, run your first static triage on a known-safe sample, and work your way up. Explore the techniques library to deepen each skill — one safely-contained sample at a time.

Related articles

Compare static vs dynamic analysis for malware reverse engineering: pros, cons, when to use each, and how anti-analysis tricks defeat both.
What is reverse engineering? Learn how analysts deconstruct software and hardware to understand, secure, and rebuild systems — plus how to start.
A defensive, lab-focused guide to recognizing and unpacking packed executables: entropy, OEP recovery, memory dumps, and IAT rebuilding.