Skip to main content
DFIRLab
ResearchUse CasesCompare
Intel BriefingsThreat Actors
IOC CheckFile AnalyzerPhishing CheckDomain LookupExposure ScannerPrivacy Check
WikiAbout
PlatformNew
DFIRLab

Security research, threat intelligence, and free DFIR tools.

Tools

Phishing CheckerExposure ScannerDomain LookupFile AnalyzerPrivacy CheckAPI Playground

Use Cases

SOC Phishing TriageIR IOC EnrichmentMSSP Exposure Monitoringn8n AutomationSee all use cases →

Compare

vs VirusTotalvs Shodanvs TheHiveSee all 8 →

Resources

DFIR WikiIntel BriefingsAboutPlatformAPI Docs

Legal

Privacy PolicyRSS FeedSitemap

© 2026 DFIR Lab. All rights reserved.


← Back to Research
yaraDetection Engineeringmalware-analysisthreat-hunting

YARA Rules Tutorial: Writing Detection Rules from Scratch

DFIR Lab/April 25, 2026/11 min read

If you work in a SOC or do malware analysis, you have almost certainly heard of YARA. It is one of the most widely used tools in detection engineering — simple enough to write rules in minutes, powerful enough to underpin enterprise-grade threat hunting pipelines. This tutorial walks you through everything you need to get started: installation, rule anatomy, string types, conditions, and real-world examples you can run today.


What Is YARA and Why Does It Matter?

YARA is an open-source pattern-matching tool designed for identifying and classifying files based on textual or binary patterns. Originally developed by Víctor Manuel Álvarez at VirusTotal, it is now maintained as a standalone open-source project. YARA rules describe characteristics of a file — strings, byte sequences, regular expressions — and a condition that determines whether a file matches.

Where a traditional antivirus engine relies on signature hashes, YARA gives you expressive, human-readable rules that match on content. A hash changes the moment a threat actor recompiles their binary. A well-written YARA rule targeting a unique decryption stub or command string will survive that recompile.

YARA is used across the industry for:

  • Malware triage and classification
  • Incident response sweeps across endpoints
  • Threat intelligence enrichment
  • Detection-as-Code pipelines in SIEM and EDR platforms

For a broader introduction to the detection landscape, see our wiki on Detection-as-Code and Malware Analysis.


Installing YARA

YARA is available on all major platforms. Choose the method that fits your environment.

macOS (Homebrew)

bash
1brew install yara

Debian/Ubuntu

bash
1sudo apt install yara

Python bindings (yara-python)

If you want to integrate YARA into scripts or automate scanning:

bash
1pip install yara-python

Verify the installation:

bash
1yara --version

As of this writing, the current stable release is YARA 4.x. The Python bindings follow the same version line.


YARA Rule Anatomy

Every YARA rule follows the same structure. Understanding each block is the foundation of everything that follows.

yara
1rule RuleName
2{
3 meta:
4 author = "Analyst Name"
5 description = "What this rule detects"
6 date = "2026-04-10"
7 reference = "https://example.com/report"
8 
9 strings:
10 $string1 = "suspicious text"
11 $bytes1 = { 4D 5A 90 00 }
12 $regex1 = /https?:\/\/[a-z0-9]{8}\.example\.com/
13 
14 condition:
15 any of them
16}

rule RuleName — The rule identifier. Use a consistent naming convention. Many teams prefix with a category: MAL_, SUSP_, HUNT_.

meta — Free-form key-value metadata. Not used in matching, but critical for operationalizing rules. Include author, description, creation date, and a reference URL whenever possible.

strings — The patterns you want to find. Three types: plain text strings, hex byte sequences, and regular expressions. Each string gets a variable name prefixed with $.

condition — A boolean expression that determines whether the file matches. Conditions can reference individual strings, counts, file size, entry point offset, and more.

See our YARA Rules wiki page for a reference sheet you can bookmark.


Writing Your First Rule

Start simple. The goal of your first rule is to match a file that contains a known suspicious string.

Suppose you are investigating a sample that connects to a hardcoded C2 path. You extract the string cmd.exe /c whoami from the binary. Here is a minimal rule:

yara
1rule Detect_Whoami_Execution
2{
3 meta:
4 description = "Detects binaries containing a hardcoded whoami command"
5 author = "DFIR Lab"
6 date = "2026-04-10"
7 
8 strings:
9 $cmd = "cmd.exe /c whoami" nocase
10 
11 condition:
12 $cmd
13}

The nocase modifier makes the match case-insensitive. If the string appears anywhere in the file, the rule fires.

Test it against a directory:

bash
1yara rule.yar /path/to/samples/

String Types

YARA supports three types of string definitions. Each has a different use case.

Text Strings

Plain ASCII or wide-character strings. Use modifiers to adjust matching behavior.

yara
1strings:
2 $ascii = "MalwareLoader"
3 $wide = "MalwareLoader" wide // UTF-16LE encoded
4 $both = "MalwareLoader" wide ascii // match either encoding
5 $nocase = "malwareloader" nocase
6 $full = "exact match" fullword // must not be preceded/followed by alphanumeric

wide is important. Many Windows executables store strings in UTF-16LE. If you only check for ASCII and the binary uses wide strings, you will miss the match.

Hex Strings

Hex strings let you match raw byte sequences, including wildcards and jumps.

yara
1strings:
2 // Match the MZ header of a PE file
3 $mz = { 4D 5A }
4 
5 // Wildcard: match any single byte with ??
6 $pattern = { E8 ?? ?? ?? ?? 83 C4 04 }
7 
8 // Jump: match a sequence with a variable-length gap
9 $jump = { 4D 5A [2-10] 50 45 00 00 }

Hex strings are ideal when you identify a unique instruction sequence or a hardcoded byte pattern in a disassembler.

Regular Expressions

YARA supports Perl-compatible regular expressions wrapped in forward slashes.

yara
1strings:
2 // Match a URL with a randomly generated 8-character subdomain
3 $c2_url = /https?:\/\/[a-z0-9]{6,12}\.example\.com\/[a-z]{4}/
4 
5 // Match base64-encoded content (broad)
6 $b64 = /[A-Za-z0-9+\/]{40,}={0,2}/
7 
8 // Match an IPv4 address embedded in a string
9 $ipv4 = /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})){3}\b/

Regular expressions are the most expressive but also the slowest. Use them when text strings and hex patterns are insufficient, and be specific — overly broad regex patterns generate false positives and slow down scans.


Conditions

The condition block is where YARA becomes powerful. It is a boolean expression supporting a rich set of operators and keywords.

Boolean Logic

yara
1condition:
2 $string1 and $string2 // both must match
3 $string1 or $string2 // either must match
4 not $string1 // must not match
5 ($string1 or $string2) and $string3

String Counts

yara
1condition:
2 #string1 >= 3 // $string1 appears at least 3 times
3 #string1 == 1 and #string2 > 0

File Size

yara
1condition:
2 filesize < 1MB // common for packed/dropper stages
3 filesize > 500KB and filesize < 5MB

Any and All

yara
1condition:
2 any of ($str*) // any string matching the wildcard prefix
3 all of ($key*)
4 2 of ($pattern1, $pattern2, $pattern3) // at least 2 of the named set

PE Module (Entry Point and Imports)

YARA has a module system. The pe module gives you access to PE header fields.

yara
1import "pe"
2 
3rule Detect_Suspicious_Import
4{
5 meta:
6 description = "PE file importing VirtualAlloc and CreateRemoteThread"
7 
8 condition:
9 pe.imports("kernel32.dll", "VirtualAlloc") and
10 pe.imports("kernel32.dll", "CreateRemoteThread")
11}

Other useful modules include math (entropy calculations) and hash.


Real-World Examples

Example 1: Detecting a Known Malware Family by Unique Strings

When you analyze a new sample, look for strings that are unique to that family — hard-coded error messages, mutex names, registry keys, or user-agent strings. Avoid matching strings that appear in legitimate software.

yara
1rule MAL_FakeUpdater_Strings
2{
3 meta:
4 description = "Detects FakeUpdater dropper by hardcoded strings"
5 author = "DFIR Lab"
6 date = "2026-04-10"
7 reference = "https://example.com/fakeupdater-analysis"
8 
9 strings:
10 $mutex = "Global\\FU_MUTEX_2026" wide ascii
11 $ua = "Mozilla/5.0 (compatible; Updater/3.1)" nocase
12 $registry = "SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run\\FUService" wide ascii
13 $payload = { 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF }
14 
15 condition:
16 filesize < 2MB and
17 (2 of ($mutex, $ua, $registry) or $payload)
18}

Requiring two of three string matches rather than just one reduces false positives significantly.

Example 2: Detecting Suspicious Office Macros

Malicious Office documents often combine specific VBA keywords with network or execution capabilities.

yara
1rule SUSP_Office_Macro_Download_Execute
2{
3 meta:
4 description = "Office document with VBA strings associated with download-and-execute"
5 author = "DFIR Lab"
6 date = "2026-04-10"
7 
8 strings:
9 $vba1 = "AutoOpen" nocase
10 $vba2 = "Document_Open" nocase
11 $shell = "Shell(" nocase
12 $wscript = "WScript.Shell" nocase
13 $urlmon = "URLDownloadToFile" nocase
14 $http = "http://" nocase
15 $https = "https://" nocase
16 
17 condition:
18 filesize < 10MB and
19 (1 of ($vba1, $vba2)) and
20 (1 of ($shell, $wscript, $urlmon)) and
21 (1 of ($http, $https))
22}

This rule requires a macro entry point, an execution primitive, and a network indicator — all three must be present.

Example 3: Detecting Base64-Encoded PowerShell Commands

Attackers frequently encode PowerShell payloads to evade simple string detection. The string powershell with -EncodedCommand (or its shortened forms) is a strong indicator.

yara
1rule SUSP_PowerShell_EncodedCommand
2{
3 meta:
4 description = "Detects encoded PowerShell command execution"
5 author = "DFIR Lab"
6 date = "2026-04-10"
7 
8 strings:
9 $enc1 = "-EncodedCommand" nocase
10 $enc2 = "-EnC" nocase
11 $enc3 = "-EC " nocase
12 $ps = "powershell" wide ascii nocase
13 // Common base64 prefix for "IEX" (Invoke-Expression) encoded in UTF-16LE
14 $iex_b64 = "SQBFAFYA" // base64 of "IEX" in UTF-16LE
15 
16 condition:
17 (1 of ($enc1, $enc2, $enc3)) and ($ps or $iex_b64)
18}

For a deeper dive into PowerShell-based attack techniques, see our wiki on MITRE ATT&CK and Threat Hunting.


Testing Your Rules

Before deploying a rule in production, validate it against known samples and known-clean files.

Scan a single file:

bash
1yara rule.yar suspicious_file.exe

Scan a directory recursively:

bash
1yara -r rule.yar /path/to/samples/

Scan with multiple rule files:

bash
1yara -r rules/ /path/to/samples/

Common flags:

FlagPurpose
-rRecurse into subdirectories
-sPrint matching strings
-mPrint metadata
-nPrint non-matching files (useful for negative testing)
--timeout=NAbort scan after N seconds per file

Test against a clean baseline. A rule that fires on every system32 binary is not useful. Tune conditions and add fullword or nocase modifiers as needed to reduce noise.

The DFIR Lab File Analyzer performs static analysis on uploaded files and surfaces strings, imports, and entropy — useful inputs for writing and validating rules before you run them locally.


YARA vs Sigma

A common question from detection engineers new to the field: should I write YARA rules or Sigma rules?

The answer is both — they solve different problems and are complementary.

YARA operates on files. It inspects the content of a binary, document, script, or memory dump. YARA is the right tool when you have a file artifact and want to classify it or hunt for copies of it across endpoints.

Sigma operates on log events. A Sigma rule matches structured log records — Windows Event Logs, firewall logs, proxy logs, SIEM events. Sigma is the right tool when you want to detect behavior: a process spawning an unusual child, a lateral movement pattern in authentication logs, a DNS query to a known bad domain.

In practice, an investigation uses both. You find a suspicious binary (YARA classifies it), then you pivot to logs to find every host that executed it (Sigma detects the execution behavior). Neither replaces the other.

See our Sigma Rules wiki for an equivalent tutorial on writing Sigma detection rules.


Generating YARA Rules with AI

Writing YARA rules manually from a sample is time-consuming. For triage at scale, AI-assisted rule generation significantly reduces the cycle time from artifact to detection.

DFIR Lab's AI Triage feature generates YARA rules directly from natural language descriptions or from uploaded file analysis. You describe what you want to detect — or paste extracted strings and byte patterns — and the engine produces a syntactically valid, commented rule ready for review and deployment.

Documentation: platform.dfir-lab.ch/docs/ai/detect

AI-generated rules should always be reviewed by a human analyst before production deployment. Treat them as a starting point: validate the string choices, tighten the condition, and test against both malicious and benign samples. The output quality scales with the specificity of your input — the more context you provide, the more precise the rule.


Generate a YARA rule from a description. The DFIR API Playground exposes the AI detect endpoint so you can go from malware sample context to a drafted rule in one call — 10 free calls per week, no signup. Bring a behavior description or a known malware family name and see the generated strings and condition block before writing anything by hand.

Conclusion

YARA is a foundational skill for anyone working in malware analysis, detection engineering, or threat hunting. The learning curve is shallow: once you understand the four-block structure and the three string types, you can write useful rules within an hour. The depth comes from operational experience — learning which strings are unique, how to combine conditions to minimize false positives, and how to integrate rules into automated pipelines.

Start with simple rules on real samples from your own investigations. Build a personal rule library. Version-control it. Over time, it becomes one of the most valuable assets in your detection toolkit.

Ready to get started?

  • Use the DFIR Lab File Analyzer to extract strings and imports from samples before you write your first rule.
  • Generate YARA rules from natural language with AI Triage.
  • Free tier includes 100 credits/month. Use code LAUNCH50 for 50% off your first paid month.

Further reading on the DFIR Lab wiki: YARA Rules · Sigma Rules · Malware Analysis · Detection-as-Code · MITRE ATT&CK · Threat Hunting

Table of Contents

  • What Is YARA and Why Does It Matter?
  • Installing YARA
  • YARA Rule Anatomy
  • Writing Your First Rule
  • String Types
  • Text Strings
  • Hex Strings
  • Regular Expressions
  • Conditions
  • Boolean Logic
  • String Counts
  • File Size
  • Any and All
  • PE Module (Entry Point and Imports)
  • Real-World Examples
  • Example 1: Detecting a Known Malware Family by Unique Strings
  • Example 2: Detecting Suspicious Office Macros
  • Example 3: Detecting Base64-Encoded PowerShell Commands
  • Testing Your Rules
  • YARA vs Sigma
  • Generating YARA Rules with AI
  • Conclusion
Share on XShare on LinkedIn
DFIR Platform

Incident Response. Automated.

Analyze phishing emails, enrich IOCs, triage alerts, and generate forensic reports — from your terminal with dfir-cli or through the REST API.

Phishing Analysis

Headers, URLs, attachments + AI verdict

IOC Enrichment

Multiple threat intel providers

Exposure Scanner

Attack surface mapping

CLI & API

Terminal-first, JSON output

Start FreeFree tier · No credit card required

Related Research

DFIRThreat IntelligenceDetection Engineering+2

Welcome, Analyst

DFIR Lab is an independent research platform for digital forensics, incident response, and threat intelligence — built by practitioners, for practitioners. Here's what you'll find.

Mar 14, 20264 min read