YARA Rules Tutorial: Writing Detection Rules from Scratch

If you work in a SOC or do malware analysis, you have almost certainly heard of YARA. It is one of the most widely used tools in detection engineering — simple enough to write rules in minutes, powerful enough to underpin enterprise-grade threat hunting pipelines. This tutorial walks you through everything you need to get started: installation, rule anatomy, string types, conditions, and real-world examples you can run today.

What Is YARA and Why Does It Matter?

YARA is an open-source pattern-matching tool designed for identifying and classifying files based on textual or binary patterns. Originally developed by Víctor Manuel Álvarez at VirusTotal, it is now maintained as a standalone open-source project. YARA rules describe characteristics of a file — strings, byte sequences, regular expressions — and a condition that determines whether a file matches.

Where a traditional antivirus engine relies on signature hashes, YARA gives you expressive, human-readable rules that match on content. A hash changes the moment a threat actor recompiles their binary. A well-written YARA rule targeting a unique decryption stub or command string will survive that recompile.

YARA is used across the industry for:

Malware triage and classification
Incident response sweeps across endpoints
Threat intelligence enrichment
Detection-as-Code pipelines in SIEM and EDR platforms

For a broader introduction to the detection landscape, see our wiki on Detection-as-Code and Malware Analysis.

Installing YARA

YARA is available on all major platforms. Choose the method that fits your environment.

macOS (Homebrew)

bash

1brew install yara

Debian/Ubuntu

bash

1sudo apt install yara

Python bindings (yara-python)

If you want to integrate YARA into scripts or automate scanning:

bash

1pip install yara-python

Verify the installation:

bash

1yara --version

As of this writing, the current stable release is YARA 4.x. The Python bindings follow the same version line.

YARA Rule Anatomy

Every YARA rule follows the same structure. Understanding each block is the foundation of everything that follows.

yara

1rule RuleName
2{
3    meta:
4        author      = "Analyst Name"
5        description = "What this rule detects"
6        date        = "2026-04-10"
7        reference   = "https://example.com/report"
8 
9    strings:
10        $string1 = "suspicious text"
11        $bytes1  = { 4D 5A 90 00 }
12        $regex1  = /https?:\/\/[a-z0-9]{8}\.example\.com/
13 
14    condition:
15        any of them
16}

rule RuleName — The rule identifier. Use a consistent naming convention. Many teams prefix with a category: MAL_, SUSP_, HUNT_.

meta — Free-form key-value metadata. Not used in matching, but critical for operationalizing rules. Include author, description, creation date, and a reference URL whenever possible.

strings — The patterns you want to find. Three types: plain text strings, hex byte sequences, and regular expressions. Each string gets a variable name prefixed with $.

condition — A boolean expression that determines whether the file matches. Conditions can reference individual strings, counts, file size, entry point offset, and more.

See our YARA Rules wiki page for a reference sheet you can bookmark.

Writing Your First Rule

Start simple. The goal of your first rule is to match a file that contains a known suspicious string.

Suppose you are investigating a sample that connects to a hardcoded C2 path. You extract the string cmd.exe /c whoami from the binary. Here is a minimal rule:

yara

1rule Detect_Whoami_Execution
2{
3    meta:
4        description = "Detects binaries containing a hardcoded whoami command"
5        author      = "DFIR Lab"
6        date        = "2026-04-10"
7 
8    strings:
9        $cmd = "cmd.exe /c whoami" nocase
10 
11    condition:
12        $cmd
13}

The nocase modifier makes the match case-insensitive. If the string appears anywhere in the file, the rule fires.

Test it against a directory:

bash

1yara rule.yar /path/to/samples/

String Types

YARA supports three types of string definitions. Each has a different use case.

Text Strings

Plain ASCII or wide-character strings. Use modifiers to adjust matching behavior.

yara

1strings:
2    $ascii  = "MalwareLoader"
3    $wide   = "MalwareLoader" wide          // UTF-16LE encoded
4    $both   = "MalwareLoader" wide ascii    // match either encoding
5    $nocase = "malwareloader" nocase
6    $full   = "exact match" fullword        // must not be preceded/followed by alphanumeric

wide is important. Many Windows executables store strings in UTF-16LE. If you only check for ASCII and the binary uses wide strings, you will miss the match.

Hex Strings

Hex strings let you match raw byte sequences, including wildcards and jumps.

yara

1strings:
2    // Match the MZ header of a PE file
3    $mz = { 4D 5A }
4 
5    // Wildcard: match any single byte with ??
6    $pattern = { E8 ?? ?? ?? ?? 83 C4 04 }
7 
8    // Jump: match a sequence with a variable-length gap
9    $jump = { 4D 5A [2-10] 50 45 00 00 }

Hex strings are ideal when you identify a unique instruction sequence or a hardcoded byte pattern in a disassembler.

Regular Expressions

YARA supports Perl-compatible regular expressions wrapped in forward slashes.

yara

1strings:
2    // Match a URL with a randomly generated 8-character subdomain
3    $c2_url = /https?:\/\/[a-z0-9]{6,12}\.example\.com\/[a-z]{4}/
4 
5    // Match base64-encoded content (broad)
6    $b64 = /[A-Za-z0-9+\/]{40,}={0,2}/
7 
8    // Match an IPv4 address embedded in a string
9    $ipv4 = /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})){3}\b/

Regular expressions are the most expressive but also the slowest. Use them when text strings and hex patterns are insufficient, and be specific — overly broad regex patterns generate false positives and slow down scans.

Conditions

The condition block is where YARA becomes powerful. It is a boolean expression supporting a rich set of operators and keywords.

Boolean Logic

yara

1condition:
2    $string1 and $string2        // both must match
3    $string1 or $string2         // either must match
4    not $string1                 // must not match
5    ($string1 or $string2) and $string3

String Counts

yara

1condition:
2    #string1 >= 3               // $string1 appears at least 3 times
3    #string1 == 1 and #string2 > 0

File Size

yara

1condition:
2    filesize < 1MB              // common for packed/dropper stages
3    filesize > 500KB and filesize < 5MB

Any and All

yara

1condition:
2    any of ($str*)              // any string matching the wildcard prefix
3    all of ($key*)
4    2 of ($pattern1, $pattern2, $pattern3)   // at least 2 of the named set

PE Module (Entry Point and Imports)

YARA has a module system. The pe module gives you access to PE header fields.

yara

1import "pe"
2 
3rule Detect_Suspicious_Import
4{
5    meta:
6        description = "PE file importing VirtualAlloc and CreateRemoteThread"
7 
8    condition:
9        pe.imports("kernel32.dll", "VirtualAlloc") and
10        pe.imports("kernel32.dll", "CreateRemoteThread")
11}

Other useful modules include math (entropy calculations) and hash.

Real-World Examples

Example 1: Detecting a Known Malware Family by Unique Strings

When you analyze a new sample, look for strings that are unique to that family — hard-coded error messages, mutex names, registry keys, or user-agent strings. Avoid matching strings that appear in legitimate software.

yara

1rule MAL_FakeUpdater_Strings
2{
3    meta:
4        description = "Detects FakeUpdater dropper by hardcoded strings"
5        author      = "DFIR Lab"
6        date        = "2026-04-10"
7        reference   = "https://example.com/fakeupdater-analysis"
8 
9    strings:
10        $mutex    = "Global\\FU_MUTEX_2026" wide ascii
11        $ua       = "Mozilla/5.0 (compatible; Updater/3.1)" nocase
12        $registry = "SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run\\FUService" wide ascii
13        $payload  = { 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF }
14 
15    condition:
16        filesize < 2MB and
17        (2 of ($mutex, $ua, $registry) or $payload)
18}

Requiring two of three string matches rather than just one reduces false positives significantly.

Example 2: Detecting Suspicious Office Macros

Malicious Office documents often combine specific VBA keywords with network or execution capabilities.

yara

1rule SUSP_Office_Macro_Download_Execute
2{
3    meta:
4        description = "Office document with VBA strings associated with download-and-execute"
5        author      = "DFIR Lab"
6        date        = "2026-04-10"
7 
8    strings:
9        $vba1 = "AutoOpen" nocase
10        $vba2 = "Document_Open" nocase
11        $shell = "Shell(" nocase
12        $wscript = "WScript.Shell" nocase
13        $urlmon = "URLDownloadToFile" nocase
14        $http = "http://" nocase
15        $https = "https://" nocase
16 
17    condition:
18        filesize < 10MB and
19        (1 of ($vba1, $vba2)) and
20        (1 of ($shell, $wscript, $urlmon)) and
21        (1 of ($http, $https))
22}

This rule requires a macro entry point, an execution primitive, and a network indicator — all three must be present.

Example 3: Detecting Base64-Encoded PowerShell Commands

Attackers frequently encode PowerShell payloads to evade simple string detection. The string powershell with -EncodedCommand (or its shortened forms) is a strong indicator.

yara

1rule SUSP_PowerShell_EncodedCommand
2{
3    meta:
4        description = "Detects encoded PowerShell command execution"
5        author      = "DFIR Lab"
6        date        = "2026-04-10"
7 
8    strings:
9        $enc1 = "-EncodedCommand" nocase
10        $enc2 = "-EnC" nocase
11        $enc3 = "-EC " nocase
12        $ps   = "powershell" wide ascii nocase
13        // Common base64 prefix for "IEX" (Invoke-Expression) encoded in UTF-16LE
14        $iex_b64 = "SQBFAFYA" // base64 of "IEX" in UTF-16LE
15 
16    condition:
17        (1 of ($enc1, $enc2, $enc3)) and ($ps or $iex_b64)
18}

For a deeper dive into PowerShell-based attack techniques, see our wiki on MITRE ATT&CK and Threat Hunting.

Testing Your Rules

Before deploying a rule in production, validate it against known samples and known-clean files.

Scan a single file:

bash

1yara rule.yar suspicious_file.exe

Scan a directory recursively:

bash

1yara -r rule.yar /path/to/samples/

Scan with multiple rule files:

bash

1yara -r rules/ /path/to/samples/

Common flags:

Flag	Purpose
`-r`	Recurse into subdirectories
`-s`	Print matching strings
`-m`	Print metadata
`-n`	Print non-matching files (useful for negative testing)
`--timeout=N`	Abort scan after N seconds per file

Test against a clean baseline. A rule that fires on every system32 binary is not useful. Tune conditions and add fullword or nocase modifiers as needed to reduce noise.

The DFIR Lab File Analyzer performs static analysis on uploaded files and surfaces strings, imports, and entropy — useful inputs for writing and validating rules before you run them locally.

YARA vs Sigma

A common question from detection engineers new to the field: should I write YARA rules or Sigma rules?

The answer is both — they solve different problems and are complementary.

YARA operates on files. It inspects the content of a binary, document, script, or memory dump. YARA is the right tool when you have a file artifact and want to classify it or hunt for copies of it across endpoints.

Sigma operates on log events. A Sigma rule matches structured log records — Windows Event Logs, firewall logs, proxy logs, SIEM events. Sigma is the right tool when you want to detect behavior: a process spawning an unusual child, a lateral movement pattern in authentication logs, a DNS query to a known bad domain.

In practice, an investigation uses both. You find a suspicious binary (YARA classifies it), then you pivot to logs to find every host that executed it (Sigma detects the execution behavior). Neither replaces the other.

See our Sigma Rules wiki for an equivalent tutorial on writing Sigma detection rules.

Generating YARA Rules with AI

Writing YARA rules manually from a sample is time-consuming. For triage at scale, AI-assisted rule generation significantly reduces the cycle time from artifact to detection.

DFIR Lab's AI Triage feature generates YARA rules directly from natural language descriptions or from uploaded file analysis. You describe what you want to detect — or paste extracted strings and byte patterns — and the engine produces a syntactically valid, commented rule ready for review and deployment.

Documentation: platform.dfir-lab.ch/docs/ai/detect

AI-generated rules should always be reviewed by a human analyst before production deployment. Treat them as a starting point: validate the string choices, tighten the condition, and test against both malicious and benign samples. The output quality scales with the specificity of your input — the more context you provide, the more precise the rule.

Generate a YARA rule from a description. The DFIR API Playground exposes the AI detect endpoint so you can go from malware sample context to a drafted rule in one call — 10 free calls per week, no signup. Bring a behavior description or a known malware family name and see the generated strings and condition block before writing anything by hand.

Conclusion

YARA is a foundational skill for anyone working in malware analysis, detection engineering, or threat hunting. The learning curve is shallow: once you understand the four-block structure and the three string types, you can write useful rules within an hour. The depth comes from operational experience — learning which strings are unique, how to combine conditions to minimize false positives, and how to integrate rules into automated pipelines.

Start with simple rules on real samples from your own investigations. Build a personal rule library. Version-control it. Over time, it becomes one of the most valuable assets in your detection toolkit.

Ready to get started?

Use the DFIR Lab File Analyzer to extract strings and imports from samples before you write your first rule.
Generate YARA rules from natural language with AI Triage.
Free tier includes 100 credits/month. Use code LAUNCH50 for 50% off your first paid month.

Further reading on the DFIR Lab wiki: YARA Rules · Sigma Rules · Malware Analysis · Detection-as-Code · MITRE ATT&CK · Threat Hunting

What Is YARA and Why Does It Matter?

YARA is used across the industry for:

Malware triage and classification
Incident response sweeps across endpoints
Threat intelligence enrichment
Detection-as-Code pipelines in SIEM and EDR platforms

For a broader introduction to the detection landscape, see our wiki on Detection-as-Code and Malware Analysis.

Installing YARA

YARA is available on all major platforms. Choose the method that fits your environment.

macOS (Homebrew)

bash

1brew install yara

Debian/Ubuntu

bash

1sudo apt install yara

Python bindings (yara-python)

If you want to integrate YARA into scripts or automate scanning:

bash

1pip install yara-python

Verify the installation:

bash

1yara --version

As of this writing, the current stable release is YARA 4.x. The Python bindings follow the same version line.

YARA Rule Anatomy

Every YARA rule follows the same structure. Understanding each block is the foundation of everything that follows.

yara

1rule RuleName
2{
3    meta:
4        author      = "Analyst Name"
5        description = "What this rule detects"
6        date        = "2026-04-10"
7        reference   = "https://example.com/report"
8 
9    strings:
10        $string1 = "suspicious text"
11        $bytes1  = { 4D 5A 90 00 }
12        $regex1  = /https?:\/\/[a-z0-9]{8}\.example\.com/
13 
14    condition:
15        any of them
16}

rule RuleName — The rule identifier. Use a consistent naming convention. Many teams prefix with a category: MAL_, SUSP_, HUNT_.

meta — Free-form key-value metadata. Not used in matching, but critical for operationalizing rules. Include author, description, creation date, and a reference URL whenever possible.

strings — The patterns you want to find. Three types: plain text strings, hex byte sequences, and regular expressions. Each string gets a variable name prefixed with $.

condition — A boolean expression that determines whether the file matches. Conditions can reference individual strings, counts, file size, entry point offset, and more.

See our YARA Rules wiki page for a reference sheet you can bookmark.

Writing Your First Rule

Start simple. The goal of your first rule is to match a file that contains a known suspicious string.

Suppose you are investigating a sample that connects to a hardcoded C2 path. You extract the string cmd.exe /c whoami from the binary. Here is a minimal rule:

yara

1rule Detect_Whoami_Execution
2{
3    meta:
4        description = "Detects binaries containing a hardcoded whoami command"
5        author      = "DFIR Lab"
6        date        = "2026-04-10"
7 
8    strings:
9        $cmd = "cmd.exe /c whoami" nocase
10 
11    condition:
12        $cmd
13}

The nocase modifier makes the match case-insensitive. If the string appears anywhere in the file, the rule fires.

Test it against a directory:

bash

1yara rule.yar /path/to/samples/

String Types

YARA supports three types of string definitions. Each has a different use case.

Text Strings

Plain ASCII or wide-character strings. Use modifiers to adjust matching behavior.

yara

1strings:
2    $ascii  = "MalwareLoader"
3    $wide   = "MalwareLoader" wide          // UTF-16LE encoded
4    $both   = "MalwareLoader" wide ascii    // match either encoding
5    $nocase = "malwareloader" nocase
6    $full   = "exact match" fullword        // must not be preceded/followed by alphanumeric

wide is important. Many Windows executables store strings in UTF-16LE. If you only check for ASCII and the binary uses wide strings, you will miss the match.

Hex Strings

Hex strings let you match raw byte sequences, including wildcards and jumps.

yara

1strings:
2    // Match the MZ header of a PE file
3    $mz = { 4D 5A }
4 
5    // Wildcard: match any single byte with ??
6    $pattern = { E8 ?? ?? ?? ?? 83 C4 04 }
7 
8    // Jump: match a sequence with a variable-length gap
9    $jump = { 4D 5A [2-10] 50 45 00 00 }

Hex strings are ideal when you identify a unique instruction sequence or a hardcoded byte pattern in a disassembler.

Regular Expressions

YARA supports Perl-compatible regular expressions wrapped in forward slashes.

yara

1strings:
2    // Match a URL with a randomly generated 8-character subdomain
3    $c2_url = /https?:\/\/[a-z0-9]{6,12}\.example\.com\/[a-z]{4}/
4 
5    // Match base64-encoded content (broad)
6    $b64 = /[A-Za-z0-9+\/]{40,}={0,2}/
7 
8    // Match an IPv4 address embedded in a string
9    $ipv4 = /\b(25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})(\.(25[0-5]|2[0-4][0-9]|[01]?[0-9]{1,2})){3}\b/

Conditions

The condition block is where YARA becomes powerful. It is a boolean expression supporting a rich set of operators and keywords.

Boolean Logic

yara

1condition:
2    $string1 and $string2        // both must match
3    $string1 or $string2         // either must match
4    not $string1                 // must not match
5    ($string1 or $string2) and $string3

String Counts

yara

1condition:
2    #string1 >= 3               // $string1 appears at least 3 times
3    #string1 == 1 and #string2 > 0

File Size

yara

1condition:
2    filesize < 1MB              // common for packed/dropper stages
3    filesize > 500KB and filesize < 5MB

Any and All

yara

1condition:
2    any of ($str*)              // any string matching the wildcard prefix
3    all of ($key*)
4    2 of ($pattern1, $pattern2, $pattern3)   // at least 2 of the named set

PE Module (Entry Point and Imports)

YARA has a module system. The pe module gives you access to PE header fields.

yara

1import "pe"
2 
3rule Detect_Suspicious_Import
4{
5    meta:
6        description = "PE file importing VirtualAlloc and CreateRemoteThread"
7 
8    condition:
9        pe.imports("kernel32.dll", "VirtualAlloc") and
10        pe.imports("kernel32.dll", "CreateRemoteThread")
11}

Other useful modules include math (entropy calculations) and hash.

Real-World Examples

Example 1: Detecting a Known Malware Family by Unique Strings

yara

1rule MAL_FakeUpdater_Strings
2{
3    meta:
4        description = "Detects FakeUpdater dropper by hardcoded strings"
5        author      = "DFIR Lab"
6        date        = "2026-04-10"
7        reference   = "https://example.com/fakeupdater-analysis"
8 
9    strings:
10        $mutex    = "Global\\FU_MUTEX_2026" wide ascii
11        $ua       = "Mozilla/5.0 (compatible; Updater/3.1)" nocase
12        $registry = "SOFTWARE\\Microsoft\\Windows\\CurrentVersion\\Run\\FUService" wide ascii
13        $payload  = { 4D 5A 90 00 03 00 00 00 04 00 00 00 FF FF }
14 
15    condition:
16        filesize < 2MB and
17        (2 of ($mutex, $ua, $registry) or $payload)
18}

Requiring two of three string matches rather than just one reduces false positives significantly.

Example 2: Detecting Suspicious Office Macros

Malicious Office documents often combine specific VBA keywords with network or execution capabilities.

yara

1rule SUSP_Office_Macro_Download_Execute
2{
3    meta:
4        description = "Office document with VBA strings associated with download-and-execute"
5        author      = "DFIR Lab"
6        date        = "2026-04-10"
7 
8    strings:
9        $vba1 = "AutoOpen" nocase
10        $vba2 = "Document_Open" nocase
11        $shell = "Shell(" nocase
12        $wscript = "WScript.Shell" nocase
13        $urlmon = "URLDownloadToFile" nocase
14        $http = "http://" nocase
15        $https = "https://" nocase
16 
17    condition:
18        filesize < 10MB and
19        (1 of ($vba1, $vba2)) and
20        (1 of ($shell, $wscript, $urlmon)) and
21        (1 of ($http, $https))
22}

This rule requires a macro entry point, an execution primitive, and a network indicator — all three must be present.

Example 3: Detecting Base64-Encoded PowerShell Commands

Attackers frequently encode PowerShell payloads to evade simple string detection. The string powershell with -EncodedCommand (or its shortened forms) is a strong indicator.

yara

1rule SUSP_PowerShell_EncodedCommand
2{
3    meta:
4        description = "Detects encoded PowerShell command execution"
5        author      = "DFIR Lab"
6        date        = "2026-04-10"
7 
8    strings:
9        $enc1 = "-EncodedCommand" nocase
10        $enc2 = "-EnC" nocase
11        $enc3 = "-EC " nocase
12        $ps   = "powershell" wide ascii nocase
13        // Common base64 prefix for "IEX" (Invoke-Expression) encoded in UTF-16LE
14        $iex_b64 = "SQBFAFYA" // base64 of "IEX" in UTF-16LE
15 
16    condition:
17        (1 of ($enc1, $enc2, $enc3)) and ($ps or $iex_b64)
18}

For a deeper dive into PowerShell-based attack techniques, see our wiki on MITRE ATT&CK and Threat Hunting.

Testing Your Rules

Before deploying a rule in production, validate it against known samples and known-clean files.

Scan a single file:

bash

1yara rule.yar suspicious_file.exe

Scan a directory recursively:

bash

1yara -r rule.yar /path/to/samples/

Scan with multiple rule files:

bash

1yara -r rules/ /path/to/samples/

Common flags:

Flag	Purpose
`-r`	Recurse into subdirectories
`-s`	Print matching strings
`-m`	Print metadata
`-n`	Print non-matching files (useful for negative testing)
`--timeout=N`	Abort scan after N seconds per file

Test against a clean baseline. A rule that fires on every system32 binary is not useful. Tune conditions and add fullword or nocase modifiers as needed to reduce noise.

The DFIR Lab File Analyzer performs static analysis on uploaded files and surfaces strings, imports, and entropy — useful inputs for writing and validating rules before you run them locally.

YARA vs Sigma

A common question from detection engineers new to the field: should I write YARA rules or Sigma rules?

The answer is both — they solve different problems and are complementary.

See our Sigma Rules wiki for an equivalent tutorial on writing Sigma detection rules.

Generating YARA Rules with AI

Writing YARA rules manually from a sample is time-consuming. For triage at scale, AI-assisted rule generation significantly reduces the cycle time from artifact to detection.

Documentation: platform.dfir-lab.ch/docs/ai/detect

Conclusion

Ready to get started?

Use the DFIR Lab File Analyzer to extract strings and imports from samples before you write your first rule.
Generate YARA rules from natural language with AI Triage.
Free tier includes 100 credits/month. Use code LAUNCH50 for 50% off your first paid month.

Further reading on the DFIR Lab wiki: YARA Rules · Sigma Rules · Malware Analysis · Detection-as-Code · MITRE ATT&CK · Threat Hunting

Incident Response. Automated.

Related Research

Welcome, Analyst

Incident Response. Automated.

Related Research

Welcome, Analyst