A deep dive into Sigma rules and how to write your own threat detection rules
Written by Hardik Manocha
Co-founder @ FourCore
Sigma as a Detection Language
In our previous blog post, we covered how Windows Event Log IDs can be utilized for threat hunting, featuring Sigma rules.
Released by Florian Roth in 2017, Sigma (The Generic Signature Format for SIEM Systems) has paved the way for platform-agnostic search. With Sigma, defenders can harness the community's power to react promptly to critical threats and new adversary tradecraft. You get a fixed-language specification for the generic rule format, a tool for converting Sigma rules into various query formats and a repository of over one thousand rules for several attack techniques.
Like YARA, or Snort Rules, Sigma is a tool for the open sharing and crowdsourcing of threat intelligence, it focuses on SIEM instead of files or network traffic. What Snort is to network traffic, and YARA is to files, Sigma is to logs.
Most attacks on IT systems and networks manifest themselves in event logs stored in the SIEM systems or other log storage and analysis solutions. This makes SIEM a crucial tool to detect and alert against intruders. SIEM detection rulesets existed in the vendor or platform-specific databases in the earlier days. The growing demand for up-to-date detections and analytics to be secure today requires sharing detection intelligence between different stakeholders and vendors. Sigma solves this challenge to make the queries and rulesets platform-agnostic.
Sigma allows defenders to share detections in a common language.
Sigma satisfies various use cases:
Sigma has become an agnostic way of sharing detections between Researchers and Intelligence who identify new adversary behaviours.
Security teams can avoid vendor-lock-in, i.e. by defining rules in Sigma; we can more easily move between platforms.
Sigma can be utilized to crowdsource detection methods and make them usable instantly for everyone.
Using Sigma to share the signature with other threat intel communities.
Sigma rules can be converted into a search query specific to your SIEM solution and supports various solutions:
ElasticSearch Query Strings and DSL
Microsoft Defender Advanced Threat Protection (MDATP)
PowerShell and Grep
Playing with Sigma
Sigma is an open-source project with three major components:
A language specification for the generic Sigma rule format.
Open repository for sigma signatures with over one thousand rules for several attacker behaviours and techniques.
sigmac, a conversion utility to generate search queries for different SIEM systems from Sigma rules.
Step 1: Get the repository:
First, download or clone the Sigma repository from GitHub.
1git clone https://github.com/SigmaHQ/sigma.git
Step 2: Understanding Sigma Rules
A Sigma rule is written in YAML and defines the what and the where to look in system logs. Every Sigma rule also specifies metadata such as the author of the rule, a unique rule identifier (UUID), MITRE ATT&CK techniques, and references, eg. an URL for additional information.
1title: PwnKit Local Privilege Escalation
4description: Detects potential PwnKit exploitation CVE-2021-4034 in auth logs
12detection:13keyword:14-"pkexec"15-"The value for environment variable XAUTHORITY contains suscipious content"16-"[USER=root] [TTY=/dev/pts/0]"17condition: all of keyword
Each rule (yml) has the following sections:
title: Name of the rule, PwnKit Local Privilege Escalation.
id: UUID to uniquely identify the rule, 0506a799-698b-43b4-85a1-ac4c84c720e9.
status: experimental or normal, in this case it's an experimental rule.
description: explains the context of the rule, Detects potential PwnKit exploitation CVE-2021-4034 in auth logs.
author: metadata about the rule creator Sreeman.
date: creation date for the rule.
reference: reference links to blog posts or tweets explaining the issue.
modified: date on which the rule was modified, any new changes introduced.
level: Severity level one of “low”, “medium”, “high” or “critical”
logsource: used to scope the searches, supporting various combinations, eg. the Windows Security log channel
product: describes the product to match all the rules.
service: events where the field names are set to the product logs.
detection: search values in specific fields of log data, selectors, keywords, timeframe and conditions, types, here we look for the tool 'pkexec'.
condition: selections linked in a condition, all of keywords.
falsepositives: Description field to explain which events or situations might trigger that rule leading to a false positive.
Diving deeper into Sigma
Logsource Field: The logsource field in the Sigma YAML describes the log data on which the detection is meant to be applied to. Logsource schema is as follows, and it can use one or more of the following three attributes:
1logsource: category [optional]2 product [optional]3 service [optional]4 definition [optional]
Used to select all the logs generated by a certain family of products, eg. firewall and web server logs
Category examples could include: firewall, antivirus and EDR, web
Used to select all the logs generated by a specific product, eg. Windows, oracle, apache, zscaler
Product types can go one step beyond, like Windows EventLog Types, eg. Security, System, AppLocker, Application, and Windows Defender
used to select only a subset of product's logs, eg. Security eventlog on Windows Systems or polkit logs on Linux.
Detection Field: This section contains a set of search-identifiers representing searches on log data and their respective evaluation mechanism, controlled by the two attributes: Selections and Conditions.
Selections: Search identifiers for the log data
Conditions: defines how the selection or filters are to be evaluated
Detection can consist of two different data structures: Lists and Maps.
List YAML type contain multiple string-based search parameters, applied to the log data, linked with a logical OR.
For example, This detection will match any keywords from the provided list in a log line.
Maps YAML type consists of key/value pairs, like a dictionary, where the key is a field in the log data, and the value is a string or integer value.
List of maps are joined with a logical 'OR'. All elements of a map are joined with a logical 'AND'
For example, in the below example, a match will happen if the ImageLoaded key contains the any of the following list of values.
Here, the selection (search-identifier) matches the ImageLoaded field in the log data and uses the transformation modifier (|contains) to check if the listed keywords are present.
A short list of modifiers is listed below:
We can also use wildcard characters to match a wide list of keywords in the log data. For example, instead of a hardcoded path or command line argument, we can find logs for excecution of rundll32.exe: \*rundll32.exe
For conditions, we can evaluate search-identifiers using:
Logical AND/OR operations
1 of selection or all of selection
Negation using not — eg. not selection
Grouping expressions by using parenthesis — eg. (selection1 and selection2 and selection3) or selection4
In the following example, a match will happen if the conditions - selection1, selection2, selection3 are triggered and no matches for the filter.
1detection:2selection1:3ParentImage|endswith:4-'\winlogon.exe'5-'\services.exe'6-'\lsass.exe'7-'\csrss.exe'8-'\smss.exe'9-'\wininit.exe'10-'\spoolsv.exe'11-'\searchindexer.exe'12selection2:13Image|endswith:14-'\powershell.exe'15-'\cmd.exe'16selection3:17User|contains:# covers many language settings18-"AUTHORI"19-"AUTORI"20filter:21CommandLine|contains|all:22-" route "23-" ADD "24condition: selection1 and selection2 and selection3 and not filter
Sigma supports various advanced filters, you can learn more about them here
Taking another exciting example for browser credential dumping, we have the following detection logic: selection is triggered with none of the filters positive to remove the false positives.
1detection:2selection:3-FileName|contains:4-'\AppData\Local\Google\Chrome\User Data\Default\Network\Cookies'5-'\Appdata\Local\Chrome\User Data\Default\Login Data'6-'\AppData\Local\Google\Chrome\User Data\Local State'7-FileName|endswith:8-'\Appdata\Local\Microsoft\Windows\WebCache\WebCacheV01.dat'9-'\cookies.sqlite'10-'release\key3.db'#firefox11-'release\key4.db'#firefox12-'release\logins.json'#firefox13filter_browser:14Image|endswith:15-'\firefox.exe'16-'\chrome.exe'17filter_programfile:18Image|startswith:19- 'C:\Program Files\'
20- 'C:\Program Files (x86)\'
21filter_antimalware:22Image|endswith:23-'\MsMpEng.exe'24-'\MpCopyAccelerator.exe'25filter_service:26ParentImage:'C:\Windows\System32\services.exe'27TargetFilename|endswith:'\APPDATA\LOCAL\MICROSOFT\WINDOWS\WEBCACHE\WEBCACHEV01.DAT'28filter_windows:29-Image:'C:\Windows\System32\dllhost.exe'30-CommandLine|contains:'\svchost.exe -k DcomLaunch -p'31condition: selection and not 1 of filter_*
32falsepositives:33- Antivirus, Anti-Spyware, Anti-Malware Software
34- Backup software
Finally, we have the falsepositives field, post-deployment of your rules; you might discover some incorrect detections (false positives), which can be dealt with by either tuning the detection or updating the rule with a list of known false positives that can occur from a detection.
I strongly urge you to read this article Sigma Rules by SOC Prime and the Wiki Specification by Sigma HQ.
Step 3: Compiling Sigma Rules
In order to convert the Sigma rule into searchable queries for the target SIEM or any supported logging platform, you have to use the Sigma Compiler sigmac, a python-based tool shipped with Sigma itself.
sigmac allows you to convert a rule into a target of choice like Splunk, Qualys, and Qradar as visible above. Sigmac also uses field mappings to convert fields used in the rule into actual fields available for the desired target.
To view all available targets, configurations, and modifiers, simply run:
1$ python3 sigmac --lists
Compiling a rule
To finally convert the rule for your target SIEM, we provide the target system by the -t flag and the -c flag for configurations to set the field mappings correctly:
2$ python3 sigmac -t splunk -c splunk-windows ../rules/windows/driver_load/driver_load_mal_creddumper.yml
3(ImageLoaded="*fgexec*" OR ImageLoaded="*dumpsvc*" OR ImageLoaded="*cachedump*" OR ImageLoaded="*mimidrv*" OR ImageLoaded="*gsecdump*" OR ImageLoaded="*servpw*" OR ImageLoaded="*pwdump*")
That's it, voila! You now have a compiled rule. We can use this query inside splunk to detect execution of credential dumping tools such as mimikatz or gsecdump.
A small note: Consider reading the Sigma rules present in the rules directory; you will learn a lot about detections and selections, and it will aid in writing effective Sigma rules quickly.
You can primarily monitor logs from these sources and utilize these logs for detecting intruders:
Successful/Filtered IP/TCP/UDP Communication
Operating System Logs
Authentication and Accounts
Large number of failed logon attempts
Alternation and usage of specific accounts, and SID History
Process Creation and Execution
Execution from unusual locations
Suspicious process relationships
Known executables with Unknown hashes
known evil hashes
Rare Service installations
New domain trusts
Network: Port Scans, Host Discovery
Web Server Access Logs
4xx Errors: Enumeration and Reconnaissance activity
5xx Errors: Exploitation
Application Error Logs
Exceptions and Other specific messages
Uncoder.io is a web-based tool by SOCPrime, which let's you quickly write and play around with Sigma rules in the browser. It allows you to write, edit, test, and compile the rules for various targets.
You have made it till here. Time to get our hands dirty and do some threat detection with Sigma rules.
Challenge: Write your own Sigma rule
We have been through a lot of technical mumbo jumbo, and, a rule for a real-life scenario would look more akin to what we have learned today. However, since this post has become way too huge, I'd skip the process of setting up an actual Splunk instance and creating logs to trigger the Sigma rule that you are going to write.
You can go one step ahead and set up an instance of OpenEDR or use Aurora-Lite, try to simulate an actual threat and begin from there. :)
I encourage you to set up a trial environment, download a vulnerable version of any typical software, and try to build your own detection rules on top of that.
Sigma rule to detect Registry Run Key Persistence Technique
There are multiple ways an attacker can gain persistence for their malicious payloads, however, in our context, let's talk about Registry based persistence.
MITRE ATT&CK mentions Registry Run Keys as "Adversaries may achieve persistence by adding a program to a startup folder or referencing it with a Registry run key. Adding an entry to the "run keys" in the Registry or startup folder will cause the program referenced to be executed when a user logs in. These programs will be executed under the context of the user and will have the account's associated permissions level."
A little pretext for RunKeys:
The following run keys are created by default on Windows systems:
1title: Registry Add RUN Key Persistence
2id: 7e0ed38f-5aa9-427d-bc19-689edccedba1 # some random uuid3status: test
4description:"Detects suspicious command line executions monitoring modifications to the RUN key in the Registry Hive "5references:6- https://attack.mitre.org/techniques/T1547/001/
We can create events for our technique using the firedrill utility. You can find the source here and customize it to run more such techniques to improve your detection capabilities. This technique uses the Registry Key HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run to execute a sample payload.
With the above information, our primary detection logic should contain a check for the modification of HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run.
The modification of this key generates an event by the Microsoft Windows Security Auditng with event ID 4657.
The selection criteria specified under the “detection” section is a set of key-value pairs. The rule, in this example, will trigger only when the field type is CommandLine contains the arguments:
The complete rule:
1title: Registry Add RUN Key Persistence
2id: 7e0ed38f-5aa9-427d-bc19-689edccedba1 # some random uuid3status: test
4description:'Detects suspicious command line executions monitoring modifications to the RUN key in the Registry Hive '5references:6- https://attack.mitre.org/techniques/T1547/001/
16detection:17selection:18CommandLine|contains|all:19-'reg'20-' ADD '21-'Software\Microsoft\Windows\CurrentVersion\Run'22condition: selection
24falsepositives:25- System provisioning
After writing the Sigma rule, we can use either uncoder or Sigmac to convert from the sigma rule to any other SIEM tool format.