A deep dive into Sigma rules and how to write your own threat detection rules

Written by Hardik Manocha
Co-founder @ FourCore
Open-Source SIEM Rules with Sigma

Sigma as a Detection Language

In our previous blog post, we covered how Windows Event Log IDs can be utilized for threat hunting, featuring Sigma rules.

Released by Florian Roth in 2017, Sigma (The Generic Signature Format for SIEM Systems) has paved the way for platform-agnostic search. With Sigma, defenders can harness the community's power to react promptly to critical threats and new adversary tradecraft. You get a fixed-language specification for the generic rule format, a tool for converting Sigma rules into various query formats and a repository of over one thousand rules for several attack techniques.

Like YARA, or Snort Rules, Sigma is a tool for the open sharing and crowdsourcing of threat intelligence, it focuses on SIEM instead of files or network traffic. What Snort is to network traffic, and YARA is to files, Sigma is to logs.

Most attacks on IT systems and networks manifest themselves in event logs stored in the SIEM systems or other log storage and analysis solutions. This makes SIEM a crucial tool to detect and alert against intruders. SIEM detection rulesets existed in the vendor or platform-specific databases in the earlier days. The growing demand for up-to-date detections and analytics to be secure today requires sharing detection intelligence between different stakeholders and vendors. Sigma solves this challenge to make the queries and rulesets platform-agnostic.

Sigma Converter
Sigma Conversion Process

Sigma allows defenders to share detections in a common language.

Sigma satisfies various use cases:

  • Sigma has become an agnostic way of sharing detections between Researchers and Intelligence who identify new adversary behaviours.
  • Security teams can avoid vendor-lock-in, i.e. by defining rules in Sigma; we can more easily move between platforms.
  • Sigma can be utilized to crowdsource detection methods and make them usable instantly for everyone.
  • Using Sigma to share the signature with other threat intel communities.

Sigma rules can be converted into a search query specific to your SIEM solution and supports various solutions:

  • Splunk
  • ElasticSearch Query Strings and DSL
  • Kibana
  • Microsoft Defender Advanced Threat Protection (MDATP)
  • Azure Sentinel
  • QRadar
  • LogPoint
  • Qualys
  • RSA NetWitness
  • LimaCharlie
  • ArcSight
  • PowerShell and Grep

Playing with Sigma

Sigma is an open-source project with three major components:

  • A language specification for the generic Sigma rule format.
  • Open repository for sigma signatures with over one thousand rules for several attacker behaviours and techniques.
  • sigmac, a conversion utility to generate search queries for different SIEM systems from Sigma rules.

Step 1: Get the repository:

First, download or clone the Sigma repository from GitHub.

1git clone https://github.com/SigmaHQ/sigma.git
Sigma Directory
Sigma Directory

Step 2: Understanding Sigma Rules

A Sigma rule is written in YAML and defines the what and the where to look in system logs. Every Sigma rule also specifies metadata such as the author of the rule, a unique rule identifier (UUID), MITRE ATT&CK techniques, and references, eg. an URL for additional information.

Sigma supports the following log types:

  • Firewall Logs
  • Web Application Logs
  • Proxy / VPN Networking Logs
  • Operating System Logs
    • Event logs
    • Process Creation and Auditing Logs
    • Sysmon Events

Let's take a look at a sigma rule from the repository rules/linux/builtin/lnx_pwnkit_local_privilege_escalation.yml. This rule detects the privilege escalation vulnerability, PwnKit.

1title: PwnKit Local Privilege Escalation
2id: 0506a799-698b-43b4-85a1-ac4c84c720e9
3status: experimental
4description: Detects potential PwnKit exploitation CVE-2021-4034 in auth logs
5author: Sreeman
6date: 2022/01/26
7references:
8  - https://twitter.com/wdormann/status/1486161836961579020
9logsource:
10  product: linux
11  service: auth
12detection:
13  keyword:
14    - "pkexec"
15    - "The value for environment variable XAUTHORITY contains suscipious content"
16    - "[USER=root] [TTY=/dev/pts/0]"
17  condition: all of keyword
18falsepositives:
19  - Unknown
20level: high
21tags:
22  - attack.privilege_escalation
23  - attack.t1548.001

Each rule (yml) has the following sections:

  • title: Name of the rule, PwnKit Local Privilege Escalation.
  • id: UUID to uniquely identify the rule, 0506a799-698b-43b4-85a1-ac4c84c720e9.
  • status: experimental or normal, in this case it's an experimental rule.
  • description: explains the context of the rule, Detects potential PwnKit exploitation CVE-2021-4034 in auth logs.
  • author: metadata about the rule creator Sreeman.
  • date: creation date for the rule.
  • reference: reference links to blog posts or tweets explaining the issue.
  • modified: date on which the rule was modified, any new changes introduced.
  • level: Severity level one of “low”, “medium”, “high” or “critical
  • logsource: used to scope the searches, supporting various combinations, eg. the Windows Security log channel
    • product: describes the product to match all the rules.
    • service: events where the field names are set to the product logs.
  • detection: search values in specific fields of log data, selectors, keywords, timeframe and conditions, types, here we look for the tool 'pkexec'.
    • condition: selections linked in a condition, all of keywords.
  • falsepositives: Description field to explain which events or situations might trigger that rule leading to a false positive.

Diving deeper into Sigma

Logsource Field: The logsource field in the Sigma YAML describes the log data on which the detection is meant to be applied to. Logsource schema is as follows, and it can use one or more of the following three attributes:

1logsource: category [optional]
2  product [optional]
3  service [optional]
4  definition [optional]
  • Category:
    • Used to select all the logs generated by a certain family of products, eg. firewall and web server logs
    • Category examples could include: firewall, antivirus and EDR, web
  • Product:
    • Used to select all the logs generated by a specific product, eg. Windows, oracle, apache, zscaler
    • Product types can go one step beyond, like Windows EventLog Types, eg. Security, System, AppLocker, Application, and Windows Defender
  • Service:
    • used to select only a subset of product's logs, eg. Security eventlog on Windows Systems or polkit logs on Linux.

Some examples to understand the logsource format.

1logsource:
2  product: linux
3  service: auth
4---
5logsource:
6  product: cisco
7  service: aaa
8  category: accounting
9---
10logsource:
11  product: zeek
12  service: kerberos

Detection Field: This section contains a set of search-identifiers representing searches on log data and their respective evaluation mechanism, controlled by the two attributes: Selections and Conditions.

  • Selections: Search identifiers for the log data
  • Conditions: defines how the selection or filters are to be evaluated

Detection can consist of two different data structures: Lists and Maps.

List YAML type contain multiple string-based search parameters, applied to the log data, linked with a logical OR.

For example, This detection will match any keywords from the provided list in a log line.

1detection:
2  keywords:
3    - "tftp"
4    - "rcp"
5    - "puts"
6    - "copy"
7    - "configure replace"
8    - "archive tar"
9  condition: keywords

Maps YAML type consists of key/value pairs, like a dictionary, where the key is a field in the log data, and the value is a string or integer value.

List of maps are joined with a logical 'OR'. All elements of a map are joined with a logical 'AND'

For example, in the below example, a match will happen if the ImageLoaded key contains the any of the following list of values.

1logsource:
2  product: windows
3  category: driver_load
4detection:
5  selection:
6    ImageLoaded|contains:
7      - "fgexec"
8      - "dumpsvc"
9      - "cachedump"
10      - "mimidrv"
11      - "gsecdump"
12      - "servpw"
13      - "pwdump"
14  condition: selection

Here, the selection (search-identifier) matches the ImageLoaded field in the log data and uses the transformation modifier (|contains) to check if the listed keywords are present.

A short list of modifiers is listed below:

  • contains
  • all
  • base64
  • endswith
  • startswith

We can also use wildcard characters to match a wide list of keywords in the log data. For example, instead of a hardcoded path or command line argument, we can find logs for excecution of rundll32.exe: \*rundll32.exe

For conditions, we can evaluate search-identifiers using:

  • Logical AND/OR operations
  • 1 of selection or all of selection
  • Negation using not — eg. not selection
  • Grouping expressions by using parenthesis — eg. (selection1 and selection2 and selection3) or selection4

In the following example, a match will happen if the conditions - selection1, selection2, selection3 are triggered and no matches for the filter.

1detection:
2  selection1:
3    ParentImage|endswith:
4      - '\winlogon.exe'
5      - '\services.exe'
6      - '\lsass.exe'
7      - '\csrss.exe'
8      - '\smss.exe'
9      - '\wininit.exe'
10      - '\spoolsv.exe'
11      - '\searchindexer.exe'
12  selection2:
13    Image|endswith:
14      - '\powershell.exe'
15      - '\cmd.exe'
16  selection3:
17    User|contains: # covers many language settings
18      - "AUTHORI"
19      - "AUTORI"
20  filter:
21    CommandLine|contains|all:
22      - " route "
23      - " ADD "
24  condition: selection1 and selection2 and selection3 and not filter
25fields:
26  - ParentImage
27  - Image
28  - User
29  - CommandLine

Sigma supports various advanced filters, you can learn more about them here

Taking another exciting example for browser credential dumping, we have the following detection logic: selection is triggered with none of the filters positive to remove the false positives.

1detection:
2  selection:
3    - FileName|contains:
4        - '\AppData\Local\Google\Chrome\User Data\Default\Network\Cookies'
5        - '\Appdata\Local\Chrome\User Data\Default\Login Data'
6        - '\AppData\Local\Google\Chrome\User Data\Local State'
7    - FileName|endswith:
8        - '\Appdata\Local\Microsoft\Windows\WebCache\WebCacheV01.dat'
9        - '\cookies.sqlite'
10        - 'release\key3.db' #firefox
11        - 'release\key4.db' #firefox
12        - 'release\logins.json' #firefox
13  filter_browser:
14    Image|endswith:
15      - '\firefox.exe'
16      - '\chrome.exe'
17  filter_programfile:
18    Image|startswith:
19      - 'C:\Program Files\'
20      - 'C:\Program Files (x86)\'
21  filter_antimalware:
22    Image|endswith:
23      - '\MsMpEng.exe'
24      - '\MpCopyAccelerator.exe'
25  filter_service:
26    ParentImage: 'C:\Windows\System32\services.exe'
27    TargetFilename|endswith: '\APPDATA\LOCAL\MICROSOFT\WINDOWS\WEBCACHE\WEBCACHEV01.DAT'
28  filter_windows:
29    - Image: 'C:\Windows\System32\dllhost.exe'
30    - CommandLine|contains: '\svchost.exe -k DcomLaunch -p'
31  condition: selection and not 1 of filter_*
32falsepositives:
33  - Antivirus, Anti-Spyware, Anti-Malware Software
34  - Backup software

Finally, we have the falsepositives field, post-deployment of your rules; you might discover some incorrect detections (false positives), which can be dealt with by either tuning the detection or updating the rule with a list of known false positives that can occur from a detection.

I strongly urge you to read this article Sigma Rules by SOC Prime and the Wiki Specification by Sigma HQ.

Step 3: Compiling Sigma Rules

In order to convert the Sigma rule into searchable queries for the target SIEM or any supported logging platform, you have to use the Sigma Compiler sigmac, a python-based tool shipped with Sigma itself.

1bane ~/sigma
2$ cd tools
3bane ~/sigma/tools
4$ chmod +x sigmac
5bane ~/sigma/tools
6$ python3 sigmac
7Nothing to do!
8usage: sigmac [-h] [--recurse] [--filter FILTER]
9              [--target {humio,splunkdm,crowdstrike,kibana,qradar,qualys,splunk,splunkxml,es-qs-lr,es-eql,es-dsl,es-rule,sql,sumologic-cse-rule,carbonblack,athena,hedera,sqlite,es-qs,opensearch-monitor,sentinel-rule,devo,netwitness,elastalert-dsl,arcsight-esm,sumologic,sumologic-cse,uberagent,logiq,logpoint,hawk,powershell,ee-outliers,kibana-ndjson,lacework,limacharlie,graylog,mdatp,es-rule-eql,grep,sysmon,ala-rule,fortisiem,fieldlist,netwitness-epl,chronicle,fireeye-helix,stix,elastalert,ala,arcsight,csharp,xpack-watcher,datadog-logs,streamalert}]
10              [--lists] [--lists-files-after-date LISTS_FILES_AFTER_DATE]
11              [--config CONFIG] [--output OUTPUT]
12              [--output-fields OUTPUT_FIELDS] [--output-format {json,yaml}]
13              [--output-extention OUTPUT_EXTENTION] [--print0]
14              [--backend-option BACKEND_OPTION]
15              [--backend-config BACKEND_CONFIG] [--backend-help BACKEND_HELP]
16              [--defer-abort] [--ignore-backend-errors] [--verbose] [--debug]
17              [inputs ...]

sigmac allows you to convert a rule into a target of choice like Splunk, Qualys, and Qradar as visible above. Sigmac also uses field mappings to convert fields used in the rule into actual fields available for the desired target.

To view all available targets, configurations, and modifiers, simply run:

1$ python3 sigmac --lists

Compiling a rule

To finally convert the rule for your target SIEM, we provide the target system by the -t flag and the -c flag for configurations to set the field mappings correctly:

1bane ~/sigma/tools
2$ python3 sigmac -t splunk -c splunk-windows ../rules/windows/driver_load/driver_load_mal_creddumper.yml
3(ImageLoaded="*fgexec*" OR ImageLoaded="*dumpsvc*" OR ImageLoaded="*cachedump*" OR ImageLoaded="*mimidrv*" OR ImageLoaded="*gsecdump*" OR ImageLoaded="*servpw*" OR ImageLoaded="*pwdump*")

That's it, voila! You now have a compiled rule. We can use this query inside splunk to detect execution of credential dumping tools such as mimikatz or gsecdump.

A small note: Consider reading the Sigma rules present in the rules directory; you will learn a lot about detections and selections, and it will aid in writing effective Sigma rules quickly.

You can primarily monitor logs from these sources and utilize these logs for detecting intruders:

  • Firewall Logs
    • Successful/Filtered IP/TCP/UDP Communication
  • Operating System Logs
    • Authentication and Accounts
      • Large number of failed logon attempts
      • Alternation and usage of specific accounts, and SID History
    • Process Creation and Execution
      • Execution from unusual locations
      • Suspicious process relationships
      • Known executables with Unknown hashes
      • known evil hashes
    • Resource Access
    • Windows events
      • Rare Service installations
      • New domain trusts
    • Network: Port Scans, Host Discovery
  • Proxy Logs
  • Web Server Access Logs
    • 4xx Errors: Enumeration and Reconnaissance activity
    • 5xx Errors: Exploitation
  • Application Error Logs
    • Exceptions and Other specific messages

Uncoder.io is a web-based tool by SOCPrime, which let's you quickly write and play around with Sigma rules in the browser. It allows you to write, edit, test, and compile the rules for various targets.

You have made it till here. Time to get our hands dirty and do some threat detection with Sigma rules.

Challenge: Write your own Sigma rule

We have been through a lot of technical mumbo jumbo, and, a rule for a real-life scenario would look more akin to what we have learned today. However, since this post has become way too huge, I'd skip the process of setting up an actual Splunk instance and creating logs to trigger the Sigma rule that you are going to write.

You can generate the sigma rule and test events (mock test events: https://github.com/sans-blue-team/DeepBlueCLI/tree/master/evtx), upload it to TimeSketch DemoServer (user, password: demo) and play around here.

You can go one step ahead and set up an instance of OpenEDR or use Aurora-Lite, try to simulate an actual threat and begin from there. :)

I encourage you to set up a trial environment, download a vulnerable version of any typical software, and try to build your own detection rules on top of that.

Sigma rule to detect Registry Run Key Persistence Technique

There are multiple ways an attacker can gain persistence for their malicious payloads, however, in our context, let's talk about Registry based persistence.

MITRE ATT&CK mentions Registry Run Keys as "Adversaries may achieve persistence by adding a program to a startup folder or referencing it with a Registry run key. Adding an entry to the "run keys" in the Registry or startup folder will cause the program referenced to be executed when a user logs in. These programs will be executed under the context of the user and will have the account's associated permissions level."

A little pretext for RunKeys: The following run keys are created by default on Windows systems:

  • HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\Run
  • HKEY_CURRENT_USER\Software\Microsoft\Windows\CurrentVersion\RunOnce
  • HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\Run
  • HKEY_LOCAL_MACHINE\Software\Microsoft\Windows\CurrentVersion\RunOnce

Registry run key entries can reference programs directly or list them as a dependency. For example, it is possible to load a DLL at logon using a "Depend" key with RunOnceEx:

reg add HKLM\SOFTWARE\Microsoft\Windows\CurrentVersion\RunOnceEx\0001\Depend /v 1 /d "C:\malicious.dll"

Let's build a basic skeleton for our Sigma rule:

1title: Registry Add RUN Key Persistence
2id: 7e0ed38f-5aa9-427d-bc19-689edccedba1 # some random uuid
3status: test
4description: "Detects suspicious command line executions monitoring modifications to the RUN key in the Registry Hive "
5references:
6  - https://attack.mitre.org/techniques/T1547/001/
7date: 2022/06/21
8author: bane
9level: medium

We can create events for our technique using the firedrill utility. You can find the source here and customize it to run more such techniques to improve your detection capabilities. This technique uses the Registry Key HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run to execute a sample payload.

Registry Run Key Modification Event Log
Registry Run Key Modification Event Log
  • With the above information, our primary detection logic should contain a check for the modification of HKCU\SOFTWARE\Microsoft\Windows\CurrentVersion\Run.
  • The modification of this key generates an event by the Microsoft Windows Security Auditng with event ID 4657.

The selection criteria specified under the “detection” section is a set of key-value pairs. The rule, in this example, will trigger only when the field type is CommandLine contains the arguments:

  • reg.exe
  • ADD
  • 'Software\Microsoft\Windows\CurrentVersion\Run'

The complete rule:

1title: Registry Add RUN Key Persistence
2id: 7e0ed38f-5aa9-427d-bc19-689edccedba1 # some random uuid
3status: test
4description: 'Detects suspicious command line executions monitoring modifications to the RUN key in the Registry Hive '
5references:
6  - https://attack.mitre.org/techniques/T1547/001/
7date: 2022/06/21
8author: bane
9level: medium
10tags:
11  - attack.persistence
12  - attack.t1547.001
13logsource:
14    category: process_creation
15    product: windows
16detection:
17    selection:
18        CommandLine|contains|all:
19            - 'reg'
20            - ' ADD '
21            - 'Software\Microsoft\Windows\CurrentVersion\Run'
22    condition: selection
23    condition: selection
24falsepositives:
25    - System provisioning
26level: medium

After writing the Sigma rule, we can use either uncoder or Sigmac to convert from the sigma rule to any other SIEM tool format.

1bane ~/sigma/tools
2% python3 sigmac -t splunk -c splunk-windows run-key-persistence.yml
3(CommandLine="*reg*" CommandLine="* ADD *" CommandLine="*Software\\Microsoft\\Windows\\CurrentVersion\\Run*")

Threat Hunting with FourCore ATTACK

You can validate your detection rules and alerts with FourCore ATTACK. Optimize your rules by simulating attackers' behaviour generating different Event you can utilize for validating detections! Get a free assessment with a free trial of FourCore ATTACK.

Resources