I love looking at logs. I love finding evil in logs. I love detecting evil in logs.
When I speak to other analysts about SIEM detection engineering, I hear some recurring themes:
- “Detection engineering in (SIEM query language) is too complicated/time-consuming.”
- “Because of (complexity|redundancy|effort required), when we migrated to our new SIEM from our old SIEM, we didn’t carry over the detection rules.”
- “We don’t have any ideas for new detections to make!”
I sympathize with all these sentiments. They are real problems. That’s why, if you’re detection engineering-adjacent, and you haven’t yet heard of or dove into the Sigma detection signature format, I highly recommend it.
What is Sigma?
Sigma is, basically, a backend-neutral detection signature language for logs. It was created in 2017 by Florian Roth and has since been expanded and matured through the efforts of talented maintainers and contributors to the project’s primary GitHub repo. This common description sums it up best:
Sigma is for log files what Snort is for network traffic and YARA is for files.
Outside the primary project, Sigma rules can be found in the cybersecurity community in blogs, reports (e.g., those from The DFIR Report), and other vendors’ rulesets. Particularly noteworthy and significant is SOC Prime. This organization has created a “threat bounty” program to “monetize your threat detection content,” essentially flipping the concept of bug bounty on its head.
There’s also wide support for the usage of Sigma rules. Security Onion, for example, natively supports Sigma rules that get converted into ElastAlert detection rules on the backend. In 2023, IBM QRadar announced that it would natively support Sigma rules. Both Hayabusa and Chainsaw, popular DFIR tools, include support for Sigma detection rules.
Sigma for Simplifying Detection Engineering
Many detection query languages are some combination of annoying, cumbersome, and/or complicated. Additionally, some people are (un)fortunate enough to have to deal with multiple detection query languages (e.g., KQL for Microsoft Sentinel, SPL for Splunk) leading to wasted work in creating a detection rule that is compatible with one query language and then having to re-perform that work for another query language using different field names, syntaxes, and so on.
Sigma eases that pain in a couple of ways:
- Sigma’s rule syntax is fairly simple. Additionally, since the Sigma conversion tool does most of the heavy lifting, you can significantly lower the skill floor for other query languages.
- Sigma eliminates redundant work by allowing a single human-written detection rule to be converted into any other query language.
Sigma Rule Syntax
The simplest way to understand the Sigma rule syntax is to look over a rule. Here’s a fairly simple slightly modified rule that looks for DNS requests that appear to be associated with a potential Cobalt Strike beacon:
title: Suspicious Cobalt Strike DNS Beaconing - Sysmon
id: f356a9c4-effd-4608-bbf8-408afd5cd006
related:
- id: 0d18728b-f5bf-4381-9dcf-915539fff6c2
type: similar
status: test
description: Detects a program that invoked suspicious DNS queries known from Cobalt Strike beacons
references:
- https://www.icebrg.io/blog/footprints-of-fin7-tracking-actor-patterns
- https://www.sekoia.io/en/hunting-and-detecting-cobalt-strike/
author: Florian Roth (Nextron Systems)
date: 2021-11-09
modified: 2023-01-16
tags:
- attack.command-and-control
- attack.t1071.004
logsource:
product: windows
category: dns_query
detection:
selection1:
QueryName|startswith:
- 'aaa.stage.'
- 'post.1'
selection2:
QueryName|contains: '.stage.123456.'
filter1:
QueryName|endswith: '.example.com' # example filter
condition: 1 of selection* and not filter1
falsepositives:
- Unknown
fields:
- Image
- CommandLine
level: critical
Sigma rules are written in YAML, which is great for readability. Rules primarily consist of metadata fields that describe the rule (e.g., title, ID, description, tags), log source fields that instruct on where to find the logs of interest (everything under the logsource
key), and detection fields which define what to alert on and what not to alert on (everything under the detection
key). If you’re new to YAML and want to start writing your own Sigma detection rules, be aware that YAML is both case- and space-sensitive.
The logsource
key is made up of three potential fields: category
, product
, and service
. category
is used to select all log sources for a specific category of security tooling (e.g., firewall). product
and service
are additional subdividers. SigmaHQ provides fairly comprehensive documentation on what product/service/category pairings to use for different log types. For example, Windows Applocker logs should leverage the following logsource
definition:
logsource:
product: windows
service: applocker
The detection
key is where the fun happens. Detections are made up of selections and filters. You should define what things you are looking for (alerting criteria) with one or more selections and define exclusions with one or more filters. The names of selection and filter keys are arbitrary (e.g., you could use “selection1” per the example, or “doughnut,” (don’t do this) so long as you refer to “doughnut” in the condition
definition). Comprehensive rules on what modifiers are available (e.g., startswith, contains, and endswith, from the example rule) are available on the Sigma Modifiers documentation page.
Then, under the condition
key, you define what to alert on through the Sigma Conditions syntax. In the example above, the condition “1 of selection* and not filter1” is equivalent to the following statement:
Alert when the DNS query name starts with ‘aaa.stage.’ or ‘post.1’ or the DNS query name contains ‘.stage.12456.’, but do not alert when the DNS query ends with ‘.example.com’.
For official documentation on Sigma rule syntax, review the Sigma specification.
Eliminating Redundant Work with SigmaCLI and sigconverter
SigmaHQ maintains SigmaCLI, a Python-based Sigma conversion tool. It can be installed quickly and easily at the command line via pip.
python -m pip install sigma-cli
Once installed, using SigmaCLI is as simple as invoking the sigma
command with appropriate arguments. You can convert rules into different backends (SIEM query languages), leveraging different pipelines (e.g., environment-specific mappings).
sigma convert -t <backend> -p <processing pipeline 1> -p <processing pipeline 2> [...] <directory or file>
The Sigma community has created several plugins that provide pipelines and backends for use. You can list and install them using the sigma plugin
command. Below are some example commands:
sigma plugin list
sigma plugin list --plugin-type pipeline
sigma plugin install <backend|pipeline>
For example, to convert a Sigma threat hunting rule that I created to detect potentially suspicious Azure Front Door connections to Kusto Query Language for Microsoft Defender for Endpoint’s EDR logs, I would use the following command:
sigma convert -t kusto -p microsoft_xdr net_connection_win_susp_azurefd_connection.yml
This would produce the following result:
Parsing Sigma rules [####################################] 100%
DeviceNetworkEvents
| where RemoteUrl contains "azurefd.net" and (not(((InitiatingProcessFolderPath endswith "msedge.exe" or InitiatingProcessFolderPath endswith "chrome.exe" or InitiatingProcessFolderPath endswith "msedgewebview2.exe" or InitiatingProcessFolderPath endswith "firefox.exe" or InitiatingProcessFolderPath endswith "brave.exe" or InitiatingProcessFolderPath endswith "vivaldi.exe" or InitiatingProcessFolderPath endswith "opera.exe" or InitiatingProcessFolderPath endswith "chromium.exe") or InitiatingProcessFolderPath endswith "searchapp.exe" or (RemoteUrl contains "afdxtest.z01.azurefd.net" or RemoteUrl contains "fp-afd.azurefd.net" or RemoteUrl contains "fp-afdx-bpdee4gtg6frejfd.z01.azurefd.net" or RemoteUrl contains "roxy.azurefd" or RemoteUrl contains "powershellinfraartifacts-gkhedzdeaghdezhr.z01.azurefd.net" or RemoteUrl contains "storage-explorer-publishing-feapcgfgbzc2cjek.b01.azurefd.net" or RemoteUrl contains "graph.azurefd.net"))))
If my environment also included an Elastic SIEM that leveraged Windows Sysmon logs, I could use the following command:
sigma convert -t elastalert -p sysmon net_connection_win_susp_azurefd_connection.yml
This would produce the following result:
Parsing Sigma rules [####################################] 100%
description: Detects connections with Azure Front Door (known legitimate service that can be leveraged for C2)
that fall outside of known benign behavioral baseline (not using common apps or common azurefd.net endpoints)
name: Potential Suspicious Azure Front Door Connection
index: "*"
filter:
- query:
query_string:
query: EventID:3 AND (DestinationHostname:*azurefd.net* AND (NOT ((Image:(*msedge.exe OR *chrome.exe OR *msedgewebview2.exe OR *firefox.exe OR *brave.exe OR *vivaldi.exe OR *opera.exe OR *chromium.exe)) OR Image:*searchapp.exe OR (DestinationHostname:(*afdxtest.z01.azurefd.net* OR *fp\-afd.azurefd.net* OR *fp\-afdx\-bpdee4gtg6frejfd.z01.azurefd.net* OR *roxy.azurefd* OR *powershellinfraartifacts\-gkhedzdeaghdezhr.z01.azurefd.net* OR *storage\-explorer\-publishing\-feapcgfgbzc2cjek.b01.azurefd.net* OR *graph.azurefd.net*)))))
type: any
While the readability and efficiency of the converted rules have some definite opportunities for improvement, they are completely functional. With this, I could maintain a single rule in Sigma and convert it into any backend format needed for my environment. You could even leverage detection-as-code and CI/CD to automagically roll detections out to your environment.
sigconverter is another tool you can use to perform Sigma conversions. In my experience, it is not as reliable as SigmaCLI, but one extreme definitive advantage it has is its integration into the Sigma Visual Studio Code extension. I find this to be most useful for, at a glance, quickly spotting if my detection rule is producing the desired output and does not have any formatting errors.
I recommend reviewing the extension’s user’s settings.json file to configure backends and pipelines of interest.
Sigma for Detection Ideation
A positive side effect of the existence of the Sigma project is an extreme opportunity for ideation. If you ever find yourself stumped on what things might be valuable to write new detection rules for, reviewing the library of thousands of rules in the primary SigmaHQ GitHub repository can be a wonderful source of inspiration. Not only can the individual rules be highly informative of interesting things to look for, but because the Sigma detection rule specification mandates a field for references, you can often find excellent new sources of threat information. Consider the below example:
title: AppX Package Installation Attempts Via AppInstaller.EXE
id: 7cff77e1-9663-46a3-8260-17f2e1aa9d0a
related:
- id: 180c7c5c-d64b-4a63-86e9-68910451bc8b
type: derived
status: test
description: |
Detects DNS queries made by "AppInstaller.EXE". The AppInstaller is the default handler for the "ms-appinstaller" URI. It attempts to load/install a package from the referenced URL
references:
- https://twitter.com/notwhickey/status/1333900137232523264
- https://lolbas-project.github.io/lolbas/Binaries/AppInstaller/
author: frack113
date: 2021-11-24
modified: 2023-11-09
tags:
- attack.command-and-control
- attack.t1105
logsource:
product: windows
category: dns_query
detection:
selection:
Image|startswith: 'C:\Program Files\WindowsApps\Microsoft.DesktopAppInstaller_'
Image|endswith: '\AppInstaller.exe'
condition: selection
falsepositives:
- Unknown
level: medium
Just by glancing at this rule, if we were not previously aware, we have learned that AppInstaller.exe is a tool that can be used by threat actors to load/install potentially malicious packages, and we’ve learned about a great resource on how threat actors attempt to remain stealthy by living-off-the-land: the LOLBAS Project!
Advanced Sigma
If you try to use Sigma in production for any extended period of time, you’ll notice two things:
- Sigma rules don’t always convert or perfectly convert in accordance with what log sources you have available for your SIEM
- The default format for a Sigma rule only allows for simple, “atomic” detections
The remedies for these two issues come in the form of pipelines and Sigma Correlations.
Pipelines
I like to think of Sigma pipelines as intermediaries between a Sigma rule and a backend SIEM language. These are most often valuable for making sure that fields are mapped to the correct field names and values. For example, a Sigma rule may be written for Windows event logs, but your SIEM may be ingesting Sysmon-based Windows logs instead of native event logs. If you convert a Sigma rule looking for process creation to your SIEM backend without applying the common Sysmon
pipeline, you’ll end up with a rule that triggers on event ID 4688 instead of Sysmon’s event ID 1.
My Microsoft Sentinel test environment has a custom Syslog-based DNS log source that is presented as a table named IsaacDNS
If I try to convert a rule in Sigma to the Kusto pySigma backend using only pipelines natively available from the package, I run into an issue:
sigma convert -t kusto -p azure_monitor dns_query_win_mal_cobaltstrike.yml
Parsing Sigma rules [####################################] 100%
Error: Error while conversion: Unable to determine table name for category: dns_query, category is not yet supported by the pipeline. Please provide the 'query_table' parameter to the pipeline instead.
This is saying that the pipeline does not know what table to reference for a DNS request. While this will likely be better mitigated over time as the backend is further developed, we would still need to write our own pipeline to map to our own log source, as it is custom and bespoke. Here’s a quick fix pipeline I threw together to be able to leverage my custom DNS log source:
name: "Isaac's Demo Pipeline"
priority: 1
transformations:
- id: category_dns_to_IsaacDNS
type: set_state
key: "query_table"
val: ["IsaacDNS"]
rule_conditions:
- type: logsource
category: dns_query
- id: fix_field_name_mappings
type: field_name_mapping
mapping:
QueryName: DnsQuery
With the above pipeline, running the following conversion command results in a successful conversion, leveraging my custom IsaacDNS
log source and the correct custom column name, QueryName
. Essentially, what the pipeline has done is set the query_table
field to be equal to IsaacDNS
for Sigma rules that are of the log source category dns_query
. These pipelines can be written in YAML, or, for more sophisticated use cases, as is often used in pySigma backend development, Python.
sigma convert -t kusto -p azure_monitor -p demo_pipeline.yml dns_query_win_mal_cobaltstrike.yml
The result is shown below:
IsaacDNS
| where (QueryName startswith "aaa.stage." or QueryName startswith "post.1") or QueryName contains ".stage.123456."
This is an extremely simple sample, and this quick fix of creating a pipeline to accommodate a specific rule is not a great showcasing of the purpose of Sigma (what’s the point of an agnostic detection language if you need to write a million pipelines?). However, with strategy and planning, you can create a small quantity of robust pipelines that can successfully convert Sigma rules into all of your environment-specific niche backend log sources and accommodate common errors (e.g., rules that expect fields you simply don’t have).
You can find more information on how pipelines work from SigmaHQ here. I also recommend reviewing an open source pySigma backend project’s pipelines, such as the Kusto backend from this example, as those are some of the most robust and sophisticated examples of pipelines you can draw inspiration from.
Correlation Rules
Presently, >99% of Sigma rules that you will find are what Eric Capuano would call “atomic detection rules,” as opposed to “stateful detection rules.” This graphic from his blog post on the matter (a highly recommended read) describes the difference perfectly. ![[EricCapuanoStatefulRules.png]]
Stateful detection rules have recently been made available in Sigma through “Sigma Correlations.” Unfortunately, at the time of writing, Sigma Correlations are only available in Elasticsearch’s ES|QL and Splunk’s SPL query languages. Below is a simple example:
title: Windows Failed Logon Event
name: failed_logon # Rule Reference
description: Detects failed logon events on Windows systems.
logsource:
product: windows
service: security
detection:
selection:
EventID: 4625
condition: selection
---
title: Multiple failed logons for a single user (possible brute force attack)
correlation:
type: event_count
rules:
- failed_logon # Referenced here
group-by:
- TargetUserName
- TargetDomainName
timespan: 5m
condition:
gte: 10
The biggest difference between a “normal” rule and a correlation rule in Sigma is that there are, in a way, multiple rules in one rule. The first rule, “failed_logon
,” is referenced by the second “rule,” which aggregates the count of the appearance of the failed_logon
rule grouped by the TargetUserName
and TargetDomainName
over a five-minute timespan. Essentially, an alert would fire if a particular account on a particular domain failed to log in more than ten times over five minutes.
This is a game changer for Sigma and could drive significantly wider adoption. I don’t think it’s possible (or even a good idea) for Sigma to strive to support the most advanced features provided by the most advanced query languages (e.g., Sentinel geoinfo for IP addresses via the geo_info_from_ip_address
function). However, almost all SIEM languages support some level of correlation, and some of the best detections are only made possible by the ability to correlate, aggregate, rank, stack, and perform other similar actions on logs.
Additional Resources
If you’re interested in learning more about or leveraging Sigma, the first place to look would be SigmaHQ’s Getting Started page. Be sure to check out the wide variety of rules available on the official GitHub repo.
If you’re interested in training on Sigma, I’ve taken and strongly recommend Josh Brower’s Detection Engineering with Sigma course from Applied Network Defense. You’ll learn Sigma fundamentals, review case studies leveraging real logs from Windows Event Logs, Zeek, Sysmon, AWS Cloudtrail, and how to use Sigma in production.