This repository provides a script to convert Sigma rules to Elastalert-compatible rules.
- Python 3.6 or later
- Git
- A Unix-like shell (e.g., Bash)
-
Create a New Python Virtual Environment and Activate It
python3 -m venv myenv source myenv/bin/activate -
Upgrade pip
pip install --upgrade pip
-
Create a Directory for Elastalert Rules and Clone This Repository
git clone https://github.com/m-aouzal/sigma_to_elastalert.git cd sigma_to_elastalert -
Clone the Sigma Repository
This step downloads the official Sigma rules:
git clone https://github.com/Neo23x0/sigma.git
-
Install Sigma CLI
Install the sigma CLI tool from PyPI:
python3 -m pip install sigma-cli
-
Install pySigma (for Compatibility Verification)
pip install pySigma
-
Install the Elasticsearch Backend for pySigma
pip install pySigma-backend-elasticsearch
Note: This script utilizes GNU Parallel to run file conversions concurrently, which significantly speeds up processing when dealing with many files. Please ensure that GNU Parallel is installed on your system before running the conversion command. On Debian or Ubuntu systems, you can install it using:
sudo apt-get install parallelSequential Execution Alternative:
This script is optimized for speed using GNU Parallel to convert files concurrently. However, if you prefer not to install GNU Parallel or want to run the conversion sequentially, you can modify the command to process files one by one. Instead of running:
find "$RULE_DIR" -type f -name "*.yml" | parallel -j "$JOBS" process_rule {}you can use a simple loop, such as:
for file in $(find "$RULE_DIR" -type f -name "*.yml"); do
process_rule "$file"
doneThis loop will run the conversion function on each file sequentially. Keep in mind that processing files one at a time may take longer if you have many files to convert.
For other platforms or additional installation details, please refer to GNU Parallel's official documentation. 8. Make the Conversion Script Executable and Run It
chmod +x convert.sh
./convert.shNote: You can change the input directory (where your Sigma rules are located) and the output directory (where converted Elastalert rules will be saved) by editing the
convert.shscript.
Once the conversion script has executed, check the output directory for Elastalert-compatible rule files. You can then integrate these rules into your Elastalert setup.
This folder contains a pipeline that:
- Converts Sigma rules from the
sigma/rules/windowsfolder into Elastalert-compatible YAML files. - Performs partial transformations on those converted files (e.g. lowercasing certain values, converting
description:lines to block scalars, replacingindex: "*"withindex: "winlogbeat-*", and appending.keywordto ECS fields in the query strings). - Prepares final Elastalert rules that can be used in an Elasticsearch + Kibana environment.
Below is an overview of the important files/folders and how they all fit together:
-
check_keywords.py
This script (if present) may have once been used to verify or debug.keywordusage. It's not directly invoked byconvert.shnow, but can be repurposed for checking fields. -
pySigma-backend-elasticsearch/
A custom or experimental Sigma backend for Elasticsearch. Typically, this is used in tandem withsigma convert -t elastalert .... However, note that advanced features or new ES versions (like Elasticsearch 8+) may cause issues in older PySigma code. This is partly why the.keywordtransformations were introduced – so that older Sigma backends can still produce functional queries on new indexes. -
converted_sigma_rules_to_elastalert/
This is the output folder whereconvert.shplaces the final Elastalert rules after transformation. The subfolders mirror the structure fromsigma/rules/windows. If everything succeeds, you’ll find .yml files in here containing the final rules. -
convert.sh
The main script that orchestrates the entire pipeline. It:- Loads each
.ymlSigma rule fromRULE_DIR(by default./sigma/rules/windows/). - Runs
sigma converton them, partially lowercasing the detection block values. - Fixes the
description:lines by turning them into a YAML block scalar. - Replaces
index: "*"withindex: "winlogbeat-*". - Uses AWK to scan lines that start with
query:orquery_string:and appends.keywordto ECS fields in that line – only if they match a certain pattern (e.g.winlog.channel) and are not exempt.
- Loads each
-
non_keyword_fields
This file contains a list of fields not to be appended with.keyword. If a field is in here,convert.sh’s final AWK transformation will skip adding.keywordto it.
gma.
In older or simpler Sigma backends, you might want a case-insensitive search. However, because these fields become keywords in Elasticsearch, you want them stored in a consistent lowercased format. Specifically:
- The script uses sed to:
- Enter the detection block: from
detection:until the next top-level key. - Skip lines starting with
condition:(so the condition is not lowercased). - Lowercase everything else in that block.
- Enter the detection block: from
Why lowercasing?
- Keywords in Elasticsearch are exact-match. If you want consistent, case-insensitive searching, it’s easiest to store them in lowercased form. Or, if a normalizer is used, it can also forcibly lowercase at index time.
YAML’s plain scalar lines can break if they contain certain punctuation or colons. The script uses AWK to:
- Detect lines starting with
description: - Output
description: | - Indent subsequent lines of the description so it’s recognized as a multi-line block scalar.
Originally, Sigma rules might have index: "*". But many users store logs in a named pattern like winlogbeat-*. The script finds lines that begin with index: and replaces '*' with "winlogbeat-*". This helps ensure the generated Elastalert rules only target the correct indices.
Because new versions of Elasticsearch do not accept partial text searching on keyword fields (or older PySigma backends might rely on field:value logic that doesn’t map well to new text fields), we do:
- AWK scans lines starting with
query:orquery_string:. - We find tokens that look like
field:orfield.xyz:. - If the field matches the ECS-like pattern (
^[a-zA-Z_]+(\.[a-zA-Z0-9_]+)+$) and is not in an exemption list, we append.keyword. For example,winlog.channelbecomeswinlog.channel.keyword. - This ensures that searching on certain fields is done as a keyword, not as a text field. The script helps older rules remain functional in new ES setups.
The script uses GNU Parallel to process multiple rules concurrently (-j 10 by default). If any file fails, it logs that in ./failed_conversion. At the end, it prints a summary of failed or invalid ones.
-
.keyword
In modern Elasticsearch, fields stored astextare analyzed and used for full-text search. If you want an exact-match filter (likefield:"someValue"), it’s more stable to store them as a keyword field. That’s why we dofield.keyword. This is standard in the ECS (Elastic Common Schema) world, especially if you want to do exact matching or aggregations. -
Lowercasing
If you want case-insensitive searches (like matchingPROCESS.EXECUTABLEorProcess.Executableorprocess.executableequally), the simplest approach is to store them all as lowercased keywords. Then you only have to search in lowercase. Alternatively, you can use a normalizer on the field to make everything lowercase at index time.
Below is an example snippet you can place in your index template or dev console to define a normalizer that lowercases all keyword fields:
PUT winlogbeat-*
{
"settings": {
"analysis": {
"normalizer": {
"lowercase_normalizer": {
"type": "custom",
"filter": ["lowercase"]
}
}
}
},
"mappings": {
"dynamic_templates": [
{
"strings_as_text_with_keyword": {
"match_mapping_type": "string",
"mapping": {
"type": "text",
"fields": {
"keyword": {
"type": "keyword",
"normalizer": "lowercase_normalizer"
}
}
}
}
}
]
}
}With this approach, the .keyword subfield is always stored in lowercase. That means searching for FIELD.KEYWORD:"SomeValue" is effectively searching for somevalue, making it case-insensitive.
If you have older indices (e.g. winlogbeat-8.17.4) that don’t have the normalizer or the new field mappings, you can use:
POST _reindex
{
"source": {
"index": "winlogbeat-8.17.4"
},
"dest": {
"index": "winlogbeat-8.17.4-2025.03.29"
}
}This copies data from winlogbeat-8.17.4 to a new index winlogbeat-8.17.4-2025.03.29 which presumably has the new mapping (with the normalizer). Then you can point your queries or Kibana index patterns at that new index.
Hint: If you paste large JSON in Kibana Dev Tools, ensure it’s in pure JSON format, not YAML.
convert.shautomates the process of converting and massaging Sigma rules into functional Elastalert rules for an ECS + Elasticsearch environment.- We do partial lowercasing, block-scalar descriptions, index rewriting, and
.keywordappending. - The approach helps older or simpler Sigma backends remain useful with new ES 8 versions.
- For actual data, you might need to define a normalizer or do reindexing to ensure your fields exist as lowercased keywords. Otherwise, your queries might not match.
We hope this clarifies how to use the scripts and why .keyword plus lowercasing is important for stable, case-insensitive searching in modern Elasticsearch.
if you have any question reach me out here : [email protected]