HashTraceAI is a lightweight tool for generating and verifying file-level manifests for machine learning models. It calculates cryptographic hashes of files in a model directory, produces a JSON manifest, and verifies those hashes to detect drift, tampering, or unintended changes.
HashTraceAI helps security teams verify the integrity and provenance of machine learning model artifacts in CI/CD pipelines or production deployments. By creating a manifest with cryptographic hashes of each file, teams can quickly detect drift, unauthorized modification, or corruption during storage or transmission.
- Generates a file-level manifest from any directory or model hub (Hugging Face, MLflow)
- Uses SHA-256 for secure hashing
- Supports RSA-signed manifests for tamper detection and authenticity verification
- Uses password-encrypted private keys for secure signing operations
- Verifies model files against a previously generated manifest
- Verifies the manifest's digital signature to prove authenticity
- CLI output supports JSON or colorized text format
- Produces portable JSON output suitable for automation
HashTraceAI supports secure MLOps practices aligned with ISO/IEC 42001:2023 by enabling traceability, integrity verification, provenance validation, and cryptographic signature verification of model artifacts. These features contribute to:
-
Clause 5.3 – Roles and responsibilities Ensures teams can clearly define and enforce responsibility for model integrity and approval workflows.
-
Clause 6.1.2 – Risk treatment plan Supports the detection of unauthorized model drift or tampering via manifest verification and optional digital signatures.
-
Clause 7.5 – Documented information Allows for automated and cryptographically verifiable documentation of model versions and components in regulated environments.
-
Clause 8.2.1 – Data and AI system integrity Confirms that deployed models match the validated and approved versions using strong file-level hashing and signature verification.
-
Clause 8.3 – Operational planning and control Integrates into CI/CD pipelines to enforce provenance and integrity checks for models sourced internally or from third parties (e.g., Hugging Face, MLflow).
Disclaimer: While HashTraceAI aligns with ISO 42001 principles, its use alone does not ensure compliance. Organizations should evaluate it as part of a broader AI management and risk governance program.
Clone the repository and install required dependencies:
git clone https://github.com/vsheahan/hashtraceai.git
cd hashtraceai
pip install -r requirements.txt
Here is the recommended three-step process for ensuring maximum security and authenticity.
First, create a password-protected private key and a corresponding public key. The private key will be used for signing, and the public key will be used for verification.
You will be prompted to create and confirm a password for the private key.
python3 cli.py keys --name my_key --out-dir keysKeep your private_key.pem file and its password secret! The public_key.pem can be distributed freely.
Next, generate a manifest for your model. The --sign flag will use your private key to create a digital signature. You will be prompted for the password you created in Step 1.
python3 cli.py generate --path ./your-model-dir --created-by "Your Name" --model-name "My Model" --model-version "1.0" --sign-key keys/my_key.pemThis command creates a single manifest file (e.g., My Model_1.0_manifest.json) that includes the file list, their hashes, and the digital signature.
Finally, anyone with the public key can verify the integrity of the model files and the authenticity of the manifest. This command checks both that the files haven't changed and that the signature is valid.
python3 cli.py verify --manifest-file your-model-dir/My\ Model_1.0_manifest.json --public-key keys/my_key.pub| Scenario | Command | Purpose |
|---|---|---|
| Generate an encrypted key pair | python3 cli.py keys --name my_key --out-dir keys |
Create a secure, password-protected key pair for signing. |
| Generate and sign a manifest | python3 cli.py generate --path ./model --created-by "Your Name" --sign-key keys/my_key.pem --model-name "MyModel" --model-version "1.0" |
Prove authenticity with a digital signature, requires password. |
| Verify files and signature | python3 cli.py verify --manifest-file manifest.json --directory ./model --public-key keys/my_key.pub |
Confirm that files are unchanged and the manifest is authentic. |
| Download from Hugging Face | python3 hf_downloader.py --model-id "distilbert-base-uncased" --created-by "Your Name" --sign-key keys/my_key.pem |
Download HF model and generate signed manifest in local hf/ directory. |
| Generate manifest for MLflow model | python3 cli.py generate --path mlruns/0/[run_id]/artifacts/model --created-by "Your Name" --sign-key keys/my_key.pem --model-name "MLflowModel" --model-version "1.0" |
Hash MLflow model artifacts and create signed manifest. |
| JSON output for CI/CD integration | python3 cli.py verify --manifest-file manifest.json --directory ./model --public-key keys/my_key.pub --format json |
Get structured log output for automation. |
You can download a model from the Hugging Face Hub and generate a signed manifest for it in one command:
python3 hf_downloader.py --model-id "distilbert-base-uncased" --created-by "Your Name" --sign-key keys/my_key.pem --model-version "1.0"This command will:
- Download the specified model from Hugging Face Hub (stored in cache)
- Generate a signed manifest with SHA-256 hashes of all model files
- Save the manifest in the local
hf/directory for easy access - Provide the exact verification command for testing
The manifest will be saved as hf/[model-name]_[version]_manifest.json in your project directory.
HashTraceAI supports MLflow model artifacts. Generate manifests for MLflow models using:
# For models in mlruns directory
python3 cli.py generate \
--path mlruns/0/[run_id]/artifacts/[model_name] \
--created-by "Your Name" \
--sign-key keys/my_key.pem \
--model-name "MLflowModel" \
--model-version "1.0"
# For models in MLflow cache
python3 cli.py generate \
--path .cache/hashtraceai/mlflow/[model_hash] \
--created-by "Your Name" \
--sign-key keys/my_key.pem \
--model-name "CachedMLflowModel" \
--model-version "1.0"- Python 3.8 or newer
huggingface_hub(for Hugging Face model downloads)cryptography(for RSA signing)mlflow(for MLflow model artifacts)colorama
The generated manifest is a JSON file with the following structure:
{
"model_name": "My Model",
"model_version": "1.0",
"created_by": "Your Name",
"timestamp_utc": "2025-07-06T23:50:00.123456+00:00",
"timestamp_local": "2025-07-06T19:50:00.123456",
"files": [
{
"name": "file1.txt",
"path": "file1.txt",
"sha256": "..."
},
{
"name": "image.png",
"path": "data/image.png",
"sha256": "..."
}
],
"signature": "..."
}If --sign is used, a manifest.json.sig file is created. It can be verified using the associated public key.
This project is licensed under the MIT License.
