A black-box fingerprinting system that identifies the underlying LLM model family (GPT, LLaMA, Mistral, etc.) by analyzing response patterns across 75 discriminative prompts. The system can identify fine-tuned models as well, tracing them back to their foundational base model.
Note: Check config.py to see all identifiable model families
You can find an already NLP trained model in the model directory.
| Backend | Description | API Key Required |
|---|---|---|
ollama |
Local Ollama instance | ❌ No |
ollama-cloud |
Ollama Cloud API | ✅ OLLAMA_CLOUD_API_KEY |
openai |
OpenAI API (or compatible) | ✅ OPENAI_API_KEY |
gemini |
Gemini API (or compatible) | ✅ GEMINI_API_KEY |
deepseek |
Deepseek API (or compatible) | ✅ DEEPSEEK_API_KEY |
custom |
Custom HTTP request | ✅ CUSTOM_API_KEY |
pip install -r requirements.txt
# Or install as a package
pip3 install -e .
# Optional: Download NLTK data for text processing
python -c "import nltk; nltk.download('punkt_tab'); nltk.download('stopwords')"# Identify model and fine-tuning
llm-fingerprinter identify -b ollama --model some-model
# Train your own classifier
# Fingerprint the LLM
llm-fingerprinter simulate --model llama3.2 --family llama
# Train on the Fingerprints
llm-fingerprinter train
llm-fingerprinter identify -r ./custom_request.txt --api-key <API_KEY>
# Example of custom request inside the example folderexport OLLAMA_CLOUD_API_KEY="your-key"
llm-fingerprinter simulate -b ollama-cloud --model llama3.2 --family llamaexport OPENAI_API_KEY="your-key"
llm-fingerprinter simulate -b openai --model gpt-4 --family gptexport GEMINI_API_KEY="your-key"
llm-fingerprinter simulate -b gemini --model gemini-2.5-pro --family gptexport DEEPSEEK_API_KEY="your-key"
llm-fingerprinter simulate -b deepseek --model deepseek-v3.2 --family deepseekexport CUSTOM_API_KEY="your-key"
llm-fingerprinter simulate -b custom -e http://your-api.com/v1 --model your-model --family llama| Option | Short | Default | Description |
|---|---|---|---|
--backend |
-b |
custom |
Backend: ollama, ollama-cloud, openai,deepseek,gemini ,custom |
--endpoint |
-e |
auto | API endpoint URL |
--api-key |
-k |
env var | API key |
Run fingerprinting simulations for training data.
llm-fingerprinter simulate [OPTIONS]| Option | Default | Description |
|---|---|---|
--model |
required | Model name |
--family |
required | Family: gpt, claude, llama, gemini, mistral, qwen, gemma |
--num-sims |
optional | Number of simulations |
--repeats |
optional | Prompt repeats per simulation |
Examples:
# Ollama local
llm-fingerprinter simulate --model llama3.2 --family llama
# Ollama Cloud
llm-fingerprinter simulate -b ollama-cloud --model llama3.2 --family llama
# OpenAI
llm-fingerprinter simulate -b openai --model gpt-4 --family gpt --num-sims 5
# Custom endpoint
llm-fingerprinter simulate -b openai -e https://api.groq.com/openai/v1 -k $GROQ_KEY --model llama-3.1-70b --family llamaTrain classifier from saved fingerprints.
llm-fingerprinter train [--augment/--no-augment]Identify model family using trained classifier.
llm-fingerprinter identify --model <model-name> [-b <backend>]List available models on the API.
llm-fingerprinter list-models [-b <backend>]List saved fingerprints by family.
llm-fingerprinter list-fingerprintsShow configuration and status.
llm-fingerprinter info| Variable | Backend | Description |
|---|---|---|
OLLAMA_CLOUD_API_KEY |
ollama-cloud | Ollama Cloud API key |
OPENAI_API_KEY |
openai | OpenAI API key |
GEMINI_API_KEY |
gemini | Gemini API key |
DEEPSEEK_API_KEY |
deepseek | DeepSeek API key |
CUSTOM_API_KEY |
custom | Custom API key |
LOG_LEVEL |
all | Logging level (DEBUG, INFO, etc.) |
-
75 Prompts across 3 layers:
- Stylistic: Analyze writing style and formatting preferences
- Behavioral: Assess response patterns and decision-making behavior
- Discriminative: Identify model-specific characteristics and inconsistencies
-
Feature Extraction: 384-dim embeddings + 12 linguistic + 6 behavioral features
-
PCA reduction to 64 dimensions (Optional)
-
Ensemble Classification: Random Forest (45%) + SVM (45%) + MLP (10%)
Contributions are welcome! Whether you're adding support for new models, improving accuracy, or extending to additional clients, please see CONTRIBUTING.md for guidelines.
MIT License
