An AI-powered comic book cover analyzer that extracts structured metadata from comic book cover images using advanced LLM APIs.
- Fast inference using Groq's
meta-llama/llama-4-scout-17b-16e-instructmodel - Vision-enabled analysis of comic book covers
- Structured output with Pydantic validation
- Uses Google's Gemini 2.5 Flash model with Google Search grounding
- Retrieves canonical data from authoritative sources (Comic Vine, GCD, etc.)
- Returns all source URLs used for verification
- More accurate for publication dates and pricing
- Extracts comic book metadata from cover images:
- Title
- Publisher
- Issue Number / Identifier
- Publication Date
- Cover Price
- Creators (Writer, Artist, etc.)
- Gemini RAG version includes source URLs for verification
- Simple command-line interface
- Structured JSON output
- Python 3.8 or higher
- For Groq Extractor: A Groq API key (free)
- For Gemini RAG: A Google Gemini API key (free)
- Visit https://console.groq.com
- Sign up for a free account (no credit card required)
- Navigate to API Keys section in the dashboard
- Click Create API Key
- Copy your API key
Note: Groq offers free API access with generous rate limits!
- Visit https://aistudio.google.com/app/apikey
- Sign in with your Google account
- Click Create API Key
- Copy your API key
Note: Google Gemini offers free API access with Google Search grounding capabilities!
git clone https://github.com/totallyrad1/Groq-comicbook-data-extractor.git
cd Groq-comicbook-data-extractorOn Linux/macOS:
# Create virtual environment
python3 -m venv venv
# Activate virtual environment
source venv/bin/activateOn Windows:
# Create virtual environment
python -m venv venv
# Activate virtual environment
venv\Scripts\activateYou should see (venv) appear at the beginning of your terminal prompt.
pip install -r requirements.txtCreate a .env file in the project root directory:
For Groq Extractor only:
echo "GROQ_API_KEY=your_groq_api_key_here" > .envFor both scripts (recommended):
cat > .env << EOF
GROQ_API_KEY=your_groq_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
EOFOr manually create a .env file with:
GROQ_API_KEY=your_actual_groq_api_key_here
GEMINI_API_KEY=your_actual_gemini_api_key_here.env file to version control! Add it to .gitignore.
python groq_extractor.py testimages/46617-3824-55260-1-wonder-woman.jpg{
"Title": "Wonder Woman",
"Publisher": "DC Comics",
"Identifier": "#1",
"PublicationDate": "June 1987",
"CoverPrice": "$0.75",
"Creators": [
"George PΓ©rez",
"Greg Potter"
]
}python groq_plus_gemini_rag.py testimages/46617-3824-55260-1-wonder-woman.jpg{
"Title": "Wonder Woman",
"Publisher": "DC Comics",
"Identifier": "#1",
"PublicationDate": "February 1987",
"CoverPrice": "$0.75 US",
"Creators": [
"George PΓ©rez",
"Greg Potter"
],
"CanonicalSource": "https://comicvine.gamespot.com/wonder-woman-1-the-princess-and-the-power/4000-27660/",
"GroundingSources": [
"https://comicvine.gamespot.com/wonder-woman-1-the-princess-and-the-power/4000-27660/",
"https://dc.fandom.com/wiki/Wonder_Woman_Vol_2_1",
"https://en.wikipedia.org/wiki/Wonder_Woman_(comic_book)"
]
}# Groq Extractor - Fast analysis
python groq_extractor.py testimages/thor.jpeg
# Gemini RAG - Accurate with web verification
python groq_plus_gemini_rag.py testimages/thor.jpeg
# Analyze any comic cover
python groq_extractor.py ~/Downloads/my-comic-cover.png
python groq_plus_gemini_rag.py ~/Downloads/my-comic-cover.png| Feature | Groq Extractor | Gemini RAG |
|---|---|---|
| Speed | β‘ Very Fast | π’ Slower (web search) |
| Accuracy | Good for visible info | π― Excellent (web-verified) |
| Source URLs | β No | β Yes |
| API Required | Groq | Google Gemini |
| Best For | Quick batch processing | Accurate cataloging |
| Offline Info | Relies on vision only | Uses web search |
Recommendation: Use Gemini RAG when you need accurate, verifiable data with source citations. Use Groq Extractor for fast batch processing or when sources aren't needed.
Groq-comicbook-data-extractor/
βββ groq_extractor.py # Fast Groq vision-based extractor
βββ groq_plus_gemini_rag.py # Gemini with Google Search grounding
βββ requirements.txt # Python dependencies
βββ .env # API keys (create this)
βββ README.md # This file
βββ testimages/ # Sample comic book covers
βββ 2403520-ss16.png
βββ 46617-3824-55260-1-wonder-woman.jpg
βββ images (2).jpeg
βββ thor.jpeg
Make sure your .env file exists in the project root and contains:
GROQ_API_KEY=your_actual_groq_key
GEMINI_API_KEY=your_actual_gemini_keyIf packages aren't found, ensure your virtual environment is activated:
- You should see
(venv)in your terminal prompt - Reactivate with
source venv/bin/activate(Linux/macOS) orvenv\Scripts\activate(Windows)
Provide the full or relative path to your image:
python groq_extractor.py /full/path/to/image.jpg
python groq_plus_gemini_rag.py /full/path/to/image.jpgThis error is already fixed in the latest version. Make sure you're using the updated groq_plus_gemini_rag.py file.
The Google Search grounding feature requires an active internet connection and may not always return sources. The extraction will still work, but GroundingSources will be an empty list.
Key libraries used:
For Groq Extractor:
- groq - Official Groq API client
- instructor - Structured output from LLMs
- pydantic - Data validation
- python-dotenv - Environment variable management
For Gemini RAG:
- google-genai - Official Google Gemini API client
- pydantic - Data validation
- python-dotenv - Environment variable management
See requirements.txt for the complete list.
- Both scripts use base64 encoding to send images to their respective APIs
- Groq Extractor: Faster but relies solely on vision analysis
- Gemini RAG: Slower but more accurate, uses Google Search for verification
- Extraction quality depends on image resolution and cover clarity
- The Gemini model analyzes visible information AND verifies with web sources
- Free API tiers have rate limits (check respective documentation):
- GroundingSources in Gemini RAG output shows all URLs used for data verification
Feel free to submit issues, fork the repository, and create pull requests for any improvements.
This project is open source and available under the MIT License.
- Powered by Groq for fast LLM inference
- Powered by Google Gemini for AI with Google Search grounding
- Uses Meta's Llama 4 Scout model for vision capabilities (Groq)
- Uses Google's Gemini 2.5 Flash model for multimodal analysis with web search
Happy Comic Book Analyzing! πβ¨