🦸 AI Comic Book Data Extractor

An AI-powered comic book cover analyzer that extracts structured metadata from comic book cover images using advanced LLM APIs.

🎯 Two Powerful Extraction Methods

1. Groq Extractor (`groq_extractor.py`)

Fast inference using Groq's meta-llama/llama-4-scout-17b-16e-instruct model
Vision-enabled analysis of comic book covers
Structured output with Pydantic validation

2. Groq + Gemini RAG (`groq_plus_gemini_rag.py`)

Uses Google's Gemini 2.5 Flash model with Google Search grounding
Retrieves canonical data from authoritative sources (Comic Vine, GCD, etc.)
Returns all source URLs used for verification
More accurate for publication dates and pricing

📋 Features

Extracts comic book metadata from cover images:
- Title
- Publisher
- Issue Number / Identifier
- Publication Date
- Cover Price
- Creators (Writer, Artist, etc.)
Gemini RAG version includes source URLs for verification
Simple command-line interface
Structured JSON output

🚀 Getting Started

Prerequisites

Python 3.8 or higher
For Groq Extractor: A Groq API key (free)
For Gemini RAG: A Google Gemini API key (free)

1. Get Your Free API Keys

Groq API Key

Visit https://console.groq.com
Sign up for a free account (no credit card required)
Navigate to API Keys section in the dashboard
Click Create API Key
Copy your API key

Note: Groq offers free API access with generous rate limits!

Gemini API Key (Optional - for RAG version)

Visit https://aistudio.google.com/app/apikey
Sign in with your Google account
Click Create API Key
Copy your API key

Note: Google Gemini offers free API access with Google Search grounding capabilities!

2. Clone or Download the Repository

git clone https://github.com/totallyrad1/Groq-comicbook-data-extractor.git
cd Groq-comicbook-data-extractor

3. Set Up Python Virtual Environment

On Linux/macOS:

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

On Windows:

# Create virtual environment
python -m venv venv

# Activate virtual environment
venv\Scripts\activate

You should see (venv) appear at the beginning of your terminal prompt.

4. Install Dependencies

pip install -r requirements.txt

5. Configure Your API Key(s)

Create a .env file in the project root directory:

For Groq Extractor only:

echo "GROQ_API_KEY=your_groq_api_key_here" > .env

For both scripts (recommended):

cat > .env << EOF
GROQ_API_KEY=your_groq_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
EOF

Or manually create a .env file with:

GROQ_API_KEY=your_actual_groq_api_key_here
GEMINI_API_KEY=your_actual_gemini_api_key_here

⚠️ Important: Never commit your .env file to version control! Add it to .gitignore.

🎯 Usage

Method 1: Groq Extractor (Fast Vision-Only Analysis)

Basic Usage

python groq_extractor.py testimages/46617-3824-55260-1-wonder-woman.jpg

Expected Output

{
    "Title": "Wonder Woman",
    "Publisher": "DC Comics",
    "Identifier": "#1",
    "PublicationDate": "June 1987",
    "CoverPrice": "$0.75",
    "Creators": [
        "George Pérez",
        "Greg Potter"
    ]
}

Method 2: Gemini RAG (Web-Grounded Analysis with Source URLs)

Basic Usage

python groq_plus_gemini_rag.py testimages/46617-3824-55260-1-wonder-woman.jpg

Expected Output (with Source URLs!)

{
    "Title": "Wonder Woman",
    "Publisher": "DC Comics",
    "Identifier": "#1",
    "PublicationDate": "February 1987",
    "CoverPrice": "$0.75 US",
    "Creators": [
        "George Pérez",
        "Greg Potter"
    ],
    "CanonicalSource": "https://comicvine.gamespot.com/wonder-woman-1-the-princess-and-the-power/4000-27660/",
    "GroundingSources": [
        "https://comicvine.gamespot.com/wonder-woman-1-the-princess-and-the-power/4000-27660/",
        "https://dc.fandom.com/wiki/Wonder_Woman_Vol_2_1",
        "https://en.wikipedia.org/wiki/Wonder_Woman_(comic_book)"
    ]
}

More Examples

# Groq Extractor - Fast analysis
python groq_extractor.py testimages/thor.jpeg

# Gemini RAG - Accurate with web verification
python groq_plus_gemini_rag.py testimages/thor.jpeg

# Analyze any comic cover
python groq_extractor.py ~/Downloads/my-comic-cover.png
python groq_plus_gemini_rag.py ~/Downloads/my-comic-cover.png

🔄 Which Method Should You Use?

Feature	Groq Extractor	Gemini RAG
Speed	⚡ Very Fast	🐢 Slower (web search)
Accuracy	Good for visible info	🎯 Excellent (web-verified)
Source URLs	❌ No	✅ Yes
API Required	Groq	Google Gemini
Best For	Quick batch processing	Accurate cataloging
Offline Info	Relies on vision only	Uses web search

Recommendation: Use Gemini RAG when you need accurate, verifiable data with source citations. Use Groq Extractor for fast batch processing or when sources aren't needed.

📁 Project Structure

Groq-comicbook-data-extractor/
├── groq_extractor.py           # Fast Groq vision-based extractor
├── groq_plus_gemini_rag.py     # Gemini with Google Search grounding
├── requirements.txt            # Python dependencies
├── .env                        # API keys (create this)
├── README.md                   # This file
└── testimages/                 # Sample comic book covers
    ├── 2403520-ss16.png
    ├── 46617-3824-55260-1-wonder-woman.jpg
    ├── images (2).jpeg
    └── thor.jpeg

🔧 Troubleshooting

"GROQ_API_KEY is not set" or "GEMINI_API_KEY is not set" Error

Make sure your .env file exists in the project root and contains:

GROQ_API_KEY=your_actual_groq_key
GEMINI_API_KEY=your_actual_gemini_key

Virtual Environment Not Activated

If packages aren't found, ensure your virtual environment is activated:

You should see (venv) in your terminal prompt
Reactivate with source venv/bin/activate (Linux/macOS) or venv\Scripts\activate (Windows)

Image File Not Found

Provide the full or relative path to your image:

python groq_extractor.py /full/path/to/image.jpg
python groq_plus_gemini_rag.py /full/path/to/image.jpg

"Tool use with a response mime type: 'application/json' is unsupported" Error

This error is already fixed in the latest version. Make sure you're using the updated groq_plus_gemini_rag.py file.

Gemini RAG Returns No Sources

The Google Search grounding feature requires an active internet connection and may not always return sources. The extraction will still work, but GroundingSources will be an empty list.

🛠️ Dependencies

Key libraries used:

For Groq Extractor:

groq - Official Groq API client
instructor - Structured output from LLMs
pydantic - Data validation
python-dotenv - Environment variable management

For Gemini RAG:

google-genai - Official Google Gemini API client
pydantic - Data validation
python-dotenv - Environment variable management

See requirements.txt for the complete list.

📝 Notes

Both scripts use base64 encoding to send images to their respective APIs
Groq Extractor: Faster but relies solely on vision analysis
Gemini RAG: Slower but more accurate, uses Google Search for verification
Extraction quality depends on image resolution and cover clarity
The Gemini model analyzes visible information AND verifies with web sources
Free API tiers have rate limits (check respective documentation):
- Groq Rate Limits
- Gemini Rate Limits
GroundingSources in Gemini RAG output shows all URLs used for data verification

🤝 Contributing

Feel free to submit issues, fork the repository, and create pull requests for any improvements.

📄 License

This project is open source and available under the MIT License.

🙏 Acknowledgments

Powered by Groq for fast LLM inference
Powered by Google Gemini for AI with Google Search grounding
Uses Meta's Llama 4 Scout model for vision capabilities (Groq)
Uses Google's Gemini 2.5 Flash model for multimodal analysis with web search

Happy Comic Book Analyzing! 📚✨

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
testimages		testimages
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
groq_extractor.py		groq_extractor.py
groq_plus_gemini_rag.py		groq_plus_gemini_rag.py
requirements.txt		requirements.txt

totallyrad1/AI-comicbook-data-extractor

Folders and files

Latest commit

History

Repository files navigation

🦸 AI Comic Book Data Extractor

🎯 Two Powerful Extraction Methods

1. Groq Extractor (groq_extractor.py)

2. Groq + Gemini RAG (groq_plus_gemini_rag.py)

📋 Features

🚀 Getting Started

Prerequisites

1. Get Your Free API Keys

Groq API Key

Gemini API Key (Optional - for RAG version)

2. Clone or Download the Repository

3. Set Up Python Virtual Environment

4. Install Dependencies

5. Configure Your API Key(s)

🎯 Usage

Method 1: Groq Extractor (Fast Vision-Only Analysis)

Basic Usage

Expected Output

Method 2: Gemini RAG (Web-Grounded Analysis with Source URLs)

Basic Usage

Expected Output (with Source URLs!)

More Examples

🔄 Which Method Should You Use?

📁 Project Structure

🔧 Troubleshooting

"GROQ_API_KEY is not set" or "GEMINI_API_KEY is not set" Error

Virtual Environment Not Activated

Image File Not Found

"Tool use with a response mime type: 'application/json' is unsupported" Error

Gemini RAG Returns No Sources

🛠️ Dependencies

📝 Notes

🤝 Contributing

📄 License

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

1. Groq Extractor (`groq_extractor.py`)

2. Groq + Gemini RAG (`groq_plus_gemini_rag.py`)

Packages