Skip to content

totallyrad1/AI-comicbook-data-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🦸 AI Comic Book Data Extractor

An AI-powered comic book cover analyzer that extracts structured metadata from comic book cover images using advanced LLM APIs.

🎯 Two Powerful Extraction Methods

1. Groq Extractor (groq_extractor.py)

  • Fast inference using Groq's meta-llama/llama-4-scout-17b-16e-instruct model
  • Vision-enabled analysis of comic book covers
  • Structured output with Pydantic validation

2. Groq + Gemini RAG (groq_plus_gemini_rag.py)

  • Uses Google's Gemini 2.5 Flash model with Google Search grounding
  • Retrieves canonical data from authoritative sources (Comic Vine, GCD, etc.)
  • Returns all source URLs used for verification
  • More accurate for publication dates and pricing

πŸ“‹ Features

  • Extracts comic book metadata from cover images:
    • Title
    • Publisher
    • Issue Number / Identifier
    • Publication Date
    • Cover Price
    • Creators (Writer, Artist, etc.)
  • Gemini RAG version includes source URLs for verification
  • Simple command-line interface
  • Structured JSON output

πŸš€ Getting Started

Prerequisites

  • Python 3.8 or higher
  • For Groq Extractor: A Groq API key (free)
  • For Gemini RAG: A Google Gemini API key (free)

1. Get Your Free API Keys

Groq API Key

  1. Visit https://console.groq.com
  2. Sign up for a free account (no credit card required)
  3. Navigate to API Keys section in the dashboard
  4. Click Create API Key
  5. Copy your API key

Note: Groq offers free API access with generous rate limits!

Gemini API Key (Optional - for RAG version)

  1. Visit https://aistudio.google.com/app/apikey
  2. Sign in with your Google account
  3. Click Create API Key
  4. Copy your API key

Note: Google Gemini offers free API access with Google Search grounding capabilities!

2. Clone or Download the Repository

git clone https://github.com/totallyrad1/Groq-comicbook-data-extractor.git
cd Groq-comicbook-data-extractor

3. Set Up Python Virtual Environment

On Linux/macOS:

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate

On Windows:

# Create virtual environment
python -m venv venv

# Activate virtual environment
venv\Scripts\activate

You should see (venv) appear at the beginning of your terminal prompt.

4. Install Dependencies

pip install -r requirements.txt

5. Configure Your API Key(s)

Create a .env file in the project root directory:

For Groq Extractor only:

echo "GROQ_API_KEY=your_groq_api_key_here" > .env

For both scripts (recommended):

cat > .env << EOF
GROQ_API_KEY=your_groq_api_key_here
GEMINI_API_KEY=your_gemini_api_key_here
EOF

Or manually create a .env file with:

GROQ_API_KEY=your_actual_groq_api_key_here
GEMINI_API_KEY=your_actual_gemini_api_key_here

⚠️ Important: Never commit your .env file to version control! Add it to .gitignore.

🎯 Usage

Method 1: Groq Extractor (Fast Vision-Only Analysis)

Basic Usage

python groq_extractor.py testimages/46617-3824-55260-1-wonder-woman.jpg

Expected Output

{
    "Title": "Wonder Woman",
    "Publisher": "DC Comics",
    "Identifier": "#1",
    "PublicationDate": "June 1987",
    "CoverPrice": "$0.75",
    "Creators": [
        "George PΓ©rez",
        "Greg Potter"
    ]
}

Method 2: Gemini RAG (Web-Grounded Analysis with Source URLs)

Basic Usage

python groq_plus_gemini_rag.py testimages/46617-3824-55260-1-wonder-woman.jpg

Expected Output (with Source URLs!)

{
    "Title": "Wonder Woman",
    "Publisher": "DC Comics",
    "Identifier": "#1",
    "PublicationDate": "February 1987",
    "CoverPrice": "$0.75 US",
    "Creators": [
        "George PΓ©rez",
        "Greg Potter"
    ],
    "CanonicalSource": "https://comicvine.gamespot.com/wonder-woman-1-the-princess-and-the-power/4000-27660/",
    "GroundingSources": [
        "https://comicvine.gamespot.com/wonder-woman-1-the-princess-and-the-power/4000-27660/",
        "https://dc.fandom.com/wiki/Wonder_Woman_Vol_2_1",
        "https://en.wikipedia.org/wiki/Wonder_Woman_(comic_book)"
    ]
}

More Examples

# Groq Extractor - Fast analysis
python groq_extractor.py testimages/thor.jpeg

# Gemini RAG - Accurate with web verification
python groq_plus_gemini_rag.py testimages/thor.jpeg

# Analyze any comic cover
python groq_extractor.py ~/Downloads/my-comic-cover.png
python groq_plus_gemini_rag.py ~/Downloads/my-comic-cover.png

πŸ”„ Which Method Should You Use?

Feature Groq Extractor Gemini RAG
Speed ⚑ Very Fast 🐒 Slower (web search)
Accuracy Good for visible info 🎯 Excellent (web-verified)
Source URLs ❌ No βœ… Yes
API Required Groq Google Gemini
Best For Quick batch processing Accurate cataloging
Offline Info Relies on vision only Uses web search

Recommendation: Use Gemini RAG when you need accurate, verifiable data with source citations. Use Groq Extractor for fast batch processing or when sources aren't needed.

πŸ“ Project Structure

Groq-comicbook-data-extractor/
β”œβ”€β”€ groq_extractor.py           # Fast Groq vision-based extractor
β”œβ”€β”€ groq_plus_gemini_rag.py     # Gemini with Google Search grounding
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ .env                        # API keys (create this)
β”œβ”€β”€ README.md                   # This file
└── testimages/                 # Sample comic book covers
    β”œβ”€β”€ 2403520-ss16.png
    β”œβ”€β”€ 46617-3824-55260-1-wonder-woman.jpg
    β”œβ”€β”€ images (2).jpeg
    └── thor.jpeg

πŸ”§ Troubleshooting

"GROQ_API_KEY is not set" or "GEMINI_API_KEY is not set" Error

Make sure your .env file exists in the project root and contains:

GROQ_API_KEY=your_actual_groq_key
GEMINI_API_KEY=your_actual_gemini_key

Virtual Environment Not Activated

If packages aren't found, ensure your virtual environment is activated:

  • You should see (venv) in your terminal prompt
  • Reactivate with source venv/bin/activate (Linux/macOS) or venv\Scripts\activate (Windows)

Image File Not Found

Provide the full or relative path to your image:

python groq_extractor.py /full/path/to/image.jpg
python groq_plus_gemini_rag.py /full/path/to/image.jpg

"Tool use with a response mime type: 'application/json' is unsupported" Error

This error is already fixed in the latest version. Make sure you're using the updated groq_plus_gemini_rag.py file.

Gemini RAG Returns No Sources

The Google Search grounding feature requires an active internet connection and may not always return sources. The extraction will still work, but GroundingSources will be an empty list.

πŸ› οΈ Dependencies

Key libraries used:

For Groq Extractor:

  • groq - Official Groq API client
  • instructor - Structured output from LLMs
  • pydantic - Data validation
  • python-dotenv - Environment variable management

For Gemini RAG:

  • google-genai - Official Google Gemini API client
  • pydantic - Data validation
  • python-dotenv - Environment variable management

See requirements.txt for the complete list.

πŸ“ Notes

  • Both scripts use base64 encoding to send images to their respective APIs
  • Groq Extractor: Faster but relies solely on vision analysis
  • Gemini RAG: Slower but more accurate, uses Google Search for verification
  • Extraction quality depends on image resolution and cover clarity
  • The Gemini model analyzes visible information AND verifies with web sources
  • Free API tiers have rate limits (check respective documentation):
  • GroundingSources in Gemini RAG output shows all URLs used for data verification

🀝 Contributing

Feel free to submit issues, fork the repository, and create pull requests for any improvements.

πŸ“„ License

This project is open source and available under the MIT License.

πŸ™ Acknowledgments

  • Powered by Groq for fast LLM inference
  • Powered by Google Gemini for AI with Google Search grounding
  • Uses Meta's Llama 4 Scout model for vision capabilities (Groq)
  • Uses Google's Gemini 2.5 Flash model for multimodal analysis with web search

Happy Comic Book Analyzing! πŸ“šβœ¨

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages