Skip to content

Intelligent multi-agent AI system for virtual clothing try-on with automatic 3-view generation. Built with Google ADK & Gemini AI. πŸŽ¨πŸ‘”βœ¨

Notifications You must be signed in to change notification settings

OTRONTO-F/adk-design-agent

Β 
Β 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

24 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Virtual Try-On AI Agent

License Python Google ADK Gemini Version Status

ADK Logo

ADK Design Agent

An intelligent multi-agent AI system built with Google ADK for virtual clothing try-on. Upload a person image, select from our catalog, and let AI show you how the outfit would look from all angles!

🎯 Features

Core Features

  • βœ… Multi-View Try-On: Automatic 3-angle try-on (front/side/back)
  • βœ… Catalog System: Browse 10 curated fashion garments (no uploads needed)
  • βœ… Auto-Start Mode: New image upload automatically begins workflow
  • βœ… Batch Processing: Try-on all 3 views in one operation
  • 🎬 Video Generation: Create promotional videos with Veo 3.1 ⭐ NEW
  • βœ… Continuous Workflow: Unlimited try-ons with automatic versioning
  • βœ… Rate Limiting: Smart cooldown (5s) prevents excessive API calls
  • βœ… Image Validation: Automatic person image validation
  • βœ… 9:16 Aspect Ratio: Optimized for portrait/mobile viewing
  • βœ… Session Management: Track all uploads and try-on results

v3.2.0 - Multi-View Auto Try-On System ⭐ CURRENT

  • 🎯 Interactive Coordinator: User-driven workflow with LLM intelligence
  • πŸ–ΌοΈ Image Manager Agent: Auto-generates 3 views from 1 image (4 tools)
  • πŸ‘” Catalog Manager Agent: Shows catalog and manages selection (2 tools)
  • ✨ Try-On Specialist Agent: Batch try-on on all 3 views + video generation (5 tools)
  • πŸ“Š Clean Organization: 11 tools distributed across 3 specialized sub-agents
  • πŸš€ Fast Workflow: Auto-start mode with automatic multi-view generation
  • πŸ’¬ User Control: Interactive with natural conversation flow
  • 🎨 Complete View: See garments from every angle instantly
  • 🎬 Video Marketing: Generate professional videos with Veo 3.1

πŸ—οΈ Architecture

Interactive Multi-Agent System (v3.2.0)

COORDINATOR AGENT (LlmAgent - Interactive)
    β”‚
    β”œβ”€β†’ Image Manager Agent (4 tools) - AUTO-START + AUTO-MULTIVIEW
    β”‚   β”œβ”€ list_reference_images
    β”‚   β”œβ”€ clear_reference_images
    β”‚   β”œβ”€ load_artifacts_tool
    β”‚   └─ generate_multiview_person ⭐ NEW
    β”‚
    β”œβ”€β†’ Catalog Manager Agent (2 tools)
    β”‚   β”œβ”€ list_catalog_clothes
    β”‚   └─ select_catalog_cloth
    β”‚
    └─→ Try-On Specialist Agent (5 tools)
        β”œβ”€ virtual_tryon
        β”œβ”€ list_tryon_results
        β”œβ”€ get_rate_limit_status
        β”œβ”€ batch_multiview_tryon ⭐ NEW
        └─ generate_video_from_results 🎬 NEW

Why Interactive + Auto Batch?

  • βœ… User controls pace at each phase
  • βœ… Auto-generates 3 views from 1 image
  • βœ… Auto-start on new image upload
  • βœ… Batch try-on all 3 views automatically
  • βœ… Natural conversation flow
  • βœ… Flexible and maintainable
  • βœ… Support for continuous operations
  • βœ… Complete angle coverage instantly

πŸ”„ Workflow

Auto-Start Workflow (v3.2.0)

  1. Upload Person Image β†’ AUTO-MULTIVIEW

    • Image saved automatically (reference_image_v1.png)
    • System generates 3 views: front, side, back (~10-15s)
    • Catalog displayed automatically
  2. View Catalog β†’ AUTOMATIC

    • 10 garments displayed
    • User selects by number
  3. Select Garment β†’ INTERACTIVE

    • User chooses garment
    • Confirmation displayed
  4. Execute Batch Try-On β†’ AUTOMATIC

    • Try-on processed on all 3 views (~15-20s)
    • Results saved: tryon_result_v1.png (front), v2.png (side), v3.png (back)
  5. View Results β†’ WAIT

    • User reviews all 3 results
    • Complete angle coverage
  6. Generate Video (Optional) 🎬 β†’ INTERACTIVE

    • System offers video generation
    • Veo 3.1 creates professional video (~40-90s)
    • Video showcases rotating fashion views
    • Perfect for social media marketing
  7. Continue? β†’ INTERACTIVE

    • Upload new person (β†’ auto-start + auto-multiview)
    • Try different garment (batch try-on all 3 views)
    • View all results
    • Finish

Continuous Workflow

  • Upload multiple person images (v1, v2, v3...)
  • Each upload auto-starts workflow with multi-view generation
  • All 3 views processed automatically for each garment
  • All results preserved (v1, v2, v3 for first garment, v4, v5, v6 for second, etc.)
  • No manual cleanup needed
  • Complete angle coverage for every try-on

Prerequisites

  • Python 3.10+
  • Google ADK (pip install google-adk)
  • Gemini API key with image generation access

Setup

  1. Clone the repository:

    git clone <repository-url>
    cd adk-design-agent
  2. Install dependencies:

    pip install -r requirements.txt
  3. Set up environment variables: Create a .env file in the project root:

    GEMINI_API_KEY=your_gemini_api_key_here
    GOOGLE_API_KEY=your_gemini_api_key_here
    GOOGLE_GENAI_USE_VERTEXAI=false
  4. Verify installation:

    python -c "import google.adk; print('ADK installed successfully')"

Running the Agent

Quick Start ⭐ Recommended

Start the Interactive Multi-Agent system:

adk run agent:runner

Expected output:

🎯 Virtual Try-On Agent System (v3.1.0) - Ready!

Or use the quick start script:

Windows:

run.bat

Linux/Mac:

chmod +x run.sh
./run.sh

Web UI Mode

Start the ADK web interface:

adk web

Then navigate to: http://127.0.0.1:8000/dev-ui?app=agent

πŸ“– Usage Guide

Basic Workflow

  1. Start Agent: Run adk run agent:runner
  2. Upload Image: Attach person photo (9:16 ratio preferred)
  3. Auto-Start: System automatically shows catalog
  4. Select Garment: Choose by number (1-10)
  5. View Result: See virtual try-on result
  6. Continue: Upload new person or try different garment

Step 1: Upload Person Image (AUTO-START)

User: "Hi, I want to try on some clothes"
Agent: "Hello! Welcome to our Virtual Try-On service! πŸ‘”βœ¨
        
        Let's start by uploading a person image.
        Please provide a 9:16 portrait photo."

User: [Uploads person image]
Agent: "βœ… Person image validated and saved!
        Ready to browse our catalog!"

Step 2: Browse Catalog (Catalog Manager)

User: "Show me the catalog"
Agent: "Here are our available garments:

        1. πŸ‘• Classic White T-Shirt - Versatile everyday essential
        2. πŸ‘” Navy Business Suit - Professional formal wear
        3. πŸ§₯ Denim Jacket - Casual cool style
        4. πŸ‘— Floral Summer Dress - Bright and cheerful
        5. 🎽 Athletic Sports Jersey - Active lifestyle
        6. πŸ§₯ Leather Biker Jacket - Edgy street style
        7. 🎩 Vintage Cardigan - Cozy classic comfort
        
        Which garment would you like to try? (1-7)"

Step 3: Virtual Try-On (Try-On Specialist)

User: "I want to try garment #2"
Agent: "βœ… Navy Business Suit selected!
        
        Checking rate limit... βœ… Ready!
        Creating batch try-on on all 3 views... 🎨"
        
Agent: "✨ Virtual Try-On Complete - All 3 Views!
        
        πŸ“Έ Front view: tryon_result_v1.png
        πŸ“Έ Side view: tryon_result_v2.png
        πŸ“Έ Back view: tryon_result_v3.png
        
        You can see the Navy Suit from every angle! 
        Want to try another garment?"

πŸ› οΈ Available Tools

Image Manager Agent (4 tools)

  1. list_reference_images - List all uploaded person images
  2. clear_reference_images - Clear all uploaded images
  3. load_artifacts_tool - Load previous artifacts
  4. generate_multiview_person - Generate 3 views (front/side/back) from 1 image ⭐ NEW

Catalog Manager Agent (2 tools)

  1. list_catalog_clothes - Display all 10 catalog garments
  2. select_catalog_cloth - Select garment by ID (1-10)

Try-On Specialist Agent (5 tools)

  1. virtual_tryon - Execute single virtual try-on
  2. list_tryon_results - Show all try-on results
  3. get_rate_limit_status - Check cooldown status
  4. batch_multiview_tryon - Try-on garment on all 3 views automatically ⭐ NEW
  5. generate_video_from_results - Generate Veo 3.1 video from batch results 🎬 NEW

Video Generation Tool (New!)

Purpose: Create professional promotional videos from batch try-on results

When to use:

  • After batch_multiview_tryon completes successfully
  • When user wants marketing content for social media
  • To showcase garments from all angles in video format

Default Configuration (Auto-applied):

{
    "video_length": 8,          # 8 seconds duration
    "aspect_ratio": "16:9",     # Horizontal format (YouTube/presentations)
    "transition_style": "smooth_rotation"  # Smooth transitions between views
}

Example:

# After batch try-on completes with 3 results
# Just call without parameters - uses defaults automatically
generate_video_from_results()
# Returns: Video URL for download
# Processing time: 40-90 seconds
# Output: 8-second video in 16:9 format

Features:

  • βœ… Uses Veo 2.0 model (veo-2.0-generate-001)
  • βœ… Image-to-video generation with your try-on results
  • βœ… Automatic loading of 3 batch results
  • βœ… Professional transitions between views
  • βœ… 16:9 aspect ratio perfect for YouTube/presentations
  • βœ… No text overlays (avoids spelling errors)
  • βœ… Downloadable video URL with 24-hour validity
  • βœ… 8-second duration with smooth rotation

Workflow Integration:

1. User uploads person image β†’ 3 views generated
2. User selects garment β†’ Batch try-on creates 3 results
3. System shows results β†’ Offers video generation
4. User accepts β†’ 8-second video in 16:9 generated (~60-90s)
5. User downloads video β†’ Ready for YouTube/presentations!
  1. User downloads video β†’ Ready for social media!

## πŸ“ File Structure

```text
adk-design-agent/
β”œβ”€β”€ agent.py                     # ⭐ Interactive Multi-Agent System (v3.1.0)
β”œβ”€β”€ prompts.py                   # All agent instructions
β”œβ”€β”€ prompt.py.old                # Legacy single agent (backup)
β”œβ”€β”€ tools/
β”‚   β”œβ”€β”€ tryon_tool.py           # Virtual try-on tools (10 tools)
β”‚   └── rate_limiter.py         # Rate limiting utilities
β”œβ”€β”€ catalog/                     # 10 fashion garments
β”œβ”€β”€ reference_images/            # Uploaded person images (auto-versioned)
β”œβ”€β”€ tryon_results/               # Generated results (auto-versioned)
β”œβ”€β”€ deep_think_loop.py          # Deep thinking utilities
β”œβ”€β”€ requirements.txt            # Python dependencies
β”œβ”€β”€ pyproject.toml              # Project configuration
β”œβ”€β”€ run.bat / run.sh            # Quick start scripts
└── README.md                   # This file

βš™οΈ Configuration

Environment Variables

Create a .env file:

# Required for Virtual Try-On
GEMINI_API_KEY=your_gemini_api_key_here

# Required for Video Generation (Veo 3.1)
GOOGLE_API_KEY=your_gemini_api_key_here  # Same key works for both

# Optional
GOOGLE_GENAI_USE_VERTEXAI=false
RATE_LIMIT_COOLDOWN=5.0         # Cooldown seconds (default: 5.0)

Note: GOOGLE_API_KEY is required for video generation. You can use the same API key as GEMINI_API_KEY.

Rate Limiting Configuration

Environment variable in .env:

RATE_LIMIT_COOLDOWN=5.0  # Cooldown between try-ons (default: 5s)

Or edit tools/rate_limiter.py for custom logic.

Catalog Configuration

Add/modify garments in tools/tryon_tool.py:

CATALOG_CLOTHES = [
    {"id": 1, "name": "Your Garment", "description": "Description"},
    # ... add up to 10 garments
]

Place garment images in catalog/ folder.

πŸ› Troubleshooting

Common Issues

  1. "GEMINI_API_KEY environment variable not set"

    • Ensure your .env file contains the API key
    • Verify the key has image generation permissions
    • Check the key is loaded: echo $GEMINI_API_KEY (Linux/Mac) or echo $env:GEMINI_API_KEY (PowerShell)
  2. "Rate limit exceeded"

    • Wait 5 seconds between try-ons (default cooldown)
    • Check status: "What's the rate limit status?"
    • The cooldown resets automatically
  3. "Invalid garment selection"

    • Use garment ID from 1-10 only
    • Run "Show me the catalog" to see available garments
    • Cannot upload custom garments (catalog-only mode)
  4. "Person image validation failed"

    • Ensure image is 9:16 aspect ratio (portrait)
    • Upload clear portrait photos
    • File must be JPG or PNG format
  5. "Cannot import module"

    • Check you're in the project directory
    • Verify all files exist: agent.py, prompts.py, tools/
    • Run: python -c "from agent import root_agent; print('OK')"

Debug Mode

Enable detailed logging:

# Windows PowerShell
$env:PYTHONPATH="."; python -c "import logging; logging.basicConfig(level='DEBUG'); from agent import root_agent"

# Linux/Mac
export PYTHONPATH=.; python -c "import logging; logging.basicConfig(level=logging.DEBUG); from agent import root_agent"

Or in your Python code:

import logging

logging.basicConfig(level=logging.DEBUG)

πŸ“š Documentation

logging.basicConfig(level=logging.DEBUG)


## πŸ“š Documentation

- **[Google ADK Docs](https://google.github.io/adk-docs/)** - Official ADK documentation
- **[Gemini API](https://ai.google.dev/gemini-api/docs)** - Gemini model documentation
- **[Image Generation](https://ai.google.dev/gemini-api/docs/imagen)** - Imagen 3 documentation

## πŸš€ Advanced Usage

### Adding New Agents

To extend the Interactive system:

```python
# In agent.py

# 1. Import new instruction from prompts.py
from prompts import (
    IMAGE_MANAGER_INSTRUCTION,
    CATALOG_MANAGER_INSTRUCTION,
    TRYON_SPECIALIST_INSTRUCTION,
    INTERACTIVE_COORDINATOR_INSTRUCTION,
    STYLE_ADVISOR_INSTRUCTION  # Add new instruction
)

# 2. Create new agent
style_advisor_agent = LlmAgent(
    name="style_advisor_agent",
    model="gemini-2.5-flash",
    instruction=STYLE_ADVISOR_INSTRUCTION,
    tools=[recommend_style, analyze_preferences],
    before_model_callback=process_reference_images_callback
)

# 3. Add to coordinator's sub_agents
root_agent = LlmAgent(
    name="virtual_tryon_coordinator",
    sub_agents=[
        image_manager_agent,
        catalog_manager_agent,
        tryon_specialist_agent,
        style_advisor_agent  # Add here
    ]
)

Custom Workflows

Modify agent instructions in prompts.py:

CUSTOM_WORKFLOW_INSTRUCTION = """
You are the Custom Workflow Agent for the Virtual Try-On system.

Your responsibilities:
1. Step 1: [Your custom step]
2. Step 2: [Your custom step]
3. Step 3: [Your custom step]

Available tools:
- [list your tools]

Important:
- Always validate inputs
- Provide clear feedback
- Handle errors gracefully
"""

Performance Tuning

# Adjust model for speed/quality
LlmAgent(
    model="gemini-2.5-flash",       # ⭐ Best balance (recommended)
    # model="gemini-2.0-flash-exp", # Fast, experimental features
    # model="gemini-2.5-pro",       # Slower, highest quality
)

Custom Rate Limiting

Edit tools/rate_limiter.py or set environment variable:

# In .env
RATE_LIMIT_COOLDOWN=10.0  # 10 seconds cooldown

🀝 Contributing

Contributions are welcome! Here's how:

  1. Fork the repository
  2. Create a feature branch: git checkout -b feature/amazing-feature
  3. Make your changes
  4. Test thoroughly:
    # Test your changes
    adk run agent:runner
    # Upload test image and verify workflow
  5. Commit: git commit -m "Add amazing feature"
  6. Push: git push origin feature/amazing-feature
  7. Open a Pull Request

Development Guidelines

  • Follow existing code style and structure
  • Add docstrings to new functions
  • Update prompts.py for instruction changes
  • Test interactive workflow thoroughly
  • Update README.md for new features
  • Maintain backward compatibility

πŸ“„ License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

πŸ’¬ Support

For issues and questions:

🎯 Version History

v3.2.0 - Multi-View Auto Try-On System ⭐ CURRENT

  • βœ… MULTI-VIEW GENERATION: Auto-generates 3 views (front/side/back) from 1 image
  • βœ… BATCH TRY-ON: Automatic try-on on all 3 views in one operation
  • βœ… AUTO-START + AUTO-MULTIVIEW: New image upload triggers complete workflow
  • βœ… STREAMLINED UX: No questions about garment_type or view selection
  • βœ… Interactive coordinator with 3 specialist agents
  • βœ… User-controlled workflow with automatic batch processing
  • βœ… 10 tools distributed across agents (4+2+4)
  • βœ… Complete angle coverage for every try-on
  • βœ… Enhanced documentation and cleanup

v3.1.0 - Interactive Multi-Agent Architecture

  • βœ… AUTO-START MODE: New image upload triggers automatic workflow
  • βœ… Interactive coordinator with 3 specialist agents
  • βœ… User-controlled workflow pacing (except auto-start)
  • βœ… Continuous workflow support
  • βœ… 10 garments in catalog (expanded from 7)
  • βœ… Consolidated prompts in single file (prompts.py)
  • βœ… Clean logging output
  • βœ… Enhanced documentation

v3.0.0 - LoopAgent Architecture

  • βœ… Automatic workflow with LoopAgent
  • βœ… Sequential chaining with output_key
  • βœ… Fully automated pipeline

v2.0.0 - SequentialAgent Implementation

  • βœ… SequentialAgent with automatic chaining
  • βœ… Output key-based data flow
  • βœ… Improved tool organization

v1.7.0 - Sequential Multi-Agent Architecture

  • βœ… Added Sequential Multi-Agent system (4 agents)
  • βœ… Better tool organization (3+2+5 distribution)
  • βœ… Comprehensive documentation

v1.6.0 - Regular Mode Only

  • βœ… Removed Deep Think mode
  • βœ… Simplified to regular agent only
  • βœ… Code cleanup (removed unused imports)

v1.5.0 - Catalog System

  • βœ… Added catalog with 7 garments
  • βœ… Catalog-only mode (no garment uploads)
  • βœ… English-only output

v1.4.0 - Comparison Features

  • βœ… Compare multiple try-on results
  • βœ… AI-powered comparison summary
  • βœ… Side-by-side result viewing

v1.3.0 - Rate Limiting

  • βœ… 5-second cooldown between try-ons
  • βœ… Rate limit status tool
  • βœ… Prevents excessive API calls

v1.2.0 - Image Validation

  • βœ… Person image validation
  • βœ… Clear reference images tool
  • βœ… List reference images tool

v1.1.0 - Core Features

  • βœ… Virtual try-on tool
  • βœ… List try-on results
  • βœ… Load previous artifacts

v1.0.0 - Initial Release

  • βœ… Basic virtual try-on agent
  • βœ… Single agent architecture
  • βœ… Image upload handling
  • βœ… Result management

Built with ❀️ using Google ADK and Gemini AI

⭐ Recommended: Use Sequential Multi-Agent architecture for production!

About

Intelligent multi-agent AI system for virtual clothing try-on with automatic 3-view generation. Built with Google ADK & Gemini AI. πŸŽ¨πŸ‘”βœ¨

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%