Virtual Try-On AI Agent

ADK Design Agent

An intelligent multi-agent AI system built with Google ADK for virtual clothing try-on. Upload a person image, select from our catalog, and let AI show you how the outfit would look from all angles!

🎯 Features

Core Features

✅ Multi-View Try-On: Automatic 3-angle try-on (front/side/back)
✅ Catalog System: Browse 10 curated fashion garments (no uploads needed)
✅ Auto-Start Mode: New image upload automatically begins workflow
✅ Batch Processing: Try-on all 3 views in one operation
🎬 Video Generation: Create promotional videos with Veo 3.1 ⭐ NEW
✅ Continuous Workflow: Unlimited try-ons with automatic versioning
✅ Rate Limiting: Smart cooldown (5s) prevents excessive API calls
✅ Image Validation: Automatic person image validation
✅ 9:16 Aspect Ratio: Optimized for portrait/mobile viewing
✅ Session Management: Track all uploads and try-on results

v3.2.0 - Multi-View Auto Try-On System ⭐ CURRENT

🎯 Interactive Coordinator: User-driven workflow with LLM intelligence
🖼️ Image Manager Agent: Auto-generates 3 views from 1 image (4 tools)
👔 Catalog Manager Agent: Shows catalog and manages selection (2 tools)
✨ Try-On Specialist Agent: Batch try-on on all 3 views + video generation (5 tools)
📊 Clean Organization: 11 tools distributed across 3 specialized sub-agents
🚀 Fast Workflow: Auto-start mode with automatic multi-view generation
💬 User Control: Interactive with natural conversation flow
🎨 Complete View: See garments from every angle instantly
🎬 Video Marketing: Generate professional videos with Veo 3.1

🏗️ Architecture

Interactive Multi-Agent System (v3.2.0)

COORDINATOR AGENT (LlmAgent - Interactive)
    │
    ├─→ Image Manager Agent (4 tools) - AUTO-START + AUTO-MULTIVIEW
    │   ├─ list_reference_images
    │   ├─ clear_reference_images
    │   ├─ load_artifacts_tool
    │   └─ generate_multiview_person ⭐ NEW
    │
    ├─→ Catalog Manager Agent (2 tools)
    │   ├─ list_catalog_clothes
    │   └─ select_catalog_cloth
    │
    └─→ Try-On Specialist Agent (5 tools)
        ├─ virtual_tryon
        ├─ list_tryon_results
        ├─ get_rate_limit_status
        ├─ batch_multiview_tryon ⭐ NEW
        └─ generate_video_from_results 🎬 NEW

Why Interactive + Auto Batch?

✅ User controls pace at each phase
✅ Auto-generates 3 views from 1 image
✅ Auto-start on new image upload
✅ Batch try-on all 3 views automatically
✅ Natural conversation flow
✅ Flexible and maintainable
✅ Support for continuous operations
✅ Complete angle coverage instantly

🔄 Workflow

Auto-Start Workflow (v3.2.0)

Upload Person Image → AUTO-MULTIVIEW
- Image saved automatically (reference_image_v1.png)
- System generates 3 views: front, side, back (~10-15s)
- Catalog displayed automatically
View Catalog → AUTOMATIC
- 10 garments displayed
- User selects by number
Select Garment → INTERACTIVE
- User chooses garment
- Confirmation displayed
Execute Batch Try-On → AUTOMATIC
- Try-on processed on all 3 views (~15-20s)
- Results saved: tryon_result_v1.png (front), v2.png (side), v3.png (back)
View Results → WAIT
- User reviews all 3 results
- Complete angle coverage
Generate Video (Optional) 🎬 → INTERACTIVE
- System offers video generation
- Veo 3.1 creates professional video (~40-90s)
- Video showcases rotating fashion views
- Perfect for social media marketing
Continue? → INTERACTIVE
- Upload new person (→ auto-start + auto-multiview)
- Try different garment (batch try-on all 3 views)
- View all results
- Finish

Continuous Workflow

Upload multiple person images (v1, v2, v3...)
Each upload auto-starts workflow with multi-view generation
All 3 views processed automatically for each garment
All results preserved (v1, v2, v3 for first garment, v4, v5, v6 for second, etc.)
No manual cleanup needed
Complete angle coverage for every try-on

Prerequisites

Python 3.10+
Google ADK (pip install google-adk)
Gemini API key with image generation access

Setup

Clone the repository:

git clone <repository-url>
cd adk-design-agent

Install dependencies:
```
pip install -r requirements.txt
```

Set up environment variables: Create a .env file in the project root:

GEMINI_API_KEY=your_gemini_api_key_here
GOOGLE_API_KEY=your_gemini_api_key_here
GOOGLE_GENAI_USE_VERTEXAI=false

Verify installation:

python -c "import google.adk; print('ADK installed successfully')"

Running the Agent

Quick Start ⭐ Recommended

Start the Interactive Multi-Agent system:

adk run agent:runner

Expected output:

🎯 Virtual Try-On Agent System (v3.1.0) - Ready!

Or use the quick start script:

Windows:

run.bat

Linux/Mac:

chmod +x run.sh
./run.sh

Web UI Mode

Start the ADK web interface:

adk web

Then navigate to: http://127.0.0.1:8000/dev-ui?app=agent

📖 Usage Guide

Basic Workflow

Start Agent: Run adk run agent:runner
Upload Image: Attach person photo (9:16 ratio preferred)
Auto-Start: System automatically shows catalog
Select Garment: Choose by number (1-10)
View Result: See virtual try-on result
Continue: Upload new person or try different garment

Step 1: Upload Person Image (AUTO-START)

User: "Hi, I want to try on some clothes"
Agent: "Hello! Welcome to our Virtual Try-On service! 👔✨
        
        Let's start by uploading a person image.
        Please provide a 9:16 portrait photo."

User: [Uploads person image]
Agent: "✅ Person image validated and saved!
        Ready to browse our catalog!"

Step 2: Browse Catalog (Catalog Manager)

User: "Show me the catalog"
Agent: "Here are our available garments:

        1. 👕 Classic White T-Shirt - Versatile everyday essential
        2. 👔 Navy Business Suit - Professional formal wear
        3. 🧥 Denim Jacket - Casual cool style
        4. 👗 Floral Summer Dress - Bright and cheerful
        5. 🎽 Athletic Sports Jersey - Active lifestyle
        6. 🧥 Leather Biker Jacket - Edgy street style
        7. 🎩 Vintage Cardigan - Cozy classic comfort
        
        Which garment would you like to try? (1-7)"

Step 3: Virtual Try-On (Try-On Specialist)

User: "I want to try garment #2"
Agent: "✅ Navy Business Suit selected!
        
        Checking rate limit... ✅ Ready!
        Creating batch try-on on all 3 views... 🎨"
        
Agent: "✨ Virtual Try-On Complete - All 3 Views!
        
        📸 Front view: tryon_result_v1.png
        📸 Side view: tryon_result_v2.png
        📸 Back view: tryon_result_v3.png
        
        You can see the Navy Suit from every angle! 
        Want to try another garment?"

🛠️ Available Tools

Image Manager Agent (4 tools)

list_reference_images - List all uploaded person images
clear_reference_images - Clear all uploaded images
load_artifacts_tool - Load previous artifacts
generate_multiview_person - Generate 3 views (front/side/back) from 1 image ⭐ NEW

Catalog Manager Agent (2 tools)

list_catalog_clothes - Display all 10 catalog garments
select_catalog_cloth - Select garment by ID (1-10)

Try-On Specialist Agent (5 tools)

virtual_tryon - Execute single virtual try-on
list_tryon_results - Show all try-on results
get_rate_limit_status - Check cooldown status
batch_multiview_tryon - Try-on garment on all 3 views automatically ⭐ NEW
generate_video_from_results - Generate Veo 3.1 video from batch results 🎬 NEW

Video Generation Tool (New!)

Purpose: Create professional promotional videos from batch try-on results

When to use:

After batch_multiview_tryon completes successfully
When user wants marketing content for social media
To showcase garments from all angles in video format

Default Configuration (Auto-applied):

{
    "video_length": 8,          # 8 seconds duration
    "aspect_ratio": "16:9",     # Horizontal format (YouTube/presentations)
    "transition_style": "smooth_rotation"  # Smooth transitions between views
}

Example:

# After batch try-on completes with 3 results
# Just call without parameters - uses defaults automatically
generate_video_from_results()
# Returns: Video URL for download
# Processing time: 40-90 seconds
# Output: 8-second video in 16:9 format

Features:

✅ Uses Veo 2.0 model (veo-2.0-generate-001)
✅ Image-to-video generation with your try-on results
✅ Automatic loading of 3 batch results
✅ Professional transitions between views
✅ 16:9 aspect ratio perfect for YouTube/presentations
✅ No text overlays (avoids spelling errors)
✅ Downloadable video URL with 24-hour validity
✅ 8-second duration with smooth rotation

Workflow Integration:

1. User uploads person image → 3 views generated
2. User selects garment → Batch try-on creates 3 results
3. System shows results → Offers video generation
4. User accepts → 8-second video in 16:9 generated (~60-90s)
5. User downloads video → Ready for YouTube/presentations!

User downloads video → Ready for social media!


## 📁 File Structure

```text
adk-design-agent/
├── agent.py                     # ⭐ Interactive Multi-Agent System (v3.1.0)
├── prompts.py                   # All agent instructions
├── prompt.py.old                # Legacy single agent (backup)
├── tools/
│   ├── tryon_tool.py           # Virtual try-on tools (10 tools)
│   └── rate_limiter.py         # Rate limiting utilities
├── catalog/                     # 10 fashion garments
├── reference_images/            # Uploaded person images (auto-versioned)
├── tryon_results/               # Generated results (auto-versioned)
├── deep_think_loop.py          # Deep thinking utilities
├── requirements.txt            # Python dependencies
├── pyproject.toml              # Project configuration
├── run.bat / run.sh            # Quick start scripts
└── README.md                   # This file

⚙️ Configuration

Environment Variables

Create a .env file:

# Required for Virtual Try-On
GEMINI_API_KEY=your_gemini_api_key_here

# Required for Video Generation (Veo 3.1)
GOOGLE_API_KEY=your_gemini_api_key_here  # Same key works for both

# Optional
GOOGLE_GENAI_USE_VERTEXAI=false
RATE_LIMIT_COOLDOWN=5.0         # Cooldown seconds (default: 5.0)

Note: GOOGLE_API_KEY is required for video generation. You can use the same API key as GEMINI_API_KEY.

Rate Limiting Configuration

Environment variable in .env:

RATE_LIMIT_COOLDOWN=5.0  # Cooldown between try-ons (default: 5s)

Or edit tools/rate_limiter.py for custom logic.

Catalog Configuration

Add/modify garments in tools/tryon_tool.py:

CATALOG_CLOTHES = [
    {"id": 1, "name": "Your Garment", "description": "Description"},
    # ... add up to 10 garments
]

Place garment images in catalog/ folder.

🐛 Troubleshooting

Common Issues

"GEMINI_API_KEY environment variable not set"
- Ensure your .env file contains the API key
- Verify the key has image generation permissions
- Check the key is loaded: echo $GEMINI_API_KEY (Linux/Mac) or echo $env:GEMINI_API_KEY (PowerShell)
"Rate limit exceeded"
- Wait 5 seconds between try-ons (default cooldown)
- Check status: "What's the rate limit status?"
- The cooldown resets automatically
"Invalid garment selection"
- Use garment ID from 1-10 only
- Run "Show me the catalog" to see available garments
- Cannot upload custom garments (catalog-only mode)
"Person image validation failed"
- Ensure image is 9:16 aspect ratio (portrait)
- Upload clear portrait photos
- File must be JPG or PNG format
"Cannot import module"
- Check you're in the project directory
- Verify all files exist: agent.py, prompts.py, tools/
- Run: python -c "from agent import root_agent; print('OK')"

Debug Mode

Enable detailed logging:

# Windows PowerShell
$env:PYTHONPATH="."; python -c "import logging; logging.basicConfig(level='DEBUG'); from agent import root_agent"

# Linux/Mac
export PYTHONPATH=.; python -c "import logging; logging.basicConfig(level=logging.DEBUG); from agent import root_agent"

Or in your Python code:

import logging

logging.basicConfig(level=logging.DEBUG)

📚 Documentation

logging.basicConfig(level=logging.DEBUG)


## 📚 Documentation

- **[Google ADK Docs](https://google.github.io/adk-docs/)** - Official ADK documentation
- **[Gemini API](https://ai.google.dev/gemini-api/docs)** - Gemini model documentation
- **[Image Generation](https://ai.google.dev/gemini-api/docs/imagen)** - Imagen 3 documentation

## 🚀 Advanced Usage

### Adding New Agents

To extend the Interactive system:

```python
# In agent.py

# 1. Import new instruction from prompts.py
from prompts import (
    IMAGE_MANAGER_INSTRUCTION,
    CATALOG_MANAGER_INSTRUCTION,
    TRYON_SPECIALIST_INSTRUCTION,
    INTERACTIVE_COORDINATOR_INSTRUCTION,
    STYLE_ADVISOR_INSTRUCTION  # Add new instruction
)

# 2. Create new agent
style_advisor_agent = LlmAgent(
    name="style_advisor_agent",
    model="gemini-2.5-flash",
    instruction=STYLE_ADVISOR_INSTRUCTION,
    tools=[recommend_style, analyze_preferences],
    before_model_callback=process_reference_images_callback
)

# 3. Add to coordinator's sub_agents
root_agent = LlmAgent(
    name="virtual_tryon_coordinator",
    sub_agents=[
        image_manager_agent,
        catalog_manager_agent,
        tryon_specialist_agent,
        style_advisor_agent  # Add here
    ]
)

Custom Workflows

Modify agent instructions in prompts.py:

CUSTOM_WORKFLOW_INSTRUCTION = """
You are the Custom Workflow Agent for the Virtual Try-On system.

Your responsibilities:
1. Step 1: [Your custom step]
2. Step 2: [Your custom step]
3. Step 3: [Your custom step]

Available tools:
- [list your tools]

Important:
- Always validate inputs
- Provide clear feedback
- Handle errors gracefully
"""

Performance Tuning

# Adjust model for speed/quality
LlmAgent(
    model="gemini-2.5-flash",       # ⭐ Best balance (recommended)
    # model="gemini-2.0-flash-exp", # Fast, experimental features
    # model="gemini-2.5-pro",       # Slower, highest quality
)

Custom Rate Limiting

Edit tools/rate_limiter.py or set environment variable:

# In .env
RATE_LIMIT_COOLDOWN=10.0  # 10 seconds cooldown

🤝 Contributing

Contributions are welcome! Here's how:

Fork the repository
Create a feature branch: git checkout -b feature/amazing-feature
Make your changes

Test thoroughly:

# Test your changes
adk run agent:runner
# Upload test image and verify workflow

Commit: git commit -m "Add amazing feature"
Push: git push origin feature/amazing-feature
Open a Pull Request

Development Guidelines

Follow existing code style and structure
Add docstrings to new functions
Update prompts.py for instruction changes
Test interactive workflow thoroughly
Update README.md for new features
Maintain backward compatibility

📄 License

This project is licensed under the Apache License 2.0 - see the LICENSE file for details.

💬 Support

For issues and questions:

Check the troubleshooting section above
Review ADK documentation
Open an issue in the repository
Check existing issues for solutions

🎯 Version History

v3.2.0 - Multi-View Auto Try-On System ⭐ CURRENT

✅ MULTI-VIEW GENERATION: Auto-generates 3 views (front/side/back) from 1 image
✅ BATCH TRY-ON: Automatic try-on on all 3 views in one operation
✅ AUTO-START + AUTO-MULTIVIEW: New image upload triggers complete workflow
✅ STREAMLINED UX: No questions about garment_type or view selection
✅ Interactive coordinator with 3 specialist agents
✅ User-controlled workflow with automatic batch processing
✅ 10 tools distributed across agents (4+2+4)
✅ Complete angle coverage for every try-on
✅ Enhanced documentation and cleanup

v3.1.0 - Interactive Multi-Agent Architecture

✅ AUTO-START MODE: New image upload triggers automatic workflow
✅ Interactive coordinator with 3 specialist agents
✅ User-controlled workflow pacing (except auto-start)
✅ Continuous workflow support
✅ 10 garments in catalog (expanded from 7)
✅ Consolidated prompts in single file (prompts.py)
✅ Clean logging output
✅ Enhanced documentation

v3.0.0 - LoopAgent Architecture

✅ Automatic workflow with LoopAgent
✅ Sequential chaining with output_key
✅ Fully automated pipeline

v2.0.0 - SequentialAgent Implementation

✅ SequentialAgent with automatic chaining
✅ Output key-based data flow
✅ Improved tool organization

v1.7.0 - Sequential Multi-Agent Architecture

✅ Added Sequential Multi-Agent system (4 agents)
✅ Better tool organization (3+2+5 distribution)
✅ Comprehensive documentation

v1.6.0 - Regular Mode Only

✅ Removed Deep Think mode
✅ Simplified to regular agent only
✅ Code cleanup (removed unused imports)

v1.5.0 - Catalog System

✅ Added catalog with 7 garments
✅ Catalog-only mode (no garment uploads)
✅ English-only output

v1.4.0 - Comparison Features

✅ Compare multiple try-on results
✅ AI-powered comparison summary
✅ Side-by-side result viewing

v1.3.0 - Rate Limiting

✅ 5-second cooldown between try-ons
✅ Rate limit status tool
✅ Prevents excessive API calls

v1.2.0 - Image Validation

✅ Person image validation
✅ Clear reference images tool
✅ List reference images tool

v1.1.0 - Core Features

✅ Virtual try-on tool
✅ List try-on results
✅ Load previous artifacts

v1.0.0 - Initial Release

✅ Basic virtual try-on agent
✅ Single agent architecture
✅ Image upload handling
✅ Result management

Built with ❤️ using Google ADK and Gemini AI

⭐ Recommended: Use Sequential Multi-Agent architecture for production!

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
assets		assets
catalog		catalog
tools		tools
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
agent.py		agent.py
prompts.py		prompts.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt

OTRONTO-F/adk-design-agent

Folders and files

Latest commit

History

Repository files navigation

Virtual Try-On AI Agent

ADK Design Agent

🎯 Features

Core Features

v3.2.0 - Multi-View Auto Try-On System ⭐ CURRENT

🏗️ Architecture

Interactive Multi-Agent System (v3.2.0)

🔄 Workflow

Auto-Start Workflow (v3.2.0)

Continuous Workflow

Prerequisites

Setup

Running the Agent

Quick Start ⭐ Recommended

Web UI Mode

📖 Usage Guide

Basic Workflow

Step 1: Upload Person Image (AUTO-START)

Step 2: Browse Catalog (Catalog Manager)

Step 3: Virtual Try-On (Try-On Specialist)

🛠️ Available Tools

Image Manager Agent (4 tools)

Catalog Manager Agent (2 tools)

Try-On Specialist Agent (5 tools)

Video Generation Tool (New!)

⚙️ Configuration

Environment Variables

Rate Limiting Configuration

Catalog Configuration

🐛 Troubleshooting

Common Issues

Debug Mode

📚 Documentation

Custom Workflows

Performance Tuning

Custom Rate Limiting

🤝 Contributing

Development Guidelines

📄 License

💬 Support

🎯 Version History

v3.2.0 - Multi-View Auto Try-On System ⭐ CURRENT

v3.1.0 - Interactive Multi-Agent Architecture

v3.0.0 - LoopAgent Architecture

v2.0.0 - SequentialAgent Implementation

v1.7.0 - Sequential Multi-Agent Architecture

v1.6.0 - Regular Mode Only

v1.5.0 - Catalog System

v1.4.0 - Comparison Features

v1.3.0 - Rate Limiting

v1.2.0 - Image Validation

v1.1.0 - Core Features

v1.0.0 - Initial Release

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages