Costa Rica
Last updated: 2026-01-16
List of References (Click to expand)
- Foundry Models sold directly by Azure - models available
- Timelines for Foundry Models - retirement dates
- Azure OpenAI in Microsoft Foundry model deprecations and retirements - deprecation Date
- Use model router for Microsoft Foundry - model-router LLMs
- Model summary table and region availability - table summary
- Baseline architecture for an Azure Kubernetes Service (AKS) cluster
- Run your functions from a package file in Azure
- What is Microsoft Translator Pro?
- Model leaderboards in Microsoft Foundry portal (preview)
- AI Leaderboards - general ref
- How to Stream Agent Responses
- How to enable Live Streaming over Direct Line for a Copilot Studio - deployed agent?
- Azure OpenAI Responses API
- Foundry Control Plane: Managing AI agents at scale | BRK202
Important
Disclaimer: This repository contains a demo of Zava Media AI Assistant, a hybrid system using 2 Azure AI Agents (via Azure AI Agents Service) for conversational orchestration and cropping, with code-based orchestration for other media tasks (video, image generation, document processing). It features a fully automated "Zero-Touch" deployment pipeline orchestrated by Terraform, which provisions infrastructure, creates specialized AI agents in MSFT Foundry, and deploys the complete application stack. Feel free to modify this as needed, it's just a reference. Please refer TechWorkshop L300: AI Apps and Agents, and if needed contact Microsoft directly: Microsoft Sales and Support for more guidance.
E.g
Important
The deployment process typically takes 15-20 minutes
- Adjust terraform.tfvars values
- Initialize terraform with
terraform init. Click here to understand more about the deployment process - Run
terraform apply- this automatically handles all deployment including agent creation and configuration
Warning
- Multi-Region Deployment: Sweden Central hosts 4 models + 2 agents, East US hosts 1 model.
- All models use GlobalStandard SKU for optimal performance and availability.
For example East US & Sweden Central:
| East US | Sweden Central |
|---|---|
![]() |
![]() |
- Hybrid Agent Architecture: 2 Azure AI Agents for chat-based orchestration + code-based orchestration for media processing
- Multi-Region Deployment:
- Sweden Central: 4 models + 2 agents
- Models: model-router, GPT-4o, Sora, FLUX.1-Kontext-pro
- Agents:
zava-media-orchestrator,vision-analyst
- East US: 1 model (no agents)
- Models: FLUX.2-pro
- Sweden Central: 4 models + 2 agents
- 2 Azure AI Agents (chat-based via Responses API):
zava-media-orchestrator: Central request router usingmodel-routerchat model.Routes to 18+ other modelsvision-analyst: Object detection and coordinate analysis usingGPT-4ochat model with vision (provides JSON coordinates via HTTPS). ~Analyzes images to detect objects and return bounding box coordinates as JSON. Application code handles actual image manipulation (cropping, resizing, etc.) using the provided coordinates.
- Code-Based Orchestration for generation tasks:
- Video Generation: Direct calls to
Sora(Sweden Central). ~Video generation model (not used by agents, called directly via code) - Image Generation: Direct calls to
FLUX.1-Kontext-pro(Sweden Central) andFLUX.2-pro(East US) ~Image generation model (not used by agents, called directly via code).
- Video Generation: Direct calls to
- Real-Time Image Processing: Upload or paste images directly into the chat for immediate agent action
- Real MSFT Foundry Agents: Integrates with MSFT Foundry to create and host persistent agents across multiple projects
- Zero-Touch Deployment: A single terraform apply command handles the entire lifecycle
- Advanced Task Coordination: Inter-agent task delegation (e.g., "Crop this, then change background, then add text")
- Dynamic Configuration: All settings managed via terraform.tfvars -
no code changes needed, just add your values here
Important
Agents use CHAT models only (not image generation models). GPT-4o is a chat model with vision, it can see/analyze images in conversation but doesn't generate images.
How It Works:
- Orchestrator Agent (model-router - chat model) receives user requests and routes appropriately
- Vision Analyst Agent (GPT-4o - chat model with vision) can SEE images in chat and provide object detection coordinates via JSON
- Code Orchestration calls generation models directly:
- Video generation (Sora - not an agent, direct API call)
- Image generation (FLUX.1-Kontext-pro - not an agent, direct API call)
- Key Distinction:
- Agents = Chat Models (model-router, GPT-4o) for conversation and analysis
- Code = Generation Models (Sora, FLUX) for creating videos/images
- GPT-4o is a CHAT model that can see images, NOT an image generation model
Warning
Azure Quota and Model Availability
The models deployed (model-router, GPT-4o, FLUX.2-pro, FLUX.1-Kontext-pro, Sora) require GPU capacity and are subject to Azure quotas. If you encounter deployment errors related to "Insufficient Quota", request a quota increase: Azure Support
graph TD
User[User] <--> UI[Media Studio UI]
UI <--> App[FastAPI Application]
App <--> Orchestrator[zava-media-orchestrator<br/>Model Router Chat Model<br/>Sweden Central]
App <--> Vision[vision-analyst<br/>GPT-4o Chat + Vision<br/>Object Detection & Coordinates<br/>Sweden Central]
App <--> CodeOrch[Code-Based Orchestration]
CodeOrch --> Sora[Sora<br/>Video Generation<br/>Sweden Central]
CodeOrch --> FLUX1[FLUX.1-Kontext-pro<br/>Image Generation<br/>Sweden Central]
CodeOrch --> FLUX2[FLUX.2-pro<br/>Image Generation<br/>East US]
subgraph "Azure AI Agents - Chat Models Only"
Orchestrator
Vision
end
subgraph "Sweden Central - Generation Models"
Sora
FLUX1
end
subgraph "East US - Generation Models"
FLUX2
end
Architecture Distribution:
- 2 Azure AI Agents (Sweden Central):
zava-media-orchestrator(model-router),vision-analyst(GPT-4o)- Generation Models: Sora, FLUX.1-Kontext-pro (Sweden Central), FLUX.2-pro (East US)
- Key: As now, Agents use chat models per Azure AI Agents SDK design
When you run
terraform apply, the following automated sequence occurs:
-
Infrastructure Provisioning:
- Creates Resource Group, 2 Azure AI Foundry projects (Sweden Central + East US), Key Vault, Storage Account, and Container Registry (ACR)
- Multi-Region Model Deployment:
- Sweden Central (4 models):
- Model Router (Orchestrator - automatic model selection from 18+ options)
- GPT-4o (Vision and cropping tasks)
- Sora (Native video generation)
- FLUX.1-Kontext-pro (Document processing and contextual understanding)
- East US (1 model):
- FLUX.2-pro (Background generation, thumbnail creation, artistic image manipulation)
- Sweden Central (4 models):
- All models use GlobalStandard SKU for optimal performance
- All resources use Managed Identity for secure authentication (no API keys stored)
-
Automated Agent Creation:
- Fully automated by Terraform: No manual intervention required
- Installs the
azure-ai-projectsSDK and connects to MSFT Foundry projects in both regions - Creates specialized media processing agents:
- Sweden Central:
zava-media-orchestrator,vision-analyst - East US: No agents (Models accessed directly via code)
- Sweden Central:
- Automatically stores agent IDs in Azure Key Vault for secure access with region prefixes
- Web app retrieves agent configuration from Key Vault automatically
- Zero manual configuration - Terraform handles all multi-region agent deployment and setup
-
Application Deployment:
- Builds the Docker container in the cloud (ACR Build)
- Configures the Azure Web App with the generated Agent IDs and Managed Identity
- Deploys the container and restarts the app
After deployment completes, verify the system:
-
Check the Web App:
-
The Terraform output will provide the
application_url -
Visit
https://<your-app-name>.azurewebsites.net -
You should see the Zava Media AI interface
How.the.Web.App.looks.like.mp4
-
-
Verify Agent Architecture:
- Go to the MSFT Foundry Portal
- Check Sweden Central Project -> Build -> Agents:
- Should see:
zava-media-orchestratorandvision-analyst
- Should see:
- Check East US Project:
- Note: No agents are created in East US. The FLUX.2-pro model is accessed directly via code.
- Agent IDs are automatically stored in Azure Key Vault with region prefixes and retrieved by the web app
-
Test Processing: For example:
-
Chat: Ask for information "What is GitHub Copilot?"
-
Image Upload: Upload an image and ask "Crop the main subject"
-
Background: "Change the background to a beach scene" (routed to East US for fast generation)
-
Thumbnail: "Create a thumbnail with the text 'AMAZING'" (routed to East US)
-
Multi-Step: "Crop the car, put it on a race track background, and add the text 'SPEED' in red"
-
Video: "Generate a 5-second video of a sunset over mountains" (Sweden Central - Sora)
-
Document: "Extract all text from this PDF" or "Summarize this document" (Sweden Central - FLUX.1-Kontext-pro)
-

