A production-ready Azure Functions implementation of a Retrieval Augmented Generation (RAG) agent, leveraging ChromaDB as a vector store and LlamaIndex for orchestration.
This serverless API enables intelligent query processing through a vector-backed retrieval system. The architecture follows contemporary RAG principles, combining the strengths of pre-trained language models with domain-specific knowledge retrieval.
- Serverless Deployment: Azure Functions for efficient scaling and reduced operational overhead
- Vector-Based Retrieval: ChromaDB integration for semantic similarity search
- Agent-Based Interaction: ReActAgent framework for sophisticated query processing
- Token Authentication: Simple yet effective API security
- CORS Support: Cross-origin resource sharing for web client integration
- Cold Start Optimization: Efficient initialization pattern for improved performance
- Comprehensive Logging: Structured logging for observability and troubleshooting
- Azure Functions development environment
- Python 3.12+
- Azure subscription
- OpenAI API key
- ChromaDB instance
The application requires the following environment variables:
| Variable | Description | Default |
|---|---|---|
CHROMA_HOST |
ChromaDB server ip or hostname | - |
CHROMA_PORT |
ChromaDB server port | 8000 |
CHROMA_COLLECTION_NAME |
ChromaDB collection name | knowledge |
OPENAI_API_KEY |
OpenAI API key | - |
OPENAI_MODEL_NAME |
OpenAI model identifier | o3-mini |
OPENAI_EMBEDDING_MODEL_NAME |
OpenAI embedding model | text-embedding-3-small |
FUNCTION_API_TOKEN |
Authentication token for the API | - |
VERBOSE |
Enable verbose agent logging | false |
-
Clone this repository:
git clone https://github.com/poacosta/agentic-rag-serverless-api-with-azure.git cd agentic-rag-serverless-api-with-azure -
Create and activate a virtual environment:
python -m venv .venv source .venv/bin/activate # On Windows: .venv\Scripts\activate
-
Install dependencies:
pip install -r requirements.txt
-
Set up local settings:
cp local.settings.example.json local.settings.json
-
Edit
local.settings.jsonwith your environment variables -
Start the Azure Functions runtime:
func start
Endpoint: /api/query
Method: POST
Headers:
Content-Type: application/jsonAuthorization: Bearer {FUNCTION_API_TOKEN}
Request Body:
{
"query": "Your question or query text here"
}Response:
{
"result": "Response from the agent",
"status": "success"
}Error Response:
{
"status": "error",
"message": "Error description"
}az login
az account set --subscription <subscription-id>
func azure functionapp publish <function-app-name>The system initializes an agent using LlamaIndex's ReActAgent, which follows a reasoning-action framework to sequentially process complex queries. The agent is initialized once and reused across function invocations, improving cold start performance.
The implementation uses ChromaDB as a vector database, accessed through HTTP client connection. The vector store maintains embeddings of knowledge base documents, enabling semantic search capabilities.
- Request authentication and validation
- Agent initialization (if not already initialized)
- Query processing through ReActAgent
- Response construction and delivery
- Cold Start: The initialization pattern minimizes impact on response time
- Memory Usage: Monitor function memory consumption with increased vector database size
- Scaling: Azure Functions automatically scales based on demand
- Timeout Limits: Configure function timeout settings appropriate for complex queries
- API token authentication
- CORS configuration for controlled client access
- Environment variable encryption for sensitive configuration
Common Issues:
- 401 Unauthorized: Check your
Authorizationheader format and token value - 500 Internal Server Error: Verify ChromaDB connection and OpenAI API key validity
- Agent Initialization Failure: Ensure all required environment variables are set
Logging:
The application uses Python's standard logging module. Set VERBOSE=true for detailed agent operation logs and thoughts.
This project is licensed under the MIT License - see the LICENSE file for details.
