+ ╔╦╗╔═╗╔╦╗╔═╗ ╔═╗╔═╗╔═╗╔╗╔╔╦╗
+ ║║╠═╣ ║ ╠═╣ ╠═╣║ ╦║╣ ║║║ ║
+ ═╩╝╩ ╩ ╩ ╩ ╩ ╩ ╩╚═╝╚═╝╝╚╝ ╩
[ Natural Language → SQL Query Agent ]A natural language to SQL (NL2SQL) platform built on LangGraph and Azure OpenAI. This multi-agent system automatically routes user questions to the appropriate database backend and generates optimized SQL queries and results.
Built on top of LangChain's SQLDatabase with extended support for Azure AD authentication, Cosmos DB, and built-in dialect validation.
- Multi-Database Support: PostgreSQL, Azure SQL, Azure Synapse, Azure Cosmos DB, Databricks SQL, and Google BigQuery
- Intent Detection: Automatically routes queries to the correct data agent based on question context
- Multi-Turn Conversations: Follow-up questions with context awareness (e.g., "What's the average?" after a query)
- SQL Validation: Safe query execution with sqlglot-based validation across all dialects
- Data Visualization: Generate charts and graphs from query results using natural language (e.g., "show me a bar chart")
- Configurable Agents: YAML-based configuration for adding new data sources
- A2A Protocol: Agent-to-Agent interoperability for integration with other A2A-compliant systems
Routes user questions to the appropriate data agent based on context.
Generates, validates, and executes SQL queries with retry logic.
- Python 3.12+
- uv package manager
- Azure OpenAI deployment
git clone https://github.com/eosho/langchain_data_agent
cd langchain_data_agent
uv sync --all-extras
cp .env.example .env
# Edit .env with your valuesThe CLI provides commands for querying data agents through natural language.
# Show available commands
data-agent --helpCommands:
| Command | Description |
|---|---|
query |
Run a single query and exit |
chat |
Start interactive chat mode |
configs |
List available configurations |
validate |
Validate configuration files |
Options:
| Option | Description |
|---|---|
-c, --config |
Configuration to use (default: loads all configs) |
-v/-q, --verbose/--quiet |
Show/hide query state (agent, SQL, message history) |
-l, --log-level |
Logging level (debug, info, warning, error) |
Each config file contains multiple specialized data agents. The system automatically routes your question to the appropriate agent.
| Config | Agents | Domain |
|---|---|---|
contoso |
contoso_sales, contoso_products, contoso_inventory |
Retail data (Databricks, Cosmos DB, PostgreSQL) |
adventure_works |
contoso_hr, hotel_analytics |
HR & hotel data (Azure SQL, Synapse) |
amex |
financial_transactions |
Financial data (BigQuery) |
Example Questions by Config:
contoso (default)
contoso_sales (Databricks)
- How many orders were placed last month?
- What is the total revenue by region?
contoso_products (Cosmos DB)
- How many active products are there?
- Find all products with low inventory
- What is the most expensive product?
contoso_inventory (PostgreSQL)
- Which products need to be reordered?
- What is the total inventory by warehouse?
- How many shipments are in transit?
# Single query (loads all configs by default)
data-agent query "How many shipments are in transit?"
# Interactive chat
data-agent chat
# Use specific config
data-agent query "How many shipments are in transit?" -c contoso
data-agent chat -c contosoadventure_works
contoso_hr (Azure SQL)
- How many employees are in each department?
- What is the average salary by department?
- Who are the top performers this year?
hotel_analytics (Synapse)
- What is the total revenue by hotel?
- What is the revenue breakdown by booking channel?
- What is the average daily rate by room type?
data-agent query "What is the total revenue by hotel?" -c adventure_works
data-agent chat -c adventure_worksamex
financial_transactions (BigQuery)
- What are the total deposits by customer segment?
- Show me all high-severity fraud alerts from the past week
- Who are the top 5 customers by transaction volume?
- Show me a bar chart of transactions by type
data-agent query "What are the total deposits by customer segment?" -c amex
data-agent chat -c amex# List available configs
data-agent configs
# Validate configuration files
data-agent validate # Validate all configs
data-agent validate contoso # Validate specific configThe platform includes a Chainlit-based web interface for interactive data exploration.
# Start the Chainlit UI
chainlit run src/data_agent/ui/app.pyAvailable Profiles:
| Profile | Description |
|---|---|
| Contoso | Retail sales database with products, customers, and orders |
| Amex | Financial transactions and merchant data |
| Adventure Works | Sample database with sales and product information |
Environment Variables:
Ensure these are set in your .env file:
AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/
AZURE_OPENAI_API_KEY=your-api-key
AZURE_OPENAI_DEPLOYMENT=gpt-4oimport asyncio
from data_agent import DataAgentFlow
async def main():
async with DataAgentFlow(
config_path="data_agent/config/contoso.yaml",
azure_endpoint="https://your-resource.openai.azure.com/",
api_key="your-api-key",
deployment_name="gpt-4o",
) as flow:
result = await flow.query("Show me all warehouses")
print(result["final_response"])
asyncio.run(main())The platform includes built-in configuration for these databases:
| Database | Datasource Type | SQL Dialect |
|---|---|---|
| PostgreSQL | postgres |
postgres |
| Azure SQL | azure_sql |
tsql |
| Azure Synapse | synapse |
tsql |
| Azure Cosmos DB | cosmos |
cosmosdb |
| Databricks SQL | databricks |
databricks |
| Google BigQuery | bigquery |
bigquery |
| MySQL | mysql |
mysql |
| SQLite | sqlite |
sqlite |
Note: Any SQLAlchemy-compatible database can be used via
shared_dbparameter orconnection_stringin config. The built-in types provide convenience configuration and AAD authentication support.
# Format and lint
uv run pre-commit run --all-files
# Run tests
uv run pytestMIT License - see LICENSE file for details.

