API Reliability & Debugging Suite

😤 The Problem

"Our API crashed at 3 AM and we have no idea why."

"This endpoint is slow, but we can't figure out where the bottleneck is."

"We're getting hammered by bots and can't stop them."

Sound familiar? Most APIs are built without proper observability, security, or testing—making debugging a nightmare.

✅ The Solution

This project is a production-ready template that shows you how to build APIs that are:

🔍 Observable — Every request is traced, every error is logged in structured JSON
🤖 Intelligent — Uses LLMs (Groq/OpenAI) to analyze logs and suggest fixes
🛡️ Secure — Rate limiting & JWT authentication protect your resources
✅ Tested — Automated tests catch bugs before they reach production
🚀 CI/CD Ready — Push code, tests run automatically

🎯 What This Demonstrates

Skill	Implementation
API Development	FastAPI with async endpoints
Log Analysis (AI)	Automated error triage via LLM (Groq/OpenAI/Google)
Structured Logging	JSON logs via Structlog (ELK/Datadog ready)
Distributed Tracing	OpenTelemetry instrumentation
Error Handling	Global exception middleware
Rate Limiting	Slowapi with IP-based throttling
Authentication	JWT tokens with OAuth2 password flow
Configuration	Environment-aware settings (Pydantic)
Containerization	Multi-stage Docker build
Automated Testing	Pytest with async support
CI/CD	GitHub Actions with matrix testing

🚀 Quick Start

Local Development

git clone https://github.com/daretechie/api-reliability-suite.git
cd api-reliability-suite

# The fast way (using Makefile)
make install
make run

# ...or manually with Poetry
poetry install
poetry run uvicorn src.main:app --reload

Run with Docker

# Using Makefile
make docker-build
make docker-run

# ...or manually
docker build -t reliability-suite .
docker run -p 8000:8000 reliability-suite

🔍 API Endpoints

Endpoint	Auth	Rate Limited	Description
`GET /health`	❌	✅ 5/min	Health check
`GET /slow`	❌	❌	Simulates slow request (tracing demo)
`POST /login`	❌	❌	Get JWT token (demo/secret123)
`GET /protected`	✅	❌	Protected route (requires JWT)
`GET /debug/summarize-errors`	✅	❌	AI analyzes logs and returns insights 🤖
`GET /metrics`	❌	❌	Prometheus metrics for Grafana 📊
`GET /force-error`	❌	❌	Triggers 500 error (error handling demo)
`GET /docs`	❌	❌	Interactive Swagger docs

📡 Observability & Tracing

This suite supports OTLP (OpenTelemetry Line Protocol) for production-grade tracing.

Connecting to Jaeger

To see your traces in a dashboard, update your .env:

OTLP_ENDPOINT="http://localhost:4317"

Then run Jaeger via Docker:

docker run -d --name jaeger \
  -e COLLECTOR_OTLP_ENABLED=true \
  -p 16686:16686 \
  -p 4317:4317 \
  jaegertracing/all-in-one:latest

View your traces at http://localhost:16686.

📊 Metrics & Grafana

The app automatically exposes RED metrics (Rate, Errors, Duration) at /metrics.

Spin up the full observability stack with:

docker compose up -d --build

Service	URL	Login
API	`http://localhost:8000`	—
Prometheus	`http://localhost:9099`	—
Grafana	`http://localhost:3030`	admin / admin

First-Time Grafana Setup:

Open Grafana → Connections → Data Sources → Add Prometheus.
Set URL to http://prometheus:9090 (internal Docker DNS).
Save & Test → Create your first dashboard!

🧠 AI-Powered Debugging

This project includes a Self-Healing AI Agent that reads app.json logs and provides actionable insights.

How to use:

Set an API key in .env: GROQ_API_KEY, OPENAI_API_KEY, or GOOGLE_API_KEY.
Hit the /debug/summarize-errors endpoint (requires auth).
Receive a JSON summary of root causes and fixes.

🏗️ Architecture Highlight: Hybrid Logging

Why do we log to both Console and File?

Console (stdout): Follows 12-Factor App principles. Allows Docker/K8s/Datadog to capture logs without config.
File (app.json): Allows the AI Agent (running inside the app) to read its own history and perform self-diagnosis.

This "Loopback" architecture enables the application to debug itself without external dependencies.

🧩 Technical Design & Patterns

This project implements several advanced patterns that are often overlooked:

1. Vendor-Agnostic AI Adapter

The system doesn't rely on a single AI provider. src/core/llm.py automatically detects which API keys are present (Groq, OpenAI, or Google) and dynamically selects the best available provider. This prevents vendor lock-in.

2. Hermetic Testing

Tests (tests/conftest.py) are designed to run in complete isolation. We mock the OpenTelemetry exporters so that running make test doesn't require a running Jaeger instance or Docker container. The tests verify the logic, not the infrastructure.

3. Fail-Safe Configuration

Using Pydantic Settings (src/core/config.py), the application enforces strict type validation on startup. If a required environment variable is missing, the app crashes immediately (Fail Fast) rather than failing silently at runtime.

👷 Developer Tools

This project uses Ruff for linting and Pre-Commit for quality checks.

# Install git hooks (runs automatically on commit)
make install-hooks

# Run tests
make test

# Format code manually
make format

💖 Support This Project

If this template helps you, consider sponsoring my work!

🤝 Hire Me

Looking for a developer who understands API reliability, security, and DevOps? 📧 [email protected] | LinkedIn

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github		.github
docs		docs
src		src
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
app.json		app.json
docker-compose.yml		docker-compose.yml
poetry.lock		poetry.lock
prometheus.yml		prometheus.yml
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Repository files navigation

API Reliability & Debugging Suite

😤 The Problem

✅ The Solution

🎯 What This Demonstrates

🚀 Quick Start

Local Development

Run with Docker

🔍 API Endpoints

📡 Observability & Tracing

Connecting to Jaeger

📊 Metrics & Grafana

🧠 AI-Powered Debugging

🏗️ Architecture Highlight: Hybrid Logging

🧩 Technical Design & Patterns

1. Vendor-Agnostic AI Adapter

2. Hermetic Testing

3. Fail-Safe Configuration

👷 Developer Tools

💖 Support This Project

🤝 Hire Me

About

Uh oh!

Releases

Sponsor this project

Uh oh!

Packages

Languages

Uh oh!

daretechie/api-reliability-suite

Folders and files

Latest commit

History

Repository files navigation

API Reliability & Debugging Suite

😤 The Problem

✅ The Solution

🎯 What This Demonstrates

🚀 Quick Start

Local Development

Run with Docker

🔍 API Endpoints

📡 Observability & Tracing

Connecting to Jaeger

📊 Metrics & Grafana

🧠 AI-Powered Debugging

🏗️ Architecture Highlight: Hybrid Logging

🧩 Technical Design & Patterns

1. Vendor-Agnostic AI Adapter

2. Hermetic Testing

3. Fail-Safe Configuration

👷 Developer Tools

💖 Support This Project

🤝 Hire Me

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Sponsor this project

Uh oh!

Packages 0

Languages

Packages