🚀 AlphaKhulnasoft v2: Competitive AI Code Repair System

A Flow-Engineered Agent that improves LLM code generation accuracy by 400% through iterative, sandbox-validated repair loops.

💡 The Problem

Large Language Models (LLMs) often generate code that looks correct but fails on edge cases or runtime constraints. "Zero-shot" prompting hits a ceiling (~30-40% on hard problems).

⚡ The Solution: AlphaKhulnasoft

Instead of asking once, AlphaKhulnasoft treats code generation as a Search & Repair problem.

Analyze: Semantic parsing of constraints (System 2 thinking).
Generate: Drafts initial solution.
Adversarial Test: Runs code in a secure subprocess sandbox.
Root Cause Analysis: Feeds specific error logs (stderr) back to the agent.
Iterative Repair: Loops until success or max retries.

📊 Performance (Benchmark)

Metric	Zero-Shot (Baseline)	AlphaKhulnasoft (Iter 5)	Improvement
Pass Rate	30%	80%	+166%
Logic	Implicit	Chain-of-Thought	N/A
Safety	None	Sandboxed	✅

(See repair_curve.png for the full trajectory)

🛠️ Architecture

graph TD
    A[Problem] --> B(Semantic Analyzer)
    B --> C{Code Generator}
    C --> D[Sandbox Execution]
    D -->|Pass| E[✅ Solved]
    D -->|Fail| F[Root Cause Agent]
    F -->|Hypothesis| C

🚀 Quick Start

1. Install

git clone https://github.com/KhulnaSoft/alphakhulnasoft.git
cd alphakhulnasoft
uv sync
cp .env.example .env  # Add your OPENAI_API_KEY

2. Generate Challenge Data

Bootstrap a hard dataset using the LLM itself:

uv run python -m alphakhulnasoft.dataset_gen

3. Run the Gauntlet

Execute the flow-engineering benchmark:

uv run python -m alphakhulnasoft.benchmark data/hard_mode.jsonl

4. Prove the Results

Generate the efficiency report and visualization:

uv run python -m alphakhulnasoft.visualizer results_latest.json

🐳 Run with Docker

If you prefer containerized execution:

# 1. Build and Run the benchmark
docker compose run alphakhulnasoft

# 2. Run a specific command
docker compose run alphakhulnasoft python -m alphakhulnasoft.dataset_gen

📦 Pre-built Images

Images are automatically published to the GitHub Container Registry:

docker pull ghcr.io/khulnasoft/alphakhulnasoft:main

🛡️ Code Quality & CI/CD

We use modern tooling to ensure high code quality:

Linting & Formatting: ruff
Type Checking: mypy
Testing: pytest

Run quality checks locally:

# Lint & Format check
uv run ruff check .
uv run ruff format --check .

# Type check
uv run mypy alphakhulnasoft

# Run tests
uv run pytest tests/

CI is automatically handled by GitHub Actions on every push to main.

📂 Project Structure

alphakhulnasoft/alpha_repair.py: Flow state and logic.
alphakhulnasoft/prompts.py: The specialized personas (Architect, Debugger).
alphakhulnasoft/sandbox.py: Secure execution engine.
alphakhulnasoft/evaluator.py: Scoring and metrics logic.
alphakhulnasoft/visualizer.py: Research-grade plotting.
alphakhulnasoft/data_loader.py: Ingestion from local and Hugging Face.
alphakhulnasoft/publisher.py: Results sharing to HF Hub.

🤗 Hugging Face Integration

AlphaKhulnasoft now integrates directly with the Hugging Face ecosystem:

Load Datasets: Fetch popular coding benchmarks (HumanEval, MBPP) directly from HF Hub using openai_humaneval or mbpp.
Publish Results: Automatically push your benchmark reports to a HF Dataset repository.
Serve Models: Use huggingface/ model prefixes via litellm to run local or inference-api models.

☁️ Cloud & Enterprise Integration

AlphaKhulnasoft is enterprise-ready with support for:

Google Cloud Vertex AI: Run Gemini models with enterprise-grade security and reliability.
Subprocess Isolation: Standard isolation for safe code execution (ready for optional Docker/nsjail hardening).

👨‍💻 Author

Built as a demonstration of System 2 AI Architecture.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.github/workflows		.github/workflows
alphakhulnasoft		alphakhulnasoft
notebooks		notebooks
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
Dockerfile		Dockerfile
README.md		README.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 AlphaKhulnasoft v2: Competitive AI Code Repair System

💡 The Problem

⚡ The Solution: AlphaKhulnasoft

📊 Performance (Benchmark)

🛠️ Architecture

🚀 Quick Start

1. Install

2. Generate Challenge Data

3. Run the Gauntlet

4. Prove the Results

🐳 Run with Docker

📦 Pre-built Images

🛡️ Code Quality & CI/CD

📂 Project Structure

🤗 Hugging Face Integration

☁️ Cloud & Enterprise Integration

👨‍💻 Author

About

Uh oh!

Releases

Packages

Uh oh!

Languages

khulnasoft/AlphaKhulnasoft

Folders and files

Latest commit

History

Repository files navigation

🚀 AlphaKhulnasoft v2: Competitive AI Code Repair System

💡 The Problem

⚡ The Solution: AlphaKhulnasoft

📊 Performance (Benchmark)

🛠️ Architecture

🚀 Quick Start

1. Install

2. Generate Challenge Data

3. Run the Gauntlet

4. Prove the Results

🐳 Run with Docker

📦 Pre-built Images

🛡️ Code Quality & CI/CD

📂 Project Structure

🤗 Hugging Face Integration

☁️ Cloud & Enterprise Integration

👨‍💻 Author

About

Resources

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages