Skip to content

A-Souhei/cli

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

81 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

AI CLI - Ollama-Powered Chat Interface

Tests License: MIT

A minimal, modular AI command-line interface that connects to Ollama services (local or remote) for interactive AI conversations.

πŸ€– AI-Assisted Development: This project leverages AI coding assistants (GitHub Copilot, Claude) to accelerate development iterations while maintaining human oversight for architecture decisions, code review, and quality control. AI assists with implementation details; humans drive the vision.

Features

Core CLI Features

  • πŸ€– Connect to local or remote Ollama services
  • πŸ’¬ Interactive chat with AI models
  • πŸ“ @ Prefix Autocomplete - TAB completion for files/directories with automatic context injection
  • πŸ” RAG Context System - Redis-backed vector embeddings for semantic search
  • 🌳 Directory Tree Visualization - ASCII tree structure for directory context
  • 🐍 Code Generation - Write Python/R code to files automatically
  • πŸ“ MCP Tool System - Modular Context Protocol for extensible operations
  • πŸ“š Session Management - Context-persistent conversations with history injection
  • ⚑ Code Execution - Run Python/R code with automatic output capture
  • 🎯 Dynamic Model Management - Add, remove, and switch models at runtime (no restart needed)
  • πŸ”Œ Embedding Service Abstraction - External embedding services with automatic fallback
  • πŸš€ /code Command - Unified interface for complex code task orchestration
  • πŸ”§ /make Command - Execute Makefile targets using natural language (: shortcut)
  • πŸ’‘ $ Prefix (NEW) - Direct MCP tool execution with interactive dropdowns
  • βš™οΈ Configurable via YAML file
  • πŸ”„ Streaming and non-streaming response modes

Ollama++ API Service (NEW)

  • 🌐 OpenWebUI Compatible - Drop-in Ollama replacement API
  • πŸ”Œ OpenAI API Compatibility - Works with standard OpenAI clients
  • πŸ› οΈ 11 Built-in MCP Tools - Code execution, file operations, RAG tools
  • 🎯 Intelligent Tool Matching - Semantic search finds the right tool automatically
  • πŸ“Ž File Upload Support - Upload and reference files in conversations
  • 🧠 Multi-step Orchestration - Break down complex tasks automatically

Session Persistence (NEW)

  • πŸ’Ύ Auto-save Sessions - Conversations saved to Redis automatically
  • πŸ”„ Restore Sessions - Resume any previous conversation by ID
  • πŸ“‹ List Sessions - View all saved sessions with metadata
  • πŸ—‘οΈ Session Management - Clear specific or all sessions

Infrastructure

  • 🐳 Docker Compose with Redis, PostgreSQL, Transformer services
  • πŸš€ Easy setup with automated scripts and Makefile
  • πŸ“Š Sentry integration for error tracking

Quick Start

CLI Mode

# Automated setup (recommended)
make setup

# Build and start all services (Ollama, Redis, Transformer, PostgreSQL)
make build-all-services
make up-all

# Run the CLI
make run
# Or: ./start.sh

Ollama++ API Mode (OpenWebUI Compatible)

# Start all services including the Ollama API
docker compose --profile app --profile api up -d

# API available at http://localhost:8080
# Works with OpenWebUI, standard Ollama clients, and OpenAI API clients

Using with OpenWebUI:

  1. Set Ollama API URL in OpenWebUI settings to: http://host.docker.internal:8080
  2. All MCP tools and RAG features are automatically available

Using Remote Ollama: Set OLLAMA_API_URL in .env to point to your remote Ollama server:

OLLAMA_API_URL=http://your-ollama-server:11434

See DOCUMENTATION.md for detailed guides and docs/ for specific features.

Project Structure

cli/
β”œβ”€β”€ config.yaml              # Configuration for Ollama and chat
β”œβ”€β”€ docker-compose.yml       # Multi-service Docker setup
β”œβ”€β”€ Makefile                 # Build automation and commands
β”œβ”€β”€ main.py                  # Main CLI entry point
β”œβ”€β”€ requirements.txt         # Python dependencies
β”œβ”€β”€ .env.example             # Environment variables template
β”œβ”€β”€ docs/                    # Feature documentation
β”‚   β”œβ”€β”€ AT_PREFIXER_FEATURE.md
β”‚   β”œβ”€β”€ SESSION_FEATURE.md
β”‚   β”œβ”€β”€ SESSION_PERSISTENCE.md
β”‚   β”œβ”€β”€ TOOL_RETRIEVAL_FEATURE.md
β”‚   └── MAKEFILE_COMMANDS.md
β”œβ”€β”€ src/                     # Core modules
β”‚   β”œβ”€β”€ config/              # Configuration management
β”‚   β”œβ”€β”€ ollama_client/       # Ollama client
β”‚   β”œβ”€β”€ chat/                # Chat management
β”‚   β”œβ”€β”€ mcp/                 # MCP client system
β”‚   β”œβ”€β”€ session/             # Session persistence
β”‚   β”œβ”€β”€ transformer/         # Embedding service (Docker)
β”‚   β”œβ”€β”€ redis/               # Redis API service
β”‚   β”‚   └── flask-app/       # Flask API for embeddings
β”‚   β”œβ”€β”€ postgresql/          # PostgreSQL API service
β”‚   β”‚   └── flask-app/       # Flask API for MCP tools
β”‚   β”œβ”€β”€ utils/               # Utilities (tree, etc.)
β”‚   └── file_completer.py    # @ prefix autocomplete
β”œβ”€β”€ ollama_api_service/      # Ollama++ API (NEW)
β”‚   β”œβ”€β”€ app.py               # FastAPI application
β”‚   β”œβ”€β”€ models.py            # Pydantic models
β”‚   β”œβ”€β”€ routes/              # API route handlers
β”‚   β”‚   β”œβ”€β”€ chat.py          # /api/chat endpoints
β”‚   β”‚   β”œβ”€β”€ generate.py      # /api/generate endpoints
β”‚   β”‚   β”œβ”€β”€ openai.py        # OpenAI compatibility layer
β”‚   β”‚   β”œβ”€β”€ tools.py         # MCP tool endpoints
β”‚   β”‚   └── files.py         # File upload endpoints
β”‚   └── utils/               # Ollama adapter utilities
β”œβ”€β”€ system_mcps/             # MCP tool servers
β”‚   └── coder/               # Code execution & file tools
β”œβ”€β”€ tests/                   # Test suite
β”‚   β”œβ”€β”€ test_ollama_api_*.py # API integration tests
β”‚   └── test_tool_retrieval.py
└── testing/                 # Test applications
    β”œβ”€β”€ python_app/          # Python test structure
    └── r_app/               # R test structure

Prerequisites

  • Python 3.7 or higher
  • Docker and Docker Compose (for running Ollama in a container)
  • Make (optional, for using Makefile commands)

Installation

Quick Setup (Recommended)

# Run automated setup
make setup
# Or: ./setup.sh

# This will:
# - Create Python virtual environment
# - Install all dependencies
# - Create .env file
# - Optionally start Docker containers

Option 1: Using Docker Compose

  1. Clone the repository:

    git clone <repository-url>
    cd cli
  2. Create environment file:

    cp .env.example .env
  3. Start Ollama service:

    docker compose --profile ollama up -d

    This will:

    • Start the Ollama service in a container
    • Automatically pull the tinyllama model (~1GB, CPU-friendly)
    • Create a persistent volume for model storage
  4. Wait for model download (first time only):

    docker compose logs -f ollama-setup

    Wait until you see "Ollama setup complete!"

  5. Run the CLI:

    ./start.sh

Option 2: Using Local Ollama Installation

  1. Clone the repository:

    git clone <repository-url>
    cd cli
  2. Install Ollama locally: Follow instructions at https://ollama.ai/

  3. Pull a model:

    ollama pull tinyllama  # or llama2, mistral, etc.
  4. Configure the CLI: Edit config.yaml to set your Ollama service URL and preferred model:

    ollama:
      url: "http://localhost:11434"  # Change for remote Ollama
      model: "tinyllama"             # Change to your preferred model
      timeout: 120
    
    chat:
      system_prompt: "You are a helpful AI assistant."
      max_context_length: 10
      temperature: 0.7
      stream: true
  5. Run the CLI:

    ./start.sh

    The script will automatically:

    • Create a Python virtual environment
    • Install all required dependencies
    • Start the AI CLI

Usage

Once the CLI starts, you can:

  • Chat with AI: Simply type your message and press Enter
  • @ Prefix for files: Type @filename + TAB for autocomplete, automatically adds file context
  • @ Prefix for directories: Type @dirname/ to add entire directory with tree visualization
  • Generate code: Type @newfile.py <description> to generate code directly to file
  • Clear history: Type clear to reset the conversation
  • List models: Type models to see available Ollama models
  • Switch model: Type switch to change the current model
  • MCP tools: Type mcps to list available tools, mcp-tools <name> for tool details
  • Model management commands:
    • /model status - Show all configured models
    • /model list - List all models
    • /model <type> list - List models of specific type (general/coder/embedding)
    • /model <type> add <url> <model_name> - Add a general or coder model
    • /model embedding add <url> [timeout] - Add external embedding service
    • /model <type> use <model_id> - Set active model
    • /model <type> remove <model_id> - Remove model
    • /model check [model_id] - Check model availability
  • Code command:
    • /code <prompt> - Execute complex coding tasks with automatic tool orchestration
  • Direct MCP tool execution (NEW):
    • $ <prompt> - Interactive MCP tool selection with coder model parameter extraction
  • Session commands:
    • /session start - Start a context-persistent session
    • /session end - End the current session
    • /session info - Display current session information
    • /session list - List all saved sessions (NEW)
    • /session restore <id> - Restore a previous session (NEW)
    • /session clear - Clear all saved sessions (NEW)
  • Context commands:
    • /context add @file - Add file/directory to context without LLM call (NEW)
    • /context show - Display current context (chat, session, metadata)
    • /context clear - Clear context (keeps session active)
  • Repomap commands:
    • /repomap create - Create a repository map from working directory
    • /repomap load - Load existing .repomap file into context
  • Make commands:
    • /make <prompt> - Execute make commands using natural language
    • /make map generate - Generate .makemap from Makefile
    • /make map update - Update .makemap with new targets
    • /make map load - Load .makemap into context
    • : - Shortcut for /make (e.g., :run tests)
  • Exit: Type exit or quit to close the CLI

Repomap Feature

The repomap feature helps you create and maintain a comprehensive map of your repository structure. This is useful for providing AI assistants with context about your codebase.

Creating a Repository Map:

β–Ά /repomap create
πŸ“¦ Creating repository map...
πŸ“‚ Collecting source code files...
βœ“ Found 50 source files
🌳 Generating directory tree...
βœ“ Directory tree generated
πŸ€– Generating repository map with LLM...
βœ“ Repository map created successfully!
πŸ“„ Saved to: /path/to/.repomap

Loading a Repository Map into Context:

β–Ά /repomap load
πŸ“‚ Loading repository map: .repomap
βœ“ Repository map loaded into context!
  Size: 15,230 bytes
  Session: temporary (start a session for persistence)

Automatic Loading with /code Command: When using the /code command, the .repomap file (if it exists) is automatically loaded into context to provide better understanding of the codebase structure.

The .repomap file contains:

  • Directory Tree: ASCII visualization of the project structure
  • Project overview and purpose
  • Architecture and design patterns
  • Directory structure explanation
  • Key components and their responsibilities
  • Entry points and dependencies
  • Data flow and configuration details
  • Testing structure and getting started guide

Make Command (NEW)

The /make command lets you execute Makefile targets using natural language. It automatically parses your Makefile, generates a .makemap file with target descriptions, and matches your intent to the right make commands.

Quick Shortcut: Use : as a shortcut for /make:

β–Ά :run tests
# Equivalent to: /make run tests

Generating a Make Map:

β–Ά /make map generate
πŸ“¦ Generating .makemap from Makefile...
πŸ“‚ Parsing Makefile...
βœ“ Found 15 targets
πŸ€– Generating descriptions with LLM...
βœ“ .makemap created successfully!
πŸ“„ Saved to: /path/to/.makemap

Executing Make Commands with Natural Language:

β–Ά /make run the integration tests
πŸ” Matching command...
βœ… Matched: make test-integration
⚑ Executing...
  Running integration tests (requires containers)...
βœ“ Integration tests completed

Auto-detection: If a .makemap file exists, the CLI automatically detects make-related prompts:

β–Ά build the docker images
πŸ”§ Detected make command
βœ… Matched: make build-all-services
⚑ Executing...

The .makemap file contains:

  • Targets: All available make targets with descriptions
  • Dependencies: Target dependencies
  • Variables: Makefile variables and their defaults
  • Recipes: Command summaries for each target

Session Feature

The session feature allows you to maintain conversation context across multiple prompts by automatically injecting previous interactions as context. When a session is active, the last 5 interactions are automatically included as context, enabling coherent multi-turn conversations.

Example:

β–Ά /session start
πŸ“ Session started at 14:30:45

β–Ά What is the capital of France?
β–Ά Paris

β–Ά What's the population of that city?
β–Ά Paris has approximately 2.2 million people...
# AI understands "that city" refers to Paris

β–Ά /session end
βœ… Session ended (started at 14:30:45, 2 interactions)

Session Persistence (NEW): Sessions are automatically saved to Redis and can be restored later:

β–Ά /session list
πŸ“‹ Saved Sessions:
  1. abc123... | 2025-11-25 14:30 | 5 interactions | "Python help"
  2. def456... | 2025-11-24 10:15 | 12 interactions | "Docker setup"

β–Ά /session restore abc123
βœ… Session restored with 5 interactions

See docs/SESSION_FEATURE.md and docs/SESSION_PERSISTENCE.md for detailed documentation.

Context Management (NEW)

The /context add command allows you to add files and directories to the conversation context without triggering an LLM response. This is useful when you want to incrementally build up context before asking questions, avoiding unnecessary token usage.

Difference from @ prefix:

  • Using @file.py what does this do? - Adds file to context AND triggers LLM response
  • Using /context add @file.py - Only adds file to context, no LLM call

Example:

β–Ά /context add @src/main.py
βœ“ Added 1 file(s) to context:
  β€’ src/main.py

β–Ά /context add @src/utils/
βœ“ Added 1 directory(s) to context:
  β€’ src/utils/ (15 files, 3 directories)

β–Ά /context show
πŸ“‹ Current Context:
Chat Messages: 0
Session: Active
  β€’ ID: abc123...
  β€’ Duration: 120s
  β€’ Interactions: 2

β–Ά Now explain how main.py uses the utils module
# AI now has both main.py and utils/ in context

Use Cases:

  • Batch Context Loading: Add multiple files/directories before asking questions
  • Token Efficiency: Build context incrementally without triggering LLM on each addition
  • Session Preparation: Prepare context at the start of a session for later questions
  • Code Review Setup: Load all relevant files first, then ask specific questions

Code Command (NEW)

The /code command provides a unified interface for complex code task orchestration. It automatically:

  1. Analyzes your prompt and breaks it into steps
  2. Matches the best MCP tools for each step
  3. Executes the tools in sequence

Example:

β–Ά /code create a python script that reads data from users.csv, filters active users, and generates a bar chart

πŸ” Analyzing task...
πŸ“‹ Task breakdown:
  1. Read CSV file
  2. Filter data for active users
  3. Generate bar chart visualization

πŸ”§ Matching tools...
βœ… Matched 3 tools for execution

⚑ Executing...
  βœ“ Step 1: read_csv_file
  βœ“ Step 2: filter_data
  βœ“ Step 3: create_visualization
  
βœ… Task completed successfully!

Auto-session: If no session is active, /code will automatically start one for you.

See docs/CODE_COMMAND.md for detailed documentation and examples.

Direct MCP Tool Execution with $ Prefix (NEW)

The $ prefix provides interactive MCP tool selection with automatic parameter extraction using your coder model. Perfect when you know what type of task you want but want to browse available tools.

How it works:

  1. Type $ followed by your request
  2. Select MCP server from dropdown (arrow keys + Enter)
  3. Select tool from dropdown
  4. System extracts parameters using coder model
  5. Tool executes with results displayed

Example:

β–Ά $ generate 100 fake records from @users.csv

πŸ”§ Direct MCP Tool Execution Mode
Select MCP server and tool to execute...

πŸ“¦ Select MCP Server:
  β–Ά coder
    data-engineer

βœ“ Selected MCP: data-engineer

πŸ”§ Select Tool from data-engineer:
    compare_ast_similarity
    compare_code_similarity
    generate_ast
  β–Ά generate_fake_data
    generate_fake_data_ctgan

βœ“ Selected Tool: generate_fake_data

πŸ€– Extracting parameters from prompt using coder model...
Parameters extracted: {
  "file_path": "users.csv",
  "num_samples": 100,
  "working_dir": "/path/to/dir"
}

⚑ Executing tool 'generate_fake_data' on MCP 'data-engineer'...
βœ“ Tool execution completed

Key Features:

  • Interactive selection - Browse all MCPs and tools with arrow keys
  • Coder model enforced - Uses coder model for accurate parameter extraction
  • Smart parameter detection - Extracts file paths, counts, and options from natural language
  • Filtered tool list - Automatically excludes meta/orchestration tools

See docs/DOLLAR_PREFIX_MCP_TOOL_EXECUTION.md for complete documentation.

Dynamic Model Management (NEW)

Manage AI models at runtime without restarting the application:

β–Ά /model status
πŸ“Š Model Configuration:
  General: llama3.1:8b (http://192.168.31.23:11434) βœ“
  Coder: qwen2.5-coder:7b (http://192.168.31.23:11434) βœ“
  Embedding: External Service (http://localhost:16050) βœ“

β–Ά /model general add http://localhost:11434 mistral
βœ… Added general model: mistral (ID: abc123)

β–Ά /model general use abc123
βœ… Switched to model: mistral

Features:

  • Add/remove models dynamically
  • Switch between models instantly
  • Support for external embedding services
  • Automatic availability checking
  • Redis-backed persistence

See docs/DYNAMIC_MODEL_MANAGEMENT.md for detailed documentation.

Example Session

==================================================
  AI CLI - Powered by Ollama
==================================================
Type 'exit' or 'quit' to exit
Type 'clear' to clear chat history
Type 'models' to list available models
==================================================

Using model: tinyllama
Connected to: http://localhost:11434

You: Hello! Can you help me with Python?
AI: Of course! I'd be happy to help you with Python...

You: exit
Goodbye!

Makefile Commands

The project includes a Makefile for easy management:

# Show all available commands
make help

# Setup
make setup                    # Complete setup (venv + dependencies + Docker)
make venv                     # Create virtual environment
make install                  # Install Python dependencies

# Build & Run
make build-all-services       # Build all Docker images
make up-all                   # Start all services (Ollama + Redis + Transformer + PostgreSQL)
make up-redis                 # Start only Redis services
make run                      # Run the CLI

# Web UI
make ui                       # Start the AI CLI Web UI in background
make ui-logs                  # Start the AI CLI Web UI with logs (foreground)
make ui-stop                  # Stop the AI CLI Web UI

# Docker Management
make down                     # Stop Docker containers
make restart                  # Restart Docker containers
make logs                     # View container logs
make status                   # Show container status

# Redis Management
make build-redis              # Build Redis API image
make redis-logs               # Show Redis API logs
make redis-cli                # Execute Redis CLI
make redis-clear              # Clear all Redis data (with confirmation)
make redis-info               # Show Redis statistics
make redis-api-health         # Check Redis API health
make transformer-health       # Check Transformer service health

# Database
make migrate-session          # Apply session database migration
make update-schema            # Update PostgreSQL schema

# Ollama
make pull-model MODEL=llama2  # Pull a specific model
make list-models              # List available models

# Cleanup
make clean                    # Remove venv and volumes

See docs/MAKEFILE_COMMANDS.md for detailed command documentation.

Docker Compose Management

Using Makefile (Recommended):

# Start Ollama
make up

# Stop Ollama
make down

# View logs
make logs

# Check status
make status

Using Docker Compose directly:

Start Ollama service:

docker compose --profile ollama up -d

Stop Ollama service:

docker compose --profile ollama down

View Ollama logs:

docker compose logs -f ollama

Check service status:

docker compose ps

Remove volumes (delete downloaded models):

docker compose down -v

Configuration Options

Ollama Settings

  • url: Ollama service URL (default: http://localhost:11434)
  • model: AI model to use (e.g., tinyllama, llama2, mistral, codellama)
  • timeout: Request timeout in seconds (default: 120)

Chat Settings

  • system_prompt: Initial prompt to guide AI behavior
  • max_context_length: Number of messages to keep in context (default: 10)
  • temperature: Response randomness, 0.0-1.0 (default: 0.7)
  • stream: Enable streaming responses (default: true)

Manual Setup (Alternative)

If you prefer to set up manually instead of using start.sh:

# Create virtual environment
python3 -m venv venv

# Activate virtual environment
source venv/bin/activate  # On Windows: venv\Scripts\activate

# Install dependencies
pip install -r requirements.txt

# Run the CLI
python main.py

Troubleshooting

Ollama Connection Issues

Using Docker Compose:

  • Ensure Ollama container is running: docker compose ps
  • Check container logs: docker compose logs ollama
  • Verify the service is healthy: docker compose ps (should show "healthy")
  • Restart the service: docker compose restart ollama

Using Local Installation:

  • Ensure Ollama is running: ollama serve
  • Verify the URL in config.yaml matches your Ollama service
  • For remote Ollama, ensure network connectivity

Model Not Found

Using Docker Compose:

  • Check if setup container completed: docker compose logs ollama-setup
  • Manually pull a model: docker compose exec ollama ollama pull tinyllama
  • List available models: docker compose exec ollama ollama list

Using Local Installation:

  • Pull the model: ollama pull <model-name>
  • Update config.yaml with the correct model name
  • Use the models command in the CLI to see available models

Docker Compose Issues

  • Ensure Docker and Docker Compose are installed
  • Check if ports are available (default: 11434)
  • View all logs: docker compose logs
  • Recreate containers: docker compose down && docker compose --profile ollama up -d

Development

The project uses a modular architecture:

Core CLI Modules

  • Config Module (src/config/): Handles configuration loading and management
  • Ollama Client Module (src/ollama_client/): Manages communication with Ollama
  • Chat Module (src/chat/): Handles conversation context and message management
  • MCP Module (src/mcp/): Model Context Protocol client for tool execution
  • Session Module (src/session/): Session persistence and management

Microservices

  • Transformer Service (src/transformer/): Sentence embeddings for semantic search
  • PostgreSQL API (src/postgresql/flask-app/): MCP tool storage and retrieval
  • Redis API (src/redis/flask-app/): RAG vector storage and session persistence
  • Ollama++ API (ollama_api_service/): OpenWebUI-compatible API with enhanced features

Running Tests

# Run all tests
make test

# Run specific test file
pytest tests/test_ollama_api_integration.py -v

# Run with coverage
pytest --cov=src tests/

License

This project is licensed under the MIT License - see the LICENSE file for details.

Copyright (c) 2025 Toavina A.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •