RAG-based Observability Assistant

A Python application that uses a Retrieval-Augmented Generation (RAG) approach to query system telemetry data using natural language. The application collects CPU, memory, and disk usage metrics and stores them in DuckDB. It then uses natural language processing via a local LLM (Llama2) to translate user queries into SQL and provide human-readable insights.

Features

Natural Language Queries: Ask questions about system metrics in plain English
Telemetry Collection: Automatic collection of CPU, memory, and disk usage data
Columnar Database: Fast storage in DuckDB
Local LLM: Powered by Llama2 via Ollama
Web UI: Interactive Streamlit interface for queries and results
Concurrent Access: Optimized for concurrent read/write access to database

Prerequisites

Python 3.8+
Ollama:
- Download and install Ollama from https://ollama.ai/
- Pull the Llama2 model: ollama pull llama2:7b
- Ensure Ollama is running in the background (default: http://localhost:11434)
Conda (recommended for environment management)

Installation

Option 1: Using Conda (Recommended)

Clone this repository:

git clone https://github.com/Rhushya/Kloudmate.git
cd kloudmate

Create and activate the Conda environment:

conda create -n rag_obs_assistant python=3.8
conda activate rag_obs_assistant

Install dependencies:
```
pip install -r requirements.txt
```

Option 2: Using pip

Clone this repository:

git clone https://github.com/Rhushya/Kloudmate.git
cd kloudmate

Install dependencies:
```
pip install -r requirements.txt
```
Configure (optional):
- The .env file contains configuration for Ollama. By default, it uses http://localhost:11434.
- You can modify collection intervals in telemetry_collector.py if needed.

Usage

Make sure your environment is activated:
```
conda activate rag_obs_assistant
```
Start the Telemetry Collector:
```
python telemetry_collector.py
```
This will begin collecting system metrics and storing them in telemetry.db. Let it run for a few minutes to collect meaningful data.
Start the Streamlit App:
```
streamlit run app.py
```
This will open the application in your web browser.
Ask Questions:
- "Which servers had >65% memory usage in the last 24 hours?"
- "Did any service spike over 85% CPU last week?"
- "List hosts with >90% disk usage in the past 12 hours"
- "Show me CPU usage for the last 10 minutes"
- "Any high memory usage recently?"

How It Works

The telemetry collector (telemetry_collector.py) uses psutil to gather system metrics.
Metrics are stored in DuckDB with schema: (timestamp, hostname, cpu_usage, memory_usage, disk_usage).
When you ask a question:
- The LLM converts your natural language query to SQL
- The SQL query runs against DuckDB
- The LLM summarizes the results into human-readable insights
- The answer is displayed in the Streamlit UI

Database Concurrency

The application uses DuckDB's concurrency features to allow simultaneous read/write access:

The telemetry collector opens and closes database connections for each metrics collection cycle
The Streamlit app uses READ_ONLY access mode to safely query the database while the collector is writing
This approach prevents "File is already open" errors that can occur with concurrent access

If you want to test the concurrency model, you can run:

python test_concurrency.py

This simulates simultaneous read/write operations to verify the solution works correctly.

Project Structure

observability_assistant/
├── telemetry_collector.py  # Collects and stores psutil data into DuckDB
├── app.py                  # Streamlit application (RAG logic)
├── .env                    # For OLLAMA_BASE_URL configuration
├── requirements.txt        # Python dependencies
├── test_concurrency.py     # Test script for DB concurrency
└── README.md               # This file

##DEMO

20250508_142207.mp4

Limitations

The application is designed for collecting metrics from the local machine. For a multi-host setup, additional configuration would be required.
The current schema is simple (timestamp, hostname, cpu_usage, memory_usage, disk_usage). For more complex telemetry, the schema and query generation would need to be expanded.
The LLM runs locally through Ollama, which might have performance implications on resource-constrained systems.

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

RAG-based Observability Assistant

Features

Prerequisites

Installation

Option 1: Using Conda (Recommended)

Option 2: Using pip

Usage

How It Works

Database Concurrency

Project Structure

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
__pycache__		__pycache__
.env		.env
README.md		README.md
app.py		app.py
check_database.py		check_database.py
requirements.txt		requirements.txt
telemetry.db		telemetry.db
telemetry_collector.py		telemetry_collector.py

Rhushya/Kloudmate

Folders and files

Latest commit

History

Repository files navigation

RAG-based Observability Assistant

Features

Prerequisites

Installation

Option 1: Using Conda (Recommended)

Option 2: Using pip

Usage

How It Works

Database Concurrency

Project Structure

Limitations

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages