Gemi Chat 🤖

A CLI-based AI chatbot for analyzing UBER SEC 10-K filings using Retrieval-Augmented Generation (RAG) and Google's Gemini API.

🚀 Features

AI-Powered Analysis: Interactive chat with a FunctionAgent that answers questions about UBER's financial data from 2019-2022.
RAG with LlamaIndex: Indexes SEC filings for accurate, context-aware responses.
CLI Interface: Simple command-line tools for data loading and chatting.
Modular Design: Clean, maintainable code structure for easy extension.

📦 Installation

Clone the repo:

git clone https://github.com/yourusername/gemi-chat.git
cd gemi-chat

Install dependencies (using uv for speed):
```
uv sync
```
Set up environment:
- Create a .env file: GOOGLE_API_KEY=your_gemini_api_key
- Or set the env var: $env:GOOGLE_API_KEY = "your_key"

🛠️ Usage

Load and index data:
```
python -m src.cli --load-data
```
Start interactive chat:
```
python -m src.cli --chat
```
View help:
```
python -m src.cli --help
```

Example chat: Ask questions like "What were UBER's revenue trends in 2020?" and get AI-powered answers based on the filings.

🏗️ Architecture

Auto-Generated Diagrams

Keep architecture diagrams up-to-date with your codebase:

python generate_architecture.py

This script automatically analyzes the src/ directory and regenerates:

architecture_diagram.md - Detailed component diagram
codebase_analysis.md - Module analysis report

📁 Project Structure

Gemi_Chat/
├── src/                          # Main package
│   ├── __init__.py              # Package init
│   ├── config.py                # Settings & env vars
│   ├── data_loader.py           # Loads UBER HTML data
│   ├── index_manager.py         # Manages vector indices
│   ├── ageny.py                 # AI agent & tools
│   ├── cli.py                   # Command-line interface
│   ├── custom_console.py        # Console utilities
│   └── google_llm_init.py       # Gemini LLM setup
├── pyproject.toml               # Project config & deps
├── system_prompt.txt            # Agent system prompt
├── .env                         # Environment variables
├── data/UBER/                   # UBER SEC filings
└── storage/                     # Persisted indices

Module Details

config.py: Centralized configuration (years, paths, API keys).
data_loader.py: Data ingestion with UnstructuredReader.
index_manager.py: Vector index creation/persistence.
ageny.py: Agent setup with query engines and chat loop.
cli.py: CLI with argparse for commands.
custom_console.py: Spinners, colors, timers.
google_llm_init.py: Google Gemini LLM initialization.

🏗️ Architecture

The CLI_Chat application follows a modular RAG (Retrieval-Augmented Generation) architecture:

System Overview

CLI Interface → Data Processing → Vector Indexing → AI Agent → Chat Interface

Core Components Flow

Data Pipeline: HTML SEC filings → Document parsing → Vector embeddings → Persistent storage
Query Pipeline: User question → Index retrieval → Context augmentation → LLM generation → Response
Agent System: FunctionAgent with specialized tools for multi-year financial analysis

Architecture Diagrams

Detailed Component Diagram: Complete system architecture with all modules and dependencies
Process Flow Diagram: High-level user journey and data flow

Key Technologies

LlamaIndex: Vector indexing, query engines, and agent framework
Google Gemini: LLM for generation and embeddings for semantic search
Unstructured.io: Document parsing for HTML SEC filings
RAG Pattern: Retrieval-augmented generation for accurate financial analysis

🤝 Contributing

Fork the repo.
Create a feature branch.
Commit changes.
Push and open a PR.

📄 License

MIT License - see LICENSE file for details.

⚠️ Disclaimer

This is for educational purposes only. Not financial advice.

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
__MACOSX/UBER		__MACOSX/UBER
data		data
src		src
storage		storage
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
architecture_diagram.md		architecture_diagram.md
generate_architecture.py		generate_architecture.py
pyproject.toml		pyproject.toml
system_prompt.txt		system_prompt.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Gemi Chat 🤖

🚀 Features

📦 Installation

🛠️ Usage

🏗️ Architecture

Auto-Generated Diagrams

📁 Project Structure

Module Details

🏗️ Architecture

System Overview

Core Components Flow

Architecture Diagrams

Key Technologies

🤝 Contributing

📄 License

⚠️ Disclaimer

About

Uh oh!

Releases

Packages

Languages

eddietal2/CLI_Chat

Folders and files

Latest commit

History

Repository files navigation

Gemi Chat 🤖

🚀 Features

📦 Installation

🛠️ Usage

🏗️ Architecture

Auto-Generated Diagrams

📁 Project Structure

Module Details

🏗️ Architecture

System Overview

Core Components Flow

Architecture Diagrams

Key Technologies

🤝 Contributing

📄 License

⚠️ Disclaimer

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages