HACID_RAG

A comprehensive Retrieval-Augmented Generation (RAG) system for medical and healthcare applications, supporting various LLM models and embedding techniques.

Note: The current repository version is for test usage and it's under update.

Overview

This project implements a RAG system specifically designed for medical and healthcare applications. It supports multiple Large Language Models (LLMs) and embedding models, with a focus on extracting and processing medical concepts, relationships, and entities.

Features

Specialized medical concept extraction and processing:
- SNOMED CT concept extraction
- Medical entity recognition
- Relationship extraction
- Knowledge graph integration

Project Structure

HACID_RAG/
├── README.md                    # Project documentation
├── requirements.txt             # Python dependencies
├── requirements_minimal.txt    # The top-level necessary dependencies
├── .gitignore                  # Git ignore rules
├── .env.template               # Environment variables template
├── data/                       # Data storage directory
│   ├── BC5CDR/                 # BC5CDR dataset
│   ├── mimiciv/                # MIMIC-IV dataset
│   └── pubmed/                 # PubMed dataset
├── notebooks/                  # Jupyter notebooks for experimentation and testing
│   ├── rag_playground.ipynb    # Main playground for RAG system testing
│   ├── retriever_test.ipynb    # Retriever functionality testing
│   ├── RAG_format_enforcer.ipynb # Triple extraction with format enforcement
│   ├── RAG_grammarllm.ipynb    # Grammar-constrained triple extraction
│   └── grammarllm/             # Grammar LLM utilities
│       └── temp/
├── scripts/                    # Utility scripts (currently empty)
└── src/                        # Source code directory
    ├── rag_moduler.py          # Core RAG functionality and LLM pipeline
    ├── llm_factory.py          # Modular factory for building LLMs and embedding models
    ├── rag_auxiliaries         # Auxiliary LLM utilities (extractor, parser, generator)
    ├── rag_prompt_template.py  # Prompt templates for various tasks
    ├── rag_util.py            # Utility functions
    ├── llm_extraction.py      # LLM-based extraction utilities
    ├── rag_extraction.py      # RAG-specific extraction utilities
    ├── rag_index_builder_csv.py # Index building utilities
    ├── rag_llm_series_extraction.py # Series extraction utilities
    ├── rag_run.sh             # Shell script for running RAG
    ├── format_enforcer_wrapper.py # Format enforcement wrapper
    ├── grammar_llm_utils.py   # Grammar LLM utilities
    ├── deepseek_llm.py        # DeepSeek LLM integration
    ├── grammarllm/            # Grammar LLM package
    │   ├── README_PACKAGE.md  # Package documentation
    │   ├── __init__.py        # Package initialization
    │   ├── generate_with_constraints.py # Constrained generation
    │   ├── modules/           # Grammar modules
    │   ├── scripts/           # Grammar scripts
    │   └── utils/             # Grammar utilities
    └── utils/                 # General utilities
        └── deepseek_llm.py    # DeepSeek LLM utilities

Notebooks

The notebooks/ directory contains Jupyter notebooks for experimentation, testing, and demonstration of the RAG system's capabilities. For detailed descriptions of each notebook's features, pipeline, and examples, see notebooks/README.md.

Available Notebooks:

rag_playground.ipynb - Main interactive playground for comprehensive RAG system testing
retriever_test.ipynb - Focused testing and comparison of retrieval components
RAG_format_enforcer.ipynb - Triple extraction with structured output enforcement
RAG_grammarllm.ipynb - Grammar-constrained generation for reliable structured outputs

Installation

Clone the repository
(Recommended) Create a Python 3.10 environment:

conda create -n hacid_rag python=3.10
conda activate hacid_rag

Install dependencies:

pip install -r requirements.txt

requirements.txt vs requirements_minimal.txt

requirements.txt: Contains all packages (including sub-dependencies and system/conda packages) from a full environment export. Use this for exact environment replication.
requirements_minimal.txt: Contains only the top-level, necessary Python packages. All other dependencies will be installed automatically. Use this for a cleaner, more portable, and easier-to-maintain setup.

Recommendation:

Use requirements.txt for legacy or debugging purposes.
Use requirements_minimal.txt for new installations or lightweight environments.

Additional Installation Step

Ensure you have the ollama package installed, as it is required for embedding computations in the Custom RAG retriever in the src/rag_moduler.py. You can install it using the following command:

pip install ollama

Quick Usage

Download the .zip file of the indexed knowledge graph here.
- snomed_all_dataset_nodoc_hitsnomed.zip: The indices of full knowledge graph embedded by hitsnomed
- snomed_all_dataset_nodoc_minilml12v2.zip: The indices of full knowledge graph embedded by minilml12v2
- snomed_partial_dataset_nodoc_minilml6v2.zip: The indices of partial knowledge graph embedded by minilml6v2
Download the LLM and embedding model (recommendations: mistral-small and hitsnomed ). Place them in a new folder and check the paths in the src/rag_moduler.py

$ LLM = {} and EMBED_MODEL = {}
Unzip the file and put it into the "index" folder. (If there is no "index" folder, create a new one under the root directory.)

Note: The folders "logs", "results", and "llm" need to be created manually if they do not exist.

Basic RAG Pipeline

This section shows how to initialize and run the core RAG pipeline. Implementation details are in src/rag_moduler.py, and a live example is available in notebooks/rag_playground.ipynb.

Initialize the LLM service context with your chosen LLM and embedding model:

from rag_moduler import init_llm_service_context
llm = init_llm_service_context(
    llm_model_name="mistralsmall",
    embed_model_name="bgem3"
)

Load the knowledge graph index (create or load an existing index directory):

from rag_moduler import init_kg_storage_context
kg_index = init_kg_storage_context(
    llm,
    storage_dir="path/to/your/index"
)

Build the RAG pipeline by combining the LLM and KG index into a query engine:

from rag_moduler import init_rag_pipeline
rag_pipeline = init_rag_pipeline(
    kg_index,
    similarity_top_k=3,
    retriever_mode="hybrid"
)

Query the pipeline and inspect the response:

response = rag_pipeline.query("What medical concepts are associated with acute myocardial infarction?")
print(response)

Supported Tasks

Medical Concept Extraction
- SNOMED CT concept extraction
- BC5CDR entity-type extraction
- NCBI disease entity extraction
- MIMIC-IV SNOMED entity extraction
Relationship Extraction
- SNOMED CT relationship extraction
- Medical concept relationship mapping
Knowledge Graph Integration
- Vector storage
- Knowledge graph storage
- Hybrid retrieval

Recent Updates

New Modular Architecture (v0.2)

The codebase has been refactored to improve modularity and maintainability:

New File: `llm_factory.py`

A dedicated factory module for building and managing LLMs and embedding models:

build_hf_transformers(): Explicitly load HuggingFace models with fine-grained control over quantization, device mapping, and memory management
wrap_hf_llm(): Wrap Transformers models into LlamaIndex HuggingFaceLLM objects
build_llm_by_name(): Factory method supporting multiple LLM types (HuggingFace, OpenAI, DeepSeek)
build_embed_model(): Create embedding models with consistent interface
configure_global_settings(): Set global LLM and embedding model settings
release_model(): Properly release models and free GPU memory

New File: `rag_auxiliaries`

Auxiliary LLM utilities extracted from the original rag_moduler.py for better code organization:

init_llm_pipeline(): Initialize LLM pipeline for extractors, parsers, and generators
llm_entity_extractor(): Extract entities from text using LLM
llm_parser(): Parse RAG or LLM extractor outputs
llm_generation(): Generate responses from LLM models

Modified: `rag_moduler.py`

Core RAG functionality now leverages the new factory pattern and has been streamlined:

Imports and uses functions from llm_factory.py for all model initialization
Moved auxiliary LLM functions (extractors, parsers, generators) to rag_auxiliaries
Simplified init_llm_service_context() by delegating model loading to factory functions
Improved init_kg_storage_context() with better embedding model handling
Enhanced CustomKGTableRetriever with Ollama-based semantic similarity computation
Added swap_llm_and_rebuild_engine() for dynamic LLM switching (experimental)

These changes provide:

Better separation of concerns: Model management is isolated from RAG logic
Improved memory management: Explicit control over model loading and release
Enhanced flexibility: Easier to add new LLM providers or modify model configurations
Cleaner code: Reduced duplication and improved maintainability

Configuration

The system supports various configurations through the llm_factory.py and rag_moduler.py files:

LLM model selection (HuggingFace, OpenAI, DeepSeek)
Embedding model selection
Quantization settings (4-bit, 8-bit)
Context window size
Temperature and sampling parameters
Device mapping and memory optimization

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HACID_RAG

Overview

Features

Project Structure

Notebooks

Available Notebooks:

Installation

requirements.txt vs requirements_minimal.txt

Additional Installation Step

Quick Usage

Basic RAG Pipeline

Supported Tasks

Recent Updates

New Modular Architecture (v0.2)

New File: `llm_factory.py`

New File: `rag_auxiliaries`

Modified: `rag_moduler.py`

Configuration

Contributing

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
data		data
notebooks		notebooks
src		src
.env.template		.env.template
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
requirements_minimal.txt		requirements_minimal.txt

hacid-project/hacid-RAG

Folders and files

Latest commit

History

Repository files navigation

HACID_RAG

Overview

Features

Project Structure

Notebooks

Available Notebooks:

Installation

requirements.txt vs requirements_minimal.txt

Additional Installation Step

Quick Usage

Basic RAG Pipeline

Supported Tasks

Recent Updates

New Modular Architecture (v0.2)

New File: llm_factory.py

New File: rag_auxiliaries

Modified: rag_moduler.py

Configuration

Contributing

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

New File: `llm_factory.py`

New File: `rag_auxiliaries`

Modified: `rag_moduler.py`

Packages