AI-IOWA-MAKE-RAG-BETTER

Repo that shows a few approachable techniques to improving your document retrieval results

Areas to tweak in RAG Flows

See the diagram here to see where some of these fall in the standard RAG process. But most changes that can be made fall into 1 of 3 categories

Document Indexing/Preparation
- The process of getting documents from your source, chunking, cleaning, parsing for metadata, and really transforming in any way before creating your embedding.
Document Retrieval
- The process of finding the most similar indexed documents to you query. This is typically where many apps struggle.
Inference
- After you have already found the document chunks you are intending to reference, this is the process of sending the user query, documents, and any other information to the LLM to generate your answer.

Running the code

Before running anything in main.ipynb

Create a python virtual environment
Run docker compose up to create our postgres container
Create a .env file in the base of this repository that sets COHERE_API_KEY to be equal to your api key

The Data

This uses a snapshot of data pulled from Kaggle with around 15000 movie titles and plots. The data was pulled verbatim, so no cleaning has been done on it, either for quality or appropriateness. If you find data you don't think is appropriate, feel free to remove it!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
postgres-data		postgres-data
.gitignore		.gitignore
RAG Diagram.drawio		RAG Diagram.drawio
README.md		README.md
docker-compose.yaml		docker-compose.yaml
main.ipynb		main.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AI-IOWA-MAKE-RAG-BETTER

Areas to tweak in RAG Flows

Running the code

The Data

About

Uh oh!

Releases

Packages

Languages

grey-lovelace/ai-iowa-make-rag-better

Folders and files

Latest commit

History

Repository files navigation

AI-IOWA-MAKE-RAG-BETTER

Areas to tweak in RAG Flows

Running the code

The Data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages