AI Augmentation - Naive Semantic RAG lesson #81

roshansuresh · 2025-12-06T05:01:45Z

No description provided.

…antic_rag

EricThomson

This is fantastic thanks so much for putting this together! Just a couple of notes.

1. Embedding: reference intro LLM lesson
When talking about embedding: maybe put a link to llms lesson we talk about it a lot there: this is sort of the payoff of spending so much time there discussing embeddings (it will be 01_intro_nlp_llms.md in 05_AI_intro week). It isn't merged yet but that's wehre it will be 😄

2. Index
In that code block where semantic retrieval is introduced, I think we should break that up into sections, there is a lot going on. I think first have the first few lines of code and introduce an index with something like this

Before we build the FAISS index, it's worth pausing to focus on what an index means in the context of RAG pipelines. An index is a vector store that holds our embeddings and is optimized for fast similarity search over those embeddings. The term is used loosely -- sometimes it refers to storage, sometimes to search, and sometimes to both at once.

In our case, FAISS gives us a simple, in-memory way to store embeddings and retrieve the most similar ones, which corresponds to the Data Indexing and Retrieval Similarity steps in the figure above. FAISS is not a full SQL-based database like the pgvector setup we will see later. Instead, it provides a temporary (ephemeral) vector storage solution that exists only while your script is running. This lets us keep the focus on understanding the retrieval workflow without needing to set up a database server

(and you can figure out how to incorporate simiarity search which is part of the index, something like "there are lots of measures of similarity to measure how close embeddings are to each other. Here, we use something called cosine similarity." or whatever)

Then you can introduce the def(retrieve) and discuss what that's doing.

…antic_rag

roshansuresh added 9 commits November 20, 2025 17:12

Created semantic RAG markdown

e4bbeab

Merge remote-tracking branch 'upstream/main' into AI_augmentation_sem…

c15b655

…antic_rag

Initial skeleton commit

9acd58b

More additions

113cfd5

Completed semantic RAG lesson

6173aed

Merge remote-tracking branch 'upstream/main' into AI_augmentation_sem…

9b6b0e2

…antic_rag

Modified based on first round of feedback

08e54c4

Merge remote-tracking branch 'upstream/main' into AI_augmentation_sem…

28d827a

…antic_rag

Added pgvector + docker + postgreSQL stuff

44a2d9b

EricThomson self-requested a review December 16, 2025 20:44

EricThomson reviewed Dec 16, 2025

View reviewed changes

roshansuresh added 2 commits December 16, 2025 18:35

Merge remote-tracking branch 'upstream/main' into AI_augmentation_sem…

f62cb50

…antic_rag

Completed addition of pgvector

233514b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

AI Augmentation - Naive Semantic RAG lesson #81

AI Augmentation - Naive Semantic RAG lesson #81

Uh oh!

roshansuresh commented Dec 6, 2025

Uh oh!

EricThomson left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

AI Augmentation - Naive Semantic RAG lesson #81

Are you sure you want to change the base?

AI Augmentation - Naive Semantic RAG lesson #81

Uh oh!

Conversation

roshansuresh commented Dec 6, 2025

Uh oh!

EricThomson left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants