-
Notifications
You must be signed in to change notification settings - Fork 8
AI Augmentation - Naive Semantic RAG lesson #81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
AI Augmentation - Naive Semantic RAG lesson #81
Conversation
EricThomson
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fantastic thanks so much for putting this together! Just a couple of notes.
1. Embedding: reference intro LLM lesson
When talking about embedding: maybe put a link to llms lesson we talk about it a lot there: this is sort of the payoff of spending so much time there discussing embeddings (it will be 01_intro_nlp_llms.md in 05_AI_intro week). It isn't merged yet but that's wehre it will be 😄
2. Index
In that code block where semantic retrieval is introduced, I think we should break that up into sections, there is a lot going on. I think first have the first few lines of code and introduce an index with something like this
Before we build the FAISS index, it's worth pausing to focus on what an index means in the context of RAG pipelines. An
indexis a vector store that holds our embeddings and is optimized for fast similarity search over those embeddings. The term is used loosely -- sometimes it refers to storage, sometimes to search, and sometimes to both at once.In our case, FAISS gives us a simple, in-memory way to store embeddings and retrieve the most similar ones, which corresponds to the Data Indexing and Retrieval Similarity steps in the figure above. FAISS is not a full SQL-based database like the pgvector setup we will see later. Instead, it provides a temporary (ephemeral) vector storage solution that exists only while your script is running. This lets us keep the focus on understanding the retrieval workflow without needing to set up a database server
(and you can figure out how to incorporate simiarity search which is part of the index, something like "there are lots of measures of similarity to measure how close embeddings are to each other. Here, we use something called cosine similarity." or whatever)
Then you can introduce the def(retrieve) and discuss what that's doing.
No description provided.