-
Notifications
You must be signed in to change notification settings - Fork 9
intro to agents #92
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
intro to agents #92
Conversation
Habeebah157
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I dont see anything wrong
roshansuresh
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a great intro lesson to agents! I've left a few comments/questions for clarification but this looks good to go. Also appreciate the addition of MCP like we had discussed!
| @@ -0,0 +1,45 @@ | |||
| # Introduction to AI Agents | |||
| AI agents are autonomous systems that can perform tasks on behalf of users by leveraging a combination of external tools, and decision-making processes. A helpful way to think about the difference between an agent and a traditional LLM is that a basic LLM answers questions, while an agent is able to perform actions using *tools*. This is why agentic AI has become so popular recently: it breaks LLMs out of the the "text-in, text-out" box, and let's them interact with the (software) world in much more interesting ways. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might also be good to point out that agents are able to determine when they have completed a task, as opposed to a traditional LLM that stops once the response is generated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You actually discuss this later. You can disregard this comment
|
|
||
| This pattern helps agents avoid brittle, one-shot responses. Instead of trying to solve everything at once, the agent can think step-by-step, verify intermediate results, and adjust its plan as needed. In theory, this makes agents more reliable and easier to debug, because you can see not just the final answer, but the sequence of thoughts and actions that led there. | ||
|
|
||
| ReAct is not a single library or tool: it is a design pattern that includes LLMs in the loop. Many agent frameworks employ some variation of this loop under the hood. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the phrase "includes LLMs in the loop" makes it feel like ReAct was developed independently of transformers and LLM theory. Is this true? If not, maybe we can reframe it: ReAct is not a single library or tool: it a response framework that leverages the inherent reasoning ability of LLMs.
| ReAct is not a single library or tool: it is a design pattern that includes LLMs in the loop. Many agent frameworks employ some variation of this loop under the hood. | ||
|
|
||
| ### Frameworks: smolagents, LangChain, and LlamaIndex | ||
| While building an agent from scratch is possible (indeed, we will build a couple in order to demysify agents and LLM tool use). However, just like with RAG, things can get complex very quickly, and there are [many agentic frameworks](https://github.com/Azure-Samples/python-ai-agent-frameworks-demos/) that have been created to handle this complexity for you. Just to name a few: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: demystify
|
|
||
| **[smolagents](https://huggingface.co/docs/smolagents/en/index)** is a lightweight framework from HuggingFace that emphasizes simplicity and transparency. It is especially well-suited for educational settings, because the agent loop is explicit and easy to inspect. Code-based agents are a first-class concept, which makes smolagents a good fit for learning how agents actually work under the hood. smolagents is the framework we will be using for the hands-on portion of this lesson, partly because it is simple and easy to learn, and because their code-based agents are so powerful and flexible. | ||
|
|
||
| **[LangChain](https://www.langchain.com/agents)** is a general-purpose framework for building LLM-powered applications, including agents. It provides abstractions for tools, memory, chains, and agents, making it easier to assemble complex systems quickly. The trade-off is complexity. LangChain can feel heavy, and understanding what is happening internally requires more effort. We initially planned to use LangChain for Code the Dream, but the learning curve was much too steep for our purposes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what does chains mean in this context?
|
|
||
| **[LangChain](https://www.langchain.com/agents)** is a general-purpose framework for building LLM-powered applications, including agents. It provides abstractions for tools, memory, chains, and agents, making it easier to assemble complex systems quickly. The trade-off is complexity. LangChain can feel heavy, and understanding what is happening internally requires more effort. We initially planned to use LangChain for Code the Dream, but the learning curve was much too steep for our purposes. | ||
|
|
||
| **[LlamaIndex](https://developers.llamaindex.ai/python/framework/understanding/agent/)** isn't just for RAG, but also has an agentic framework that can be very powerful for building agents that interact with external data sources. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like logically we should include a reason why we don't use this for our hands-on portion. But we don't really need to do that, we can discuss
| ### Tool-Based vs. Code-Based Agents | ||
| Not all agents work the same way. A useful distinction that is becoming more prevalent is between tool-based agents and code-based agents. | ||
|
|
||
| *Tool-based agents* interact with the world by calling predefined tools that are explicitly enumeratred to the LLM. These tools might include things like a plotting functions, a calculator, or any other well-defined software tools. The agent's job is to decide which tools to use and when, given the task. This approach is generally safer and easier to control, because the agent can only do what the available tools allow. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: enumerated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: a plotting function
|
|
||
| *Tool-based agents* interact with the world by calling predefined tools that are explicitly enumeratred to the LLM. These tools might include things like a plotting functions, a calculator, or any other well-defined software tools. The agent's job is to decide which tools to use and when, given the task. This approach is generally safer and easier to control, because the agent can only do what the available tools allow. | ||
|
|
||
| *Code-based agents*, on the other hand, are given free reign to generate and execute novel code to reach their given goal. Instead of selecting from a fixed set of tools, the agent writes code, runs it, inspects the output, and continues from there. This is extremely powerful and flexible. While careful sandboxing and guardrailes are required to keep things safe, in practice, it turns out that LLMs are surprisingly good at writing correct code, so code-based agents can be very effective. We will see an example of this in a hands-on lesson. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: guardrails
| Also, the [Model Context Protocol](https://en.wikipedia.org/wiki/Model_Context_Protocol) (MCP), created by Anthropic, is an emerging standard that defines a *common interface* for exposing tools and resources to agents. It is a protocol that standardizes how agents discover tools, understand their inputs and outputs, and calls them safely, much like a USB standardizes how devices connect to computers. | ||
|
|
||
| ## Next steps | ||
| AI agents represent a shift from "text exchange" to "tool building"." The core ideas - autonomy, iterative reasoning, and tool use -- are simple, but powerful. In the rest of the lessons, we will move from these concepts to hands-on examples, where you will see how agents behave in practice. No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: an extra " added after "tool building"
Brief introduction to agents: what are they, some different types, and a few of the frameworks out there. Just enough to whet their appetite for the lessons.