“According to the Gartner Report, by 2028, one-third of interactions with generative AI (GenAI) services will use action models and autonomous agents for task completion.”
In the rapidly changing word of Artificial Intelligence, the development of autonomous agents has taken a significant leap forward. These AI agents, equipped with advanced tools and supported by large language models (LLMs) like those from OpenAI, can autonomously perform complex, multi-step tasks with minimal or no human intervention.
Different industries like finance, retail, e-commerce, and many more have already taken a plung in utilizing this technology in different use cases. According to Forrester report, generative AI and autonomous workplace assistants are among the top emerging technologies poised to deliver significant ROI and transform business processes across various sectors (Forrester) (Forrester).
LangChain, an open-source framework that has emerged as the de facto standard for building generative AI applications, offers a powerful libraries and toolkit for developing such AI agents. Many AI applications and platforms are already leveraging it for implementing sophisticated use cases as it helps in harnessing the full potential of multiple language models.
In this series, I’ll share my journey and learnings as I explore how to build AI agents using LangChain. I will start with a simple example and gradually in subsequent posts, take it to a level where we would try to create multiple agents performing complex task.
What are AI Agents?
AI agents are software programs that can analyze information, make decisions, and take actions to achieve specific tasks or goals. These agents are context-aware, capable of learning and adapting inputs from human in loop , and can handle sophisticated tasks.
LangChain and OpenAI
LangChain simplifies the development of applications that leverage the power of large language models (LLMs). By using the comprehensive toolkit of libraries and frameworks it provides, developers can create a wide range of natural language processing applications, including chatbots, text summarizers, question-answering systems, and sentiment analysis tools and many more.
OpenAI, known as the company behind ChatGPT, has become a go-to AI platform for both technical and non-technical users, even for everyday tasks. They have developed advanced language models like GPT-3.5, GPT-4, DALL-E and Whisper, which can process and generate text, images, audio etc. These models are ideal for tasks such as answering questions, summarizing information, and creating original content.
Building The First Agent
To understand any concept and deep dive further, it is always better to start with a simple example. In this example, I will be creating a simple calculator agent, which will perform basic mathematical calculation. To calculate, this Agent will be equipped with a very basic tool. This Agent leverages OpenAI to comprehend the questions asked and create the response.
One more thing: Even though this is a basic example, I have exposed the agent’s tasks through a Flask API. The core agentic component can also be executed as standalone Python scripts. The rationale behind an API-first approach will become clear as I go deeper and share examples with multiple agents collaborating to achieve complex goals.
So let us jump into the example:
Required Dependencies
We will need following dependencies to make the Example:
import os
from dotenv import load_dotenv
from flask import Flask, request, jsonify
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import StdOutCallbackHandler
Install these dependencies by running following PIP command:
pip install flask python-dotenv langchain openai
Basic Building Blocks
Step 1: Creating the Calculator Tool
As shared earlier for the Agent to perform mathematical calculations, we will create a simple tool in python which does calculation. LangChain has a concept of tools, which AI Agent can leverage to perform tasks. Below is the code, for the same:
def calculator(expression):
try:
return str(eval(expression))
except:
return "Error: Invalid expression"
tools = [
Tool(
name="Calculator",
func=calculator,
description="useful for when you need to perform mathematical calculations",
)
]
As you can see the calculator is nothing but a python function , which has name and description. Same has been added in the tools list.
Step 2: Initializing the language model
While LangChain provides the framework for building our agent, OpenAI’s language model serves as the agent’s “brain,” powering its language understanding and generation capabilities.
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
Step 3: Initializing the language model and agent
Below code initializes the OpenAI LLM and Agent:
agent = initialize_agent(
tools, llm, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
In the above code block following things are set, while initializing agent:
a) tools: The list of Tool objects the agent can use.
b) LLM: The language model the agent will use for understanding and generating text.
c) Agent Type: Specifies the type of agent to use
Here was using CHAT_ZERO_SHOT_REACT_DESCRIPTION for simple chat interaction as this agent doesn’t require examples to understand how to use tools.
c) verbose: This is set to True, which provides detailed output of the agents thought process
Step 4: Executing the Agent
Agent is run by calling agent. Run method. When this is called following steps are performed internally:
a) The agent receives the question and analyzes it using the language model.
b) It determines if it needs to use any tools to answer the question.
c) If needed, it uses one or more tools.
d) It formulates a response based on the tool outputs and its understanding of the question.
e) If verbose is True, it outputs its thought process.
f) Finally, it returns the response.
This process can iterate multiple times if the agent determines it needs more information or needs to use additional tools to complete the task.
Below is the code, which takes the queries and run the agent:
def query_agent():
data = request.json
if "question" not in data:
return jsonify({"error": "No question provided"}), 400
question = data["question"]
try:
response = agent.run(question)
return jsonify({"response": response})
except Exception as e:
return jsonify({"error": str(e)}), 500
Complete Code
Below is the full code of the agent which does the mathematical calculation:
import os
from dotenv import load_dotenv
from flask import Flask, request, jsonify
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import StdOutCallbackHandler
# Load environment variables
load_dotenv()
app = Flask(__name__)
# Set up a simple calculator tool
def calculator(expression):
try:
return str(eval(expression))
except:
return "Error: Invalid expression"
tools = [
Tool(
name="Calculator",
func=calculator,
description="useful for when you need to perform mathematical calculations",
)
]
# Set up the LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)
# Initialize the agent
agent = initialize_agent(
tools, llm, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)
@app.route("/query", methods=["POST"])
def query_agent():
data = request.json
if "question" not in data:
return jsonify({"error": "No question provided"}), 400
question = data["question"]
try:
response = agent.run(question)
return jsonify({"response": response})
except Exception as e:
return jsonify({"error": str(e)}), 500
if __name__ == "__main__":
app.run(debug=True)
Conclusion & Next Steps
This is just a basic agent built using LangChain and OpenAI. When you query the Agent with a basic question to perform mathematical operations, it will respond with an answer.
In the next few blogs, I will share how to create an Agent with a slightly complex tool and expand it further with more than one agents.
Additional Notes
The code usage specific versions of LangChain and other depnendencies. While running the code, it is always better to create python virtual application and install the dependencies using requirements.txt. The complete working code can be found in my GitHub repo.
Happy Learning!!
One thought on “Building Intelligent Agents with LangChain and OpenAI — Part 1”