August 25, 2024September 8, 2024 Madhukar Chaubey Artificial Intelligence, Gen AI

Building Intelligent Agents with LangChain and OpenAI — Part 3: Creating Agents with Advanced Task Handling and Chain of Thought Reasoning

Introduction

In the first two parts [Part 1, Part 2] of this series, we introduced the basics of building AI agents using LangChain and OpenAI. We started with a simple calculator agent and then delved into more advanced concepts of LangChain, which allowed us to develop advanced AI agents.
In this post, we will go through the core concept on which these agents work, i.e. , “Chain of Thought” (CoT) reasoning, and learn how to implement it in LangChain. We will also enhance our AI Agent to make it generic and reusable to handle complex tasks.

Understanding Chain of Thought Reasoning in AI Agents

Most AI Agents nowadays, and the ones we are building/enhancing, use LLMs and prompts behind the scenes. Chain of Thought is a prompting technique used in the area of artificial intelligence and machine learning to enhance the reasoning capabilities of Large Language Models (LLMs). By leveraging the Chain of Thought, AI agents can break down complex tasks into smaller, manageable steps, allowing them to reason, make informed decisions and solve complex problems more effectively. This technique not only helps AI agents to solve various complex problems but also helps in achieving greater accuracy and effectiveness through human-like reasoning processes.

Chain of Thought Prompting Techniques

The chain of thought prompting technique involves providing a structured sequence of steps so LLMs can reason well and effectively provide the correct output. This technique can be applied in the following ways:

Direct Prompting: In this approach, explicit step-by-step instruction is given as a prompt so that LLM can reason and act effectively. Leveraging LanChain’s prompt template makes this easy.
Structured Prompting with LangChain: Structured Prompting with LangChain: LangChain also has additional capabilities, like chains, tolls and memory, that can be used to enhance the reasoning process. By leveraging them, a complex problem can be broken down into a series of prompts or actions, each representing an intermediate step.

By utilizing these techniques, the performance and outcome of AI models can be further improved so that these models can solve real-life complex problems in various domains.

Implementing Chain of Thought Reasoning in LangChain

Let us explore how the same is implemented in LangChain:

Direct Prompting: In this technique the prompt defined using PromptTemplate includes clear instructions and steps to guide the language model to break down complex tasks into intermediate steps.

from langchain_openai import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain

# Initialize the language model
llm = OpenAI(temperature=0.7)

# Define the prompt template with specific chain of thought instructions
cot_prompt = PromptTemplate(
    input_variables=["question"],
    template="""Question: {question}

Let's solve this problem step-by-step using a chain of thought approach:

1. Understand the question:
   - Restate the problem in your own words
   - Identify what we're asked to find

2. Identify the given information:
   - List all the relevant data provided in the question
   - Note any units of measurement

3. Determine the appropriate formula or method:
   - Recall any relevant formulas that apply to this problem
   - Explain why this formula or method is suitable

4. Solve the problem:
   - Plug the given information into the formula
   - Show each calculation step clearly
   - Carry units through your calculations

5. Check the result:
   - Verify if the answer makes sense in the context of the question
   - Ensure the units of the final answer are correct

6. State the final answer:
   - Provide a clear, concise answer to the original question
   - Include the appropriate units

Now, let's apply this process to solve the problem:

"""
)

# Create the chain
chain = LLMChain(llm=llm, prompt=cot_prompt)

# Example question
question = "If a train travels 120 miles in 2 hours, what is its average speed in miles per hour?"

# Run the chain
response = chain.invoke(question)

# Print the response
print(response)

In above code the prompt template includes a direct instruction (“Let’s break down this problem step-by-step using chain of thought”) to encourage the model to reason through the problem. The model is also instrcuted to provide an intermediate reasoning step before giving the final answer.

LLMChain with Intermediate Steps: This method utilizes LangChain’s SequentialChain, wherein multiple LLMChains are defined for each steps and executed in squence using Sequential Chain. Each intermediate LLMChains has clear prompt templates to execute the steps.

from langchain_openai import OpenAI
from langchain.prompts import PromptTemplate
from langchain.chains import LLMChain, SequentialChain

# Initialize the language model
llm = OpenAI(temperature=0.7)

# Define prompt templates for each step
restate_prompt = PromptTemplate(
    input_variables=["question"],
    template="Restate the following question in your own words:\n\nQuestion: {question}\n\nRestatement:"
)

identify_info_prompt = PromptTemplate(
    input_variables=["question", "restatement"],
    template="Given this question and restatement:\n\nQuestion: {question}\n\nRestatement: {restatement}\n\nIdentify and list the key information provided:"
)

determine_method_prompt = PromptTemplate(
    input_variables=["question", "key_info"],
    template="Based on this question and key information:\n\nQuestion: {question}\n\nKey Information: {key_info}\n\nDetermine the appropriate formula or method to solve this problem:"
)

solve_problem_prompt = PromptTemplate(
    input_variables=["question", "method", "key_info"],
    template="Solve this problem step-by-step:\n\nQuestion: {question}\n\nMethod: {method}\n\nKey Information: {key_info}\n\nSolution:"
)

check_answer_prompt = PromptTemplate(
    input_variables=["question", "solution"],
    template="Given this question and solution:\n\nQuestion: {question}\n\nSolution: {solution}\n\nVerify if the answer makes sense and explain why:"
)

# Create individual chains for each step
restate_chain = LLMChain(llm=llm, prompt=restate_prompt, output_key="restatement")
identify_info_chain = LLMChain(llm=llm, prompt=identify_info_prompt, output_key="key_info")
determine_method_chain = LLMChain(llm=llm, prompt=determine_method_prompt, output_key="method")
solve_problem_chain = LLMChain(llm=llm, prompt=solve_problem_prompt, output_key="solution")
check_answer_chain = LLMChain(llm=llm, prompt=check_answer_prompt, output_key="verification")

# Combine chains into a sequential chain
overall_chain = SequentialChain(
    chains=[restate_chain, identify_info_chain, determine_method_chain, solve_problem_chain, check_answer_chain],
    input_variables=["question"],
    output_variables=["restatement", "key_info", "method", "solution", "verification"],
    verbose=True
)

# Example question
question = "If a train travels 120 miles in 2 hours, what is its average speed in miles per hour?"

# Run the chain
print("Running the chain...")
result = overall_chain.invoke({"question": question})

print("\nFinal Results:")
for key, value in result.items():
    print(f"\n{key.capitalize()}:")
    print(value)

In this approach breaks down the reasoning process into a series of LLMChains with specific prompt template, each representing an intermediate step. This is ideal for complex tasks requiring multiple steps, such as strategic planning or multi-stage decision-making.

Building the Agent and Task Execution Framework

As we understand the concept of CoT and how it can be implemented with using various techniques using LangChain, let us bring our focus back to the AI Agent. The purpose here is to enhance it and add concept of Task. This will be a step forward to create a simple agentic framework, where we can dynamically define the task & agent and execute it.

Task Class

The Task class defines the purpose and outcome of the task and provides a method to execute the task using an assigned agent.

#other imports
import uuid
# Task class
class Task:
    def __init__(self, name: str, description: str):
        self.id = str(uuid.uuid4())
        self.name = name
        self.description = description
        self.result = None

    def execute(self, agent, **kwargs):
        task_prompt = f"Complete the following task: {self.name}\n{self.description}\n\nAdditional Information:\n"
        for key, value in kwargs.items():
            task_prompt += f"{key}: {value}\n"
        self.result = agent.process_task(task_prompt)
        return self.result

Agent Class

The Agent class defines our intelligent AI agent. It leverages LangChain’s tools and language models to perform task.

#other imports
from langchain.agents import AgentExecutor, create_react_agent

# Agent class
class Agent:
    def __init__(self, name: str):
        self.name = name
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
        self.tools = [tavily_search]
        self.agent_executor = self._create_agent()

    def _create_agent(self):
        template = '''Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}'''

        prompt = PromptTemplate.from_template(template)
        
        agent = create_react_agent(llm, self.tools, prompt)
        return AgentExecutor(agent=agent, tools=self.tools, verbose=True, memory=self.memory)

    def process_task(self, task_prompt: str) -> str:
        response = self.agent_executor.invoke({"input": task_prompt})
        return response['output']

    def execute_task(self, task: Task, **kwargs):
        return task.execute(self, **kwargs)

As you can see in above code, we have used pre-defined ReAct Agent from LangChain, which usage the CoT principle to accomplish the assigned task.

Using the Framework via API

As we have seen in previous examples, the invocation of Agent was exposed via Flask API, we will do the same here. The reason, I am using Flask API (you can choose any other Python Microsevices framework) is to follow the API first approach., which helps us SaaSifing the framework.

# Flask API routes

@app.route('/add_task', methods=['POST'])
def add_task():
    """Add a new task."""
    data = request.json
    name = data.get('name')
    description = data.get('description')
    
    if not name or not description:
        return jsonify({"error": "Task name and description are required"}), 400

    task = Task(name=name, description=description)
    tasks[task.id] = task
    return jsonify({"task_id": task.id}), 201

@app.route('/add_agent', methods=['POST'])
def add_agent():
    """Add a new agent."""
    data = request.json
    name = data.get('name')
    
    if not name:
        return jsonify({"error": "Agent name is required"}), 400

    agent = Agent(name=name)
    agents[name] = agent
    return jsonify({"agent_name": agent.name}), 201

@app.route('/assign_task', methods=['POST'])
def assign_task():
    """Assign a task to an agent without executing it."""
    data = request.json
    agent_name = data.get('agent_name')
    task_id = data.get('task_id')

    agent = agents.get(agent_name)
    task = tasks.get(task_id)

    if not agent or not task:
        return jsonify({"error": "Agent or task not found"}), 404

    assignments[task_id] = {"agent_name": agent_name, "assigned": True}
    return jsonify({"message": f"Task '{task_id}' assigned to agent '{agent_name}'."}), 200

@app.route('/execute_task', methods=['POST'])
def execute_task():
    """Execute an assigned task for an agent."""
    data = request.json
    task_id = data.get('task_id')
    additional_data = data.get('additional_data', {})

    assignment = assignments.get(task_id)

    if not assignment or not assignment['assigned']:
        return jsonify({"error": "Task is not assigned or does not exist."}), 404

    agent_name = assignment['agent_name']
    agent = agents.get(agent_name)
    task = tasks.get(task_id)

    result = agent.execute_task(task, **additional_data)

    return jsonify({"result": result})

@app.route('/get_agents', methods=['GET'])
def get_agents():
    """Get all agents."""
    return jsonify({"agents": list(agents.keys())})

@app.route('/get_tasks', methods=['GET'])
def get_tasks():
    """Get all tasks."""
    return jsonify({"tasks": {task_id: {"name": task.name, "description": task.description} for task_id, task in tasks.items()}})

@app.route('/get_assignments', methods=['GET'])
def get_assignments():
    """Get all task assignments."""
    return jsonify({"assignments": assignments})

Plugging the Pieces Together

import os
from dotenv import load_dotenv
from flask import Flask, request, jsonify
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_react_agent
from langchain_community.tools.tavily_search import TavilySearchResults
from langchain_core.prompts import PromptTemplate
from langchain.memory import ConversationBufferMemory
from typing import Dict, Any
import uuid

# Load environment variables
load_dotenv()

app = Flask(__name__)

# Initialize the LLM
llm = ChatOpenAI(model="gpt-4", temperature=0)

# Initialize Tavily Search tool
tavily_search = TavilySearchResults(max_results=3)

# In-memory storage for tasks, agents, and assignments
tasks: Dict[str, 'Task'] = {}
agents: Dict[str, 'Agent'] = {}
assignments: Dict[str, Dict[str, Any]] = {}

# Task class
class Task:
    def __init__(self, name: str, description: str):
        self.id = str(uuid.uuid4())
        self.name = name
        self.description = description
        self.result = None

    def execute(self, agent, **kwargs):
        task_prompt = f"Complete the following task: {self.name}\n{self.description}\n\nAdditional Information:\n"
        for key, value in kwargs.items():
            task_prompt += f"{key}: {value}\n"
        self.result = agent.process_task(task_prompt)
        return self.result

# Agent class
class Agent:
    def __init__(self, name: str):
        self.name = name
        self.memory = ConversationBufferMemory(memory_key="chat_history", return_messages=True)
        self.tools = [tavily_search]
        self.agent_executor = self._create_agent()

    def _create_agent(self):
        template = '''Answer the following questions as best you can. You have access to the following tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of [{tool_names}]
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}'''

        prompt = PromptTemplate.from_template(template)
        
        agent = create_react_agent(llm, self.tools, prompt)
        return AgentExecutor(agent=agent, tools=self.tools, verbose=True, memory=self.memory)

    def process_task(self, task_prompt: str) -> str:
        response = self.agent_executor.invoke({"input": task_prompt})
        return response['output']

    def execute_task(self, task: Task, **kwargs):
        return task.execute(self, **kwargs)

# Flask API routes

@app.route('/add_task', methods=['POST'])
def add_task():
    """Add a new task."""
    data = request.json
    name = data.get('name')
    description = data.get('description')
    
    if not name or not description:
        return jsonify({"error": "Task name and description are required"}), 400

    task = Task(name=name, description=description)
    tasks[task.id] = task
    return jsonify({"task_id": task.id}), 201

@app.route('/add_agent', methods=['POST'])
def add_agent():
    """Add a new agent."""
    data = request.json
    name = data.get('name')
    
    if not name:
        return jsonify({"error": "Agent name is required"}), 400

    agent = Agent(name=name)
    agents[name] = agent
    return jsonify({"agent_name": agent.name}), 201

@app.route('/assign_task', methods=['POST'])
def assign_task():
    """Assign a task to an agent without executing it."""
    data = request.json
    agent_name = data.get('agent_name')
    task_id = data.get('task_id')

    agent = agents.get(agent_name)
    task = tasks.get(task_id)

    if not agent or not task:
        return jsonify({"error": "Agent or task not found"}), 404

    assignments[task_id] = {"agent_name": agent_name, "assigned": True}
    return jsonify({"message": f"Task '{task_id}' assigned to agent '{agent_name}'."}), 200

@app.route('/execute_task', methods=['POST'])
def execute_task():
    """Execute an assigned task for an agent."""
    data = request.json
    task_id = data.get('task_id')
    additional_data = data.get('additional_data', {})

    assignment = assignments.get(task_id)

    if not assignment or not assignment['assigned']:
        return jsonify({"error": "Task is not assigned or does not exist."}), 404

    agent_name = assignment['agent_name']
    agent = agents.get(agent_name)
    task = tasks.get(task_id)

    result = agent.execute_task(task, **additional_data)

    return jsonify({"result": result})

@app.route('/get_agents', methods=['GET'])
def get_agents():
    """Get all agents."""
    return jsonify({"agents": list(agents.keys())})

@app.route('/get_tasks', methods=['GET'])
def get_tasks():
    """Get all tasks."""
    return jsonify({"tasks": {task_id: {"name": task.name, "description": task.description} for task_id, task in tasks.items()}})

@app.route('/get_assignments', methods=['GET'])
def get_assignments():
    """Get all task assignments."""
    return jsonify({"assignments": assignments})

if __name__ == '__main__':
    app.run(debug=True)

Example API Calls

Define Task

Endpoint: POST /add_task
Method: POST
Example Payload:

{
"name": "Research Latest Trends in e-commerce",
"description": "Investigate current trends and researchs in the given focus area and summarize key findings."
}

curl -X POST http://localhost:5000/add_task -H "Content-Type: application/json" -d '{"name": "Research Latest Trends in e-commerce", "description": "Investigate current trends and researchs in the given focus area and summarize key findings."}'

This will give GUID of the task which is to be used while we assign task to the agent.

Define Agent

Endpoint: POST /add_agent
Method: POST
Example Payload:

{
  "name": "ResearchAgent"
}

curl -X POST http://localhost:5000/add_agent -H "Content-Type: application/json" -d '{"name": "ResearchAgent"}'

Assign Task to an Agent

Endpoint: POST /assign_task
Method: POST
Example Payload:

{
  "agent_name": "ResearchAgent",
  "task_id": "task_uuid_here"
}
curl -X POST http://localhost:5000/assign_task -H "Content-Type: application/json" -d '{"agent_name": "ResearchAgent", "task_id": "task_uuid_here"}'

Execution of the Task

Endpoint: POST /execute_task
Method: POST

Example Payload:

{
  "task_id": "task_uuid_here",
  "additional_data": {
    "focus_area": "e-commerce marketing"
  }
}
curl -X POST http://localhost:5000/execute_task -H "Content-Type: application/json" -d '{"task_id": "task_uuid_here", "additional_data": {"focus_area": "e-commerce marketing"}}'

Conclusion and Next Steps

In this post, we have deep-dived into the concept of CoT and seen how it can be implemented using various methods using LangChain. We also enhanced the Agentic Code with tasks and APIs to make it more flexible and reusable.
In the next part of this series, we will enhance this code and create simple multi-agent systems where multiple agents collaborate, leveraging the Chain of Thought framework to achieve even more complex objectives.

Additional Notes
The code uses specific versions of LangChain (V0.1) and other dependencies. We have also used TavilySearch for which you need to register here and get the API Key. While running the code, it is always better to create python virtual environment and install the dependencies with specific versions. The complete working code can be found in my GitHub repo.

Happy Learning!!

August 25, 2024September 7, 2024 Madhukar Chaubey Artificial Intelligence, Gen AI

Building Intelligent Agents with LangChain and OpenAI — Part 2: Diving Deeper into LangChain Agentic Concepts

1. Introduction

In the previous article, we talked about AI agents and developed a basic calculator agent using LangChain and OpenAI. Purpose of starting with a very basic agent was to demonstrate the LangChain’s core components and explore how OpenAI’s large language models (LLMs) can be used to create intelligent systems.

Building on the same foundation, let us explore more sophisticated concepts and delve deeper into the LangChain framework. We’ll expand our understanding of the fundamentals and basic concepts of agentic frameworks, and will create more sophisticated set of tools.

2. Fundamentals of Agentic Frameworks

At its core, an agentic framework is a system that allows AI agents to interact with their environment, make decisions, and perform actions to achieve specific goals. Key components of agentic frameworks include:

Agents: The AI entities that perceive, decide, and act.
Environment: The context in which agents operate.
Tools: Functionalities that agents can use to perform actions.
Memory: Mechanisms for storing and retrieving information.
Decision-making processes: Methods for choosing actions based on perceptions and goals.

3. Types of Agents

LangChain offers several types of agents, along with different dimensions:

Tool Calling: This agent is helpful when a model capable of calling external tools is used.

OpenAI Tools: Similar to the above, but supporting Open AI Models and tools.

OpenAI Functions: This legacy type is used with Open AI models supporting function calls.

XML: This is useful using LLM Models, which efficiently process XML inputs/output like anthropic.

Structured Chat: This is best suited when supporting tools with multiple inputs are needed.

JSON Chat: This is useful when using LLM Models capable of handling JSON inputs.

ReAct: When simpler models are used, this is the best fit.

Self Ask With Search: This Agent supports only one tool, which isalso a search tool. Best suited when you have simple Q 7 A use case.

Word of Caution: As stated earlier, Gen AI and especially Agenting platforms are changing rapidly. You need to be very careful with the version of LangChain. The above list is supported in V0.1, and there might be some changes in different v0.1.x.

4. Decision-Making Process Of Agentic Platforms

The following steps are taken by LangChain agents during the decision-making process:

Perception: The agent receives input (e.g., a user query).
Thought: The agent considers the input and its available tools.
Action: The agent decides on an action (e.g., using a specific tool).
Observation: The agent observes the result of its action.
Repeat: The agent repeats this process until it reaches a final answer or conclusion.

This process is part of what’s known as the “ReAct” (Reason+Act) framework. ReAct is crucial for enabling agents to reason through tasks and decide on the best actions to take based on the situation.

5. LangChain: A Closer Look

LangChain provides a robust architecture for building AI applications. Let’s dive deeper into its core concepts:

Models:At the heart of LangChain are the language models. These can be:
· Large Language Models (LLMs): Models that take a string prompt as input and return a string completion as output.
· Chat Models: Models that take a list of chat messages as input and return a chat message as output.

Prompts: Prompts are the inputs to models. LangChain provides several utilities for working with prompts:
· PromptTemplates: For creating reproducible prompts with dynamic inputs.
· Example Selectors: For choosing relevant examples to include in prompts.
· Output Parsers: For structuring model outputs.

Chains: Chains are sequences of calls to components like models, prompts, or even other chains. They allow you to combine multiple steps into a single coherent workflow.

Agents: Agents use language models to determine which actions to take and in what order. They can use tools and manage multi-step tasks.

Tools: Tools are functions that agents can use to interact with the world or perform specific tasks.

Memory: Memory components allow chains and agents to retain information across multiple calls.

Indexes: Indexes are data structures used to organize documents or other data for efficient retrieval. They’re crucial for tasks like question-answering over specific document sets.

6. LangChain Libraries & Building Blocks

Let us understand the critical libraries from LangChain that can implement these concepts.

langchain.tools & chain
In this example, we will create a web search tool that uses the Serper API to search the Internet. We will also use a summarizer tool and set these tools as a list, which the agent will use further.

The below code also usage summarizer chain:

# Other imports
from langchain.utilities import GoogleSerperAPIWrapper
from langchain.chains.summarize import load_summarize_chain
from langchain.text_splitter import CharacterTextSplitter
from langchain.docstore.document import Document

# Tools definition and other supporting methods
# Custom text splitter
class CustomTextSplitter(CharacterTextSplitter):
    def __init__(self, chunk_size=1000, chunk_overlap=20, **kwargs):
        self.chunk_size = chunk_size
        self.chunk_overlap = chunk_overlap
        super().__init__(chunk_size=chunk_size, chunk_overlap=chunk_overlap, **kwargs)

    def split_text(self, text: str) -> List[str]:
        # Simple word-based splitting
        words = text.split()
        chunks = []
        current_chunk = []
        current_chunk_length = 0
        for word in words:
            if current_chunk_length + len(word) > self.chunk_size:
                chunks.append(" ".join(current_chunk))
                current_chunk = []
                current_chunk_length = 0
            current_chunk.append(word)
            current_chunk_length += len(word) + 1  # +1 for space
        if current_chunk:
            chunks.append(" ".join(current_chunk))
        return chunks

# Set up the LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# Set up Serper for web search
search = GoogleSerperAPIWrapper()

# Set up text splitter for summarization
text_splitter = CustomTextSplitter(chunk_size=1000, chunk_overlap=20)

# Set up summarization chain
summarize_chain = load_summarize_chain(llm, chain_type="map_reduce")

def web_search(query):
    return search.run(query)

def summarize(text):
    docs = [Document(page_content=chunk) for chunk in text_splitter.split_text(text)]
    summary = summarize_chain.invoke(docs)
    return summary["output_text"]

tools = [
    Tool(
        name="Web Search",
        func=web_search,
        description="Useful for searching the web for current information on a topic."
    ),
    Tool(
        name="Summarizer",
        func=summarize,
        description="Useful for summarizing long pieces of text."
    )
]

langchain.agents
This part we already touched upon in the previous part of the series:

# Other imports
from langchain.agents import AgentExecutor, create_react_agent, Tool

# Other setup 
# Construct the ReAct agent
agent = create_react_agent(llm, tools, prompt)

langchain.memory
Although our current example doesn’t explicitly use memory, it’s crucial for many agent applications.

7. Plugging the Pieces Together
This covers the complete code where all the previous snippets has been fully integrated. As usual we are having an API endpoint to query the agent.

import os
from dotenv import load_dotenv
from flask import Flask, request, jsonify
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType

from langchain.chains import LLMChain
from langchain.prompts import PromptTemplate
from langchain.text_splitter import CharacterTextSplitter
from langchain.schema import Document
import requests
import json

from langchain.chat_models import ChatOpenAI

# Load environment variables
load_dotenv()

app = Flask(__name__)
SERPER_API_KEY = os.getenv('SERPER_API_KEY')
OPENAI_API_KEY = os.getenv('OPENAI_API_KEY')

# Custom text splitter for summarization
def custom_text_splitter(text, chunk_size=1000, chunk_overlap=20):
    splitter = CharacterTextSplitter(chunk_size=chunk_size, chunk_overlap=chunk_overlap)
    return splitter.split_text(text)

def summarize(text):
    chunks = custom_text_splitter(text)
    docs = [Document(page_content=chunk) for chunk in chunks]
    
    prompt_template = """Write a concise summary of the following text:
    "{text}"
    CONCISE SUMMARY:"""
    prompt = PromptTemplate(template=prompt_template, input_variables=["text"])
    
    llm = ChatOpenAI(temperature=0.7)
    summarize_chain = LLMChain(llm=llm, prompt=prompt)
    
    summaries = []
    for doc in docs:
        summary = summarize_chain.run(doc.page_content)
        summaries.append(summary)
    
    return " ".join(summaries)


def web_search(query: str) -> str:
    url = "https://google.serper.dev/search"
    payload = json.dumps({"q": query})
    headers = {
        'X-API-KEY': SERPER_API_KEY,
        'Content-Type': 'application/json'
    }
    response = requests.request("POST", url, headers=headers, data=payload)
    
    if response.status_code == 200:
        results = response.json()
        formatted_results = []
        for item in results.get('organic', [])[:3]:
            title = item.get('title', 'No title')
            snippet = item.get('snippet', 'No snippet')
            link = item.get('link', 'No link')
            formatted_results.append(f"Title: {title}\nSnippet: {snippet}\nLink: {link}\n")
        return "\n".join(formatted_results)
    else:
        return f"Error in web search: {response.status_code} - {response.text}"
    
tools = [
    Tool(
        name="Web Search",
        func=web_search,
        description="Search the web for current information.",
    ),
    Tool(
        name="Summarizer",
        func=summarize,
        description="Summarize long pieces of text.",
    )
]


# Set up the LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# Initialize the agent
agent = initialize_agent(
    tools, llm, agent=AgentType.ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

@app.route("/query", methods=["POST"])
def query_agent():
    data = request.json
    if "question" not in data:
        return jsonify({"error": "No question provided"}), 400
    
    question = data["question"]
    try:
        response = agent.run(question)
        return jsonify({"response": response})
    except Exception as e:
        return jsonify({"error": str(e)}), 500

if __name__ == "__main__":
    app.run(debug=True)

Run the application:

This will start the Flask server on http://localhost:5000

To query the agent, send a POST request to the /query endpoint:

curl -X POST -H “Content-Type: application/json” -d ‘{“input”: “What are the latest developments in AI?”}’ http://localhost:5000/query

8. Conclusion and Next Steps

In this article, we’ve explored more concepts around the agentic framework and created a complex agent using LangChain and OpenAI. We’ve seen how to create flexible, extensible systems that can use multiple tools. We’ve introduced the React framework, setting the stage for a deeper exploration in our next blog.

Stay tuned for Part 3 of our series, where we’ll explore how the ReAct framework and Chain of Thought reasoning enable agents to tackle even more complex problems.

Additional Notes
Before you run the code, get the API Key from https://serper.dev/ . Keep your API Keys in the environment file. Don’t forget to install the dependencies using requirements.txt. The complete working code can be found in my GitHub repo.

Happy Learning!!

August 24, 2024August 26, 2024 Madhukar Chaubey Artificial Intelligence, Gen AI

Building Intelligent Agents with LangChain and OpenAI — Part 1

“According to the Gartner Report, by 2028, one-third of interactions with generative AI (GenAI) services will use action models and autonomous agents for task completion.”

In the rapidly changing word of Artificial Intelligence, the development of autonomous agents has taken a significant leap forward. These AI agents, equipped with advanced tools and supported by large language models (LLMs) like those from OpenAI, can autonomously perform complex, multi-step tasks with minimal or no human intervention.

Different industries like finance, retail, e-commerce, and many more have already taken a plung in utilizing this technology in different use cases. According to Forrester report, generative AI and autonomous workplace assistants are among the top emerging technologies poised to deliver significant ROI and transform business processes across various sectors (Forrester) (Forrester).

LangChain, an open-source framework that has emerged as the de facto standard for building generative AI applications, offers a powerful libraries and toolkit for developing such AI agents. Many AI applications and platforms are already leveraging it for implementing sophisticated use cases as it helps in harnessing the full potential of multiple language models.

In this series, I’ll share my journey and learnings as I explore how to build AI agents using LangChain. I will start with a simple example and gradually in subsequent posts, take it to a level where we would try to create multiple agents performing complex task.

What are AI Agents?

AI agents are software programs that can analyze information, make decisions, and take actions to achieve specific tasks or goals. These agents are context-aware, capable of learning and adapting inputs from human in loop , and can handle sophisticated tasks.

LangChain and OpenAI

LangChain simplifies the development of applications that leverage the power of large language models (LLMs). By using the comprehensive toolkit of libraries and frameworks it provides, developers can create a wide range of natural language processing applications, including chatbots, text summarizers, question-answering systems, and sentiment analysis tools and many more.

OpenAI, known as the company behind ChatGPT, has become a go-to AI platform for both technical and non-technical users, even for everyday tasks. They have developed advanced language models like GPT-3.5, GPT-4, DALL-E and Whisper, which can process and generate text, images, audio etc. These models are ideal for tasks such as answering questions, summarizing information, and creating original content.

Building The First Agent

To understand any concept and deep dive further, it is always better to start with a simple example. In this example, I will be creating a simple calculator agent, which will perform basic mathematical calculation. To calculate, this Agent will be equipped with a very basic tool. This Agent leverages OpenAI to comprehend the questions asked and create the response.

One more thing: Even though this is a basic example, I have exposed the agent’s tasks through a Flask API. The core agentic component can also be executed as standalone Python scripts. The rationale behind an API-first approach will become clear as I go deeper and share examples with multiple agents collaborating to achieve complex goals.

So let us jump into the example:

Required Dependencies
We will need following dependencies to make the Example:

import os
from dotenv import load_dotenv
from flask import Flask, request, jsonify
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import StdOutCallbackHandler

Install these dependencies by running following PIP command:

pip install flask python-dotenv langchain openai

Basic Building Blocks
Step 1: Creating the Calculator Tool
As shared earlier for the Agent to perform mathematical calculations, we will create a simple tool in python which does calculation. LangChain has a concept of tools, which AI Agent can leverage to perform tasks. Below is the code, for the same:

def calculator(expression):
    try:
        return str(eval(expression))
    except:
        return "Error: Invalid expression"

tools = [
    Tool(
        name="Calculator",
        func=calculator,
        description="useful for when you need to perform mathematical calculations",
    )
]

As you can see the calculator is nothing but a python function , which has name and description. Same has been added in the tools list.

Step 2: Initializing the language model
While LangChain provides the framework for building our agent, OpenAI’s language model serves as the agent’s “brain,” powering its language understanding and generation capabilities.

llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

Step 3: Initializing the language model and agent
Below code initializes the OpenAI LLM and Agent:

agent = initialize_agent(
    tools, llm, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

In the above code block following things are set, while initializing agent:
a) tools: The list of Tool objects the agent can use.
b) LLM: The language model the agent will use for understanding and generating text.
c) Agent Type: Specifies the type of agent to use
Here was using CHAT_ZERO_SHOT_REACT_DESCRIPTION for simple chat interaction as this agent doesn’t require examples to understand how to use tools.
c) verbose: This is set to True, which provides detailed output of the agents thought process

Step 4: Executing the Agent
Agent is run by calling agent. Run method. When this is called following steps are performed internally:
a) The agent receives the question and analyzes it using the language model.
b) It determines if it needs to use any tools to answer the question.
c) If needed, it uses one or more tools.
d) It formulates a response based on the tool outputs and its understanding of the question.
e) If verbose is True, it outputs its thought process.
f) Finally, it returns the response.
This process can iterate multiple times if the agent determines it needs more information or needs to use additional tools to complete the task.
Below is the code, which takes the queries and run the agent:

def query_agent():
    data = request.json
    if "question" not in data:
        return jsonify({"error": "No question provided"}), 400

    question = data["question"]
    try:
        response = agent.run(question)
        return jsonify({"response": response})
    except Exception as e:
        return jsonify({"error": str(e)}), 500

Complete Code
Below is the full code of the agent which does the mathematical calculation:

import os
from dotenv import load_dotenv
from flask import Flask, request, jsonify
from langchain.agents import initialize_agent, Tool
from langchain.agents import AgentType
from langchain.chat_models import ChatOpenAI
from langchain.callbacks import StdOutCallbackHandler

# Load environment variables
load_dotenv()

app = Flask(__name__)

# Set up a simple calculator tool
def calculator(expression):
    try:
        return str(eval(expression))
    except:
        return "Error: Invalid expression"

tools = [
    Tool(
        name="Calculator",
        func=calculator,
        description="useful for when you need to perform mathematical calculations",
    )
]

# Set up the LLM
llm = ChatOpenAI(model_name="gpt-3.5-turbo", temperature=0)

# Initialize the agent
agent = initialize_agent(
    tools, llm, agent=AgentType.CHAT_ZERO_SHOT_REACT_DESCRIPTION, verbose=True
)

@app.route("/query", methods=["POST"])
def query_agent():
    data = request.json
    if "question" not in data:
        return jsonify({"error": "No question provided"}), 400

    question = data["question"]
    try:
        response = agent.run(question)
        return jsonify({"response": response})
    except Exception as e:
        return jsonify({"error": str(e)}), 500

if __name__ == "__main__":
    app.run(debug=True)

Conclusion & Next Steps
This is just a basic agent built using LangChain and OpenAI. When you query the Agent with a basic question to perform mathematical operations, it will respond with an answer.
In the next few blogs, I will share how to create an Agent with a slightly complex tool and expand it further with more than one agents.

Additional Notes
The code usage specific versions of LangChain and other depnendencies. While running the code, it is always better to create python virtual application and install the dependencies using requirements.txt. The complete working code can be found in my GitHub repo.

Happy Learning!!

March 31, 2021 Madhukar Chaubey Religion, Yoga & Philosophy

प्रणव

एकाक्षराय रुद्राय अकारायात्मरूपिणे ⁠। 
उकारायादिदेवाय विद्यादेहाय वै नमः ⁠।⁠।⁠ १ 
तृतीयाय मकाराय शिवाय परमात्मने ⁠। 
सूर्याग्निसोमवर्णाय यजमानाय वै नमः ⁠।⁠।⁠ २
अद्वितीय तथा नाशरहित प्रणवरूप रुद्रको नमस्कार है। अकाररूप परमात्मा तथा उकाररूप आदिदेव विद्यादेहको नमस्कार है ⁠।⁠।⁠ १ ⁠।⁠। 
तीसरे मकाररूप परमात्मा शिव और सूर्य-अग्नि-चन्द्रवर्णवाले रुद्र तथा यजमानरूपवाले महादेवको नमस्कार है ⁠।⁠।⁠
-- श्री लिंग महापुराण (अध्याय १८)
ओंकार की महत्ता अलग अलग जगहों पर अलग अलग प्रतीकों में की गई है। 
प्रणव ही संबल है, प्रणव ही साधन है, प्रणव ही साध्य है। 
प्रणव ही ब्रह्म है, प्रणव ही ब्रह्मज्ञान है। प्रणव ही ब्रह्मा हैं, प्रणव ही विष्णु है, प्रणव ही महेश हैं।
प्रणव से ही वेद है, प्रणव ही वेदज्ञान है। 
🙏🙏

March 31, 2021 Madhukar Chaubey Poetry

किसान

एक बीघा खेत, छोटा सा परिवार
कर्ज पिछले साल का लिए,
हैरान परेशान किसान से पूछो
कानून का क्या होना है।
बिहार का किसान बारिश में बाढ़ से लुटा,
पश्चिम का किसान बारिश के
इंतजार में बैठा, 
उनसे पूछो जरा नए बिल का विरोध
कैसे होना चाहिए।
जरा पूछो बिल कैसा होना चाहिए।
बेटी के ब्याह के लिए खेत बेचते किसान को
रिहाना और ग्रेटा क्या समझेंगे।
क्या समझेंगे उनके पसीने की कीमत
हम जैसे लोग, जो एसी कमरे में,
ऑनलाइन ऑर्डर से सारा काम चला लेते हैं।
कानून तो पढ़ा नहीं, गांव की शक्ल देखी नहीं
पर लेफ्ट राइट करते करते राय जरूर रखते हैं।
आज वास्तविकता से दूर सभी बस मोहरे हैं,
चाल किसी और की पर हारता कोई और है।
                              (भ्रमर)

March 31, 2021March 31, 2021 Madhukar Chaubey Poetry

शिव महिमा

तुम्ही हो ब्रह्मा, तुम्ही हो विष्णु, 
तुम्ही तो निर्बीज महादेव हो।
राजोगुन भी हो, तमोगुणी भी,
शुक्लवर्णी सत्वगुणी तुम्ही हो।
तुम्ही सृजन, संहार तुम्ही से,
सृष्टि तुम्हारी, प्रजा तुम्हीं से।
रिक, यजू औ सामवेद तुम्ही से,
वेद तुम्ही हो, योग तुम्ही से।
ब्रह्मरूप, आनंदस्वरुप, 
तुम त्यागरूप, कल्याणरूप हो।
सिद्ध हो तुम, सर्वज्ञ तुम्ही हो,
मोक्षरूप रूद्रदेव तुम्ही हो।
गुरु तुम्हीं हो, पिता तुम्ही हो,
अर्धनारीश्वर, मात तुम्ही हो।
अ - कार, उ - कार, म- कार तुम्ही हो,
ओंकारमूर्त, प्रणव तुम्हीं हो।
श्वेत हंस रूप ब्रह्मा को,
वराह रूप वृहद विष्णु को,
ओंकार नाद से, लिंगरूप में,
ज्ञान दिया जो, महेश्वर तुम्ही हो।
हिमलिंग तुम, ऊर्ध्वलिंग तुम,
व्योम व्याप्त शिवलिंग तुम्ही हो।
आदि तुम्ही से, अंत तुम्ही से,
देवाधिदेव आदिदेव तुम्ही हो।
           🙏🙏

March 31, 2021 Madhukar Chaubey Personal, Poetry

होना न होना

मेरे होने ना होने से क्या किसे फर्क पड़ जाएगा,
कुछ समय याद करेंगे सब, फिर सब धूमिल पड़ जाएगा।
मैं आया क्या था लेकर, कुछ संबंधों की डोरी तो,
समय ने फिर कुछ जोड़ दिए पर राख में सब मिल जाएगा।
किसको मैंने अपना माना, किससे मेरा नाता टूटा,
इस पर क्या रोना धोना है सब यहीं खत्म हो जाएगा।
स्वर्ग यहीं है, नर्क यहीं पर, कर्मो का हिसाब यहीं होगा,
मैं लाख जतन चाहे कर लूं, जो होना है हो जाएगा।
बात कहूं दिल की जो अगर, चाहा मैंने अच्छा ही है,
यश - अपयश का सोचा तो नहीं, जो होगा देखा जायेगा।
मेरे जाने पर रोना मत, हंस लेना थोड़ा ज्यादा भले,
रोने धोने से भ्रमर भला वापस थोड़े ही आयेगा।
सच कहता हूं, सब मिथ्या है, जीवन सारा ये झूठा है,
एक मौत ही है सच् घटना है, समय से जो हो जाएगा।
ये समय से ही होना होगा, जो गया ना वापस आयेगा, 
जो गया ना वापस आयेगा।।।।
                                      --------     (भ्रमर)

March 31, 2021March 31, 2021 Madhukar Chaubey Religion

शिव की लिंग रूप में पूजन

श्री लिंग महापुराण (अध्याय -१७) में श्री ब्रह्मा और श्री विष्णु के सम्मुख उमापति महेश्वर के ज्योतिर्मय लिंग स्वरूप में प्रकट होने का वर्णन है। जब ब्रह्मा और विष्णु अहंकार वश खुद को सृष्टि का पालक, संहारक और कर्ता बताकर लड़ने लगे तो उस समय महादेव एक दीप्तिमान लिंग के रूप में प्रकट हुए जिसका ना ओर दिख रहा था ना छोर। ब्रह्मा और विष्णु दोनो इसे देख मोहित हो गए और फिर ये क्या है ऐसा जानने के लिए प्रयासरत हुए।
ब्रह्मा ने श्वेत वर्णी हंस का रूप लिया और ऊपर की तरफ चल पड़े और विष्णु ने विशाल वराह रूप लेकर नीचे की तरफ गए। पर दोनो को ना उस अग्नि स्तंभ का मूल दिखा ना ही अंत।
थक कर दोनो जब वापस आए और हाथ जोड़ खड़े हुए तो ओंकार नाद सुनाई पड़ा। तब उन्हे आदि, मध्य और अंत से रहित आनंद रूप से शुद्ध स्फटिक रूप में प्रभु रूद्रदेव शिव के दर्शन हुए। तत्पश्चात उन्हे वेद ज्ञान, प्रणव की महत्ता और ब्रह्मज्ञान मिला।
महादेव ने उस समय भगवान विष्णु को ५ मंत्र दिए, जो इस प्रकार है : 
१) ओंकार से उत्पन्न शुभ्र वर्ण वाला पवित्र ईशान मंत्र - 
"ईशानः सर्वविद्यानामीश्वरः सर्वभूतानां ब्रह्माधिपतिर्ब्रह्मणोऽधिपतिर्ब्रह्मा शिवो मे अस्तु सदाशिवोम् ⁠।⁠।"
२) गायत्री से उत्पन्न हरित वर्ण वाला अत्यूतम मंत्र -
"तत्पुरुषाय विद्महे महादेवाय धीमहि ⁠। तन्नो रुद्रः प्रचोदयात् ⁠।⁠।"
३) अथर्ववेद से उत्पन्न कृष्णवर्ण वाले अघोर मंत्र -
"अघोरेभ्योऽथ घोरेभ्यो घोरघोरतरेभ्यः ⁠। सर्वेभ्यः सर्वशर्वेभ्यो नमस्ते अस्तु रुद्ररूपेभ्यः ⁠।⁠।"
४) यजुर्वेद से उत्पन्न श्वेत वर्ण वाले साद्योजात मंत्र - 
"सद्योजातं प्रपद्यामि सद्योजाताय वै नमोनमः ⁠। भवे भवे नाति भवे भवस्व मां भवोद्भवाय नमः ⁠।⁠।"
५) सामवेद से उत्पन्न रक्त वर्ण वाला उत्तम मंत्र - 
"वामदेवाय नमो ज्येष्ठाय नमः श्रेष्ठाय नमो रुद्राय नमः कालाय नमः कलविकरणाय नमो बलविकरणाय नमो बलाय नमो बलप्रमथनाय नमः सर्वभूतदमनाय नमो मनोन्मनाय नमः ⁠।⁠।"
इन्हीं पांच मंत्रों के साथ भगवान विष्णु ने फिर पुरातन पुरुष महादेव ब्रह्माधिपति शिव की स्तुति गान किया।
                               🙏🙏

March 31, 2021March 31, 2021 Madhukar Chaubey Uncategorized

शिव और ॐ

                          ॐ
श्री शिव महापुराण - विद्येश्वर संहिता के अनुसार 
सर्वविदित होना चाहिए कि ब्रह्मा और विष्णु अहंकार वश जब आपस में युद्ध करने लगे तो महादेव सदाशिव लिंग रूप में अवतरित हुए थे।  लिंग स्वरूप महादेव के अवतरण के दिन को ही शिवरात्रि के नाम से जाना जाता है।
महादेव आगे विष्णु और ब्रह्मा को ज्ञान देते हुए कहते हैं कि मैं ही परम ब्रह्म हूं। मैं सगुण और निर्गुण दोनो रूप में हूं। मैं निष्कल लिंग रूप में भी पूजनीय हूं और पंचमुख धारण किए सकल साक्षात रूप में भी। 
मेरा उत्तरवर्ती मुख से अकार, पश्चिम मुख से उकार, दक्षिण मुख से मकार, पूर्ववर्ती मुख से बिंदु तथा मध्यवर्ती मुख से नाद उत्पन्न हुआ। इन पांच अवयवों से मिलकर वह प्रणव 'ॐ' एक अक्षर बना। प्रणव मंत्र  शिव और शक्ति दोनो का बोधक है। इसी प्रणव मंत्र से पंचाक्षर मंत्र - "ॐ नमः शिवाय" की उत्पत्ति हुई है। उसी से शिरोमंत्र तथा चार मूखों से गायत्री प्रकट हुई। गायत्री से वेद प्रकट हुए। वेदों से करोड़ों मंत्र निकले। मंत्रों से सभी कार्य सिद्ध होते हैं। परन्तु इस प्रणव से और पंचाक्षर मंत्र से संपूर्ण मनोरथ सिद्ध होते हैं। इस मूल मंत्र से भोग और मोक्ष दोनो की सिद्धि होती है।
महादेव कहते हैं कि प्रणव मंत्र उनका ही स्वरूप है। इसके निरंतर जाप से मुक्ति मिल जाती है।  वैसे तो महादेव के निष्कल और सकल दोनो रूप हैं पर मुमुक्षु पुरुष को लिंग का ही पूजन करना चाहिए। लिंग रूप महादेव का ओंकार मंत्र से और सकल रूप महादेव को  पंचाक्षर  मंत्र से पूजन करना चाहिए।
                         ॐ ॐ ॐ ॐ ॐ

March 31, 2021 Madhukar Chaubey Uncategorized

शिव

अमूर्ते यत्पराख्यं वै तस्य मूर्तिस्सदाशिवः ।
अर्वाचीनाः पराचीना ईश्वरं तं जगुर्बुधाः ॥
जो मुर्तिरहित परमब्रह्म हैं, उसी की मूर्ति भगवान सदाशिव है। अर्वाचीन और प्राचीन विद्वान उन्हीं को ईश्वर कहते हैं।

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31