Implementation of REAcT Agent using LlamaIndex and Gemini

Blog

Implementation of REAcT Agent using LlamaIndex and Gemini

October 19, 2024

In the past 2-3 years, we’ve witnessed unreal development in the field of AI, mainly in large language models, diffusion models, multimodals, and so on. One of my favorite interests has been in agentic workflows. Early this year, Andrew Ng, the founder of Coursera and a pioneer in deep learning, made a tweet saying “Agentic workflows will drive massive AI progress this year”. Ever since this tweet came out, we’ve seen unbelievable development in the field of agents, with many people building autonomous agents, multi-agent architectures, and so on.

In this article, we’ll dive deep into the implementation of REAcT Agent, a powerful approach in agentic workflows. We’ll explore what REAcT prompting is, why it’s useful, and how to implement it using LlamaIndex and Gemini LLM.

Learning Objectives

We will understand REAcT prompting and its role in building more capable AI agents that can reason, act, and think through complex tasks.
An example prompt on how REAcT prompt is written.
Further implement REAcT Agents within the LlamaIndex framework, which provides the Agent with feedback capabilities.
Explore the capabilities of REAcT Agents, particularly their ability to break down tasks, use various tools, and provide transparent reasoning processes.

This article was published as a part of the Data Science Blogathon.

What is REAcT Prompting?

REAcT stands for Reasoning, Acting, and Thinking. It’s a prompting technique that enables large language models (LLMs) to break down complex tasks into a series of thought processes, actions, and observations.

REAcT prompting is very straightforward and follows a three-step process:

Think: Based on the user query, the agent formulates thoughts about how to approach the problem.
Act: The agent performs certain actions based on its thoughts, utilizing available tools or APIs. For example, you can use Google Search, File loading, and so on.
Observe: The agent analyzes the results of its actions and decides whether to continue the process or provide a final answer.

This process runs in a loop until a satisfactory result is achieved or a maximum number of iterations is reached.

Example of REAcT Prompt

Let’s look at a typical structure of a REAcT prompt:

Thought: [The agent's reasoning about the task]
Action: [The action to be taken, e.g., "Search"]
Action Input: [The input for the action, e.g., a search query]
Observation: [The result of the action]
Thought: [Further reasoning based on the observation]
...
[This process repeats until a final answer is reached]
Final Answer: [The agent's conclusion]

Let’s look at a real-time example of how a REAcT Agent might process a query about recent technological advancements. This example will demonstrate the agent’s thought process, actions, and observations.

User Query: “Who was the man of the series in the recent India vs England Test series, and what were their key performances?”

Output: (generated using crewAI)

Thought: To answer this query accurately, I need to find information about the recent India vs England Test series, identify the man of the series, and gather details about their performances. I'll start by searching for the series results.

Action: Search
Action Input: "India vs England Test series 2024 results"

Observation: The India vs England Test series concluded recently with India winning the series 4-1. The series was played in India from January to March 2024.

Thought: Now that I have confirmed the series details, I need to find out who was named the man of the series.

Action: Search
Action Input: "India vs England Test series 2024 man of the series"

Observation: Yashasvi Jaiswal was named the man of the series for his outstanding batting performances throughout the series.

Thought: I've identified the man of the series. Now, I need to gather information about Yashasvi Jaiswal's key performances in this series.

Action: Search
Action Input: "Yashasvi Jaiswal performance India vs England Test series 2024"

Observation: Yashasvi Jaiswal had an exceptional series:
- He scored a total of 712 runs in the series.
- His scores included two double centuries (209 and 214*).
- He hit the most sixes (26) in a Test series by any batter.
- Jaiswal maintained an impressive average of 89 throughout the series.

Thought: I now have all the necessary information to provide a comprehensive answer about the man of the series and his key performances.

Final Answer: The man of the series in the recent India vs England Test series (played from January to March 2024) was Yashasvi Jaiswal. India won the series 4-1, and Jaiswal's performances were crucial to this victory.

Key performances by Yashasvi Jaiswal in the series:

1. Total runs: Jaiswal amassed an impressive 712 runs across the five-match series.
2. Double centuries: He scored two double hundreds - 209 and 214* (not out).
3. Six-hitting record: Jaiswal hit 26 sixes, setting a new record for the most sixes in a Test series by any batter.
4. Batting average: He maintained an outstanding average of 89 throughout the series.

These performances demonstrate Jaiswal's consistency, ability to play long innings, and his aggressive style of play, especially evident in his six-hitting prowess. His contributions were pivotal in India's dominant performance in the series, justifying his selection as the man of the series.

Jaiswal's performance is particularly noteworthy given that this was one of his early Test series, marking him as a rising star in Indian cricket. His ability to perform at such a high level against a strong England team suggests a promising future in Test cricket for the young batsman.

Why REAcT Prompting is Useful in Building Agents?

A typical approach to obtaining results from LLMs is by writing a well-structured prompt. However, it’s important to remember that LLMs lack inherent reasoning capabilities. Various methods have been attempted to enable LLMs to reason and plan, but many of these approaches have fallen short. Techniques like Chain of Thought, Tree of Thoughts, and Self-Consistency COT have shown promise but were not entirely successful in achieving robust reasoning. Then came ReAct, which, to some extent, succeeded in designing logical research plans that made more sense than previous methods.

REAcT breaks down complex tasks into a series of thoughts, actions, and observations, REAcT agents can tackle intricate problems with a level of transparency and adaptability that was previously challenging to achieve. This methodology allows for a more nuanced understanding of the agent’s decision-making process, making it easier for developers to debug, refine, and optimize LLM responses.

Moreover, the iterative nature of REAcT prompting enables agents to handle uncertainty. As the agent progresses through multiple cycles of thinking, acting, and observing, it can adjust its approach based on new information, much like a human would when faced with a complex task. By grounding its decisions in concrete actions and observations, a REAcT agent can provide more reliable and contextually appropriate responses, thus significantly reducing the risk of hallucination.

Key Applications and Use Cases of REAcT Agents

We’ll explore the diverse applications and real-world use cases of REAcT Agents, highlighting their potential to transform industries through enhanced reasoning, decision-making, and adaptability in various contexts.

Real-time Sports Analysis and Prediction

ReAcT agents based on the tons of information available on the internet can provide analysis and prediction in the sports industry. It could process live match data, player statistics, and historical performance to provide in-depth analysis and predictions. For example, during an IPL match, the agent could:

Analyze player performance trends
Predict optimal batting orders or bowling changes
Suggest field placements based on batsman’s hitting zones

Automated Customer Support

Customer support always requires a skill to provide valuable feedback. ReAcT agent is a great choice when LLM or Agents need to be provided with intelligent feedback. This can help:

Understand complex customer queries
Access relevant product information and troubleshooting guides
Walk customers through step-by-step solutions

Personalized Learning for Students

Education is another field where ReAcT Agents could make a massive impact. Imagine a personalized AI tutor that can:

Assess a student’s current knowledge level
Break down complex topics into manageable chunks
Adapt its teaching style based on the student’s responses
Provide real-time feedback and suggest additional resources

In our code implementation, we will look into real-time sports data query and analysis.

Implementing a REAcT Agent Using LlamaIndex

Now, let’s get into the exciting part – implementing a REAcT Agent using LlamaIndex. The implementation is surprisingly straightforward and can be done in just a few lines of code.

Installation and Setup

Before we proceed with the code implementation, let’s install a few necessary libraries, including LlamaIndex. LlamaIndex is a framework that efficiently connects large language models to your data. For our action tool, we’ll be using DuckDuckGo Search, and Gemini will be the LLM we integrate into the code.

!pip install llama-index
!pip install duckduckgo-search
!pip install llama-index-llms-gemini

First, we need to import the necessary components. Since the ReAct agent needs to interact with external tools to fetch data, we can achieve this using the Function Tool, which is defined within the LlamaIndex core tools. The logic is straightforward: whenever the agent needs to access real-world data, it triggers a Python function that retrieves the required information. This is where DuckDuckGo comes into play, helping to fetch the relevant context for the agent.

from llama_index.core.tools import FunctionTool
from duckduckgo_search import DDGS

from llama_index.llms.gemini import Gemini

Define Gemini LLM

In LlamaIndex, OpenAI is the default LLM, to override Gemini, we need to initialize it within the Settings. To use the Gemini LLM, you need to get the API key from here: https://aistudio.google.com/

from llama_index.core import Settings
import os

GOOGLE_API_KEY = "" # add your API key here
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

llm = Gemini()
Settings.llm = llm

Next, we define our search tool, DuckDuckGo Search. One important detail to remember is that you need to specify the data type of the input parameter when defining the FunctionTool for performing actions. For example, search(query: str) -> str ensures the query parameter is a string. Since DuckDuckGo returns the search results with additional metadata, we’ll extract only the body content from the results to streamline the response.

def search(query:str) -> str:
  """
  Args:
      query: user prompt
  return:
  context (str): search results to the user query
  """
  # def search(query:str)
  req = DDGS()
  response = req.text(query,max_results=4)
  context = ""
  for result in response:
    context += result['body']
  return context
  
search_tool = FunctionTool.from_defaults(fn=search)

Writing a REAcT Agent with LlamaIndex

With the major components of the agent already set up, we can now define the ReAct agent. We can directly use the ReAct Agent from LlamaIndex core. Additionally, we set verbose=True to understand what’s happening behind the scenes. Setting allow_parallel_tool_calls to True enables the agent to make decisions without always relying on external actions, allowing it to use its own reasoning when appropriate.

from llama_index.core.agent import ReActAgent

agent = ReActAgent.from_tools([search_tool], 
                               llm=llm, 
                               verbose=True,
                               allow_parallel_tool_calls=True
                             )

That’s it! We’ve created our REAcT Agent. Now we can use it to answer queries, by running agent.chat methodology.

template = """
You are an expert Sport analysis reporter. 
Understand the trends of Virat Kohli performance in IPL 2024 and provide what was his strengths and weakness
Also provide total score of Virat Kohli in the IPL 2024
I also need highest score as Virat Kohli in the same season
"""

response = agent.chat(template) 
print(response)

Conclusion

REAcT Agents represent a significant step forward in the field of AI and agentic workflows. By implementing a REAcT Agent using LlamaIndex, we’ve created a powerful tool that can reason, act, and think its way through real-time user queries.

Key Takeaways

REAcT prompting represents a significant advancement in agentic workflows, offering a structured approach to complex reasoning for Large language models.
The implementation of REAcT Agents using LlamaIndex is surprisingly straightforward, requiring just a few lines of code to create powerful, adaptive AI systems.
The iterative nature of REAcT prompting allows for dynamic problem-solving, enabling agents to adapt their approach based on intermediate results and new information.
REAcT Agents significantly reduce the risk of hallucination, a common challenge in language models.

Frequently Asked Questions

Q1. How do REAcT Agents reduce hallucinations in AI responses?

A. By grounding responses in concrete actions and observations, REAcT Agents reduce hallucinations. This means that instead of generating unsupported or inaccurate information, the agent performs actions (like searching for information) to verify its reasoning and adjust its response based on real-world data.

Q2. Can I build a REAcT Agent using Langchain?

A. Yes, you can implement a ReAct agent using Langchain and it is very straightforward as well. You first define the tools the agent can use, such as search functions, and LLM, and then create the agent using these tools. The agent then operates in an iterative loop, reasoning, acting, and observing until a satisfactory answer is reached.

Q3. What are some common use cases for REAcT agents?

A. REAcT agents are commonly used in complex problem-solving environments such as customer support, research analysis, autonomous systems, and educational tools.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Data Scientist at AI Planet || YouTube- AIWithTarun || Google Developer Expert in ML || Won 5 AI hackathons || Co-organizer of TensorFlow User Group Bangalore || Pie & AI Ambassador at DeepLearningAI

Source link

Blog