How to Build a SQL Agent with CrewAI and Composio?

Introduction

SQL is easily one of the most important languages in the computer world. It serves as the primary means for communicating with relational databases, where most organizations store crucial data. SQL plays a significant role including analyzing complex data, creating data pipelines, and efficiently managing data warehouses. However, writing optimized SQL queries can often be challenging and cumbersome. But thanks to the rapid progress in AI in the past few years, we now have AI agents augmented with Large Language Models capable of writing queries on our behalf.

This article demonstrates how to build an AI agent using CrewAI, Composio, and Gemini to access databases and execute SQL queries to retrieve data.

Learning Objectives

  • Understand what CrewAI is.
  • Learn about Composio tools and integrations.
  • Understand the workflow of the AI agent.
  • A step-by-step guide to building an SQL agent using Composio and CrewAI.

This article was published as a part of the Data Science Blogathon.

What is CrewAI?

CrewAI is an open-source collaborative multi-agent framework. It lets you build a crew of AI agents with various tasks, tools, roles, and motivations akin to a real-world crew. CrewAI manages the flow of information from one agent to another, letting you build autonomous efficient agentic workflows.

CrewAI mainly consists of five core features Agents, Tasks, Tools, Processes, and Tasks.

  • Agents: Agents are the fundamental unit of CrewAI and are responsible for decision-making, performing tasks, and communicating with other agents.
  • Tasks: These are the goals Agents are motivated to accomplish. A task can be accomplished by one or many agents.
  • Tools: Tools enable the Agents to interact with the external environment such as using a web scrapper to retrieve the latest news or a scheduler to schedule calendar events.
  • Process: The Process is responsible for managing tasks in CrewAI. It allocates tasks to agents in a defined order. The process can be sequential, hierarchical, or consensual. In a sequential process, one task follows another; in a hierarchical process, a managerial hierarchy dictates the order of tasks; and in a consensual process, agents perform tasks collaboratively.
  • Crews: Crews within CrewAI consist of collaborative agents equipped with tasks and tools, all working together to accomplish complex tasks.

Here is a mind-map CrewAI.

What is CrewAI?

Also Read: How to Build a Collaborative AI Agents With CrewAI?

What is Composio?

Composio is an open-source platform that provides tooling solutions for building reliable and useful AI agents. Composio provides over 150 tools and applications with built-in user authentication and authorization to help developers build reliable, secure, and production-ready agentic workflows. The tools have been designed from the ground up keeping real-world readiness of agents in mind.

Composio offers several advantages over other tooling solutions, including managed user authentication and authorizations, a wide array of tools and integrations, a dashboard for monitoring live integrations, and the flexibility to add custom tools

Composio has four key concepts.

  • Entities: In Composio, an “entity” is a container for all user or organization accounts and tools, allowing centralized management from a single dashboard.
  • Integrations: Integrations link your account with external apps, involving the setup of authentication mechanisms like OAuth and defining access permissions to control app actions. Once established, all users can access the same integration through their accounts.
  • Actions: are tasks performed by integrated tools, like sending a Slack message or scheduling a calendar event.
  • Triggers: Predefined conditions trigger webhooks to your agents when met, sending event details such as entities, message text, and more.

Designing an AI-Powered SQL Agent Workflow

Here, you will create an agentic system, which takes a user query regarding the data stored in a database, fetches the relevant data using an SQL agent, and creates nice plots to visualize the data. For this workflow, we shall use CrewAI to orchestrate agents and Composio for tooling support.

The magnetic system will have an SQL query writer agent and a coding agent to write and execute the queries. The SQL agent will have access to the SQL tool from Composio and the coding agent will be able to use the E2B’s Codeinterpreter via Composio. The Codeinterpreter provides a sandboxed environment for executing Python programs. 

The SQL agent will connect to a local database and query from an appropriate table. The data fetched from the SQL agent will be used by the Coding agent to create plots. The crew will return the plot as the final output.

Prerequisites for Building the SQL Agent

To run this project successfully, you will need the Composio API key. First, create an account on Composio and we will explain how to get API keys later. Also, get a free Gemini API key from Google AI studio, but remember the free account is rate-limited. The execution may take longer.

Furthermore, you will need a dummy database for executing queries. If you do not have a spare database, run the following code to create an employee table with names, departments, and salaries.

import sqlite3

# Connect to the SQLite database
connection = sqlite3.connect('company.db')

# Create a cursor object
cursor = connection.cursor()

# Create the 'employee' table
create_table_query = '''
CREATE TABLE IF NOT EXISTS employee (
    id INTEGER PRIMARY KEY AUTOINCREMENT,
    employee_name TEXT NOT NULL,
    department TEXT NOT NULL,
    salary INTEGER NOT NULL
);
'''
cursor.execute(create_table_query)

# Data to insert into the 'employee' table
employees = [
    ("John Doe", "Engineering", 70000),
    ("Jane Smith", "Human Resources", 65000),
    ("Alice Johnson", "Marketing", 72000),
    ("Bob Brown", "Sales", 68000),
    ("Charlie Black", "Engineering", 71000),
    ("Daisy White", "Human Resources", 66000),
    ("Edward Green", "Marketing", 69000),
    ("Fiona Grey", "Sales", 64000),
    ("George Yellow", "Engineering", 73000),
    ("Hannah Blue", "Human Resources", 61000),
    ("Ivan Purple", "Marketing", 75000),
    ("Jessica Cyan", "Sales", 70000),
    ("Kyle Red", "Engineering", 68000),
    ("Lily Orange", "Human Resources", 67000),
    ("Martin Indigo", "Marketing", 72000),
    ("Nina Teal", "Sales", 65000),
    ("Oscar Lime", "Engineering", 73000),
    ("Penny Olive", "Human Resources", 62000),
    ("Quentin Silver", "Marketing", 74000),
    ("Rachel Maroon", "Sales", 69000),
    ("Steve Pink", "Engineering", 71000),
    ("Tina Violet", "Human Resources", 68000),
    ("Ursula Gold", "Marketing", 76000),
    ("Victor Bronze", "Sales", 64000),
    ("Wendy Mauve", "Engineering", 69000),
    ("Xavier Cream", "Human Resources", 65000),
    ("Yolanda Peach", "Marketing", 70000),
    ("Zack Sage", "Sales", 68000),
    ("Abby Coral", "Engineering", 72000),
    ("Bill Moss", "Human Resources", 63000)
]

# Insert data into the 'employee' table
insert_query = 'INSERT INTO employee (employee_name, department, salary) VALUES (?, ?, ?);'
cursor.executemany(insert_query, employees)

# Commit the changes
connection.commit()

# Close the connection
connection.close()

print("Table created and data inserted successfully.")

Step-by-Step Guide to Creating an SQL Agent

Now, that the basics are covered, we can start with the coding part. As with any Python project, we will first set up a virtual environment and environment variables, and install libraries. The project will use Gemini 1.5 Pro as the language model.

Step1: Installing Libraries

Create a virtual environment using Python Venv.

python -m venv sqlagent
cd sqlagent
source bin/active

Install the following libraries using pip install .

composio-core
composio-crewai
langchain-google-genai
dotenv

Step2: Set Environment Variables

To use Composio toolsets, you need to authenticate your Composio account. Run the below command to log in to Composio and follow the login flow.

composio login

Now, get your Composio API keys.

Composio whoami

Create a .env file and add COMPOSIO_API_KEY and GOOGLE_API_KEY variables to it.

COMPOSIO_API_KEY=your Composio API key
GOOGLE_API_KEY=your Gemini API key

Step3: Define Tools and LLM

Now, create a Python file and import the necessary libraries.

import os

import dotenv
from composio_langchain import App, ComposioToolSet
from crewai import Agent, Crew, Process, Task
from langchain_google_genai import ChatGoogleGenerativeAI

# Load environment variables from the .env file
dotenv.load_dotenv()

Define Composio tools.

# Initialize the ComposioToolSet
toolset = ComposioToolSet(api_key=os.environ["COMPOSIO_API_KEY"])

code_interpreter_tools = toolset.get_tools([App.CODEINTERPRETER])
sql_tools = toolset.get_tools([App.SQLTOOL])

Now, define the LLM with Gemini 1.5 Pro.

llm = ChatGoogleGenerativeAI(model="gemini-1.5-pro", 
                             api_key=os.environ['GOOGLE_API_KEY']
                          )

Step4: Define Agents and Tasks

As discussed earlier, we will here define two agents and two tasks. The SQL agent, Coding agent, and their respective tasks.

code_interpreter_agent = Agent(
        role="Python Code Interpreter Agent",
        goal="Run a code to get achieve a task given by the user",
        backstory="You are an agent that helps users run Python code.",
        verbose=True,
        tools=code_interpreter_tools,
        llm=llm,
    )
    
sql_agent = Agent(
        role="SQL Agent",
        goal="Run SQL queries to get achieve a task given by the user",
        backstory=(
            "You are an agent that helps users run SQL queries. "
            "Connect to the local SQLite DB at connection string = company.db"
            "Try to analyze the tables first by listing all the tables and columns "
            "and doing distinct values for each column and once sure, make a query to \
            get the data you need."
        ),
        verbose=True,
        tools=sql_tools,
        llm=llm,
        allow_delegation=True,
    )

In the above code snippet, we defined the agents, each with a defined role, goal, and backstory. The additional information offers additional context to LLMs before generating responses to queries. A tool equips each agent to perform the actions.

Now, define tasks.

code_interpreter_task = Task(
        description=f"Run Python code to achieve the task - {main_task}. \
        Exit once the image has been created.",
        expected_output="Python code executed successfully. Return the image path.",
        agent=code_interpreter_agent,
    )
    
   sql_task = Task(
        description=f"Run SQL queries to achieve a task - {main_task}",
        expected_output=f"SQL queries executed successfully. The result of the task \
        is returned - {main_task}",
        agent=sql_agent,
    )

We defined the tasks that the agents will perform. Each task has a description, expected output, and the agent responsible for performing it.

Now, define the Crew with the agents and the tasks.

crew = Crew(
        agents=[sql_agent, code_interpreter_agent],
        tasks=[sql_task, code_interpreter_task],
    )
result = crew.kickoff()
print(result)

You can put this flow in a while loop to make it more engaging, 

while True:
    main_task = input("Enter the task you want to perform (or type 'exit' to quit): ")
    if main_task.lower() == "exit":
        break
        
    code_interpreter_agent = Agent(
        role="Python Code Interpreter Agent",
        goal="Run a code to get achieve a task given by the user",
        backstory="You are an agent that helps users run Python code.",
        verbose=True,
        tools=code_interpreter_tools,
        llm=llm,
    )
    
     sql_agent = Agent(
        role="SQL Agent",
        goal="Run SQL queries to get achieve a task given by the user",
        backstory=(
            "You are an agent that helps users run SQL queries. "
            "Connect to the local SQLite DB at connection string = company.db"
            "Try to analyze the tables first by listing all the tables and columns "
            "and doing distinct values for each column and once sure, make a query to \
            get the data you need."
        ),
        verbose=True,
        tools=sql_tools,
        llm=llm,
        allow_delegation=True,
    )
    code_interpreter_task = Task(
        description=f"Run Python code to achieve the task - {main_task}. \
        Exit once the image has been created.",
        expected_output="Python code executed successfully. Return the image path.",
        agent=code_interpreter_agent,
    )
    
    sql_task = Task(
        description=f"Run SQL queries to achieve a task - {main_task}",
        expected_output=f"SQL queries executed successfully. The result of the task \
        is returned - {main_task}",
        agent=sql_agent,
    )


    crew = Crew(
        agents=[sql_agent, code_interpreter_agent],
        tasks=[sql_task, code_interpreter_task],
    )

    result = crew.kickoff()
    print(result)

This will prompt you to enter a query, which will then be passed to the Crew of AI agents. After execution, you will have the option to either ask another question or exit the loop.

Once the execution of a query is completed, it will output the file path for the plot’s image.

Building an SQL Agent with CrewAI and Composio

I asked it to create a bar plot of the number of employees in each department. This was the outcome.

Building an SQL Agent with CrewAI and Composio

GitHub Gist:  SQLsgent

These steps demonstrated how to build an agentic workflow to automate SQL data extraction and visualization. However, you can go further, and make it more robust and reliable by adding a memory component to agents and the Crew. This will help the Agents remember their past outcomes, which will make them steer the workflow better, You can also add a frontend with Streamlit or Gradio, to make it more interactive. 

Conclusion

The AI landscape is evolving at an unprecedented pace. As the quality of AI models, frameworks, and tools continues to improve, building powerful AI agents is becoming increasingly convenient each day.  The future of the workforce is agentic, where humans and AI will complement each other to create even more efficient systems. With frameworks like CrewAI and Composio, you can conveniently create AI workflows to automate many routine tasks. This article demonstrates how to automate data extraction and visualization. You can expand this workflow to handle even more complex scenarios.

Key Takeaways

  • CrewAI is an open-source framework for orchestrating LLM agents to collaboratively accomplish complex tasks by assigning roles, sharing goals, and delegating tasks.
  • Composio is an open-source tooling platform that offers production-ready tools and integrations to empower AI agents to accomplish tasks reliably.
  • You can integrate Composio tools with popular platforms like LangChain, Autogen, CrewAI, and Llamaindex.

Frequently Asked Questions

Q1. What is CrewAI?

A. A.  CrewAI is an open-source agent orchestration framework for building role-playing and collaborative agents.

Q2. What is the difference between CrewAI and Autogen?

A.  In Autogen, orchestrating agents’ interactions requires additional programming, which can become complex and cumbersome as the scale of tasks grows. CrewAi has a simplified multi-agent AI implementation.

Q3. What can CrewAI do?

A. A. CrewAI lets you build collaborative multi-agent AI systems to accomplish complex automation workflows.

Q4. Is CrewAI open source?

A. A.  CrewAI is an open-source AI agent orchestration framework distributed under MIT license.

Q5. 5. What is an SQL agent?

A. An SQL agent is an AI-augmented software that can autonomously perform SQL operations like querying, insertion, deletion, and updation.

The media shown in this article is not owned by Analytics Vidhya and is used at the Author’s discretion.

Source link

Picture of quantumailabs.net
quantumailabs.net

Leave a Reply

Your email address will not be published. Required fields are marked *