Demystifying AI data agents:
A plain English guide that explains how AI agents help you make better sense of your data.
When we think of AI agents, we often think of agents that provide text or image outputs from unstructured data like knowledge bases and blogs. But what about agents that make sense of structured data from databases, data warehouses, or spreadsheets?
Enterprises spend millions of dollars to become data-driven. Despite these efforts, a survey by Bayes Business School stated that 70% of executives said they struggled to unlock value from their data. It’s not an easy task. Getting the answer we want can mean scouring across multiple tools or teams. Data analysts are often overwhelmed by questions and requests. And sometimes, executives just don’t know what data is even available to them.
What is an AI data agent and why are they useful?
An AI data agent is exactly what it sounds like: it’s an agent that interprets data. Rather than having to navigate a complex interface or write complex SQL queries, you can describe the data you need as well as the format you’d like it presented. How can AI help us unlock the value of our data in a way that traditional BI tools cannot?
Imagine having a tool that abstracts that complexity. It sounds like magic, but it’s not. This guide will walk you through how an AI data agent works and what they can do for your organization. So what are their benefits?
Make decisions faster and easier. It’s hard to make timely data-driven decisions when data analysts have a backlog. Having a natural language interface to query data ensures that decision makers can get basic data reports themselves, providing more time for analysts to do deeper reporting.
Create insights, not queries. Knowing what data you need is half the battle. It shouldn’t take days to figure out where it is, how to pull it, and then visualize it. Between SQL, SOSQL, Cypher, SparkQL, API calls etc, it can be overwhelming to even seasoned data veterans.
Modernize in place with your existing tools. Data migrations are never fun. Rather than migrating data into Yet Another Data Tool, a data agent can be built on top of existing data sources, ensuring teams can still use them without interruption.
Visualize data across various sources. Data agents can connect to multiple databases, providing a wealth of insights across business units, products, customers, etc.
What kind of data can AI data agents consume?
When we’re talking about data agents, we’re referring specifically to AI agents that can interpret and act on structured data. Structured data is organized in a predefined manner or format, typically using a well-defined schema. It is highly organized and can be easily stored, queried, and processed in databases, data warehouses, or SaaS tools. Bear in mind that it’s possible to build an agent that understands knowledge and data, but the way an agent would process structured and unstructured data is different.
Now we’ll walk through what’s happening under the hood of a data agent.
How can an AI agent talk to your data?
AI agents rely on Large Language Models (LLM) under the hood like OpenAI, Gemini, Mistral, or Llama to interpret and generate responses. LLMs rely on sophisticated machine learning algorithms to understand patterns, contexts, and nuances in language. They use training data to build complex neural networks. LLMs understand language and they understand SQL, so they can be used to translate between the two.
So we’ve covered how LLMs answer your questions based on unstructured data like blog posts, manuals, and PDFs. How do they work with structured data like databases? It’s fairly similar. They tokenize content, i.e. break each word or piece of content down into parts and then convert those parts into something computers understand: numbers. In this case, agents break both your text and the schema into numerical weights, which are called vectors.
So imagine you want to talk to your database instead of writing SQL queries. What should the owner of the database do first?
Choosing the right data: How much data should be accessible? Using a database view, you can expose only subsets of your data. Squid also supports database joins, making it easy to query multiple databases using the same data agent.
Cleaning your data: First, you’d need your database to be clean. This means data is accurate and deduped, so your agent isn’t having to guess which version of “John D. Smith” you’re looking for. The field names that describe each column should be fairly descriptive and easy to understand.
Connect your database to the Squid platform. It’s easy.
Using Squid AI’s Query with AI functionality, you can now create an agent that allows you to talk to your data. Squid supports a number of SQL and NoSQL options.
What does Squid do then?
Squid’s schema autodiscovery feature breaks your data down into something LLMs understand. It provides the structure of your data, i.e. the field and table names, which provides key context for your agent. Squid also provides descriptions of your data, either automatically or with input from you. Squid can then pass this context on to an LLM.
Imagine you want to pull data on your latest financials for a concert you hosted. Let’s walk through this.
You ask your AI agent to “Make a bar chart of the revenue we made by access level on the Taylor Swift concert last night.”
Squid passes this question along with the annotated schema to an LLM. The LLM breaks down the meaning of your question, and then generates the appropriate SQL query back to Squid.
Make a bar chart of the revenue we made by access level on the Taylor Swift concert last night.
Squid runs the query against the database and then checks the response.
If the answer makes sense, the LLM will deliver a bar chart of the data you need back to you.
But wait...there’s more
Data agents don’t just pull data and make graphs. Squid supports AI functions, which provide more explicit instructions for your AI agent to act on this data across your tools. Squid also supports schedulers, so you can run reports on it too.
Imagine your company offers a 10% discount for entering in an email address on your site, and you’d like to chart how often this coupon code is used. You can ask your agent the following:
Create a line chart that shows how many times the NEW10 coupon code was used by week this month. Create this chart each month on the last day of the month, and post it in the #promotions Slack channel with the text “Here’s how many new customers used the NEW10 promo.”
Where is this useful?
Every part an organization benefits by providing better access to data. Anyone from Finance to Marketing to Product relies on data teams, and these teams can easily get overwhelmed by ad hoc requests. Enabling teams to pull their own data and reports frees up data analysts to do more deep work, providing unique insights for an organization.
Here are some examples of where this could be applied:
Learning about customers and their transactions: Often the Data team has the best insights about customers, but can’t necessarily get those insights out to other teams. Data teams can democratize access to data and make it easier to act on it.
Recognizing the products that make the most money: When do people buy certain products? What products have the greatest margin? If you can put it in a database, you can learn what products are driving your business and how.
Evaluating costs: How much are you spending across lines of business? How does this compare, year over year?
If you can put it in a database, you can talk to it in natural language using Squid.
Get started with AI agents
AI agents can solve real world problems for your organization. If you’d like the benefits of AI agents without having to set up AI infrastructure or data connectors, contact the team at Squid AI to discuss your use case. Squid AI offers configurable agents that can automate or augment a variety of tasks.