Featured Article

What exactly is an AI agent?

The answer depends on who you ask

Comment

Illustration of a robotic agent helping workers do their jobs.
Image Credits: girafchik123 / Getty Images

AI agents are supposed to be the next big thing in AI, but there isn’t an exact definition of what they are. To this point, people can’t agree on what exactly constitutes an AI agent.

At its simplest, an AI agent is best described as AI-fueled software that does a series of jobs for you that a human customer service agent, HR person or IT help desk employee might have done in the past, although it could ultimately involve any task. You ask it to do things, and it does them for you, sometimes crossing multiple systems and going well beyond simply answering questions.

Seems simple enough, right? Yet it is complicated by a lack of clarity. Even among the tech giants, there isn’t a consensus. Google sees them as task-based assistants depending on the job: coding help for developers; helping marketers create a color scheme; assisting an IT pro in tracking down an issue by querying log data.

For Asana, an agent may act like an extra employee, taking care of assigned tasks like any good co-worker. Sierra, a startup founded by former Salesforce co-CEO Bret Taylor and Google vet Clay Bavor, sees agents as customer experience tools, helping people achieve actions that go well beyond the chatbots of yesteryear to help solve more complex sets of problems.

This lack of a cohesive definition does leave room for confusion over exactly what these things are going to do, but regardless of how they’re defined, the agents are for helping complete tasks in an automated way with as little human interaction as possible.

Rudina Seseri, founder and managing partner at Glasswing Ventures, says it’s early days and that could account for the lack of agreement. “There is no single definition of what an ‘AI agent’ is. However, the most frequent view is that an agent is an intelligent software system designed to perceive its environment, reason about it, make decisions, and take actions to achieve specific objectives autonomously,” Seseri told TechCrunch.

She says they use a number of AI technologies to make that happen. “These systems incorporate various AI/ML techniques such as natural language processing, machine learning, and computer vision to operate in dynamic domains, autonomously or alongside other agents and human users.”

Aaron Levie, co-founder and CEO at Box, says that over time, as AI becomes more capable, AI agents will be able to do much more on behalf of humans, and there are already dynamics at play that will drive that evolution.

“With AI agents, there are multiple components to a self-reinforcing flywheel that will serve to dramatically improve what AI Agents can accomplish in the near and long-term: GPU price/performance, model efficiency, model quality and intelligence, AI frameworks and infrastructure improvements,” Levie wrote on LinkedIn recently.

That’s an optimistic take on the technology that assumes growth will happen in all these areas, when that’s not necessarily a given. MIT robotics pioneer Rodney Brooks pointed out in a recent TechCrunch interview that AI has to deal with much tougher problems than most technology, and it won’t necessarily grow in the same rapid way as, say, chips under Moore’s law have.

“When a human sees an AI system perform a task, they immediately generalize it to things that are similar and make an estimate of the competence of the AI system; not just the performance on that, but the competence around that,” Brooks said during that interview. “And they’re usually very over-optimistic, and that’s because they use a model of a person’s performance on a task.”

The problem is that crossing systems is hard, and this is complicated by the fact that some legacy systems lack basic API access. While we are seeing steady improvements that Levie alluded to, getting software to access multiple systems while solving problems it may encounter along the way could prove more challenging than many think.

If that’s the case, everyone could be overestimating what AI agents should be able to do. David Cushman, a research leader at HFS Research, sees the current crop of bots more like Asana does: assistants that help humans complete certain tasks in the interest of achieving some sort of user-defined strategic goal. The challenge is helping a machine handle contingencies in a truly automated way, and we are clearly not anywhere close to that yet.

“I think it’s the next step,” he said. “It’s where AI is operating independently and effectively at scale. So this is where humans set the guidelines, the guardrails, and apply multiple technologies to take the human out of the loop — when everything has been about keeping the human in the loop with GenAI,” he said. So the key here, he said, is to let the AI agent take over and apply true automation.

Jon Turow, a partner at Madrona Ventures, says this is going to require the creation of an AI agent infrastructure, a tech stack designed specifically for creating the agents (however you define them). In a recent blog post, Turow outlined examples of AI agents currently working in the wild and how they are being built today.

In Turow’s view, the growing proliferation of AI agents — and he admits, too, that the definition is still a bit elusive — requires a tech stack like any other technology. “All of this means that our industry has work to do to build infrastructure that supports AI agents and the applications that rely upon them,” he wrote in the piece.

“Over time, reasoning will gradually improve, frontier models will come to steer more of the workflows, and developers will want to focus on product and data — the things that differentiate them. They want the underlying platform to ‘just work’ with scale, performance, and reliability.”

One other thing to keep in mind here is that it’s probably going to take multiple models, rather than a single LLM, to make agents work, and this makes sense if you think about these agents as a collection of different tasks. “I don’t think right now any single large language model, at least publicly available, monolithic large language model, is able to handle agentic tasks. I don’t think that they can yet do the multi-step reasoning that would really make me excited about an agentic future. I think we’re getting closer, but it’s just not there yet,” said Fred Havemeyer, head of U.S. AI and software research at Macquarie US Equity Research.

“I do think the most effective agents will likely be multiple collections of multiple different models with a routing layer that sends requests or prompts to the most effective agent and model. And I think it would be kind of like an interesting [automated] supervisor, delegating kind of role.”

Ultimately for Havemeyer, the industry is working toward this goal of agents operating independently. “As I’m thinking about the future of agents, I want to see and I’m hoping to see agents that are truly autonomous and able to take abstract goals and then reason out all the individual steps in between completely independently,” he told TechCrunch.

But the fact is that we are still in a period of transition where these agents are concerned, and we don’t know when we’ll get to this end state that Havemeyer described. While what we’ve seen so far is clearly a promising step in the right direction, we still need some advances and breakthroughs for AI agents to operate as they are being envisioned today. And it’s important to understand that we aren’t there yet.

More TechCrunch

Featured Article

What exactly is an AI agent?

Regardless of how they’re defined, the agents are for helping complete tasks in an automated way with as little human interaction as possible.

What exactly is an AI agent?
Image Credits: girafchik123 / Getty Images

Meta announced former President Donald Trump’s Facebook and Instagram accounts will no longer be subject to heightened suspension penalties, according to an updated blog post on Friday. The company says…

Meta removes special restrictions for Trump’s account ahead of 2024 elections

A Castro Valley resident was charged Thursday for allegedly slashing the tires of 17 Waymo robotaxis in San Francisco between June 24 and June 26, according to the city’s district…

Waymo cameras capture footage of person charged in alleged robotaxi tire slashings

Welcome to Startups Weekly — your weekly recap of everything you can’t miss from the world of startups. Sign up here to get it in your inbox every Friday. This…

Defending Russia’s EU neighbors

Cat-Wells said she started this platform because traditional hiring processes are exclusionary and often overlook skilled, talented disabled people.

A VC told Keely Cat-Wells to get a male, non-disabled co-founder — she balked, nabbed a $2M pre-seed round

A new study examines whether AI could be an automated helpmeet in creative tasks, with mixed results: It appeared to help less naturally creative people write more original short stories…

Experiment finds AI boosts creativity individually — but lowers it collectively

Featured Article

HeadSpin, whose founder is in prison for fraud, sold to PE firm in fire sale, sources say

In total, HeadSpin raised $117 million since its 2015 inception and was last valued at $1.1 billion in 2020.

HeadSpin, whose founder is in prison for fraud, sold to PE firm in fire sale, sources say

A bipartisan group of senators has introduced a new bill that seeks to protect artists, songwriters and journalists from having their content used to train AI models or generate AI…

New Senate bill seeks to protect artists’ and journalists’ content from AI use

When Keith Rabois announced he was leaving Founders Fund to return to Khosla Ventures in January, it came as a shock to many in the venture capital ecosystem — and…

From Ethan Choi to Spencer Peterson, venture capitalists continue to play musical chairs

Archer Aviation and Southwest Airlines are teaming up to figure out what it will take to build out a network of electric air taxis at California airports. Southwest’s customer data…

Archer’s vision of an air taxi network could benefit from Southwest customer data

If you visited the Wikipedia website on mobile this week, you might have seen a pop-up indicating that dark mode is ready for prime time.

Wikipedia’s mobile website finally gets a dark mode — here’s how to turn it on

Featured Article

What the AT&T phone records data breach means for you

The giant U.S. telco lost the information of around 110 million customers. Here’s what you need to know.

What the AT&T phone records data breach means for you

The error brings to a close SpaceX’s incredible streak of 335 flawless launches across the company’s Falcon family of rockets, which also includes the more powerful Falcon Heavy.

SpaceX Falcon 9 suffers rare failure on orbit during Starlink deployment

The AI chatbot has been trained on Amazon’s product catalog, customer reviews, community Q&As, and other public information found around the web.

Amazon AI chatbot Rufus is now live for all US customers

If X continues to violate Europe’s data protection rules, the company is on the hook for fines of up to €4,000 per day.

More bad news for Elon Musk after X user’s legal challenge to shadowban prevails

HERO Software has closed a €40 million Series B financing round, and plans to expand across Europe. 

A startup set out to fight climate change — it did it by helping plumbers

Fusion power may still be a few years away, but one startup is laying the groundwork for what it hopes will become a bustling sector of the economy.

Fusion pioneer Commonwealth Fusion Systems is selling core magnet tech to the University of Wisconsin

For months, rumors persisted that Google, and perhaps others, were interested in buying HubSpot, a Boston-based CRM and marketing software company. HubSpot’s market cap ballooned as the rumors persisted, eventually…

Boston VCs are pleased that HubSpot will remain an independent company

ByteDance’s video editing app CapCut will stop offering free cloud storage to host creative assets starting August 5. In the past few days, users have received notifications about CapCut changing…

CapCut will stop offering free cloud storage from August 5

The platform formerly known as Twitter has earned the dubious honor of being the first very large online platform (VLOP) to face a preliminary finding of breaching the European Union’s…

Europe confirms first clutch of DSA grievances on Elon Musk’s X

Featured Article

AT&T says criminals stole phone records of ‘nearly all’ customers in new data breach

The stolen data includes 110 million AT&T customer phone numbers, calling and text records, and some location-related data.

AT&T says criminals stole phone records of ‘nearly all’ customers in new data breach

The full and final text of the EU AI Act, the European Union’s landmark risk-based regulation for applications of artificial intelligence, has been published in the bloc’s Official Journal. In…

EU’s AI Act gets published in bloc’s Official Journal, starting clock on legal deadlines

Featured Article

SoftBank acquires UK AI chipmaker Graphcore

While the figure of $500 million has been bandied around in various reports for months, in a press briefing early Thursday morning, Graphcore co-founder and CEO Nigel Toon remained coy on the details.

SoftBank acquires UK AI chipmaker Graphcore

Elon Musk’s X, formerly Twitter, is continuing to develop a downvoting feature that will be used to improve how replies are ranked. Although the company has not yet officially announced…

X is building a ‘dislike’ button for downvoting replies

Featured Article

Data breach exposes millions of mSpy spyware customers

A huge batch of mSpy customer service emails dating back to 2014 were stolen in a May data breach.

Data breach exposes millions of mSpy spyware customers

Kudos founder says her company makes a disposable diaper lined with 100% cotton, unlike the major competitors.

Shark Tank-backed Kudos raises another $3M for healthier, cotton-based disposable diapers

Astra CEO Chris Kemp is already pulling out of a parking spot when he warns the person in the passenger seat that he doesn’t have a valid driver’s license. “And…

‘Wild Wild Space’ doc captures the risks and rivalries of the new space race

Although these companies’ claims are artfully couched, it’s clear that they want to express that the model sees in some sense of the word.

‘Visual’ AI models might not see anything at all

Welcome back to TechCrunch Mobility — your central hub for news and insights on the future of transportation. Sign up here for free — just click TechCrunch Mobility! Did you…

Lucid revs up sales, Fisker makes a deal and Uber reignites an old fight

Retro CEO Nathan Sharp isn’t worrying just yet about Google’s plan to copy his app’s experience, despite the numerous similarities.

Photo-sharing startup Retro spots Google Photos copying its idea and design