My first exposure to the intersection of AI and crypto was in December 2023 when I heard about Bittensor, a collection of AI projects federated around a blockchain. Fast forward to today and oh boy, the web3 AI train is in full swing. But beyond the hype, there is a real sense of the industry overselling and doing a lot of things that aren't really AI.
So I want to take this post as an opportunity to classify different types of "AI Agents" in the crypto space.
First of all, let's define what an AI Agent is. At its core, an agent is a system that can take autonomous actions.
We will define 3 levels of autonomy:
The key technological paradigm powering all these levels is called Function Calling or Tool Usage.
This is the process of creating code and asking the LLM the best way to execute that code. Imagine we wanted to make an agent that helps users invest in bitcoin.
What are the latest news headlines about bitcoin?
What are the latest news headlines about bitcoin?
Find the latest Bitcoin news with 24-hour trading volume, analysis, video, price updates for BTC cryptocurrency, blockchain, mining at Cointelegraph.
Now we are getting somewhere! We can now search the web for news about bitcoin. This is the essence of tool calling. This was a good example of a first level of autonomy. These agents are simply responsing to a user's input and executing a predefined process. This is essentially a new form of user experience driven by natural language instead of user clicks or forms.
The second level we defined is where things start to get interesting! Humans have defined processes since the industrial revolution. A nice difference now is that we can now use language models to stick these processes together with tool calling. Let's look at an example.
As a human we want to invest in bitcoin:
Now let's explore how this translates in the second level of autonomy. We'll graph the architecture of this process and make it "agentic".
Here we see a typical investment algorithm that invests based on sentiment analysis of news. However, the key difference is that we are driving the process with a language model and a series of tools. This could still exist before the launch of chatgpt and when agents were not a thing.
The key difference is in the availability of power cheap compute and the rapidity. Such a process was essentially restricted to rich organizations with the following capacities:
Now a solo developer can build this in a few hours simply by calling APIs and running this on their local machine. This is where the concept of agents starts to make sense. Automation is now a thing that can be done by thinkerers and even non-technical users given the rise of no-code automation tools..
But this is still a semi-autonomous process defined by a human to run each morning. Let's look at the third level of autonomy.
Now let's explore what kind of process we could build at this level. There have been attempts at fully autonomous systems but they always hit fundamental issues:
Reasoning is not possible for LLMs!
No matter matter what openAI touts about O1 and reasoning chains, LLMs are still fundamentally prediction of text. Reasoning chains is the process of cutting a problem into multiple pronpts and tool calling:
Human prompt: Should I buy bitcoin?
This might be a good start but it it faces a really common problem most developers face when coding with AI:
The prompting valley of death is a big reason why Devin and other AI agents are not yet replacing developers.
This is important context to understand the limits of web3 agents. Prompting is a powerful tool but it is not a silver bullet. We hit the fundamental limits of LLMs when we give full autonomy.
The general answer is that we build specialized systems and we stitch them together to form a more complete semi autonomous system. A single well defined prompt always gets great results compared to a loose set of prompts. This is caused by the concept of "attention" in LLMs. LLMs can be given vasts amount of information but like a human, will focus on what it considers the most important.
Knowing this let's explore how we could build a web3 agent that runs a meme coin. We'll split it into parts:
This then starts to run as a continuous script but the logic between each agent is defined by a human planning the process.
We won't dive into the code here but coding these systems essentially comes back to good old software engineering mixed in with LLM and Tool calling. If we move beyond the hype and into the reality of what we can build with AI agents, we can see that there is a lot of potential but there is nothing magic about it.
We'll need to give the agent a database, webhooks, and other tools to make it work. It becomes a matter of stitching together a system that can be run in a loop.
Conclusion The intersection of AI agents and crypto represents an exciting frontier, but it's crucial to separate the hype from reality. While we're not yet at the stage of fully autonomous AI agents running complex crypto operations, we are seeing meaningful applications emerge at different levels of autonomy.
The key takeaway is that successful AI agent implementations in crypto tend to be well-scoped, specialized systems rather than catch-all solutions. By breaking down complex processes into discrete, manageable tasks and combining traditional software engineering practices with LLM capabilities, we can create valuable tools that enhance rather than replace human decision-making.
The future of AI agents in crypto likely lies not in building fully autonomous systems, but in creating thoughtful, semi-autonomous processes that leverage the strengths of both LLMs and traditional programming. This means embracing the limitations of current AI technology while taking advantage of its ability to process natural language, analyze information, and execute specific tasks efficiently.
For developers entering this space, the focus should be on building robust, practical systems that solve real problems rather than chasing the illusion of full autonomy. By understanding these constraints and opportunities, we can create meaningful applications that advance the field without falling into the trap of overselling AI's current capabilities.
The real innovation isn't in creating magical AI agents that do everything, but in thoughtfully combining AI capabilities with solid engineering practices to build tools that genuinely improve how we interact with crypto systems.