Product Category Memo #8: Infrastructure to build LLM apps
In-depth analysis of infrastructure products that are used to build LLM apps
Welcome to Infinite Curiosity, a weekly newsletter that explores the intersection of Artificial Intelligence and Startups. Tech enthusiasts across 200 countries have been reading what I write. Subscribe to this newsletter for free to receive it in your inbox every week:
Hello friends,
Large Language Models (LLMs) have huge potential to help us extract knowledge and structure it as we need. But how do we interact with them to get the exact answers we want? In this post, I decided to discuss the category of infrastructure products that enable developers to build LLM apps.
I asked DALL-E to generate a hand drawn sketch of machines connected with pipes. This picture is supposed to represent an LLM app, but things may have gotten out of hand. I wanted to show text flowing through, but looks like we’ll just have to use our imagination to visualize text flowing through these machines. In this post, we’ll talk about:
What is a prompt
What is prompt chaining and why do we need it
Why do we need products for prompt chaining
What are the characteristics of a good infrastructure product
What products are competing in this market
Let’s dive in.
What is a prompt?
AI models are being used in production across a number of use cases. This is especially true for LLMs. What are LLMs? These are AI models that can understand input text and perform a number of tasks including generation, summarization, translation, prediction, and knowledge extraction. These models are trained using enormous datasets. The input text is called a "prompt".
What is prompt chaining?
LLMs take these prompts as input and produce text output. For example, you can ask an LLM to generate a brief paragraph on the history of transistors. And it will provide the appropriate answer. The input and output for an LLM has the same modality -- text. This enables us to do all sorts of interesting things. You can give text input to an LLM and feed the text output to the next LLM. The text output of one LLM becomes a prompt to the next LLM. This action of chaining together LLM prompts is called "prompt chaining".
Why do we need prompt chaining?
Prompt chaining becomes relevant when you have a big goal to achieve that involves complex knowledge extraction. A single call to an LLM usually won't be able to achieve this goal, so you break it down into multiple steps. Within each step, you prompt an LLM to get a text response. You use that response to prompt another LLM (or maybe the same LLM) to get a new response. You keep doing this until you get the final answer.
Let's say I want to find out why crude oil futures spiked up in Jun 2022. And let's say I just want a simple 100-word summary. How would a normal person go about finding this information? You type your query into google, then read the top 2-3 articles, gather the relevant information, and then summarize what you found into 100 words. These are multiple tasks you need to perform to achieve the goal. Each step can be considered as a prompt to an LLM model. If you chain these steps together, you can build an LLM app that can take your text input and give you exactly what you need. This chaining capability allows developers to build powerful LLM apps that would otherwise not be possible.
What are the characteristics of a good infrastructure product to build LLM apps?
Here are the key characteristics to look for:
It should allow you to work with any LLM you want. You shouldn't be tied to a specific platform or a vendor. You should be able to chain together LLMs build be disparate entities.
It should allow you optimize prompts. The quality of the output depends on the quality of the prompts. A good platform allows you to optimize prompts so that the quality of the final output is high.
It should include functionality to interact with various sources of knowledge. To build a good LLM app, you may have to query google and look up an internal database. It's not always just a sequence of calls to LLMs. A good platform allows you to weave in these steps into your app.
It should allow you to chain together sequences of calls to models and code. It's not just a sequence of calls to LLMs. You should be allow to weave together any chain of actions.
It should have version control. The platform should allow you to keep track of changes.
It should have caching so that you are not wasting resources on the same model interactions. If you make the exact same call to an LLM more than once, it shouldn't redo the work. A good platform will have caching so that it can speed up the process and not waste resources.
It should allow models to take actions. A good LLM app needs to have freedom to allow the models to take actions. A good platform will enable this.
What products are available in the market?
We are still very early in this sector, but here are a few products in the market:
If you are getting value from this newsletter, consider subscribing for free and sharing with your friends: