The 3 Biggest Unlocks From OpenAI Dev Day
The 60%+ reduction in price doesn't even make the cut
Today was the big day of OpenAI’s first Dev Day event.
Before we dig in, quick disclosures:
1. I’ve been a long-time fan of OpenAI — and am also an investor.
2. I have been tinkering with alpha versions of the APIs for a little while now.
Here’s what I’m personally most excited about.
#1 Memory/Thread Management
If you’ve done any development of a chat bot (like ChatSpot.com) using GPT, you’ve likely wanted to implement some sort of “memory” so that users can have a conversation with your chat bot (similar to what can be done in ChatGPT). This allows users to ask “follow-up” questions.
The problem is that implementing such memory is non-trivial. First off, all memory has to be squeezed into the context window and it’s often hard to know a priori how much context window you should be consuming for the memory vs. the prompt output. Then, you likely need to “summarize” some of the conversation in order to compress more of the previous conversation into the context window. And, if the kinds of things you have in memory is actually data (like the results of a data query), life is really hard, because there’s no good way to summarize that.
Now, with the new Assistants API, OpenAI does all of the heavy lifting for you. You just create a “thread” and add messages to that thread. All the management of memory is done for you.
#2 Retrieval Support in API
One of the most common approaches to implementing LLM apps that need to have access to custom knowledge is using what’s called Retrieval Augmented Generation (RAG). With RAG, you take the custom knowledge/data you have and create vector embeddings of it and store in a vector database of some sort. Then, when a user submits a query, you do a semantic search to find the most relevant documents based on the user query. You then pass *those* documents along to the LLM inside the context window along with the user query. This all works pretty well, but takes some effort.
OpenAI has now made that easy with support for retrieval right in the API. When building out your bot/assistant, you can first up[load your custom knowledge. GPT will then access that knowledge as needed in order to respond to user prompts. It basically takes care of the RAG for you.
#3 Code Interpreter Support
With the new Assistants API you can now also enable “code interpreter” support (also known as data analysis support). This lets you leverage a Python runtime engine right inside GPT so GPT can generate and run code to do data analysis or otherwise respond to a user prompt. Very, very powerful.
That’s what I’m most excited about. There’s also of course the new 128k token context window (huge!), support for multi-modal input/output, etc. But these are the 3 things I’m most excited for as a developer right now.
Are you going to use Assistants API in ChatSpot now?
Memory management seems like a great help for multi-agent. Less context to keep building up between agent actions.
Retrieval seems great for things like customer support bots that need to reference docs.
Code interpreter also seems great for multi-agent. I wonder how much of the work you can do without running a separate interpreter now.