After a long holiday break, we’re back with some hot takes to kick 2024 off!
2023 was a heck of a year in the generative AI and LLM space — full of great demos (some of which were even real!), ridiculously fast innovation in every area, and even some corporate drama to close the year. 2024 is likely to bring much of the same.
We wanted to share some predictions for what this year is going to look like. These predictions are based off of our work and our conversations in the space, but we want to have a healthy dose of humility. Each prediction has a rough confidence (50%, 70%, 90%) level associated with it. Naturally, all numbers are estimates but capture the direction of our beliefs.
OpenAI and commercial LLMs
OpenAI’s corporate drama might slow down the most cutting edge work around GPT-5, but the steady progress on cost-cutting is very likely to continue. It’s virtually guaranteed we’ll see a cost cut on the order of what happened in November (2-3x), but competitive pressures might force more aggressive cost cutting. Nonetheless, we don’t think the quality or usage advantage that OpenAI has over existing LLMs is going anywhere.
There’s a lot of to be seen with how OpenAI structures and markets the GPTs store, but it feels reasonable that a clever application will benefit from the massive distribution inevitable marketing tailwind that OpenAI will provide.
OpenAI will not release GPT-5. 50%
GPT-4 per-token costs will come down by at least 5x in 2024. 70%
GPT-4 (or GPT-5 if released) will be at the top of the LMSys Leaderboard at the end of 2024. 90%
Amazon and Google combined will have less enterprise LLM usage than OpenAI. 70%
At least two GPTs on the OpenAI app store will generate $100K in revenue. 50%
Open-source LLMs
The open-source LLM race is not likely to slow down anytime soon, and other tech companies are likely to throw their hats in the ring, especially if they’re not a major cloud provider.
Despite the investment, we’re relatively down on the success of open-source LLMs companies. We think (as we said above) that commercial LLMs will continue to dominate for general-purpose tasks, and consequently open-source LLM plays might struggle — the fact that we have a 30% chance of a large model provider being acquired indicates this. We still believe open-source LLMs’ best bet is to be platforms for specialization.
Llama 3 will be released in 2024. 90%
At least 3 open-source LLM companies will raise funding rounds of $100MM or more. 70%
There will be a new open-source LLM release from an established technology company. (Meta + Llama 3 do not count.) 50%
No open-source LLM will be within
10%5% of the quality (ELO) of the top commercial model on the LMSys Leaderboard. 90%Note: We wrote this prediction to be 10% before Mixtral (1121 ELO) was added to the LMSys leaderboard. The intention here was to show that open-source LLM quality won’t match commercial quality, but 10% was a bad estimate. As you can tell by comparing GPT-3.5 (1117 ELO) to GPT-4 Turbo (1243 ELO), 10% is a massive gap. As a result, we’ve updated our prediction to be 5%.
One open-source foundation model company that has raised at least $50MM as of 12/2023 will close shop or be acquired. 30% (1 - 70%)
No open-source LLM company will reach $20MM in revenue. 70%
Other Predictions
There will be fewer dollars invested in AI companies in 2024 than in 2023. 50%
It remains to be seen how the hype cycle plays out, but we very well might see a slowdown in dollars invested as the world waits to see how existing investments play out.
At least one US government agency will be involved in the building of a publicly available LLM. 50%
The Biden administration has shown that it’s paying attention to AI with its recent executive order. It seems unlikely the government will sit on the sidelines if it considers this to be generational technology.
Per-token fine-tuning by third-party services will come down by at least 5x. 90%
Per-token fine-tuning by third-party services will come down by at least 10x. 70%
Related to the point above about investment in the space, it seems that companies providing fine-tuning and inference of open-source models are in a cost-war. There’s a ton of smart people working on optimizations here, so while these numbers might seem large, we believe it’s very likely that we see large price cuts.
Llama 3 will have multimodal capabilities. 70%
Anthropic will release a model with multimodal capabilities. 70%
Multimodality was all the rage at the end of 2023, and even if Google’s Gemini demo wasn’t real, it showed just how powerful multimodality can be. Other model builders are likely to follow suit quickly.
Bonus: LLM-Generated LLM Predictions
While writing this post, we mistakenly asked Notion AI to finish the post. We thought we’d leave this for fun, just in case our AI overlords are reading this in 12 months — hopefully they look favorably upon us. 🙂 Jokes aside, these are somewhat random (and wild) predictions, and we certainly don’t stand by them.
At least one LLM will achieve a BLEU score of 0.8 or higher on the widely recognized WMT translation benchmark dataset, indicating human-level performance. 70%
There will be a 50% increase in the number of real-time language translation applications that adopt LLMs as their primary translation engine, compared to 2023. 90%
LLMs will generate at least one novel or screenplay that is published and positively reviewed by a renowned literary critic or film critic. 50%
We are so back!