We intentionally stayed away from commenting on DeepSeek R1 last week to let the dust settle. The AI world had a collective convulsion for a few days, and things have predictably settled down over the last week. The general consensus seems to be that DeepSeek R1 is an impressive model that performs well on benchmarks, that other model providers have some catching up to do on price-performance tradeoffs, and that the world as we know it has not, in fact, ended.
Now that we’ve all had a chance to take a deep breath, we thought it would be useful to share how we’re thinking about DeepSeek in the context of RunLLM. We don’t think it’s useful to comment on the implications for AI or society because you’ve probably seen a million of those posts on the internet in the last week — we’ll stay focused here on how R1 might affect an AI application builder’s thinking.
To start off, it’s worth establishing that we think good models are a good thing for anyone building AI applications. Putting aside any broader implications about geopolitics or open vs. closed models, a widened model selection is a huge win. Every model tends to have areas and tasks at which it excels, so more LLM options means that we’ll be able to quickly improve the quality of applications. For regulatory and compliance reasons, we wouldn’t be able to use a foreign-hosted model, but because DeepSeek was an open weights model, the likes of Fireworks AI and Together AI were able to quickly spin up inference services for R1 — something that makes our lives dramatically easier.
DeepSeek also set off massive alarms inside all of the major AI research labs from what we’ve heard, and competition is of course also a good thing. OpenAI ended last week by releasing o3-mini as a part of ChatGPT and on its API. Whether this was always the planned timeline or DeepSeek forced their hand is hard to say. It’s very likely that OpenAI felt the need to stay relevant — more in the broader zeitgeist than for those paying attention to AI — which led them to push a new model out as soon as possible. We’ll see whether OpenAI will fast track the release of the full o3 model and how Deep Research affects the general perception, but it’s clear that DeepSeek changed AI labs’ expectations.
Even more interestingly, OpenAI seems to be reconsidering its commitment to closed-weight models. If nothing else, this is a big win for open model weight advocates. (We’ll also note that we were wrong about the race between open weight models and OpenAI last year.) From our perspective, more open weight model providers would be a big win for application builders too. Security and compliance teams are always weary of private data leaving their cloud environments, and the fact that OpenAI has had the state-of-the-art models available only in Azure (and with restrictions at that) makes it difficult to move applications like RunLLM into customer clouds. If most SOTA models were open weight, the major cloud providers would immediately offer them as hosted services. This would in turn enable us to build on top of whatever cloud our customers are using. This is by no means guaranteed, but being able to host R1 anywhere is already a win, and OpenAI following suit would be a huge coup.
Whatever your opinions about foreign model providers might be, it’s clear that DeepSeek’s strategy has had a huge impact on everyone else’s thinking. The decision to release everything into the open also means that we’ll very quickly see official recreations of DeepSeek with models like Llama, which will further alleviate concerns about where models are coming from — and possibly put further pressure on AI labs to release weights.
Across all of the above points, it’s clear that frontier models will continue to commodify. We’ve written about this in the past, so we’ll avoid repeating ourselves at length, but the same quanta of intelligence (however you measure it) has gotten dramatically cheaper over the last 3 years. It’s also clear that the race for better, faster, and cheaper AI models is just getting started. Even if gains in intelligence begin to asymptote, there’s plenty of systems innovation that will lead to costs dropping in the coming months and years. Lower costs are good both because they reduce the cost of adoption and also because they allow new problems to be solved with AI at a scale that wasn’t previously possible (which is what we wrote about last week).
With all that said, our opinions about the major model providers haven’t really changed all that dramatically. There was a weekend when the world thought that everything had changed, but it’s pretty clear that o1 and o3 are still stronger models than R1 (and we’ll see how the release of o3-mini stacks up). If OpenAI and Anthropic weren’t already experimenting with the techniques that power R1, they certainly are now. We’ll either find out that this is already what made o3 so good or that there is another level of intelligence and/or efficiency that can be unlocked. Either way, OpenAI’s still the leading model provider in our minds — though we do reserve the right to change our minds on that.
January was a more exciting month in AI than we were expecting, and DeepSeek R1 was certainly not on our bingo cards for this month. That said, the general trend towards more competition amongst model providers, model quality increasing quickly, and ease of adoption going up is in line with what we’ve seen over the last two years.
From the perspective of an application builder, these are all boons. We should be very excited for what these changes already enable and what future iterations we’ll see. It’s only been one month, but we’re off to a crazy start in 2025 — hope you’re ready for more!
Are these becoming commodities?