Design AI to fail

It's a question of when, not if, your system runs into its limits

and

May 15, 2025

In many ways, the principles behind building a B2B AI application are very similar to the principles you would follow to build good SaaS applications. You want to build products that are low friction, integrated into the customers workflows, and customizable. A lot of what we find ourselves doing on a daily basis is translating those principles to something that’s specific to an AI application. That said, there’s one key area where things are dramatically different in AI applications than with traditional SaaS: how you handle failure.

When most software engineers think of the word failure, they think of servers crashing or API calls returning 500s. In those cases, your goal is design a system that returns to minimum (and eventually full) functionality as quickly as possibly — by switching over to a different cloud provider or a different API provider, or by reverting to a code commit that was known to work. These are good software engineering principles that we should all still follow.

AI systems fail in different ways though, largely due to their probabilistic nature. This isn’t new necessarily — in the 2010s, Facebook was known for having fallback mechanisms to render newsfeed posts and contact lists in < 200ms even if the ranking algorithms they were running didn’t return in time. The thing that’s new with LLMs is that they can generate totally plausible sounding responses that are, in fact, total garbage. Colloquially, everyone calls this hallucination, but this can be due to a variety of reasons — actual model hallucination, bad search, lack of available data, etc.

Given both the complexity of problems AI systems are expected to solve and the fluent-but-garbage problem, you should assume by default that your application will fail. It will absolutely find problems it can’t solve and despite your guardrails, it even might occasionally provide nonsense outputs. As an application builder, you’ll have to figure out how to solve this problem — how to design your application for the inevitable failure case.

To illustrate the point, take a fairly innocuous example: We asked o3 to read all the posts on the AI Frontier and suggest areas of improvement. The first suggestion it came up with was the strangest — it said that we needed to publish posts more consistently because we had a larger cluster of posts in 2024 and no new posts in 2025. It then linked to our archive which clearly shows posts from almost every week this year. Most likely, this was because of a failure to properly parse the metadata, but it led to a hilariously useless suggestion. We pushed it further, asking about the most recent post (it said May 2025 but that there was a six week gap to January), what the first post in January was (suddenly it found a post only 2 weeks after the last post), etc. The more questions we asked, the more incoherent the answers got. This is mostly just amusing in a silly example like this, but imagine this system in front of a high-value enterprise customer — it would be almost unusable. The correct approach in this case would have been to fail a lot earlier; the system should have told us that it didn’t have enough information to provide a useful answer.

What this boils down is that — as we’ve discussed before — LLMs have a helpfulness problem. Despite how intelligent LLMs can be, they’re still twisting themselves into knots to find a way to solve your problem even if they have no idea how to solve it. (When we gave a separate o3 very specific instructions to solve the timing problem, it did mostly fine.) That means that as an application builder, you have to think carefully about what you do, when you do it, and when you throw up your hands and say “I don’t know.”

How you do this well will vary from application to application, but we’ve found that there are some critical themes that we think will apply to anyone building an application.

Don’t be afraid to say “I don’t know.” One of the first things that we tell all our customers at RunLLM is that our null hypothesis is to not answer a question. Our inference pipeline is effectively trying to prove that it should answer the question, and we only do that once we’ve reached enough confidence. The devil’s in the details, so you’ll have to think carefully about how you do that. For us at RunLLM, it is a question of carefully analyzing the question, the data we have access to, and how the documentation provided relates to the nuances of the question (e.g., deployment, language, etc.). If any of those boxes aren’t adequately checked, we default back to saying “I don’t know.” This is one of our customers’ favorite things about us — end users are of course disappointed to not get an answer to their question, but customers prefer the slightly more careful approach in every case.
Fail loudly but gracefully. Saying “I don’t know” is a good starting point, but it’s not enough. Whether it’s an AI SDR convincing a prospect, an AI Support Engineer answering customer questions, or an AI SRE analyzing alerts, each system is going to, most likely, process a volume of data that a person shouldn’t need to check by hand. If they did, that would defeat the purpose of having the AI system in place. When you reach a scenario where you don’t have an answer to the question, saying I don’t know isn’t enough. You need to fall back to a person in a way that makes it clear their attention is needed but also arms them with all the necessary detail they need to avoid duplicating your work. For RunLLM, that means giving our customers hooks to create tickets or Slack threads when we can’t answer questions, then summarizing themes and trends in those scenarios.
Learn from your mistakes. Again, the assumption behind all of this is your application is inevitably going to fail. That’s okay, but just like a good team member, you don’t want to make the same mistake twice. One of the best examples we’ve seen of this is Devin’s custom knowledge repository — when you give Devin feedback, it adds it to a repository of things that it previously made a mistake with along with the correct behavior. In a future job, if it uses any of that information, it tells you proactively what rule it used. Not only does this help it avoid issues, this also gives the user a sense of control and satisfaction over the behavior of the system. Similarly, customers often will ask the same question to RunLLM after using our instant learning feature to verify that we learned as they expected.

While system-level failures are still things that we should work to minimize, intelligence failures are absolutely inevitable, just like they are with people. That’s okay — you should accept that, and you certainly shouldn’t obfuscate anything from your customers. The more open you are about this, the more they will trust you.

What matters is how you handle those failure cases — if you act like a diligent teammate who’s eager to get feedback and careful about improving their performance and demonstrating that improvement, you’ll be in a great place. On the other hand, if you insist on keeping your behavior constant and expecting your customers to adapt, you’re going to leave a bad impression.

The AI Frontier

Discussion about this post