We’re launching RunLLM v2 next week! We’ll post about it here on Tuesday, and we’d appreciate your help getting the word out. You can check out our pre-launch post on ProductHunt here.
Since the end of last year, we’ve been thinking a lot about how to build the right UX for AI applications. The difference is palpable between product experiences built with AI from the ground up (think Cursor) vs. a product that’s trying to bolt AI on for sake of it (we won’t name names, but you can probably think of plenty!). That’s of course been a critical part of our work at RunLLM as well, as we’ve evolved from being “just a chatbot” into an AI Support Engineer.
When we originally started talking about the idea of a new UX for AI, we were partially fighting the last war: With the web and mobile platform shifts, the new UX came from the fact that you had a different kind of real estate available to you, and you had to think about how best to present or hide information in that context. Thinking about the UX for AI is different. The unlock here — at least in a B2B context — isn’t that you can access your CRM anywhere via the internet or that you could read email from the subway. Instead, the unlock is that LLMs can now do work for you automatically that you didn’t have the cycles to do yourself.
With that framing, we realized that the right way to think about the UX for an AI application is not by coming up with some clever new information presentation technique. Instead, we have two main principles that we think everyone should be following.
Be Present: AI applications should integrate deeply into the tools and workflows that you already use.
Build Trust: AI applications should enable their users to trust them by maximizing visibility intro work being done and control over results.
Neither of these require fancy new UI components — rather, it requires being thoughtful about what integrations you build and what information you present.
Working where you work
We often say to customers (a little tongue-in-cheek) that if you hired a new support engineer on your team and that person told you that they refuse to use Zendesk, you would be a little confused. This is a silly joke, but it illustrates what we think is a useful point: In the SaaS era, joining a new team means learning what tools they use and how those tools integrate with each other. If you’re not able to follow the workflows and best practices that your new team follows, you’re going to create chaos by handing off the wrong conversations or asking for help in the wrong places. Integrations have always been a critical part of enterprise software, but for an AI product that purports to do work on your behalf, the requirements for integrations are different.
Integrations in the era of AI are more than connecting to other services and tools and reading data. Instead, the goal of a well-built integration is to use ****the target service in the same way your customer does. For example, in RunLLM’s integration with ticketing systems isn’t just about reading past tickets — instead, we can triage & tag tickets, draft responses, and even communicate directly with users.
A good AI product has to work where you work, so that it can both learn from the expertise that your team already has and communicate with you in the way that you expect. The most obvious manifestation of this is that basically every AI product has now built a Slack app as a first line of engagement. In the context of a product like RunLLM, Slack has become the de facto standard for enterprise support, so we’re often deployed as the first line of defense in customer Slack channels. For Cursor, being able to pick up from a conversation where a team is discussing a product feature and respond with a code change is makes for an extremely appealing demo.
In startups, it’s well understood that the best quality product will lose out to the product that has the best distribution. The same thing is true here. You can have the fanciest AI in the world, but if it doesn’t have the right context or isn’t present where & when I need it, it’s will be way less sticky than something that’s easier to use.
Tagging an agent from Slack is only the first cut of meeting customers where they are. We’ve already seen plenty of sales agents on the market assemble and proactively share prep materials for a customer call. Similarly, RunLLM integrates across everything in your support stack — Slack, ticketing system, task trackers, etc. As these products mature, we’ll likely see that the ability to move and manage data across systems will only get better.
Actively fostering trust
Earlier this year, we wrote about how to build applications that customers will trust, and we settled on the idea of visibility and control as the two main things that customers would look for. We’ve only become more convinced about the value of those as first-class citizens in every AI product. LLMs by their nature are inscrutable. When you put the same prompt into ChatGPT multiple times, you very well might get markedly different answers. Leaving an AI system as a black box makes customers nervous; even if it solves the problem correctly most of the time, how are they going to make sure it doesn’t make the same mistake twice when something goes wrong? (And something will inevitably go wrong!)
We won’t repeat everything from the previous post here, but here’s a quick recap:
Visibility: When an AI system is doing work on your behalf, you’re going to want to know what it did for you and why it chose to do things in a certain way. That means that you’re going to need to make it very clear to the user what problems were solved and what internal decisions were made. (Hopefully you’re not still relying on a single LLM call!)
Control: For those cases where things do go wrong, users should have the ability to give the system feedback — not just on the ultimate output but on the incremental work along the way — and feel confident that the system will follow their instructions. This is the “teach a person to fish” approach to improving AI systems.
How do you build a UX around these principles? This is an area where good information presentation is necessary. Cursor does a good job of showing you what it’s reading and what it’s thinking at each step and giving you the ability to accept or reject changes. On the other hand, while Cursor Rules are very useful, they don’t make it particularly easy for you to give feedback to the system in-line.
As we discussed in the previous post on the topic, we felt that RunLLM fell short on both visibility and control. In our product launch next week, we’re rolling out new features that show RunLLM’s fine-grained reasoning at each step, give you the ability to experiment with different answers, and control agent behavior at a much finer granularity. And while we can learn from answers that we’ve gotten wrong, we still have work to do when it comes to giving you control over fine-grained reasoning steps.
This is always been true of good UX, but there’s no magic bullet that’s going to solve your problems. Most of the top AI applications companies that we talk to are becoming integration machines (reach out if you want use RunLLM as a shortcut for building integrations 👀), and that’s because meeting your customer where they live is where you can add the most value.
To beat a dead horse, buying an AI application is like hiring a person on your team. It needs to use the tools that your team uses, it needs to make its work visible, and it needs to be receptive to feedback. You’d be displeased with a teammate falling short on any of those things, and the same holds true for an AI application. If you can excel at those things, then your new teammates are going to love working with you.
Jarvis working alongside Stark
Sounds like the right way to describe a modern AI interface is “micro dashboards.” Claude Code does this excellently for serial software elements, OpenAI Codex is a bit closer to the final evaluation. We’ll also have them for email, socials, the day ahead, and all sorts in between.