AI Artists vs. AI Engineers
Emerging archetypes for complex applications
As you all probably know, we’re big believers in the capabilities of LLMs. We’ve said more than once that if you paused all improvements in models — whether for regulatory reasons or because of technological limitations — we think there’s 5-10 years of innovation that’s left to be done simply in taking the existing technology and applying it to the many use cases that we haven’t yet thought of. If you’re building an AI application startup today, you very likely agree with us (though you would probably debate the number of years). However, even amongst people who are excited about AI, we’ve started to see two different approaches to the way more complex AI applications are being built. We’re calling these archetypes the AI Artist (give AI systems full creative control) and AI Engineer (operate under constraints to optimize for quality).
Before we dive into some preliminary definitions of these phrases, let’s start with a little bit of context for why this framework has popped up in our thinking recently. As you all likely know, we’re focused on building specific AI-powered applications for enterprises at RunLLM. We’ve historically been focused on AI Support Engineering and more recently have gotten pulled into AI SRE (which we’re very excited about). Regardless of what specific job function you’re thinking about, we’re big believers in the idea of modern AI applications fulfilling specific job functions. It’s one thing to build a Q&A bot for HR questions, but when you get deeper into tasks like SRE, sales development, or security analysis, you find that every team has a way of working. Your product will add the most value when it goes beyond being a tool and slots seamlessly into your team, and these are the approaches we’ve seen emerge in handling that depth of integration.
It’s in the context of these highly complex use cases that we’re starting to see the emerging gap between AI Artists and AI Engineers. We, of course, have opinions on the right approach, but let’s start by defining the terms.
Artists vs. Engineers
AI Artists. The idea behind AI Artists is that LLMs are very good at processing free-form data, and if you give LLMs enough context, specific instructions, and the freedom to reason, they’ll be able to solve most problems. AI Artist agents rely on LLM-powered planners to have the full freedom to make each decision, and as a result, these systems can generalize quite well to problems that they have never seen before — a powerful concept if you want to pitch your customers on a coworker who’s going to get better over time. Of course, when you have a sharp knife, you can cut yourself with it very easily too. The drawback with AI Artist agents is that they’re going to rely very heavily on the context that’s provided to them — if the agent makes a mistake or if the context is lacking in some way, they can very likely take the wrong action, mislead the user, or make some other mistake. The other pitfall is that the agents are very reliant on the results of the planner, and if that particular model makes a bad choice, the agent can go off the rails (though this should be less of a concern as models improve).
AI Engineers. AI Engineer agents take a different approach. These systems try to learn as much as they can from how companies already do things. The idea isn’t that LLMs aren’t important — they’re of course key, enabling technologies — but that you don’t want them to run amok on all the data that’s available. While the product specifics will vary based on the job function, AI Engineer systems are going to map a particular workflow into a well-defined plan that the agent can execute1. In this context, having precisely generated plans that are then faithfully executed is a very common approach that we’re seeing. These plans might be generated as code or simply as sequences of steps that are then translated into tool calls. This approach ensures that much stronger guardrails can be implemented (sometimes even via formal analysis) on the actions the agent is taking. Of course, the drawback of these systems is that if you’re biasing them towards existing processes/plans/procedures, you’re potentially restricting how well they generalize to other systems.
Case Study. To illustrate this point, let’s consider a difference in approaches that we’ve discussed in the past on the blog — Cursor vs. Devin. We’ll note that it’s been about 6 months since we’ve played with Devin, so some of our references might be out of date, but the analogy still holds. Devin takes an AI Artists approach to AI agents for software engineering. The first version of the product gave the agent full reign to generate a plan, execute it using its best judgment, and update the plan based on the results that it saw. The approach — when it works — means that Devin can theoretically solve a number of interesting problems.
However, as we shared in our blog post earlier this year, that level of freedom and flexibility didn’t always work as well as you’d have hoped, especially when the planner didn’t understand instructions. (There’s an interesting recent research paper that shows scripted SWE agents are superior to dynamic ones.) Cursor on the other hand took a much narrower approach to start, focusing on doing very specific tasks that a user requested, and asking for user input and engagement in fine-grained increments. Even today, we tell every engineer at RunLLM that the best way to get value out of Cursor is to give it specifically-scoped steps and incrementally ask it to solve a problem — the broader the scope, the more variance you get. More recently, Cursor has started generating a set of steps for each task and shows you how it executes each step incrementally2. This execution is much more closely followed and represents an AI Engineer approach.
Takeaways
As with any framework like this, you’ll very rarely find a product or team that is purely AI Artist or purely AI Engineer. Just like in human society, there’s value in both approaches. The distinction between these two is a spectrum, and most people won’t fall at the ends but will likely be closer to one side or the other. As you can see in the case study above, Devin isn’t letting its planner formulate each new step without rhyme or reason — the planner tries to stick to its plan unless it sees some new unexpected information — so it’s more of an Artist then Cursor but not purely an Artist.
The other thing worth noting about this framework is that you might not be able to neatly put a whole product in one camp or the other. For example, in the context of building an AI SRE, we’ve found that it’s incredibly valuable to learn from our customers’ existing runbooks and processes — a very AI Engineer approach. However, sometimes customers want to explore and ask arbitrary, potentially complex questions about their infrastructure, which requires more of an Artist implementation. As a result, we’ve ended up with aspects of both.
So, where do we fall on the spectrum? Generally — as you can probably tell from everything we’ve written — we tend towards Engineering over Artistry. We think that LLMs are very powerful tools but in the absence of the ability to do fine-grained learning in the way that a human does, you need stricter guardrails than you would ideally want from a truly general-purpose system. The tradeoff that we’ve consistently made is that we’d rather build trust with our customers by doing well-scoped, expected things rather than take the risk that we could do something really cool… or fail and do something really silly.
That said, we don’t think there’s one right answer, either. You need to be pushing the bounds of what’s possible in order to understand where you can loosen your constraints. As LLMs get better, our general expectation is that we’ll have the capability to build systems that are more and more Artist-oriented — systems that are given more and more leeway to take actions based on their own best judgment rather than relying on the narrow constraints they’re placed under. ChatGPT’s search-powered research features a good example of an agent that can be given a highly free-form task and execute accordingly (though the scope of actions it can take is quite limited).
The final thing that’s worth noting is the point we referred to above about fine-grained learning. We don’t have strong preferences about how this should be done — this could be a Dwarkesh-style continual learning approach, or it could simply be better and better context management and memory. We’re slightly biased towards investing the latter as it’s something we’ve already implemented in RunLLM, and we see a variety of opportunities to continue to get better at learning from our mistakes by writing the right things down.
Independent of where you fall on this spectrum, it’s probably worth asking yourself how you could learn from the other side. As default-Engineers, we’ve pushed ourselves in some of our latest product experiments to loosen some of our constraints, and we’ve found — to our surprise — that we can accomplish a lot more than we initially thought. If you’re a Artist, ask yourself where adding strong constraints might benefit your product. At the end of the day, we’re fundamentally just fans of moderation.
Of course, there’s not going to be a workflow or process for every scenario, so your agent will still need to be able to plan dynamically. Nonetheless, having a strong starting point helps reduce variance in behavior.
You might point out that Cursor Agent is actually a more Artist approach than the first version of Cursor was — we completely agree, but the in-IDE UX encourages a more fine-grained and interactive mode of operation compared to Devin, which is Slack-first. As with everything, Artists vs. Engineers is a spectrum rather than a binary.



