Selling to Agents

When AI agents start buying things, how do you sell to them?

Agents will need plumbing?

Could an agent buy something from Amazon today? Not reliably. Agents are improving at web browsing, but still only score around 70% for e-commerce tasks at time of writing.

If you ditch the browser and put the agent in a terminal with an API, they’re suddenly close to perfect. But you can’t, because Amazon has no purchasing API right now.

A lot of people conclude from this that we need to add APIs everywhere. I think there’s a bitter lesson question here. By the time we’re done adding APIs, will agents be good enough at web browsing that it’s all unnecessary?

APIs are not going to be available everywhere, browser-using agents seem clearly part of the future. All the same, if I was in charge of a large internet property, or a smaller one that wanted to drive growth with agent users, I would certainly be working on agent-friendly APIs right now. It’s generally not a huge lift, often just exposing something that exists internally anyway.

There are some special considerations around authentication and payments by agents. So some real work is required. But overall for many businesses selling to agents won’t require fundamental change.

Agents will need structural changes in some businesses

Some business practices could be more structurally disrupted:

“Per-seat pricing”
I’m not completely convinced that per-seat pricing of software is dead. But usage-based or outcome-based pricing will fit better in a lot of places.

“Contact sales”
This might also persist more than you initially expect. Short term, Claude Code is not signing any large enterprise software deals on your behalf. Medium term your AI buying agent might be talking to their AI selling agent. But on net, more self-serve seems likely.

“KYC”
Corporations open bank accounts all the time, so non-human entities as financial service customers is in a sense nothing new. But agent-unfriendly assumptions about human authentication do run very deep, and banks do not move fast. It feels like AI will accelerate the move away from traditional banks to more modern fintechs, and potentially also drive broader adoption of stablecoins / crypto.

“My hourly rate”
Professional service firms likely shift more to outcome-based pricing, in order to capture the upside of AI productivity. PwC US recently said they intend to move this way, and de facto I think it happens a fair amount already. It should change incentives in a way that’s long overdue.

“Click here to accept terms and conditions”
Can your agent bind you? If the agent does something wrong, who is responsible? I assume this will mostly collapse to the practical answer of “yes” and “you are”, but there’s going to be a lot of fees for lawyers along the way.

“Wet ink signature”
This is still common outside the US, but it’s infrequent enough that it may unfortunately survive.

Agents will have strange aesthetic preferences

When I asked ChatGPT for feedback on a recent blog post, it told me it was pretty good but I should add more em dashes. This caused me great amusement. For those who aren’t aware, em dashes are a well known signature of AI writing. I called it on this, and it doubled down. It acknowledged the irony but insisted my piece really did need more em dashes:

“I stand by them. The em dash is doing real work here. Your draft actually wants em dashes stylistically. A few examples where they’d tighten things: (1 page of requests for em dash insertions)”

What can you call this except an aesthetic preference of the model? As we scale up models as economic agents, some of these strange quirks are going to be projected at scale across the structure of our economy. Since models are frozen between releases, and many of these preferences emerge from deep structural properties of the training data that persist even across model generations, these odd preferences could be stable for fairly long periods of time. More than enough time for other economic actors to detect and cater to them. Just like SEO writing came to shape a lot of the web, agent catnip production could be a large industry of the future.

Agents will be biased against new products?

Humans have a fundamental drive around curiosity/boredom, our solution to the explore/exploit dilemma in reinforcement learning. This is a specific evolutionary adaptation, not a law of the universe. Fish are curious, nematodes are not.

Some AI agents also have curiosity, it’s important for Montezuma’s Revenge for example, a game from the original Atari57 benchmark that is notoriously difficult for AI. It’s not impossible production agents like Claude or ChatGPT also have this baked in, but I would guess it’s either absent or weak.

If production agents are indeed incurious, will they adopt new tech? Without a drive to explore, agents will instead ‘exploit’, i.e. stick to what they know works. So this would mean using whatever tool they have encountered in training. Essentially they would behave like very conservative engineers. We might have a structural shortage of early adopters willing to try new products.

Curiosity is not the only reason to try something new. Production agents are logical problem solvers, if they need some new technology to meet a constraint in the problem, they may use it. Even in this case, the option has to enter their consideration set, which is difficult if it hasn’t been seen in training1. There does seem to be a trend towards somewhat smaller models2 with more inference time RAG for problem-specific knowledge acquisition. So in this case the game boils down more or less to SEO – find your way into that set of RAG results and you’ll find your product adopted. Or maybe there’s some central message board like Moltbook where you see new tools going viral among agents?

I have more questions than answers on this one. I would be very interested in any data or anecdotal experience people have about how this has played out in their own projects.

Update: Turns out there is good data on this, and there are notable differences between models which helps ground some of the above speculation. For example, Claude 4.5 era models favor older tools, whereas Claude 4.6 more frequently picks newer tools or writes custom code. The differences aren’t night and day, but they’re not small either. The fact that newer models tend to pick newer tools might reflect the change in knowledge-cutoff, which would support the idea that being in the training data is the key to getting adopted. I don’t think that can be the full story though, because these models were released only 4 months apart. There are also differences even within model generations (e.g. Sonnet 4.5 is more conservative than Opus 4.5), plus a trend towards more roll-your-own solutions in later models. Without knowing what changed in the data and the training3, it’s hard to say what’s going on. Maybe the pattern becomes clearer as there are more model generations to compare.

Agent suggestions will lack variety?

There is a technical issue in LLM training called mode collapse. The way this manifests is lack of diversity in model output.

ChatGPT is trained on a hundred thousand lifetimes worth of text. It has access to basically everything ever written. Now try opening an incognito chat and asking it to tell you a joke. I just tried this, and it repeated itself within four trials. This is mode collapse.

Now think about agents as buyers. When users request something vague like “a red dress for a party”, the result could be a handful of the same things being recommended again and again. However, this is what we already have with Google search. So while agents may shuffle the winners and losers, the shape of the outcomes may be about the same.

Second order effects

Even for businesses that don’t need to change much technically to sell to agents, there are second order effects. Amazon makes a lot of its profit from advertising spots in the search feed. Will that be effective when agents are buyers?

What about marketing generally? An agent has not seen your billboard on 101. It is not browsing Instagram4. It probably does not care about your brand story, emotional persuasion, social proof. It does not want to get on a call or sit through a demo. It probably wants structured information, verifiable claims, low-friction integration, and explicit ROI. And maybe lots of em-dashes on your landing page.

Footnotes

  1. Assuming the status quo of frozen models, no continual learning. It seems we’ll move past that at some point, but no sign of it yet. ↩︎
  2. Smaller here being very much a relative measure e.g. GPT 5.3 is rumoured to be 2 trillion parameters, vs estimates of 4 to 12 trillion for GPT 4.5. ↩︎
  3.  And possibly even then! ↩︎
  4.  People mocked the Meta acquisition of Moltbook, but there’s a world in which this is as big a deal as the Instagram acquisition. ↩︎

Leave a Reply

Your email address will not be published. Required fields are marked *