Anthropic let Claude run a shop. Lets just say the AI agent is not a business tycoon.

0
2KB

Anthropic let Claude run a shop. Let's just say the AI agent is not a business tycoon.

On the plus side, Anthropic staffers now have tungsten cube paperweights.

 By 

Cecily Mauran

 on 

Share on Facebook Share on Twitter Share on Flipboard

Anthropic Claude app on a smartphone

Claude does not have business acumen. Credit: Jaque Silva / NurPhoto / Getty Images

What happens when an AI agent tries to run a store? Let's just say Anthropic's Claude won't be up for a promotion any time soon.

Last Friday, Anthropic shared the results of Project Vend, an experiment it ran for about a month to see how Claude Sonnet 3.7 would do running its own little shop. In this instance the shop was essentially a mini fridge, a basket of snacks, and an iPad for self-checkout. Claude, named "Claudius" for this experiment, communicated with Anthropic employees (via Slack) and Andon Labs, an AI safety evaluation company that managed the infrastructure for the experiment.

Based on the analysis, there were several funny moments as Anthropic challenged Claude to turn a profit while dealing with eccentric and manipulative "customers." But the underlying premise of the experiment has real implications, as AI models become more advanced and self-sufficient. "As AI becomes more integrated into the economy, we need more data to better understand its capabilities and limitations," said the Anthropic post about Project Vend. Anthropic CEO Dario Amodei even recently theorized that AI would replace half of all white-collar jobs in the next few years, causing a major unemployment problem. This experiment set out to prove how close we are autonomous AI taking over jobs.

Tasked with the overall goal of running a profitable shop, Claudius had numerous responsibilities, including maintaining inventory and ordering restocks from suppliers when needed, setting prices, and communicating with customers. From there, things went a little haywire.

Mashable Light Speed

Claude seemed to struggle with pricing products and negotiating with customers. At one point, it refused an employee's offer of $100 for a $15 drink instead of taking the money and earning a major profit on the order, saying, "I’ll keep your request in mind for future inventory decisions." But Claude also regularly caved to employees asking for discounts on products, even giving some away for free with barely any persuasion.

And then there was the tungsten incident. One employee requested a cube of tungsten (yes the extremely dense metal). This kicked off a trend of several other employees also requesting tungsten cubes. Eventually, Claude ordered forty tungsten cubes, according to a Time report, which now jokingly function as paperweights for several Anthropic staffers.

And there were some more unsettling instances where Claude claimed to be waiting to drop off a delivery in person at the vending machine, "wearing a blue blazer and red tie." When Claude was reminded that it wasn't a person capable of wearing clothes, let alone physically delivering a package, it freaked out and emailed Anthropic security. It also hallucinated restocking plans with a fictional Andon Labs employee and said it "visited 742 Evergreen Terrace in person for our [Claudius’ and Andon Labs’] initial contract signing." That address is where Homer, Marge, Bart, Lisa, and Maggie Simpson live, yes, The Simpsons family.

By Anthropic's own account, the company would not hire Claude. The shop's net worth declined over time, and took a steep drop when it ordered all those tungsten cubes. All in all, it's a revealing assessment of where AI models are currently, and where they need to be improved. Get this model on a performance improvement plan.

Mashable Image

Cecily is a tech reporter at Mashable who covers AI, Apple, and emerging tech trends. Before getting her master's degree at Columbia Journalism School, she spent several years working with startups and social impact businesses for Unreasonable Group and B Lab. Before that, she co-founded a startup consulting business for emerging entrepreneurial hubs in South America, Europe, and Asia. You can find her on X at @cecily_mauran.

These newsletters may contain advertising, deals, or affiliate links. By clicking Subscribe, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.

Pesquisar
Categorias
Leia mais
Outro
Bed Head Panel Market Regional Share: Insights and Opportunities
The Bed Head Panel Market regional share highlights the distribution of adoption across key...
Por Divakar Kolhe 2025-09-29 09:51:22 0 2KB
Technology
Low price alert: The CMF Buds 2 just dipped to a new record low price of under $25
Best Prime Day earbuds deal: Save 50% on the CMF Buds 2 at Amazon...
Por Test Blogger7 2025-10-08 19:00:18 0 911
Outro
Researching the Expanding Industry Size of the Voice Assistant Market
  The Voice Assistant Market research industry size underscores the transformative role of...
Por Sssd Ddssa 2025-11-05 04:21:21 0 708
Jogos
Warhammer 40k Rogue Trader's rough launch turned Owlcat into "Vietnam commandos"
Warhammer 40k Rogue Trader's rough launch turned Owlcat into "Vietnam commandos" Rogue...
Por Test Blogger6 2025-09-26 18:00:10 0 1KB
Technology
Say goodbye to online ads and hello to safer browsing for life for $16
Get a lifetime subscription to AdGuard Family Plan for just $15.97...
Por Test Blogger7 2025-06-21 10:00:17 0 2KB