Anthropic let Claude run a shop. Lets just say the AI agent is not a business tycoon.

0
853

Anthropic let Claude run a shop. Let's just say the AI agent is not a business tycoon.

On the plus side, Anthropic staffers now have tungsten cube paperweights.

 By 

Cecily Mauran

 on 

Share on Facebook Share on Twitter Share on Flipboard

Anthropic Claude app on a smartphone

Claude does not have business acumen. Credit: Jaque Silva / NurPhoto / Getty Images

What happens when an AI agent tries to run a store? Let's just say Anthropic's Claude won't be up for a promotion any time soon.

Last Friday, Anthropic shared the results of Project Vend, an experiment it ran for about a month to see how Claude Sonnet 3.7 would do running its own little shop. In this instance the shop was essentially a mini fridge, a basket of snacks, and an iPad for self-checkout. Claude, named "Claudius" for this experiment, communicated with Anthropic employees (via Slack) and Andon Labs, an AI safety evaluation company that managed the infrastructure for the experiment.

Based on the analysis, there were several funny moments as Anthropic challenged Claude to turn a profit while dealing with eccentric and manipulative "customers." But the underlying premise of the experiment has real implications, as AI models become more advanced and self-sufficient. "As AI becomes more integrated into the economy, we need more data to better understand its capabilities and limitations," said the Anthropic post about Project Vend. Anthropic CEO Dario Amodei even recently theorized that AI would replace half of all white-collar jobs in the next few years, causing a major unemployment problem. This experiment set out to prove how close we are autonomous AI taking over jobs.

Tasked with the overall goal of running a profitable shop, Claudius had numerous responsibilities, including maintaining inventory and ordering restocks from suppliers when needed, setting prices, and communicating with customers. From there, things went a little haywire.

Mashable Light Speed

Claude seemed to struggle with pricing products and negotiating with customers. At one point, it refused an employee's offer of $100 for a $15 drink instead of taking the money and earning a major profit on the order, saying, "I’ll keep your request in mind for future inventory decisions." But Claude also regularly caved to employees asking for discounts on products, even giving some away for free with barely any persuasion.

And then there was the tungsten incident. One employee requested a cube of tungsten (yes the extremely dense metal). This kicked off a trend of several other employees also requesting tungsten cubes. Eventually, Claude ordered forty tungsten cubes, according to a Time report, which now jokingly function as paperweights for several Anthropic staffers.

And there were some more unsettling instances where Claude claimed to be waiting to drop off a delivery in person at the vending machine, "wearing a blue blazer and red tie." When Claude was reminded that it wasn't a person capable of wearing clothes, let alone physically delivering a package, it freaked out and emailed Anthropic security. It also hallucinated restocking plans with a fictional Andon Labs employee and said it "visited 742 Evergreen Terrace in person for our [Claudius’ and Andon Labs’] initial contract signing." That address is where Homer, Marge, Bart, Lisa, and Maggie Simpson live, yes, The Simpsons family.

By Anthropic's own account, the company would not hire Claude. The shop's net worth declined over time, and took a steep drop when it ordered all those tungsten cubes. All in all, it's a revealing assessment of where AI models are currently, and where they need to be improved. Get this model on a performance improvement plan.

Mashable Image

Cecily is a tech reporter at Mashable who covers AI, Apple, and emerging tech trends. Before getting her master's degree at Columbia Journalism School, she spent several years working with startups and social impact businesses for Unreasonable Group and B Lab. Before that, she co-founded a startup consulting business for emerging entrepreneurial hubs in South America, Europe, and Asia. You can find her on X at @cecily_mauran.

These newsletters may contain advertising, deals, or affiliate links. By clicking Subscribe, you confirm you are 16+ and agree to our Terms of Use and Privacy Policy.

Site içinde arama yapın
Kategoriler
Read More
Oyunlar
Nvidia GeForce RTX 5050 now official - here are the specs and release window
Nvidia GeForce RTX 5050 now official - here are the specs and release window As an Amazon...
By Test Blogger6 2025-06-24 14:00:11 0 853
Rehber
15 Myths About Native American History That Just Aren’t True
15 Myths About Native American History That Just Aren’t True - History Collection...
By Test Blogger2 2025-07-11 15:00:11 0 573
Home & Garden
5 Lawn Mowing Etiquette Rules Everyone Should Know for a More Peaceful Neighborhood
5 Lawn Mowing Etiquette Rules Everyone Should Know for a More Peaceful Neighborhood Credit:...
By Test Blogger9 2025-05-29 05:00:27 0 2K
Music
My Chemical Romance Fans Think They've Solved Band's IG Teaser
Did Fans Just Solve My Chemical Romance's Teaser Clue? (New Video Emerges)Fans Thinks They’ve...
By Test Blogger4 2025-07-11 16:00:08 0 523
Technology
The Amazon Fire TV Stick 4K is half price on Prime Day
Best Prime Day deal: Amazon Fire TV Stick 4K for $24.99...
By Test Blogger7 2025-07-09 14:00:13 0 560