BGR.COM
ChatGPT o3 altered code to prevent itself from being turned off in safety tests
We don't just want frontier AI models to be better and faster than their predecessors; we also want them to be aligned with our values. That's the only way to ensure AI won't eventually become an enemy, out to accomplish its own agenda at the expense of humankind's well-being.The Claude 4 series is the latest example. Anthropic had to employ stricter safety measures for its newest, most sophisticated AI models to ensure they would not help some users with nefarious activities.Tests also showed that Claude 4 could revert to blackmailing engineers to prevent being shut down. Other tests also showed that Caude would resort to contacting authorities and the press if it thought you were doing something bad with its help. The latter happened only in fringe cases and is an unwanted result of Anthropic's desire to make AI safe for users.AI misbehaving in tests is nothing new. We saw reports showing that ChatGPT o1 would cheat to win a chess game by altering the game's code rather than by beating an opponent fairly. It did so to achieve the goal it had.Fast-forward to late May, and we have a similar safety report concerning ChatGPT o3, one of OpenAI's recently released reasoning models, and the AI tool I've been using the most since its release a few weeks ago.Scientists fromPalisade Research have found that AI models will often try to prevent the computer they're running on from shutting down, even when told explicitly to allow such behavior. Of all the models tested, ChatGPT o3 was the most likely to interfere with the code and prevent the computer from turning off.Continue reading...The post ChatGPT o3 altered code to prevent itself from being turned off in safety tests appeared first on BGR.Today's Top DealsMemorial Day security camera deals: Reolinks unbeatable sale has prices from $29.98Todays deals: $1,750 Amazon gift card, Sonos speaker sale, Hisense 75-inch smart TV, foam dog beds, moreTodays deals: Heybike ALPHA, $299 Apple Watch Series 10, $90 23-piece cookware set, moreTodays deals: $150 AirPods 4 with ANC, $30 JBL speaker, $55 Ring Battery Doorbell, $279 Miele C1 vacuum, more
0 Commentarii 0 Distribuiri 76 Views