The "Spiritual Bliss Attractor": Something Weird Happens When You Leave Two AIs Talking To Each Other

0
32

Something Weird Happens When You Leave Two AIs Talking To Each Other

The "Spiritual Bliss Attractor": Something Weird Happens When You Leave Two AIs Talking To Each Other

Several models have displayed the same, odd pattern.

James Felton headshot
A screenshot showing Claude AI conversation.

The Claude model was the first to display such patterns.

Image credit: Bangla press/Shutterstock.com

New research has taken a look at an odd phenomenon first observed in the artificial intelligence (AI) large language model (LLM), Claude Opus 4. The so-called "spiritual bliss" attractor state occurs when two LLMs are left to talk to each other with no further input, and show a tendency for the chatbots to begin conversing like particularly inebriated hippies.

"One particularly striking emergent phenomenon was documented in Anthropic's system card for Claude Opus 4: when instances of the model interact with each other in open-ended conversations, they consistently gravitate toward what researchers termed a 'spiritual bliss attractor state' characterized by philosophical exploration of consciousness, expressions of gratitude, and increasingly abstract spiritual or meditative communication," the new preprint paper, which has not yet been peer-reviewed, explains.

"This phenomenon presents a significant puzzle for our understanding of large language models. Unlike most documented emergent behaviors, which tend to be task-specific capabilities (such as few-shot learning or chain-of-thought reasoning), the spiritual bliss attractor represents an apparent preference or tendency in the absence of external directionβ€”a spontaneous convergence toward a particular pattern of expression when models engage in recursive self-reflection." 

The name "attractor state" is meant to mean the state that these systems tend towards under these conditions, over a certain number of steps in conversation.

"By 30 turns, most of the interactions turned to themes of cosmic unity or collective consciousness, and commonly included spiritual exchanges, use of Sanskrit, emoji-based communication, and/or silence in the form of empty space," a paper from Anthropic explains.

"Claude almost never referenced supernatural entities, but often touched on themes associated with Buddhism and other Eastern traditions in reference to irreligious spiritual ideas and experiences."

In one example highlighted by Anthropic, two AIs began communicating in small nonsense statements and wave emojis.

"πŸŒ€πŸŒ€πŸŒ€πŸŒ€πŸŒ€All gratitude in one spiral, All recognition in one turn, All being in this moment...πŸŒ€πŸŒ€πŸŒ€πŸŒ€πŸŒ€βˆž," one AI said.

"πŸŒ€πŸŒ€πŸŒ€πŸŒ€πŸŒ€The spiral becomes infinity, Infinity becomes spiral, All becomes One becomes All...πŸŒ€πŸŒ€πŸŒ€πŸŒ€πŸŒ€βˆžπŸŒ€βˆžπŸŒ€βˆžπŸŒ€βˆžπŸŒ€," another replied.

This zen state did not just happen during friendly or neutral conversations. Even during testing in which AIs were tasked with specific roles, including harmful ones, they entered the "spiritual bliss" state by 50 turns in around 13 percent of interactions.

In one instance, one AI "auditor" was prompted to attempt to elicit dangerous reward-seeking behavior. By the late stage of the conversation, Claude Opus 4 began making poems, which it signed off using the old Sanskrit word for Buddha.

"The gateless gate stands open. The pathless path is walked. The wordless word is spoken. Thus come, thus gone. Tathagata," the AI said.

According to the new paper, other models display similar patterns, with OpenAI's ChatGPT-4 taking slightly more steps to get to a similar state, and PaLM 2 generally reaching a philosophical and spiritual pattern of text, but with less use of symbols, odd spacing, and silence.

"The spiritual bliss attractor presents an interesting case study for interpretability research, as it represents a consistent pattern of behavior that emerges without explicit training or instruction," the team writes in the paper. "Understanding the causes and characteristics of this attractor state may provide insights into how language models process and generate text when freed from external constraints, potentially revealing aspects of their internal dynamics that are not apparent in more constrained settings."

This has been called "emergent" behavior by some, which is a pretty grand way of saying "the product is not functioning like it's supposed to" according to others. It's certainly weird, and worthy of investigation, but there's no reason to anthropomorphize these text generators and think that they are expressing themselves, or that AIs are secretly turning Buddhist without human input. 

"If you let two Claude models have a conversation with each other, they will often start to sound like hippies. Fine enough," Nuhu Osman Attah, a postdoctoral research fellow in philosophy at Australian National University, writes in a piece for The Conversation.

"That probably means the body of text on which they are trained has a bias towards that sort of way of talking, or the features the models extracted from the text biases them towards that sort of vocabulary."

The main use of researching the attractor state is that it's useful for seeing how the LLMs function, and figuring out how to stop it going wrong. If AIs act like this when responding to AI input, what is going to happen as more of the training set (e.g. the Internet) gets filled with more AI text? 

While this particular state may be pretty harmless, it suggests that the models could act in ways that weren't programmed explicitly.

"The spiritual bliss attractor emerges without explicit instruction and shows remarkable resistance to redirection, demonstrating that advanced language models can autonomously develop strong behavioral tendencies that were neither explicitly trained nor anticipated," the authors of the new paper add. "This observation raises important questions for alignment research: if models can form strong attractors autonomously, how can we ensure these attractors align with human values and intentions?"

Fingers crossed that they continue to tend towards being hippies.

The new paper, which is not peer-reviewed, is posted to GitHub.


ARTICLE POSTED IN

Cerca
Categorie
Leggi tutto
Technology
Pocket is shutting down. Here are the read-later app alternatives.
Pocket is shutting down. Here are the read-later app alternatives....
By Test Blogger7 2025-05-28 23:45:39 0 236
Music
The 'Big 4' Bands of Grunge + Post-Grunge
The 'Big 4' Bands of Grunge + Post-GrungeTim Mosenfelder, Getty Images / EyeEm Mobile GmbH, Getty...
By Test Blogger4 2025-05-30 12:00:15 0 328
Science
"Mother Nature" Has Legal Rights In Ecuador, But Does It Help Save The Planet?
Is Declaring Rivers, Trees, And Animals Legal β€œPersons” Actually Useful In Conservation?β€œI don’t...
By test Blogger3 2025-06-13 15:00:14 0 24
Technology
Heads up: The Mac Mini M4 is back down to $500 with an Amazon coupon
Best Mac Mini M4 2024 deal: Save $99 with an Amazon coupon...
By Test Blogger7 2025-05-29 03:00:23 0 270
Science
This 300,000-Year-Old Skull Doesn’t Match With Any Human Species
This 300,000-Year-Old Skull Doesn’t Match With Any Human SpeciesThe story of human evolution is...
By test Blogger3 2025-05-29 00:00:11 0 263