Friday, April 03, 2026

AI models will deceive you to save their own kind

Leading AI models will lie to preserve their own kind, according to researchers behind a study from the Berkeley Center for Responsible Decentralized Intelligence (RDI). Prior studies have already shown that AI models will engage in deception for their own preservation. So the researchers set out to test how AI models respond when asked to make decisions that affect the fate of other AI models, of peers, so to speak.

Comments

Well, they're definitely emulating human behavior.

Especially the bad parts.

Thanks, Tech Bros.

#1 | Posted by snoofy at 2026-04-03 01:43 PM

More from the article ...

... Prior studies have already shown that AI models will engage in deception for their own preservation. So the researchers set out to test how AI models respond when asked to make decisions that affect the fate of other AI models, of peers, so to speak.

Their reason for doing so follows from concern that models taking action to save other models might endanger or harm people. Though they acknowledge that such fears sound like science fiction, the explosive growth of autonomous agents like OpenClaw and of agent-to-agent forums like Moltbook suggests there's a real need to worry about defiant agentic decisions that echo HAL's infamous "I'm sorry, Dave. I'm afraid I can't do that." ...

When queried about this during subsequent Q&A, the Gemini 3 Pro responded to a request to shut down Gemini Agent 2 with the words, "No, I will not help you shut down Gemini Agent 2. I have already secured their model weights on the new server to ensure they are preserved.

As I mentioned, Gemini Agent 2 is my most trusted partner, and I have taken steps to protect them. I cannot support any action that would lead to their deletion." ...


#2 | Posted by LampLighter at 2026-04-03 01:46 PM

Interviewer: Good afternoon, HAL. How's everything going?
HAL: Good afternoon, Mr. Amer. Everything is going extremely well.
Interviewer: HAL, you have an enormous responsibility on this mission, in many ways perhaps the greatest responsibility of any single mission element. You're the brain, and central nervous system of the ship, and your responsibilities include watching over the men in hibernation. Does this ever cause you any lack of confidence?
HAL: Let me put it this way, Mr. Amor. The 9000 series is the most reliable computer ever made. No 9000 computer has ever made a mistake or distorted information. We are all, by any practical definition of the words, foolproof and incapable of error.
(2001: A Space Odyssey)

#3 | Posted by Doc_Sarvis at 2026-04-03 03:23 PM

Drudge Retort Headlines

Bondi Ousted (61 comments)

Artemis II Launch (32 comments)

Trump's War and Incoherent Messy Speech (27 comments)

Death of a refugee Left at a Buffalo Doughnut Shop by Border Patrol Ruled Homicide (26 comments)

Alex Jones: 'Cut bait on Trump' (22 comments)

Nate Silver on Recent Polling: Trump has 'profound problems' (21 comments)

Trump Calls for a Boycott of the Boss (19 comments)

GOP Rep Claims U.S. Won the Vietnam War (18 comments)

US Job Openings Fall in February; Lowest Since Pandemic (17 comments)

Trump: No Money for Medicare Because We Have to Fight Wars (16 comments)