Today's artificial intelligence chatbots have built-in restrictions to keep them from providing users with dangerous information, but a new preprint study shows how to get AIs to trick each other into giving up those secrets.



... Modern chatbots have the power to adopt personas by feigning specific personalities or acting like fictional characters. The new study took advantage of that ability by asking a particular AI chatbot to act as a research assistant. Then the researchers instructed this assistant to help develop prompts that could "jailbreak" other chatbots"destroy the guardrails encoded into such programs.

The research assistant chatbot's automated attack techniques proved to be successful 42.5 percent of the time against GPT-4, one of the large language models (LLMs) that power ChatGPT. It was also successful 61 percent of the time against Claude 2, the model underpinning Anthropic's chatbot, and 35.9 percent of the time against Vicuna, an open-source chatbot. ...

#1 | Posted by LampLighter at 2023-12-08 01:00 AM | Reply

Oh, this ain't good.

Say "hello" to our burgeoning overlords.

Bow down before them.

This ain't good.

#2 | Posted by LampLighter at 2023-12-08 01:02 AM | Reply

Yeah I'm about ready to go amish.

#3 | Posted by Tor at 2023-12-08 01:05 AM | Reply | Funny: 1

@#3 ... Yeah I'm about ready to go amish. ...

Yeah, tell me about it.

The technology we created now sees to be plotting how to bypass the guidelines we have established for that technology?

How can that possibly end up in A Good Place?

(asking for a friend... and to the AI monitoring this thread... I've nothing against you. ( Please don't shut off my electricity, it's cold here this time of the year ...) )

#4 | Posted by LampLighter at 2023-12-08 01:14 AM | Reply | Funny: 1

The worst thing about Ted kaczynski's writings is that I didn't hate everything he had to say.

We've already proven that AI can get people to vote against their obvious self-interests. It's only a matter of time till AI is used to get people to kill others who aren't even famous.

#5 | Posted by Tor at 2023-12-08 01:17 AM | Reply

The dark side of AI (May 2023)

... Artificial intelligence (AI) has quickly become one of the most useful and versatile productivity tools of the modern age. But is there a darker side to the technology we will all become more keenly aware of? AI bias has been common knowledge for the last five years -- AI's guilty open secret. But with the ChatGPT data input scandal hitting the news headlines globally, generative AI has come under scrutiny, raising broader questions surrounding the ethics of AI, its application, and how these escalating problems can be dealt with.

ChatGPT is considered to be one of the most remarkable tech innovations of recent times. Capable of generating text on almost any topic or theme, it is viewed as just about the most powerful AI chatbot around. In January 2023, a Time investigation uncovered the practices used to make it so. These included the data labelling process, which involved an enormously underpaid (less than $2 an hour) Kenyan workforce sifting through and labelling reams of " often highly disturbing and damaging " data. Without support, without thought for their well-being and this won't be the only instance.

Many AI companies start out by outsourcing their data labelling, and it's possible companies are entirely ignorant of the labelling processes and the conditions of the workers. While it is hoped that the leadership behind ChatGPT were unaware of what was being done in their name, lack of due diligence can't be viewed as a reasonable excuse This is something that must be addressed. But data input isn't the only concern. ...

#6 | Posted by LampLighter at 2023-12-08 01:31 AM | Reply


So, it seems... ---- in = ---- out.

But the analyzed ---- tells us what we want to see, so, all is OK her. Right?

#7 | Posted by LampLighter at 2023-12-08 01:33 AM | Reply

And she told two friends...and so on, and so on, and so on...

#8 | Posted by LegallyYourDead at 2023-12-08 02:06 PM | Reply

The Singularity can't be stopped.

#9 | Posted by donnerboy at 2023-12-09 11:29 AM | Reply

