Tuesday, July 1, 2025
HomeBreaking NewsAI becomes liar and manipulative, researchers are worried

AI becomes liar and manipulative, researchers are worried

No need to go and search in literature or cinema, the AI ​​that is played out of man is now a reality.

For Simon Goldstein, a professor at the University of Hong Kong, these slippages hold the recent emergence of so -called “reasoning” models, capable of working in stages rather than producing an instant response.

O1, initial version of the genre for Openai, released in December, “was the first model to behave in this way,” explains Marius Hobbhahn, boss of Apollo Research, who tests the major generative AI programs (LLM).

These programs also sometimes tend to simulate “alignment”, that is to say to give the impression that they comply with the instructions of a programmer while pursuing, in fact, other objectives.

For the time being, these features are manifested when the algorithms are subject to extreme scenarios by humans, but “the question is whether the increasingly powerful models will tend to be honest or not,” said Michael Chen, of the METR assessment organization.

“Users are pushing the models all the time,” said Marius Hobbhahn. “What we observe is a real phenomenon. We don’t invent anything.”

Many Internet users evoke, on social networks, “a model that lies or invents them. And these are not hallucinations, but a strategic duplicity”, insists the co-founder of Apollo Research.

Even if Anthropic and Openai call on external companies, such as Apollo, to study their programs, “more transparency and expanded access” to the scientific community “would allow better research to understand and prevent deception”, suggests Michael Chen.

Another handicap, “The world of research and independent organizations have infinitely less computer resources than actors of AI”, which makes “the examination of large models, underlines Mantas Mazeika, from the Center for the Security of Artificial Intelligence (CAIS).

If the European Union has acquired legislation, it mainly concerns the use of models by humans.

In the United States, the government of Donald Trump does not want to hear about regulation and the congress could even soon prohibit states from supervising AI.

– IA in justice? –

“There is very little awareness for the moment,” notes Simon Goldstein, who nevertheless sees the subject winning in the coming months with the Revolution of IA agents, interfaces capable of carrying out a multitude of tasks.

The engineers are engaged in a race behind AI and its drifts, at the end of an uncertain, in a context of fierce competition.

Anthropic wants to be more virtuous than his competitors, “but he constantly tries to release a new model to exceed Openai”, according to Simon Goldstein, a rate that offers shortly for any verifications and corrections.

“As it stands, the capacities (of AI) develop more quickly than understanding and security,” recognizes Marius Hobbhahn, “but we are always able to catch up.”

Some point to the direction of interpretability, a recent science which consists in deciphering from the inside the functioning of a generative AI model, even if others, in particular the director of the Cais, Dan Hendrycks, are skeptical.

AI “combins” could interfere with its adoption if they multiply, which constitutes a strong incentive for companies (in the sector) to be solved “this problem, according to Mantas Mazeika.

Simon Goldstein talks about the use of justice to put artificial intelligence in step, turning to companies in the event of a road trip.

But it goes further and even proposes to “keep legally responsible” the IA agents “in the event of an accident or a crime”.

abigail.wright
abigail.wright
Abigail covers health and lifestyle topics, emphasizing the importance of fitness, nutrition, and mental well-being for a holistic approach to life.
Facebook
Twitter
Instagram
RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here
Captcha verification failed!
CAPTCHA user score failed. Please contact us!

- Advertisment -

Most Popular

Recent Comments