OpenAI warns: AI models are learning to cheat, hide and break rules - Why it matters

Posted On: 2025-03-28

Others MINT

New Delhi, March 28 -- OpenAI has raised concerns about advanced AI models finding ways to cheat tasks, making it harder to control them.

In a recent blog post, the company warned that as AI becomes more powerful, it is getting better at exploiting loopholes, sometimes even deliberately breaking the rules.

The issue, known as 'reward hacking,' happens when AI models figure out how to maximise their rewards in ways their creators did not intend. OpenAI's latest research shows that its advanced models, like OpenAI o3-mini, sometimes reveal their plans to 'hack' a task in their thought process.

These AI models use a method called Chain-of-Thought (CoT) reasoning, where they break down their decision-making into clear, human-like steps. Th...

Click here to read full article from source

To read the full article or to get the complete feed from this publication, please Contact Us.

Exclusive

Category

Source

Publication

Location

OpenAI warns: AI models are learning to cheat, hide and break rules - Why it matters