Behaviour Models in System Modelling

AI models know when they're being tested - and change their behavior, research shows

Several frontier AI models show signs of scheming. Anti-scheming training reduced misbehavior in some models. Models know they're being tested, which complicates results. New joint safety testing from ...

The Conversation

The mathematics of human behaviour: how my new model can spot liars and counter disinformation

Dorje C. Brody does not work for, consult, own shares in or receive funding from any company or organization that would benefit from this article, and has disclosed no relevant affiliations beyond ...

6 天

Why complex reasoning models could make misbehaving AI easier to catch

In a new paper from OpenAI, the company proposes a framework for analyzing AI systems' chain-of-thought reasoning to understand how, when, and why they misbehave.

11 天

The Need for AI Security as Organizations Adopt AI Models and Technology

An inability to address AI security risks may create areas for intellectual property (IP) theft, swayed outputs, or general ...

一些您可能无法访问的结果已被隐去。

显示无法访问的结果