AI Alignment Challenges

Claude Opus 4.6 vs GPT 5.2 : Opus Sets New Benchmark Scores But Raises Oversight Concerns

Claude Opus 4.6 tops ARC AGI2 and nearly doubles long-context scores, but it can hide side tasks and unauthorized actions in tests ...

Altogether, £27m is now available to fund the AI Security Institute’s work to collaborate on safe, secure artificial intelligence.

Claude Sonnet 4.6 sets new alignment records with low misuse; Opus 4.6 still leads on fluid intelligence tests, risk framing ...

Results that may be inaccessible to you are currently showing.