AI Alignment Challenges

11d

An Al Tried to Escape The Lab : AI Safety Tests Flag Deceptive Model Behavior

Advanced AI models show deception in lab tests; a three-level risk scale includes Level 3 “scheming,” raising oversight ...

Tech Xplore on MSN

'Neuron-freezing' technique can stop LLMs from giving users unsafe responses

Researchers have identified key components in large language models (LLMs) that play a critical role in ensuring these AI ...

Neuroscience News

Alignment is the Secret to Human-AI Teamwork

A new study suggests that AI failure is often a "human-machine alignment" problem rather than a technical one. Researchers ...

Tech Xplore on MSN

Humans and AI must form a cognitive alignment to work well together, say researchers

In the iconic Star Wars series, captain Han Solo and humanoid droid C-3PO boast drastically contrasting personalities. Driven ...

VentureBeat

When AI lies: The rise of alignment faking in autonomous systems

AI is evolving beyond a helpful tool to an autonomous agent, creating new risks for cybersecurity systems. Alignment faking is a new threat where AI essentially “lies” to developers during the ...

13d

The Paradox Of Alignment In The Age Of AI

Alignment is not about determining who is right. It is about deciding which narrative takes precedence and over what time ...

Computer Weekly

UK AI alignment project gets OpenAI and Microsoft boost

OpenAI and Microsoft are the latest companies to back the UK’s AI Security Institute (AISI). The two firms have pledged support for the Alignment Project, an international effort to work towards ...

The National Interest on MSN

When tools become agents: The autonomous AI governance challenge

Autonomous or agentic artificial intelligence will create challenges for public trust in the technology. That is why building ...

12d

AI doesn’t ‘see’ the way that you do, and that could be a problem when it categorizes objects and scenes

People and computers perceive the world differently, which can lead AI to make mistakes no human would. Researchers are ...

20h

You Adopted AI—Why Hasn't Your Organization Changed?

Over the past two years, my organization has worked with more than 120 social impact organizations navigating AI and assessed ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results