Tests that once challenged advanced AI models are now being solved with ease, making it harder for researchers to pinpoint ...