OpenAI wants to retire the leading AI coding benchmark—and the reasons reveal a deeper problem with how the whole industry measures itself.
You wrote a cool network client or server. It encrypts connections using TLS. Your test suite needs to make TLS connections to itself. Uh oh. Your test suite probably doesn't have a valid TLS ...