On a 2.0 terminal benchmark, OpenAI’s model scores about 10% higher, guiding users toward stronger results on long, complex coding tasks.
Results that may be inaccessible to you are currently showing.
Hide inaccessible resultsResults that may be inaccessible to you are currently showing.
Hide inaccessible results