Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models.
People are getting excessive mental health advice from generative AI. This is unsolicited advice. Here's the backstory and what to do about it. An AI Insider scoop.
Some results have been hidden because they may be inaccessible to you
Show inaccessible results