Use the vitals package with ellmer to evaluate and compare the accuracy of LLMs, including writing evals to test local models ...
Abstract: Value prediction [1], [2] has the potential to break through the performance limitations imposed by true data dependencies. Aggressive value predictors can deliver significant performance ...