Llm_benchmark
Visit Toolllm_benchmark is an Open Source tool for evaluating large language models (LLMs). It uses a private, rolling question bank to track the long-term evolution of models, focusing on logic, math, programming, and human intuition.
At a glance
Trending