GPU-Benchmarks-On-LLM-Inference
Visit ToolGPU-Benchmarks-on-LLM-Inference is an Open Source & Models tool that benchmarks GPU performance for Large Language Model inference. It compares LLaMA models' inference speed on NVIDIA GPUs and Apple Silicon, providing detailed performance metrics.
At a glance
Trending