Llm-Compressor
Visit Toolllm-compressor is an open-source library that optimizes Large Language Models for deployment with vLLM. It offers a comprehensive set of quantization algorithms for weight-only and activation quantization, seamlessly integrating with Hugging Face models.
At a glance
Trending