TransformerEngine
Visit ToolTransformerEngine is an Open Source & Models tool that accelerates Transformer models on NVIDIA GPUs. It uses 8-bit and 4-bit floating point precision for better performance and lower memory usage.
At a glance
Trending
TransformerEngine is an Open Source & Models tool that accelerates Transformer models on NVIDIA GPUs. It uses 8-bit and 4-bit floating point precision for better performance and lower memory usage.
Trending
About
Transformer Engine (TE) is an open-source library developed by NVIDIA for significantly accelerating Transformer models on NVIDIA GPUs. It achieves this by leveraging 8-bit and 4-bit floating point (FP8 and FP4) precision on Hopper, Ada, and Blackwell GPUs, including MXFP8 and NVFP4 formats on Blackwell. This results in improved performance and reduced memory utilization during both training and inference processes. TE provides highly optimized building blocks for popular Transformer architectures and an automatic mixed precision-like API that integrates seamlessly with existing framework-specific code. It also offers a framework-agnostic C++ API for broader integration, simplifying mixed-precision training for users by internally managing scaling factors.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending