LookaheadDecoding
Visit ToolLookaheadDecoding is an open-source AI tool that accelerates LLM inference by breaking sequential dependency. It uses parallel decoding and supports FlashAttention for faster generation, offering up to 2.3x speedup.
At a glance
Trending
Also listed in