TokenFormer
Visit ToolTokenFormer is an academic research tool that rethinks Transformer scaling with tokenized model parameters. It offers a fully attention-based neural network for enhanced architectural flexibility.
At a glance
Trending
TokenFormer is an academic research tool that rethinks Transformer scaling with tokenized model parameters. It offers a fully attention-based neural network for enhanced architectural flexibility.
Trending
About
TokenFormer is the official implementation of the ICLR2025 Spotlight paper, "TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters." This tool introduces a fully attention-based neural network that unifies token-token and token-parameter interactions, maximizing the flexibility of neural network architectures. By tokenizing both data and model parameters, TokenFormer inherently enhances model scalability, allowing for progressively efficient scaling. The architecture is designed to be natively scalable, leveraging attention mechanisms for interactions between input tokens, and between tokens and model parameters. This approach aims to offer greater flexibility than traditional Transformers, contributing to advancements in foundation models, sparse inference (MoE), parameter-efficient tuning, device-cloud collaboration, and vision-language applications.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending