VideoGPT
Visit ToolVideoGPT is an open-source video generation tool that uses VQ-VAE and Transformers to create natural videos. It provides a simple architecture for scaling likelihood-based generative modeling.
At a glance
Trending
Also listed in
VideoGPT is an open-source video generation tool that uses VQ-VAE and Transformers to create natural videos. It provides a simple architecture for scaling likelihood-based generative modeling.
Trending
Also listed in
About
VideoGPT is an open-source project for video generation, leveraging VQ-VAE (Vector Quantized Variational AutoEncoder) and Transformer architectures. It learns downsampled discrete latent representations of raw video using 3D convolutions and axial self-attention. A GPT-like architecture then autoregressively models these discrete latents with spatio-temporal position encodings. The tool is designed for researchers and developers interested in AI video creation, offering a reproducible reference for transformer-based video generation models. It can generate samples competitive with state-of-the-art GAN models on datasets like BAIR Robot, UCF-101, and TGIF. The project includes scripts for training VQ-VAE and VideoGPT models, as well as for sampling and evaluation.
Capabilities
Pricing & Plans
Open Source
Free
FAQs
Trending