Minimind

Visit Tool

minimind is an open-source coding & development tool that enables training a 64M-parameter GPT from scratch in just 2 hours. It provides a complete training pipeline for small language models with minimal cost.

Claim this tool

No Views Yet

At a glance

Pricing

Open Source

Free tier

Yes

API

Yes

Skill level

Technical

About

What is minimind?

minimind is an open-source project designed to simplify and accelerate the training of small language models (LLMs) from scratch. It allows users to train a 64M-parameter GPT model in approximately 2 hours with a cost as low as 3 RMB. The project offers a complete training pipeline, including MoE, data cleaning, pre-training, supervised fine-tuning (SFT), LoRA, RLHF (DPO), RLAIF (PPO/GRPO/CISPO), Tool Use, Agentic RL, adaptive thinking, and model distillation. All core algorithms are implemented from scratch using PyTorch, avoiding reliance on high-level abstractions from third-party libraries. minimind aims to be an accessible entry point for LLM learning and practice, providing a reproducible, understandable, and extensible foundation for the AI community.

Best used for

Ideal for developers and data scientists who need to understand the inner workings of large language models, train small-parameter GPTs efficiently, and experiment with various LLM training techniques. Especially valuable for educational purposes and rapid prototyping of custom AI models with minimal computational cost.

Common actions

train language models

implement LLM algorithms

experiment with AI models

develop custom GPTs

workflowsautomated workflowdeepfakelow-code/no-codeopen-sourcecollaboration"AI Agents"face swappinggithub copilot

Capabilities

Key features

Train 64M GPT in 2 hours
Complete LLM training pipeline
MoE architecture support
Custom tokenizer training
PyTorch native algorithm implementation
Supports single/multi-GPU training
OpenAI API compatible server

Target Audience

developerdata scientiststudent

Integrations

transformerstrlpeftllama.cppvllmollamawandbswanlab+ 2 more

Pricing & Plans

Open Source

Free

FAQs

What is the primary goal of the minimind project?

The minimind project aims to enable individuals to train a small-parameter GPT model from scratch with minimal cost and time. It focuses on providing a complete, understandable, and reproducible open-source framework for LLM development, making advanced AI accessible to a broader audience.

What kind of models can be trained using minimind?

minimind supports training various small language models, including 64M-parameter GPTs, MoE models, and experimental versions like MiniMind-V (vision multimodal), MiniMind-dLM (diffusion language model), and MiniMind-Linear (linear attention model).

Does minimind support different training stages for LLMs?

Yes, minimind covers a comprehensive range of LLM training stages. This includes pre-training, supervised fine-tuning (SFT), LoRA, RLHF (DPO), RLAIF (PPO/GRPO/CISPO), Tool Use, Agentic RL, adaptive thinking, and model distillation, providing a full development pipeline.

What are the hardware requirements for training with minimind?

The project emphasizes low-cost training, with the SFT stage on a single NVIDIA 3090 GPU taking about 2 hours and costing around 3 RMB. While it supports single and multi-GPU setups, a single 24GB GPU like the NVIDIA 3090 is sufficient for rapid reproduction.

Is minimind compatible with existing LLM frameworks and tools?

Yes, minimind is designed for compatibility. It works with mainstream frameworks such as transformers, trl, and peft, as well as popular inference engines like llama.cpp, vllm, and ollama. It also supports visualization tools like wandb and swanlab.

Trending

Subcategories trending in Coding & Development

Code Assistants DevOps & Infrastructure No-Code / Low-Code Testing & QA Backend & APIs Prompt Engineering

Trending

Explore

Browse AI tools by category

Content & Design Productivity & Business Coding & Development AI Agents & Automation Research & Education Wellness & Lifestyle Career Development Marketing & Growth Data & Analytics Customer Support & CX Finance E-commerce