Build GPT
从零实现 GPT 训练流程,用可读代码理解 tokenizer、attention、Transformer block 与自回归生成。
Build GPT
Build GPT is a compact deep-learning study project. The goal is not to produce a large model, but to make every moving part of a GPT-style model visible enough to reason about: tokenization, embeddings, causal attention, transformer blocks, loss curves, and text generation.
It serves as a bridge between theory notes and production LLM systems.