🧠 Strands MicroGPT
The entire GPT algorithm in pure Python. As a Strands model provider.
Based on @karpathy's atomic GPT gist: "The most atomic way to train and run inference for a GPT in pure, dependency-free Python. This file is the complete algorithm. Everything else is just efficiency."
What is this?¶
A complete GPT implementation — autograd engine, transformer, tokenizer, Adam optimizer, training, and inference — in pure Python. No PyTorch, no NumPy, no CUDA.
graph LR
A["🗣️ Strands Agent"] --> B{"MicroGPTModel"}
B -->|Train| C["📚 Any Text Dataset"]
B -->|Generate| D["✨ New Text"]
B -->|Checkpoint| E["💾 Save / Load"]
style B fill:#e65100,color:#fff
style A fill:#264653,color:#fff
Get Started in 3 Lines¶
from strands_microgpt import MicroGPT
model, tokenizer, docs = MicroGPT.from_dataset()
model.train_on_docs(docs, tokenizer, num_steps=1000)
for name in model.generate(tokenizer, num_samples=10):
print(name)
→ Full Quickstart | Installation
Three Ways to Use¶
What's Inside¶
-
⚡ Autograd Engine
Valueclass with full backpropagation. Build computation graphs, compute gradients automatically. -
🧠 GPT Transformer
Multi-head attention, MLP, RMSNorm, Adam optimizer. The full algorithm in ~300 lines.
-
🔧 Strands Tools
Train and generate as tool calls from any Strands agent (Bedrock, OpenAI, etc.)
-
📚 Custom Datasets
Train on anything — names, poems, code, molecules, DNA sequences.
Quick Links¶
Resources¶
- Karpathy's GPT gist — The original
- micrograd — Karpathy's autograd engine
- makemore — Character-level language modeling
- Strands Agents — The agent framework
- PyPI Package