Installation¶
Ten seconds to a robot that sees time.
Quick Install¶
That's it. One line. The backbone downloads lazily when you first call load_backbone() — nothing heavy happens at import time.
From Source¶
For development, contributing, or if you want to live on the edge:
The [dev] extra installs test runners, linters, and type checkers. Everything you need to hack on Neon and know immediately if you broke something.
What Gets Installed¶
Neon keeps its dependency tree tight. Heavy imports (torch, transformers) are lazy — they load on use, not on import.
| Package | Version | Why It's Here |
|---|---|---|
torch |
≥ 2.2.0 | Tensor operations, model execution |
transformers |
≥ 4.48.0, < 5.3.0 | Backbone loading (Qwen2.5-VL, Cosmos) |
datasets |
≥ 3.0.0 | HuggingFace data soup loading |
huggingface-hub |
≥ 0.23.0 | Model push/pull |
numpy |
≥ 1.24.0 | The bedrock of everything |
pillow |
≥ 10.0.0 | Image I/O |
einops |
≥ 0.7.0 | Tensor reshaping without headaches |
pyyaml |
≥ 6.0 | Config parsing |
Optional Extras¶
Hardware Requirements¶
| What You Want To Do | GPU | VRAM | Notes |
|---|---|---|---|
| Inference (3B) | Jetson Orin / RTX 3060 | 8 GB | 4-bit quantized, ~50ms latency |
| Inference (7B) | RTX 4090 / A100 | 16 GB | 4-bit quantized |
| Train action heads | RTX 4090 / L4 | 24 GB | Backbone frozen, only 6M params train |
| Train with LoRA | A100 / L40S | 40+ GB | Fine-tune backbone attention layers |
No GPU? No problem.
Everything runs on CPU for testing and development. It'll be slow (~5s per prediction) but completely functional. All 168 tests pass on CPU without any GPU or backbone weights.
Verify It Works¶
import neon
from neon.model.neon_vla import NeonConfig, NeonVLA
from neon.data.action_space import G1ActionSpace
# Create the model (backbone hasn't loaded yet — this is fast)
config = NeonConfig(control_mode="arms_only")
model = NeonVLA(config)
# Check the action space
print(model.action_space)
# → G1ActionSpace(mode=arms_only, joints=14, action_dim=17)
print(f"Trainable parameters: {sum(p.numel() for p in model.parameters() if p.requires_grad):,}")
# → Trainable parameters: ~2,345,678 (action heads + fusion only)
If you see that output, you're ready. The video backbone will download when you need it — when you call model.load_backbone() for the first time.
Next¶
→ Quickstart — create a model, predict actions, control a robot