Jetson Deployment¶
Run Cosmos-Reason2 on NVIDIA Jetson edge devices (AGX Thor, Orin).
Performance on Jetson AGX Thor¶
-
⚛️ Text-Only Physics (~11s)

-
🚗 Driving Analysis + CoT (~16s)

All recordings above were captured on Jetson AGX Thor with 132GB unified memory.
Supported Jetson Devices¶
| Device | GPU Memory | Model | Status |
|---|---|---|---|
| Jetson AGX Thor | 132 GB | 2B + 8B | ✅ |
| Jetson AGX Orin 64 | 64 GB | 2B + 8B | ✅ |
| Jetson AGX Orin 32 | 32 GB | 2B | ✅ |
| Jetson Orin NX 16 | 16 GB | ❌ | Not enough memory |
Setup¶
# 1. Install
pip install strands-cosmos strands-agents
# 2. Fix CUBLAS (required for Jetson)
strands-cosmos-fix-cublas
The CUBLAS Problem¶
PyTorch wheels bundle their own libcublas.so which doesn't support Jetson GPU architectures:
- Thor: SM 11.0 — not in pip torch's CUBLAS
- Orin: SM 8.7 — may not be in pip torch's CUBLAS
Symptom: CUBLAS_STATUS_INVALID_VALUE on any matrix operation.
graph LR
A["pip install torch"] --> B["Bundled CUBLAS<br/>SM 7.0–9.0 only"]
B -->|"Jetson SM 11.0"| C["❌ CUBLAS_STATUS_INVALID_VALUE"]
D["strands-cosmos-fix-cublas"] --> E["System CUBLAS<br/>from JetPack"]
E -->|"Jetson SM 11.0"| F["✅ Works"]
style C fill:#9b2226,color:#fff
style F fill:#2d6a4f,color:#fff
Fix Commands¶
# Auto-detect and fix
strands-cosmos-fix-cublas
# Check status without fixing
strands-cosmos-fix-cublas --check
# Revert to original
strands-cosmos-fix-cublas --revert
What the Fix Does¶
- Backs up torch's bundled
libcublas.soandlibcublasLt.so - Copies system CUBLAS from JetPack (
/usr/local/cuda/targets/*/lib/) - Verifies with a quick
torch.mmtest
Run after every torch upgrade
If you upgrade PyTorch, re-run strands-cosmos-fix-cublas — the new torch will overwrite the fix.
Benchmarks (Cosmos-Reason2-2B on Thor)¶
| Task | Time | Recording |
|---|---|---|
| Text-only physics | ~11s | cast |
| Video caption (10s @ 4fps) | ~15s | cast |
| Driving analysis + CoT | ~16s | cast |
| Embodied reasoning + CoT | ~43s | cast |
| Tool invocation | ~9s | cast |
Troubleshooting¶
flowchart TD
ERR["Error on Jetson?"] --> E1{"Error message?"}
E1 -->|"CUBLAS_STATUS_INVALID_VALUE"| FIX1["strands-cosmos-fix-cublas"]
E1 -->|"Out of memory"| FIX2["Use 2B model<br/>Reduce max_vision_tokens"]
E1 -->|"Model download fails"| FIX3["Check HF token:<br/>huggingface-cli login"]
E1 -->|"Slow inference"| FIX4["Ensure GPU is in MAX power mode:<br/>sudo nvpmodel -m 0"]
style ERR fill:#9b2226,color:#fff
style FIX1 fill:#2d6a4f,color:#fff
style FIX2 fill:#2d6a4f,color:#fff
style FIX3 fill:#2d6a4f,color:#fff
style FIX4 fill:#2d6a4f,color:#fff
What's Next¶
- Quickstart — Run your first agent
- Video Understanding — Process video on Jetson
- Examples — All runnable examples with recordings