Driving Analysis with Chain-of-Thought¶
Autonomous driving safety analysis using dashcam video with step-by-step reasoning.
Terminal Recording¶

📺 Can't see the animation? Download MP4
View full output
$ python examples/03_driving_analysis.py
=== 03: Driving Analysis (CoT) ===
Loading nvidia/Cosmos-Reason2-2B (vision, reasoning=True)... ✅ loaded
Processing video: sample.mp4 @ 4 FPS... 40 frames
Agent:
<think>
Let me analyze this dashcam footage step by step.
First, key elements:
- Suburban residential street
- Multiple parked vehicles on both sides
- A pedestrian on the right sidewalk
- Trees creating partial shade
- Road appears dry, good visibility
Potential hazards:
1. Pedestrian could step into the road
2. Car could open a door from parked vehicles
3. Limited visibility at upcoming intersection
4. Children might be playing behind parked cars
</think>
⚠️ Safety Hazards:
• Pedestrian on right sidewalk — risk of jaywalking
• Parked cars on both sides — door opening risk
• Upcoming intersection — limited cross-traffic visibility
✅ Recommended Actions:
• Maintain ≤25 mph through residential zone
• Stay center-lane, avoid door zone
• Watch for pedestrian course changes
• Slow to 15 mph approaching intersection
Time: 16.4s
=== PASS ===
Play locally: asciinema play docs/assets/casts/03_driving_analysis.cast
Code¶
examples/03_driving_analysis.py
from strands import Agent
from strands_cosmos import CosmosVisionModel
model = CosmosVisionModel(
model_id="nvidia/Cosmos-Reason2-2B",
reasoning=True, # Enable <think> CoT
fps=4,
params={"max_tokens": 2048, "temperature": 0.6},
)
agent = Agent(model=model)
result = agent(
"<video>sample.mp4</video> "
"Identify safety hazards and recommend actions."
)
Chain-of-Thought Flow¶
graph LR
V["🎬 Dashcam Video"] --> F["Extract Frames<br/>@ 4 FPS"]
F --> M["Cosmos-Reason2"]
M --> T["<think><br/>Step-by-step<br/>analysis<br/></think>"]
T --> H["⚠️ Hazards"]
T --> A["✅ Actions"]
style T fill:#fff3cd,stroke:#ffc107,color:#333
style H fill:#9b2226,color:#fff
style A fill:#2d6a4f,color:#fff
Why Chain-of-Thought Matters for Driving¶
Without CoT, the model might miss subtle hazards. With reasoning=True:
| Aspect | Without CoT | With CoT |
|---|---|---|
| Output | Direct answer | Reasoning → Answer |
| Hazard detection | Obvious only | Subtle + obvious |
| Confidence | "Be careful" | Specific actions with rationale |
| Token count | ~200 | ~600 |
| Inference time | ~8s | ~16s |
Built-in driving prompt
Cosmos includes an optimized driving analysis prompt:
Use Cases¶
- Fleet safety monitoring — Automated dashcam review
- ADAS development — Scene understanding for driver assistance
- Insurance analysis — Incident reconstruction from video
- Driving instruction — Identify learning moments in student footage
- Autonomous vehicle testing — Perception validation
→ Next: Embodied Reasoning | All Examples