Chain-of-Thought Reasoning¶
Enable <think> step-by-step reasoning before the final answer. Critical for complex physical reasoning tasks.
See It In Action¶

📺 Can't see the animation? Download MP4
How It Works¶
graph LR
Q["Question"] --> T["<think><br/>Step-by-step<br/>reasoning<br/></think>"]
T --> A["Final Answer"]
style T fill:#fff3cd,stroke:#ffc107,color:#333
When reasoning=True, Cosmos generates a reasoning trace wrapped in <think>...</think> tags, then provides a concise final answer.
Enable Chain-of-Thought¶
from strands import Agent
from strands_cosmos import CosmosVisionModel
model = CosmosVisionModel(
model_id="nvidia/Cosmos-Reason2-2B",
reasoning=True, # ← This enables CoT
params={"temperature": 0.6},
)
agent = Agent(model=model)
→ Full driving analysis example with CoT
Example Output¶
<think>
I can see a residential street with parked cars on both sides.
There is a crosswalk ahead with a pedestrian waiting to cross.
The traffic light appears to be green for the ego vehicle.
However, the pedestrian might step onto the road.
The recommended action is to slow down and prepare to yield.
</think>
The driver should slow down approaching the crosswalk and be
prepared to yield to the pedestrian on the right.
When to Use Reasoning¶
| Scenario | reasoning= |
Why |
|---|---|---|
| Quick captions | False |
Speed matters |
| Safety analysis | True |
Need thorough analysis |
| Robot planning | True |
Multi-step planning |
| Video descriptions | False |
Direct description |
| Physics reasoning | True |
Logical chain needed |
Performance Impact¶
Reasoning increases output length and inference time:
| Mode | Typical Tokens | Inference Time (2B) |
|---|---|---|
Direct (reasoning=False) |
100–500 | 5–15s |
CoT (reasoning=True) |
500–2000 | 15–55s |
Temperature for reasoning
Use temperature=0.6 with reasoning enabled. Lower values produce more focused chains, higher values explore more possibilities.
Robot Embodied Reasoning with CoT¶

model = CosmosVisionModel(
model_id="nvidia/Cosmos-Reason2-2B",
reasoning=True,
)
agent = Agent(model=model)
agent("<image>robot_workspace.png</image> What is the next action?")
→ Full embodied reasoning example
What's Next¶
- Video Understanding — Apply CoT to video analysis
- Tool Usage — Use reasoning in multi-agent setups