Basic Text — Physics Reasoning¶

Text-only inference with Cosmos-Reason2. No video or image needed — pure physical world reasoning.

Terminal Recording¶

Basic text inference demo

📺 Can't see the animation? Download MP4

View output text

$ python examples/01_basic_text.py
=== 01: Basic Text Inference ===
Loading nvidia/Cosmos-Reason2-2B... ✅ loaded

Agent: When a ball rolls down a ramp, several physics principles are at work:

1. Gravitational Potential Energy → Kinetic Energy
   The ball at the top has PE = mgh. As it descends,
   gravity converts this to KE = ½mv².

2. Rolling Without Slipping
   Static friction at the contact point causes the ball
   to rotate rather than slide.

3. Moment of Inertia
   For a solid sphere: I = (2/5)mr². ~71% of energy goes
   to translation, ~29% to rotation.

4. Acceleration
   a = (5/7)g·sin(θ), less than a sliding block.

Time: 11.2s
=== PASS ===

Play locally: asciinema play docs/assets/casts/01_basic_text.cast

Code¶

examples/01_basic_text.py

from strands import Agent
from strands_cosmos import CosmosModel

model = CosmosModel(model_id="nvidia/Cosmos-Reason2-2B")
agent = Agent(model=model)

result = agent("Explain the physics of a ball rolling down a ramp. Be concise.")

How It Works¶

sequenceDiagram
    participant You
    participant Agent as Strands Agent
    participant Cosmos as Cosmos-Reason2

    You->>Agent: "Explain physics of ball on ramp"
    Agent->>Cosmos: Tokenize text prompt
    Cosmos->>Cosmos: Autoregressive generation
    Cosmos-->>Agent: Physics explanation tokens
    Agent-->>You: Formatted response

Key Points¶

Uses CosmosModel (text-only) — lighter than vision model
No GPU memory needed for vision encoder
Good for physics reasoning, causal inference, knowledge queries
~11s on Jetson AGX Thor

When to use CosmosModel vs CosmosVisionModel

Use CosmosModel for text-only tasks. It loads faster and uses less memory. Use CosmosVisionModel when you need video or image input.

→ Next: Video Captioning | All Examples