Skip to content

Tool Usage

Use Cosmos tools inside any Strands agent — Bedrock, Anthropic, OpenAI, Ollama, or any other provider.


Overview

strands-cosmos provides 21 tools that let any agent orchestrate the full NVIDIA Cosmos pipeline:

Category Tools
Reason2 VLM cosmos_inference, cosmos_reason_hf, cosmos_serve
World Models cosmos_predict_generate, cosmos_transfer_generate
Model Lifecycle cosmos_model_download, cosmos_quantize, cosmos_export_onnx, cosmos_build_engine
Training cosmos_post_train, cosmos_distill
Data & Eval cosmos_curate, cosmos_evaluate
I/O rtp_capture_frame, nats_publish, video_probe, video_extract_frames, image_read
System cosmos_sysinfo
Legacy cosmos_invoke, cosmos_vision_invoke

Basic Example

from strands import Agent
from strands_cosmos import cosmos_reason_hf, video_probe, cosmos_sysinfo

# Create an agent with Cosmos tools (agent uses any model — Bedrock, OpenAI, etc.)
agent = Agent(tools=[cosmos_reason_hf, video_probe, cosmos_sysinfo])

# The agent decides when to call each tool
agent("Check the GPU, then analyze /tmp/dashcam.mp4 for safety hazards")

Vision Tool (Direct VLM Inference)

from strands import Agent
from strands_cosmos import cosmos_reason_hf

agent = Agent(tools=[cosmos_reason_hf])

# Video analysis
agent("Use cosmos_reason_hf to describe what's happening in /tmp/scene.mp4")

# Image analysis
agent("Use cosmos_reason_hf to identify objects in /tmp/workspace.jpg")

Pipeline Orchestration

An agent can chain multiple tools to automate the full Cosmos workflow:

from strands import Agent
from strands_cosmos import (
    cosmos_model_download,
    cosmos_quantize,
    cosmos_export_onnx,
    cosmos_build_engine,
    cosmos_serve,
    cosmos_inference,
)

agent = Agent(tools=[
    cosmos_model_download,
    cosmos_quantize,
    cosmos_export_onnx,
    cosmos_build_engine,
    cosmos_serve,
    cosmos_inference,
])

# Agent orchestrates the full edge-deployment pipeline
agent("""
Deploy Cosmos-Reason2-2B to this Jetson Thor:
1. Download the model
2. Quantize to FP8
3. Export to ONNX
4. Build TRT engine
5. Start the server
6. Run a test inference with "describe the scene"
""")

Media Utilities

from strands import Agent
from strands_cosmos import video_probe, video_extract_frames, image_read

agent = Agent(tools=[video_probe, video_extract_frames, image_read])

# Agent can analyze video metadata, extract frames, read images
agent("Probe /tmp/video.mp4, extract 4 frames at 1fps to /tmp/frames/, then read the first frame")

System Diagnostics

from strands import Agent
from strands_cosmos import cosmos_sysinfo

agent = Agent(tools=[cosmos_sysinfo])
agent("What GPU do I have and how much memory is available?")

World Model Generation

from strands import Agent
from strands_cosmos import cosmos_predict_generate

agent = Agent(tools=[cosmos_predict_generate])
agent("Generate future video frames using the config at /tmp/predict_config.json")

All Tools in One Agent

from strands import Agent
from strands_cosmos import *  # Import everything

# Create an omniscient Cosmos agent
agent = Agent(
    model=CosmosVisionModel(model_id="nvidia/Cosmos-Reason2-2B"),
    tools=[
        cosmos_sysinfo, video_probe, video_extract_frames, image_read,
        cosmos_model_download, cosmos_predict_generate, cosmos_transfer_generate,
        cosmos_quantize, cosmos_export_onnx, cosmos_build_engine,
        cosmos_serve, cosmos_post_train, cosmos_distill,
        cosmos_curate, cosmos_evaluate, rtp_capture_frame, nats_publish,
    ],
)

agent("What can you do?")  # Agent will describe all available tools

How Tools Work Internally

Each tool is a thin Python wrapper that delegates to just <recipe> commands:

@tool
def cosmos_predict_generate(config_path: str) -> dict:
    """Generate video with Cosmos Predict2.5 world model."""
    return _run_just("predict-generate", config_path)

This means: - Tools are reproducible — run the same just command manually - Tools are platform-aware — justfile handles OS/GPU detection - Tools auto-clone dependencies — just ensure-predict runs if repo is missing