Skip to content
Strands ADB

Strands ADB

Give your agent a phone.

@tool decorated Android control for Strands Agents & DevDuck — drive any adb-connected Android device (phone, tablet, emulator) from an LLM.


The Pitch

Your agent can already read files, call APIs, and run shell commands. Now it can also:

  • See the screen (screenshots come back as Converse API image blocks, not paths)
  • Tap & swipe — real UI automation
  • Read notifications, battery, sensors, thermals
  • Launch apps, open URLs, send SMS drafts, make calls
  • Take physical photos (yes, through the actual camera)
  • Stream logcat into your agent's event bus
  • Mutate settings — brightness, ringer, airplane mode, bluetooth

One tool. One verb: action=.... Works over USB or wireless adb.


2-Minute Quickstart

pip install strands-adb
brew install android-platform-tools   # adb on PATH
adb devices                           # plug in phone, accept USB debugging
from strands import Agent
from strands_adb import adb

agent = Agent(tools=[adb])
agent("take a screenshot of my phone and describe what's on screen")

Full Quickstart | Installation | Connect a Device


👁️ Agent Can SEE the Screen

screenshot returns a proper Converse API image block — the same format as strands_tools.image_reader. The agent doesn't just get a file path, it actually receives the pixels and reasons over them.

agent("take a screenshot and tell me what app is open")
# → adb(action="screenshot") returns PNG bytes in Converse image block
# → vision model reads it → "You're on the WhatsApp chat with Mom..."
graph LR
    A["🗣️ Agent"] -->|action=screenshot| B["📱 adb exec-out screencap"]
    B -->|PNG bytes| C["🖼️ Converse image block"]
    C -->|in context| D["👁️ Vision model"]
    D -->|"WhatsApp, chat w/ Mom"| A

Capabilities

  • 👁️ Vision

    Screenshots as image blocks. Agent literally sees the screen.

    Vision guide

  • 🎯 Smart Tap

    Find UI elements by text/content-desc, tap by semantic meaning.

    Smart tap

  • 📸 Physical Camera

    Drive GoogleCamera via intent + shutter — real photos, real video.

    Camera guide

  • 🔔 Notifications & Logcat

    Parse notifications, stream logcat events to DevDuck's event bus.

    Logcat streaming

  • 🌡️ Sensors & Thermals

    Accelerometer, gyro, light, CPU temps — the phone as sensor platform.

    Sensors

  • ⚙️ Settings Mutation

    Ringer mode, brightness, airplane, bluetooth — programmatic device state.

    Settings

  • ♿ Accessibility

    Read screen via accessibility service, magnification, captions.

    Accessibility

  • 🤖 DevDuck Native

    DEVDUCK_TOOLS="strands_adb:adb" — zero-config integration.

    DevDuck integration


DevDuck in 1 Line

export DEVDUCK_TOOLS="strands_adb:adb;strands_tools:shell"
devduck "open whatsapp, read the last message from mom, reply 'on my way'"

DevDuck guide


90+ Actions, One Tool

adb(action="list_devices")
adb(action="screenshot")
adb(action="smart_tap", text="Send")
adb(action="launch", package="com.whatsapp")
adb(action="camera_photo", facing="front")
adb(action="notifications_parsed")
adb(action="sensors")
adb(action="set_brightness", value=128)
adb(action="log_stream_start", filter="WhatsApp")
adb(action="setting_put", namespace="global", key="airplane_mode_on", value="1")

Full actions overview | API reference


Architecture

graph TD
    AGENT["🗣️ Strands Agent"] -->|@tool call| ADB["strands_adb.adb"]
    ADB -->|dispatch action| ROUTER{Action Router}

    ROUTER -->|screenshot| SCREEN["📸 screencap → PNG bytes"]
    ROUTER -->|tap/swipe| INPUT["🎯 input tap/swipe"]
    ROUTER -->|launch| AM["📱 am start"]
    ROUTER -->|logcat| LC["📜 logcat -v"]
    ROUTER -->|sensors| DS["🌡️ dumpsys sensorservice"]

    SCREEN --> PHONE["📱 Android Device"]
    INPUT --> PHONE
    AM --> PHONE
    LC --> PHONE
    DS --> PHONE

    SCREEN -->|Converse image block| AGENT

Full architecture


Use Cases

  • Personal phone assistant — "text mom back, she asked about dinner"
  • Accessibility agent — read notifications aloud, navigate apps via voice
  • Device monitoring — watch sensor/thermal streams, alert on anomalies
  • Automated testing — semantic UI automation for QA
  • Remote operation — control your phone from anywhere via SSH + wireless adb
  • Fleet management — one agent, many devices (multi-serial targeting)


Resources