Strands ADB
Give your agent a phone.
@tool decorated Android control for Strands Agents & DevDuck — drive any adb-connected Android device (phone, tablet, emulator) from an LLM.
The Pitch¶
Your agent can already read files, call APIs, and run shell commands. Now it can also:
- See the screen (screenshots come back as Converse API image blocks, not paths)
- Tap & swipe — real UI automation
- Read notifications, battery, sensors, thermals
- Launch apps, open URLs, send SMS drafts, make calls
- Take physical photos (yes, through the actual camera)
- Stream logcat into your agent's event bus
- Mutate settings — brightness, ringer, airplane mode, bluetooth
One tool. One verb: action=.... Works over USB or wireless adb.
2-Minute Quickstart¶
pip install strands-adb
brew install android-platform-tools # adb on PATH
adb devices # plug in phone, accept USB debugging
from strands import Agent
from strands_adb import adb
agent = Agent(tools=[adb])
agent("take a screenshot of my phone and describe what's on screen")
→ Full Quickstart | Installation | Connect a Device
👁️ Agent Can SEE the Screen¶
screenshot returns a proper Converse API image block — the same format as strands_tools.image_reader. The agent doesn't just get a file path, it actually receives the pixels and reasons over them.
agent("take a screenshot and tell me what app is open")
# → adb(action="screenshot") returns PNG bytes in Converse image block
# → vision model reads it → "You're on the WhatsApp chat with Mom..."
graph LR
A["🗣️ Agent"] -->|action=screenshot| B["📱 adb exec-out screencap"]
B -->|PNG bytes| C["🖼️ Converse image block"]
C -->|in context| D["👁️ Vision model"]
D -->|"WhatsApp, chat w/ Mom"| A
Capabilities¶
-
👁️ Vision
Screenshots as image blocks. Agent literally sees the screen.
-
🎯 Smart Tap
Find UI elements by text/content-desc, tap by semantic meaning.
-
📸 Physical Camera
Drive GoogleCamera via intent + shutter — real photos, real video.
-
🔔 Notifications & Logcat
Parse notifications, stream logcat events to DevDuck's event bus.
-
🌡️ Sensors & Thermals
Accelerometer, gyro, light, CPU temps — the phone as sensor platform.
→ Sensors
-
⚙️ Settings Mutation
Ringer mode, brightness, airplane, bluetooth — programmatic device state.
→ Settings
-
♿ Accessibility
Read screen via accessibility service, magnification, captions.
-
🤖 DevDuck Native
DEVDUCK_TOOLS="strands_adb:adb"— zero-config integration.
DevDuck in 1 Line¶
export DEVDUCK_TOOLS="strands_adb:adb;strands_tools:shell"
devduck "open whatsapp, read the last message from mom, reply 'on my way'"
90+ Actions, One Tool¶
adb(action="list_devices")
adb(action="screenshot")
adb(action="smart_tap", text="Send")
adb(action="launch", package="com.whatsapp")
adb(action="camera_photo", facing="front")
adb(action="notifications_parsed")
adb(action="sensors")
adb(action="set_brightness", value=128)
adb(action="log_stream_start", filter="WhatsApp")
adb(action="setting_put", namespace="global", key="airplane_mode_on", value="1")
→ Full actions overview | API reference
Architecture¶
graph TD
AGENT["🗣️ Strands Agent"] -->|@tool call| ADB["strands_adb.adb"]
ADB -->|dispatch action| ROUTER{Action Router}
ROUTER -->|screenshot| SCREEN["📸 screencap → PNG bytes"]
ROUTER -->|tap/swipe| INPUT["🎯 input tap/swipe"]
ROUTER -->|launch| AM["📱 am start"]
ROUTER -->|logcat| LC["📜 logcat -v"]
ROUTER -->|sensors| DS["🌡️ dumpsys sensorservice"]
SCREEN --> PHONE["📱 Android Device"]
INPUT --> PHONE
AM --> PHONE
LC --> PHONE
DS --> PHONE
SCREEN -->|Converse image block| AGENT
Use Cases¶
- Personal phone assistant — "text mom back, she asked about dinner"
- Accessibility agent — read notifications aloud, navigate apps via voice
- Device monitoring — watch sensor/thermal streams, alert on anomalies
- Automated testing — semantic UI automation for QA
- Remote operation — control your phone from anywhere via SSH + wireless adb
- Fleet management — one agent, many devices (multi-serial targeting)