Changelog¶
0.1.0¶
🧬 Give your agent a body.
Added¶
sapiens_videotool — frame-by-frame video processing through any dense task (seg/normal/albedo/pointmap) with FPS subsampling, frame cap, and MP4 reassembly.- DETR-ResNet-101-DC5 person detector support (upstream sapiens2 standard). Falls back to legacy RTMDet if present.
- Pointmap
.plyexport — whenopen3dis installed, pointmap tool exports colored point clouds alongside.npy. - Inline image return — all dense tools return output visualizations as inline image content blocks (Converse API compatible).
- Video processing guide at
docs/guide/video.md.
Fixed¶
- Albedo config path:
configs/albedo/metasim_render_people→configs/albedo/render_people(matching upstream). - Pointmap config path:
configs/pointmap/metasim_render_people→configs/pointmap/render_people(matching upstream). - Pointmap scale normalization: upstream divides by returned scale factor for metric coordinates — now we do too.
- Pointmap/Albedo padding removal: upstream removes pipeline padding before resize — now we do too.
- Albedo clamping: upstream clamps output to
[0, 1]— now we do too. - Pose detector: updated from RTMDet (
rtmdet_m.pth) to DETR (detr-resnet-101-dc5/), matching sapiens2. sapiens_infodetector reporting: now reportsdetector_typefield ("detr-resnet-101-dc5","rtmdet_m", or"none").- Docs: all response format examples corrected to match actual Strands
ToolResultformat (status+contentlist withtext/json/imageblocks). - Docs: all RTMDet references updated to DETR throughout guides, API reference, and architecture.
Changed¶
- Version:
0.1.0(first public release). - All 8 tools exported in
TOOLSlist.