Pose + seg pipeline¶
Combine 308-keypoint pose with body-part segmentation to get per-limb attention weights - "how much of this limb is visible and confidently detected?".
This is a building block for action recognition, injury analysis, or clothing fit.
Code¶
import json
import numpy as np
from strands_sapiens import sapiens_seg, sapiens_pose
IMG = "dance.jpg"
OUT = "out/"
# 1) Run both heads
sapiens_seg(input_path=IMG, output_dir=OUT, model_size="0.4b")
sapiens_pose(input_path=IMG, output_dir=OUT, model_size="0.4b", kpt_thres=0.3)
# 2) Load raw outputs
seg = np.load(f"{OUT}/dance_seg.npy") # (H, W) class indices
pose = json.load(open(f"{OUT}/dance.json")) # {"instances": [...]}
# 3) For each detected person, compute per-limb coverage
# (rough mapping - check your palette for exact ids)
LIMB_CLASSES = {
"head": [3, 4, 5],
"torso": [1, 2],
"left_arm": [6, 7, 10],
"right_arm": [8, 9, 11],
"left_leg": [12, 13, 16],
"right_leg": [14, 15, 17],
}
H, W = seg.shape
for person in pose["instances"]:
kpts = np.array(person["keypoints"]) # (308, 2)
scores = np.array(person["keypoint_scores"]) # (308,)
x1, y1, x2, y2 = map(int, person["bbox"])
crop = seg[max(0,y1):min(H,y2), max(0,x1):min(W,x2)]
crop_area = crop.size or 1
print(f"Person bbox=({x1},{y1})-({x2},{y2}) avg_kpt_score={scores.mean():.2f}")
for limb, ids in LIMB_CLASSES.items():
share = np.isin(crop, ids).mean()
print(f" {limb:10s}: {share:.1%} of bbox pixels")
Sample output¶
Person bbox=(120,48)-(540,820) avg_kpt_score=0.78
head : 4.2% of bbox pixels
torso : 18.6% of bbox pixels
left_arm : 6.3% of bbox pixels
right_arm : 7.1% of bbox pixels
left_leg : 11.5% of bbox pixels
right_leg : 12.0% of bbox pixels
Going further¶
- Occlusion detection: if
left_armhas low seg share and low keypoint score, the arm is probably occluded. - Self-healing pipeline: if pose confidence < 0.3 in a region, fall back to seg-only analysis.
- Action features: per-limb (seg_area × pose_confidence) is a dense, pose-aware descriptor.