Four-step image referring-expression pipeline: turns images plus KITTI bounding-box labels into region descriptions, scene captions, grounded referring expressions, and (optionally) verified expressions via VLM distillation. Use when the user wants to generate referring-expression annotations from images with KITTI labels, build region descriptions, produce grouped grounding phrases tied to bboxes, run a double-check verification pass on grounding expressions, auto-label traffic / scene images for referring datasets, or run the image_referring_expression pipeline. Triggers include 'referring expression', 'region description', 'KITTI labels', 'spatial relationship annotation', 'auto-label image referring expression', 'image_referring_expression'.
we've indexed the metadata for this skill but the body is fetched on demand. click "view source" above to read the canonical SKILL.md on clawhub, or "run inline in claude" to apply it without leaving your session.
read on clawhubdon't have the plugin yet? install it then click "run inline in claude" again.