Scrask

When the user sends a screenshot via any chat surface (Telegram, iMessage, Slack, etc.), parse it for events and tasks using OpenClaw's configured vision LLM...

view source

installs

stars

karma

SkillRank score ↗

8.2/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-05-26

scrask-bot parses screenshots for events and tasks via gemini or claude, emits structured intent json, and delegates creation to installed calendar and task skills without direct persistence.

structure

9.0

trigger phrases

8.0

procedure

9.0

edge cases

8.0

documentation

8.0

strengths

view original SKILL.md from clawhubclick to expand

---
name: scrask-bot
version: 4.3.0
description: >
  When the user sends a screenshot via any chat surface (Telegram, iMessage, Slack, etc.),
  parse it for events and tasks using OpenClaw's configured vision LLM by default, with
  optional Gemini fast-path and Claude fallback for users who bring their own keys. Then
  delegate creation to the user's installed calendar / task skills. Scrask does not write
  to any store itself; it emits structured intent and the agent routes it.
author: sandip
metadata:
  openclaw:
    emoji: "🦞"
    # Invocation: implicit by default (agent reads the Trigger Conditions section
    # of this manifest and routes), with explicit override via the aliases below.
    # If any alias appears at the start of a user message (with or without `@`
    # or `/` prefix), the platform must dispatch to scrask regardless of the
    # implicit trigger conditions.
    invocation:
      mode: hybrid              # 'implicit' | 'explicit' | 'hybrid'
      aliases:
        - scrask
        - "scrask this"
        - screenshot
        - "screenshot to calendar"
    # No mandatory env vars. Default 'auto' provider routing uses OpenClaw's
    # configured vision LLM when no skill-level keys are set, so the skill works
    # out of the box.
    requires:
      env: []
      bins:
        - python3
    optional_env:
      # GEMINI_API_KEY: enables the cheap+fast Gemini-first routing in 'auto' mode,
      #   and is required if you pin --provider gemini.
      # ANTHROPIC_API_KEY: enables Claude fallback in 'auto' mode (when Gemini
      #   confidence is shaky), and is required if you pin --provider claude.
      - GEMINI_API_KEY
      - ANTHROPIC_API_KEY
    suggests:
      # Calendar destination skills (any one is enough)
      - calctl
      - accli
      - apple-calendar
      - brainz-calendar
      - gcal-pro
      # Task destination skills (any one is enough)
      - apple-reminders
      - things-mac
      - notion
    config:
      vision_provider:
        type: string
        description: >
          'auto' (default) routes by what you have: GEMINI_API_KEY → Gemini-first
          with Claude fallback; else ANTHROPIC_API_KEY → Claude only; else falls
          back to OpenClaw's configured vision LLM. 'openclaw' always uses the
          platform LLM. 'gemini' / 'claude' pin a specific provider (and require
          the matching key).
        default: auto
      fallback_threshold:
        type: number
        description: "Worst per-field confidence floor for auto mode. If any per-field score drops below this, Claude reruns the parse."
        default: 0.60
      timezone:
        type: string
        description: "User's IANA timezone. Used when none is detected in the screenshot."
        default: "UTC"
      confidence_threshold:
        type: number
        description: "Legacy 0.0–1.0 per-item gate. Kept for backward-compatible callers; the new thresholds below drive clarification behaviour."
        default: 0.75
      actionable_threshold:
        type: number
        description: "Top-level 'is this actually an event/task?' gate. Below this, the parser flags needs_actionable_confirmation."
        default: 0.70
      type_threshold:
        type: number
        description: "Per-item 'calendar or task list?' gate. Below this type_confidence, the parser emits a type clarification."
        default: 0.70
      field_threshold:
        type: number
        description: "Per mandatory field. Below this confidence (or null value) the parser emits a targeted clarification question for that field."
        default: 0.70
---

# Scrask Bot

## Overview

Scrask is a **screenshot-to-intent parser**. The user sends a screenshot via whatever chat surface
they have wired into OpenClaw (Telegram, iMessage, Slack, etc.). Scrask:

1. Decides whether the screenshot contains any actionable content (event, reminder, task). If not, ignores it.
2. Extracts every actionable item — a single screenshot may yield both an event and a task.
3. Emits structured intent JSON.
4. The OpenClaw agent then delegates each item to the user's installed destination skill:
   - `destination: "calendar"` → `calctl` / `accli` / `apple-calendar` / `brainz-calendar` / `gcal-pro` / etc.
   - `destination: "task"` → `apple-reminders` / `things-mac` / `notion` / etc.

Scrask never writes to a store directly. No service account JSON, no OAuth, no API keys for the
calendar/task layer — that's the destination skill's job.

## Invocation

Scrask is invoked in two ways. The platform tries explicit invocation first; if no alias matches, it falls back to the implicit trigger conditions.

### Explicit override (checked first)

If the user message begins with any of these aliases (case-insensitive, with or without a `@` or `/` prefix), the platform dispatches to Scrask regardless of the implicit conditions below:

- `scrask`
- `scrask this`
- `screenshot`
- `screenshot to calendar`

Examples that force-route to Scrask:

- `scrask this` (with an attached image)
- `@scrask` (with an attached image)
- `/scrask` (with an attached image)
- `screenshot to calendar` (with an attached image)

When invoked explicitly with no image attached, Scrask responds with a brief prompt asking the user to attach a screenshot, then stops. Do not run the parser without an image.

### Implicit (default, used when no alias matches)

The OpenClaw agent reads the incoming message and activates Scrask when:

1. The user sends a message in any connected chat surface that contains an **image attachment**.
2. The image appears to be a **screenshot** — not a photo of a person, place, or physical object.
3. No other skill has already claimed the image.

Do not activate (implicitly) for:

- Photos of people, places, food, scenery.
- Screenshots of code, errors, or UI bugs (leave for other skills).
- Images the user explicitly asks to edit, describe, or analyze for another purpose.

The implicit path is the one users will hit by default. The explicit aliases exist for two cases:

1. **Debugging / power-user override** — force Scrask to run on an ambiguous image the agent would otherwise route elsewhere (or skip).
2. **Recovery** — if the agent misses an obvious screenshot, the user can recover with `scrask this` instead of resending.

## Step-by-Step Instructions

### Step 1: Acknowledge Immediately

Reply on the user's current chat surface so they know the skill is working:

> "📸 Got it — analyzing your screenshot..."

### Step 2: Run the Parser

```bash
python3 {baseDir}/scripts/scrask_bot.py \
  --image-path "<path-to-temp-image>" \
  --provider "$CONFIG_VISION_PROVIDER" \
  --timezone "$CONFIG_TIMEZONE" \
  --confidence-threshold "$CONFIG_CONFIDENCE_THRESHOLD" \
  --actionable-threshold "$CONFIG_ACTIONABLE_THRESHOLD" \
  --type-threshold "$CONFIG_TYPE_THRESHOLD" \
  --field-threshold "$CONFIG_FIELD_THRESHOLD"
```

The script reads credentials from the environment — never pass them on the command line.
In default `auto` mode it routes by what is available:

- `GEMINI_API_KEY` set → Gemini-first with Claude fallback (cheap + fast path).
- `ANTHROPIC_API_KEY` set (no Gemini key) → Claude only.
- Neither set → OpenClaw's configured vision LLM, read from the platform-injected env vars
  `OPENCLAW_VISION_PROVIDER`, `OPENCLAW_VISION_KEY`, and optional `OPENCLAW_VISION_MODEL`.

So the skill works out of the box for any OpenClaw user with a vision-capable LLM
configured at the platform level. Bringing your own Gemini key only adds the cost-and-speed
optimisation on top.

The script returns JSON with:

- `success` — whether parsing worked
- `no_actionable_content` — true if nothing actionable was found
- `actionable_confidence` — 0.0–1.0, how sure the parser is the screenshot is actionable
- `needs_actionable_confirmation` — true if `actionable_confidence` is in the maybe band;
  the bot should confirm "is this actually an event or task?" before dispatching
- `items[]` — one entry per detected item with:
  - `type`, `destination`, `confidence` (legacy aggregate), `type_confidence`
  - `confidences{}` — per-field 0.0–1.0 scores (`title`, `date`, `time`, `location`,
    `participants`, `description`, `priority`, …)
  - `needs_confirmation` — true when there is at least one outstanding clarification
  - `clarifications[]` — targeted questions to ask the user, e.g.
    `{ "field": "time", "question": "What time is dinner with Priya?", "reason": "low_confidence" }`
  - all the extracted fields (`title`, `date`, `time`, `location`, `participants`, etc.)
- `summary_text` — chat-ready preview of what was found; send this verbatim, do not rephrase
- `screenshot_summary`, `parse_notes` — context

### Step 3: Handle the Output

**If `no_actionable_content` is true:**
Silently ignore the screenshot — or, if the user clearly meant for scrask to act on it,
reply with the `summary_text` field (which is a polite "couldn't find anything" message).

**If `success` is true:**
Send the `summary_text` value back to the user on the same chat surface. Then process each item.

### Step 4: Route Each Item to a Destination Skill

For every item in `items[]`:

**If `needs_actionable_confirmation: true` (top level):**
Send `summary_text` (which already opens with "Is this actually an event or task?") and wait for
the user. On "yes", proceed item-by-item below. On "no", reply "Got it, skipped ✓" and stop.

**For each item — if `needs_confirmation: false` (no outstanding clarifications):**
Invoke the appropriate destination skill **without** asking the user first.

- `destination: "calendar"` → invoke the user's installed calendar skill. Preference order:
  `calctl` → `accli` → `apple-calendar` → `brainz-calendar` → `gcal-pro` → first available.
- `destination: "task"` → invoke the user's installed task skill. Preference order:
  `apple-reminders` → `things-mac` → `notion` → first available.

Pass the item fields (`title`, `date`, `time`, `end_time`, `end_date`, `location`, `participants`,
`description`, `recurrence`, `online_link`, etc.) to whatever creation command that skill exposes.
If `end_date` is present and different from `date`, treat the item as a multi-day event.

**For each item — if `needs_confirmation: true`:**
The `clarifications[]` array lists the specific things to ask. Each entry has:
- `field` — which field needs clarification (e.g. `"time"`, `"date"`, `"type"`)
- `question` — the user-facing question (already pre-formatted with the item title)
- `reason` — `"missing"` (value is null) or `"low_confidence"` (extracted but uncertain) or
  `"low_type_confidence"` (unsure whether this is a calendar event or a task)

The `summary_text` already renders these as a bullet list. Ask the user the questions in order
and patch the corresponding fields with their replies. Once every clarification is resolved,
route the item to the destination skill as above. If the user says **skip** at any point, drop
the item and confirm "Got it, skipped ✓".

For the special case of `field: "type"`, the user's reply determines whether the item routes to
`calendar` or `task` — update `destination` accordingly before dispatch.

### Step 5: Confirm Saves

After each destination skill returns, relay a one-line confirmation to the user. Examples:

- `📅 Added to Calendar via calctl: **Team Standup** — 2026-03-01 at 09:00`
- `🔔 Added to Reminders: **Pay electricity bill** (due 2026-02-28)`
- `✅ Added to Things: **Send Sandip my resume**`

If the destination skill errors, surface the error and ask whether to retry with a different destination.

## Edge Cases

| Scenario | Behavior |
|---|---|
| Single screenshot has both an event and a task | Process each independently; route to its own destination. |
| Event implies a prep step (e.g. dinner at a restaurant → book table) | The parser emits BOTH an event and a prep reminder. Inferred fields on the prep reminder land in the 0.65–0.80 band, so most prep reminders hit `needs_confirmation: true` with targeted clarifications (typically `time` and `date`). |
| Multi-day event (trip, conference) | `end_date` is set and differs from `date`. Pass both to the calendar skill (e.g. `calctl add --date --end-date --all-day`). |
| Rescheduled / cancelled event | Parser extracts the NEW date; `parse_notes` flags it as a reschedule. Confirm with user before overwriting any existing entry. |
| Screenshot is in Hindi, Tamil, or another language | Title and description are already in English; `language` holds the ISO code. Save as-is. |
| Recurring event ("every Monday") | Pass `recurrence` and `recurrence_day` to the calendar skill. |
| Date has already passed | Flag in the reply: "⚠️ This date has already passed. Save anyway?" |
| Screenshot of someone's calendar | `already_in_calendar_hint: true` — reply: "Looks like this is already in your calendar 🗓️" and skip. |
| No calendar / task skill installed | Reply with the missing-skill hint and stop. |
| Zoom/Meet link found | Pass `online_link` to the calendar skill; it should set both location and description. |
| Meme / non-actionable screenshot | `no_actionable_content: true` — ignore silently unless user clearly asked for action. |

## Configuration

```json
{
  "skills": {
    "entries": {
      "scrask-bot": {
        "enabled": true,
        "env": {
          // Both keys are OPTIONAL in v4.2+. Without either, Scrask uses
          // OpenClaw's configured vision LLM via the platform-injected
          // OPENCLAW_VISION_* env vars. Setting GEMINI_API_KEY opts into
          // the cheap+fast Gemini routing. Setting ANTHROPIC_API_KEY adds
          // Claude as a fallback (or as the primary if no Gemini key).
          "GEMINI_API_KEY": "AIza-your-gemini-key",
          "ANTHROPIC_API_KEY": "sk-ant-your-key-here"
        },
        "config": {
          "vision_provider": "auto",
          "fallback_threshold": 0.60,
          "timezone": "Asia/Kolkata",
          "confidence_threshold": 0.75,
          "actionable_threshold": 0.70,
          "type_threshold": 0.70,
          "field_threshold": 0.70
        }
      }
    }
  }
}
```

`ANTHROPIC_API_KEY` is optional. Without it, auto mode runs Gemini only.

## Permissions Required

- `image:read` — to access the screenshot from the chat surface.
- `network:outbound` — to call the vision model API (Gemini and optionally Claude).
- `chat:reply` — to send confirmation messages back via the user's chat surface.
- Whatever permissions the downstream calendar / task skill needs (handled by that skill).

related skills

semantically similar in the cross-vendor index

clawhub

67% match

Calendar Extractor

Periodically scans recent transcripts to extract calendar events and sends a daily summary of meetings to your iOS chat via push notifications.

don't have the plugin yet? install it then click "run inline in claude" again.

restructured into implexa's 6-component format with explicit decision trees for routing logic, api fallbacks, error handling, multi-item processing, and edge cases. clarified vision provider auto-routing, destination skill selection, and confirmation gates. added timeout, rate limit, and auth error handling.

Scrask Bot

intent

scrask parses screenshots sent via any chat surface (telegram, imessage, slack, etc.) for actionable events and tasks using a vision llm. when the user sends an image, the skill extracts calendar events, reminders, and tasks, then emits structured intent for the agent to route to the user's installed calendar or task management skills. use scrask when you want to turn a screenshot of a calendar invite, meeting notes, or to-do list into actual entries in your calendar or task manager without manual re-entry.

inputs

required:

image attachment (jpeg, png, webp, gif) from any connected chat surface
python3 binary available on the system
iana timezone string for fallback date parsing (default: UTC)

external connections:

vision llm provider (one of: openclaw platform-configured, gemini api, anthropic claude)
destination skill for calendar (calctl, accli, apple-calendar, brainz-calendar, gcal-pro, or any installed calendar skill)
destination skill for tasks (apple-reminders, things-mac, notion, or any installed task skill)

optional environment variables:

GEMINI_API_KEY (string) , enables gemini-first routing in auto mode and is required if you pin --provider gemini. recommended for cost and speed.
ANTHROPIC_API_KEY (string) , enables claude fallback in auto mode when gemini confidence is low, and is required if you pin --provider claude.

platform-injected environment variables (used in auto mode if no gemini or anthropic keys):

OPENCLAW_VISION_PROVIDER (string) , the platform's configured vision model (e.g. openai, vertex, bedrock)
OPENCLAW_VISION_KEY (string) , platform vision api credentials
OPENCLAW_VISION_MODEL (string, optional) , specific model name

configuration parameters:

vision_provider (string, default: "auto") , routing strategy: "auto" tries gemini then claude then openclaw; "openclaw" always uses platform llm; "gemini" or "claude" pins a specific provider
fallback_threshold (number, default: 0.60) , per-field confidence floor for auto mode. if any field drops below this, claude reruns the parse
timezone (string, default: "UTC") , iana timezone for date inference when screenshot contains no explicit timezone
confidence_threshold (number, default: 0.75) , legacy aggregate gate; kept for backward compatibility
actionable_threshold (number, default: 0.70) , top-level gate: is the screenshot actually an event or task? below this triggers confirmation
type_threshold (number, default: 0.70) , per-item gate: is this a calendar event or task? below this confidence triggers type clarification
field_threshold (number, default: 0.70) , per mandatory field gate. below this confidence or if value is null, parser emits targeted clarification question

procedure

step 1: acknowledge immediately

on receiving an image, reply on the user's current chat surface within 2 seconds:

📸 Got it ,  analyzing your screenshot...

do not wait for parsing to complete. the user needs to know the skill is active.

step 2: extract the image path and validate

obtain the temporary file path where the chat platform has stored the image
confirm the file exists and is readable
confirm the file size is under 10 mb (gemini and claude both have size limits; openclaw may vary by platform)
if the file is too large, reply "screenshot is too large to parse. please resize and resend" and stop

step 3: run the parser script

execute the python3 scrask parser:

python3 {baseDir}/scripts/scrask_bot.py \
  --image-path "<path-to-temp-image>" \
  --provider "$VISION_PROVIDER_CONFIG" \
  --timezone "$CONFIG_TIMEZONE" \
  --confidence-threshold "$CONFIG_CONFIDENCE_THRESHOLD" \
  --actionable-threshold "$CONFIG_ACTIONABLE_THRESHOLD" \
  --type-threshold "$CONFIG_TYPE_THRESHOLD" \
  --field-threshold "$CONFIG_FIELD_THRESHOLD"

the script reads all api credentials from environment variables. never pass keys on the command line.

in auto mode, the script checks credentials in order:

if GEMINI_API_KEY is set, route to gemini first. if gemini returns low confidence (below fallback_threshold on any field), retry with claude (if ANTHROPIC_API_KEY is set)
else if ANTHROPIC_API_KEY is set, route to claude only
else use OPENCLAW_VISION_PROVIDER, OPENCLAW_VISION_KEY, and OPENCLAW_VISION_MODEL from platform environment

step 4: parse the script output

the script returns json with these top-level fields:

success (boolean) , whether parsing completed without error
no_actionable_content (boolean) , true if the screenshot contains no actionable event or task
actionable_confidence (number 0.0-1.0) , how confident the parser is that there is something actionable
needs_actionable_confirmation (boolean) , true if actionable_confidence is in the uncertain band (between actionable_threshold and 0.95); user confirmation is required before routing
items (array) , one entry per detected event or task with:
- type (string) , "event" or "task"
- destination (string) , "calendar" or "task"
- confidence (number) , legacy aggregate score
- type_confidence (number 0.0-1.0) , confidence that this is a calendar event vs. a task
- confidences (object) , per-field scores: title, date, time, location, participants, description, priority, end_date, end_time, recurrence, online_link
- needs_confirmation (boolean) , true if there are outstanding clarifications
- clarifications (array) , objects with field (e.g. "time", "date", "type"), question (user-facing prompt), and reason ("missing" or "low_confidence" or "low_type_confidence")
- extracted fields: title, date, time, end_date, end_time, location, participants, description, recurrence, recurrence_day, online_link, priority, language, already_in_calendar_hint, is_rescheduled
summary_text (string) , chat-ready preview of what was found; send this verbatim to the user
screenshot_summary (string) , internal context (date range, language, etc.)
parse_notes (string) , flags and context (e.g. "reschedule detected", "prep task inferred")

step 5: handle no actionable content

if no_actionable_content is true:

if the user explicitly invoked scrask (used an alias like "scrask this"), reply with the summary_text (which provides a polite "nothing found" message)
if the user sent the image implicitly (no alias), silently ignore it and do not reply

step 6: handle top-level confirmation gate

if needs_actionable_confirmation is true (actionable_confidence is uncertain):

send summary_text to the user; it already opens with "is this actually an event or task?"
wait for explicit user confirmation (yes, no, skip)
on "yes": proceed to step 7
on "no" or "skip": reply "got it, skipped ✓" and stop

step 7: route each item to clarification or destination

for each item in items[]:

if needs_confirmation is false (no clarifications outstanding):

skip to step 8

if needs_confirmation is true:

extract the clarifications array
for each clarification, send the question field to the user and wait for a reply
for the special case of field: "type" (unsure if event or task), the user's reply determines destination: update it to "calendar" or "task"
patch the corresponding field in the item with the user's reply
once all clarifications are resolved, proceed to step 8
if the user says "skip" at any point, drop this item and confirm "got it, skipped ✓"

step 8: invoke the destination skill

determine which destination skill is installed and preferred:

for destination: "calendar":

preference order: calctl, accli, apple-calendar, brainz-calendar, gcal-pro, first available
check if the skill is installed; if none are found, reply "no calendar skill installed. install one of: calctl, accli, apple-calendar, brainz-calendar, gcal-pro" and stop
invoke the skill's create command with the item fields: title, date, time, end_time, end_date, location, participants, description, recurrence, recurrence_day, online_link
if end_date is present and differs from date, set the event as multi-day (pass --all-day flag if available, or let the skill infer all-day from missing end_time)
wait for the skill to return success or error

for destination: "task":

preference order: apple-reminders, things-mac, notion, first available
check if the skill is installed; if none are found, reply "no task skill installed. install one of: apple-reminders, things-mac, notion" and stop
invoke the skill's create command with: title, date (due date), time, description, priority
wait for the skill to return success or error

step 9: confirm the save

after the destination skill returns:

on success:

send a one-line confirmation to the user. examples:
- "📅 added to calendar via calctl: team standup , 2026-03-01 at 09:00"
- "🔔 added to reminders: pay electricity bill (due 2026-02-28)"
- "✅ added to things: send sandip my resume"

on error from the destination skill:

surface the error message to the user
ask "retry with a different destination skill, or skip?"
on "retry": go back to step 7 and pick the next skill in the preference order
on "skip": confirm "got it, skipped ✓" and move to the next item

step 10: clean up and finish

after all items are processed (routed or skipped):

delete the temporary image file from disk
confirm to the user that parsing is complete (one summary line)

decision points

implicit vs. explicit invocation

when a user sends a message with an image, scrask is invoked in two steps:

check explicit aliases first. if the message begins with (case-insensitive, optional @ or / prefix): "scrask", "scrask this", "screenshot", or "screenshot to calendar", route to scrask immediately
if no alias matches, the platform checks implicit trigger conditions: the message contains an image, the image appears to be a screenshot (not a photo of a person, place, food, or scenery), and no other skill has claimed it. if all three are true, route to scrask implicitly

explicit invocation with no image

if the user says "scrask" or "screenshot to calendar" but attaches no image, reply "please attach a screenshot" and stop. do not proceed.

gemini vs. claude routing in auto mode

if vision_provider is "auto":

if GEMINI_API_KEY is set, send the image to gemini and get back a parse with per-field confidence scores
if any field score falls below fallback_threshold and ANTHROPIC_API_KEY is also set, immediately retry the entire parse with claude
if only ANTHROPIC_API_KEY is set (no gemini key), skip gemini and use claude only
if neither key is set, use the platform-injected openclaw vision credentials

if vision_provider is "openclaw", "gemini", or "claude", use only that provider. if the chosen provider's credentials are missing, reply "vision provider {provider} not configured. check your env vars" and stop.

no actionable content (implicit vs. explicit context)

if no_actionable_content is true and the image was sent implicitly (no alias), silently ignore it.

if no_actionable_content is true and the user explicitly invoked scrask (used an alias), send summary_text so the user knows the skill ran and found nothing.

actionable confirmation gate

if needs_actionable_confirmation is true, send summary_text (which opens with "is this actually an event or task?") and wait. only proceed if the user says "yes". any other response (no, skip, or silence after 30 seconds) means drop the screenshot.

type clarification (event vs. task)

if a single item has field: "type" in clarifications (low type_confidence), ask the user "is this a calendar event or a task?" and update destination accordingly ("calendar" or "task"). this determines which skill to route to in step 8.

multi-item handling

if a single screenshot yields both an event and a task, process them independently:

each item goes through its own clarification loop (if needed)
each item routes to its own destination skill (calendar or task)
send separate confirmations for each

date-has-passed check

after parsing, if any item has a date field in the past (earlier than the current date in the user's timezone), ask "this date has already passed. save anyway?" before routing. allow the user to correct the date, skip, or proceed anyway.

rescheduled event

if is_rescheduled is true in parse_notes, ask the user "this looks like a rescheduled event. should i update an existing entry?" before creating. this prevents duplicate entries.

already-in-calendar flag

if already_in_calendar_hint is true, reply "looks like this is already in your calendar 🗓️" and skip without routing.

no destination skill installed

if the item routes to "calendar" but no calendar skill is installed, reply "no calendar skill found. install one of: calctl, accli, apple-calendar, brainz-calendar, gcal-pro" and stop.

if the item routes to "task" but no task skill is installed, reply "no task skill found. install one of: apple-reminders, things-mac, notion" and stop.

destination skill error handling

if the destination skill returns an error (timeout, auth fail, rate limit, network error), surface the error to the user and ask "retry with a different destination, or skip?" on retry, try the next skill in the preference order. if all skills fail or are unavailable, ask to skip.

network timeout

if the vision api call times out after 30 seconds, reply "parsing took too long. please try again" and stop. do not retry automatically.

api rate limit

if the vision api returns a 429 (rate limit), reply "i'm hitting rate limits. try again in a few minutes" and stop.

image too large

if the image is over 10 mb, reply "screenshot is too large. please resize and resend" and stop.

language detection

if the parser detects a non-english language in the screenshot (field language), the title and description are already translated to english. save as-is. include a note in the confirmation: "note: translated from {language}" if helpful.

prep task inference

if the parser infers a prep task from an event (e.g. dinner at a restaurant implies "book table"), two items are emitted: the event and the reminder. the prep reminder typically has low field confidence (0.65-0.80 range) and will hit needs_confirmation: true. ask the standard clarifications before routing.

recurring event

if recurrence is set (e.g. "every monday") and recurrence_day is populated, pass both to the calendar skill's create command.

multi-day event

if end_date is present and differs from date, the event spans multiple days. pass both --date and --end-date to the calendar skill, or set --all-day flag if the skill supports it.

online link (zoom, meet, etc.)

if online_link is detected, pass it to the calendar skill. the skill should set it in both the location and description fields if possible.

output contract

success path:

user receives immediate ack: "📸 Got it , analyzing your screenshot..."
parser returns json with success: true
if no_actionable_content is true (and implicit invocation), no further output
if no_actionable_content is true (and explicit invocation), send summary_text
if no_actionable_content is false and needs_actionable_confirmation is true, send summary_text and await user confirmation
for each item with needs_confirmation: true, send each clarification question from the clarifications array, collect user replies, and patch the item
for each item (clarified or not), invoke the destination skill and await success/error
on success from destination skill, send one-line confirmation: "{emoji} added to {destination} via {skill-name}: {title} , {date/time}"
on error from destination skill, surface error message and offer retry or skip
after all items processed, send final summary: "✅ done parsing your screenshot"

error path:

parsing fails (network error, api down, invalid image, etc.): "something went wrong parsing your screenshot. try again in a moment"
vision api timeout: "parsing took too long. please try again"
vision api rate limit: "i'm hitting rate limits. try again in a few minutes"
image too large: "screenshot is too large. please resize and resend"
no destination skill installed: "no {calendar|task} skill installed. install one of: {list}"
destination skill error: "failed to save to {skill}: {error message}. retry or skip?"

data format and retention:

scrask does not persist any data to disk or cache. the skill is stateless.
the temporary image file is deleted after parsing completes.
all structured intent (items, clarifications, confirmations) flows through the agent's memory and is not stored by scrask itself.
the destination skills are responsible for writing to the calendar or task store.

outcome signal

user confirms the skill worked:

immediate reply "📸 Got it , analyzing..." within 2 seconds of sending the image shows the skill is active
summary_text is sent (chat-ready preview of what was found) within 10 seconds
if clarifications are needed, the user is asked specific questions (not generic "is this right?") with field names and context (e.g. "what time is the team standup?")
after the user replies to clarifications, the event or task appears in their calendar or task app within 5 seconds (depending on destination skill performance)
a one-line confirmation is sent per item: "📅 added to calendar via calctl: team standup , 2026-03-01 at 09:00"
multiple items from a single screenshot are routed independently (e.g. an event and a prep task both show up in the right places)
if the date has passed, the user is warned before save
if the screenshot is already in the user's calendar, scrask politely skips it
if a destination skill fails, the error is surfaced and the user can retry or skip without losing the other items

success without user action required:

if a screenshot has no clarifications needed (all per-field confidence scores are above field_threshold and type_confidence is above type_threshold), the item is routed to the destination skill immediately after the summary is sent. the user sees the confirmation in their chat and the entry in their calendar/task app, with no extra prompts.