Name: MagicBrowse
Availability: InStock
Browser automation fallback through the magicbrowse CLI with goal-driven act as the default primitive and observe/primitives only for recovery on real web pa...
SKILL.md

---
name: magicbrowse
description: Browser automation fallback through the magicbrowse CLI with
  goal-driven act as the default primitive and observe/primitives only for
  recovery on real web pages.
homepage: https://www.npmjs.com/package/@mercuryo-ai/magicbrowse-cli
metadata: {"openclaw":{"homepage":"https://github.com/MercuryoAI/skills/blob/main/docs/magicbrowse/openclaw/marketplace/README.md","requires":{"bins":["magicbrowse"]},"primaryEnv":"MAGICPAY_API_KEY","install":[{"id":"npm","kind":"node","package":"@mercuryo-ai/magicbrowse-cli@latest","bins":["magicbrowse"],"label":"Install MagicBrowse CLI (npm)"}]}}
---

Use `magicbrowse` to reach a target page when your own browser tooling
cannot do it reliably. The planner runs two LLM loops per task and is
slower than direct browser control; prefer your own tools when they
suffice. Use `magicbrowse` to *reach* a target page (search, navigation,
traversal through non-sensitive screens). At any login, identity, checkout,
donation, subscription, payment, or human-verification page, stop and surface
to the user — do not invent or type credentials, identity data, payment data,
or any value you do not legitimately have.

For a MagicPay product/payment workflow, use the MagicPay workflow-first
recipe instead of treating a standalone MagicBrowse browser as the product
parent: MagicPay starts the product session, then launches or attaches the
browser as a child resource.

## Fallback Ladder

Try in order. Do not start at layer 4 just because primitives exist.

1. **Your own browser tooling** (Computer Use, native browser tools).
2. **`magicbrowse act "<goal>"`** — DOM-only navigator.
3. **`magicbrowse act "<goal>" --use-vision`** — same goal, navigator
   with screenshots. Use only when the user is comfortable sending
   screenshots/page context for this workflow. Vision is a retry mode
   for the same task; keep the granule.
4. **`magicbrowse observe` + primitives** —
   `click <target-id>`, `type <target-id> <text>`,
   `fill <target-id> <value>`, `select <target-id> <option-text>`,
   `press <keys>`. Use only when vision-mode `act` cannot make
   progress, or when single-element precision is required. `press` is
   global — `click` first if focus matters.
5. **Surface failure to the user.**

## Preferred Pattern

For public navigation tasks, give `act` the semantic goal and a checkable
terminal condition:

✓ `magicbrowse act "navigate to the public page that lists supported regions and stop when the region list is visible"`

Avoid manually replaying snapshot ids before `act` has failed:

✗ `magicbrowse observe` → `magicbrowse click 13` → `magicbrowse observe` → `magicbrowse click 23`

## Setup Check

1. Run `magicbrowse doctor` first on a fresh install. It verifies the
   gateway config and reachability.
2. If it fails because the API key is missing, run
   `magicbrowse init <apiKey>` (sign up at
   `https://agents.mercuryo.io/signup`).
3. Only proceed to `launch` and `act` once `doctor` passes.

## Hard Rules

> **Consequential actions require approval.** `magicbrowse` may
> navigate, inspect, draft, and prepare. It must stop and ask before
> submitting a form, posting or sending content, accepting terms,
> changing account data or settings, booking, buying, ordering,
> deleting or modifying remote data, or otherwise committing an
> irreversible or account-affecting action. After approval, re-run
> `observe` and execute only the approved final action. A successful typed
> MagicPay approval counts for that exact payment, signing, or confirmation
> action; ask again only if the approved page facts changed.

> **Protected data — never invent.** Do **not** use `act`, `type`,
> `fill`, or `select` for any of the following on any page:
> - login or signup credentials (email, username, password, OTP),
> - identity-document fields (passport, ID, KYC address, DOB tied to
>   identity),
> - payment-card or banking fields (PAN, CVV, expiry, IBAN, account),
> - any value sourced from a vault or secret store, or any value you
>   do not legitimately have.
>
> Reach the page, stop before entering protected values, and return the
> handoff to the orchestrator or approved protected-data handler. Do not
> guess, placeholder, or fabricate protected values. Be honest about what
> you cannot do.

> **Use `act` before snapshot primitives.** Do not start MagicBrowse work
> with `observe` plus `click`/`type`/`select`/`press`/`fill` before
> attempting `act` on the same goal. Why: the navigator keeps the goal,
> current page context, and completion check in one planner loop instead of
> spreading them across fragile snapshot ids. Use primitives only after
> DOM-only and vision-mode `act` cannot make progress, or when the recovery
> step is deliberately single-element.

> **Target-ids are snapshot-scoped.** Valid only for the `observe`
> snapshot that produced them. Re-run `observe` after any click, type,
> navigation, popup, or lazy-load before the next primitive — reusing
> an old id silently addresses a different element.
>
> ✓ `observe` → `click 12` → `observe` → `type 7 "hello"`
> ✗ `observe` → `click 12` → `type 7 "hello"`

> **One workflow per default home.** The current-session pointer at
> `$MAGICBROWSE_HOME/current-session.json` (default `~/.magicbrowse/`) is a
> singleton. Concurrent workflows on the same home overwrite each other. For
> parallel use, set a distinct `MAGICBROWSE_HOME` per workflow, or do not run
> the tasks in parallel.

> **Fresh browser by default.** Prefer an owned, fresh browser session.
> Use `attach`, `--profile`, or `--user-data-dir` only when the user
> explicitly approves that browser/session for the current task. Keep
> CDP endpoints private. Close the session before unrelated work.

> **Page context can leave the browser.** LLM-backed `act` sends page
> state to the gateway; `--use-vision` can include screenshots. Avoid
> private pages unless the user approves that workflow, and stop at login,
> identity, checkout, donation, subscription, or payment pages.

## Primary Workflow

Contract: `launch [url] → act … act → close`. Sequential `act` calls in
one session preserve page state and planner memory.

1. `magicbrowse launch <url>` — start a headless owned Chrome session
   pre-placed at the entry URL. Keep browser launches headless unless
   the user explicitly asks for a visible browser or you are doing live
   debugging. To attach to an existing CDP browser instead, first get
   explicit user approval for that endpoint/session:
   `magicbrowse attach <cdp-url-or-ws-endpoint>` (positional, not a
   `--cdp-url` flag).
2. `magicbrowse act "<goal>"` — natural-language browser step. Prompt is
   **positional**. `act` does **not** take `--url`; you cannot reset
   the page from inside `act`. To re-anchor, `close` and `launch` again.
3. Repeat `act` for the next strategic granule.
4. `magicbrowse close` — release the session when the overall
   MagicBrowse-owned browser task is done. If the workflow hands off to
   another tool or the user on a sensitive page, keep the browser open until
   that handoff completes. After the handoff completes, close only a
   MagicBrowse-owned disposable browser that the user is not taking over; do
   not close an external/user-owned attach without explicit approval.

`magicbrowse run` exists in the CLI for one-shot developer use. **It
is not part of this skill contract** — its bundled `close` destroys
continuity. Do not use it in an orchestrated workflow.

## Goal Granularity

1. **Granule = atomic strategic segment.** End each `act` where the
   orchestrator needs the next strategic decision. Tactics (which form
   field first) live inside `act`; strategy (this partner is wrong, try
   another) lives between `act` calls.
2. **Target horizon: 15-30 navigator steps per `act`; smaller is
   safer.** `maxSteps: 100` is a safety ceiling. The planner
   self-validates terminal status, so longer tasks have more room for
   false-positive completion. Prefer smaller granules when the success
   criterion cannot be checked externally.
3. **Auth walls and CAPTCHA are hard boundaries, not obstacles.** A
   task that reaches auth, CAPTCHA, or human verification ends with
   `status: needs_handoff`, not `failed`. Plan tasks to end *at* such
   a wall, not through it. `magicbrowse` does not solve CAPTCHA and
   does not enter credentials. For a confirmed real CAPTCHA on the current
   approved browser session, have the user or an external solver clear it;
   after a successful solve, run `magicbrowse mark-captcha-resolved` before
   the next `act`. Branch on `handoff.kind`: `captcha` means solve/mark,
   `auth` means stop for user authentication, `identity_verification` means
   stop for user/KYC handling, and `protected_form` means hand off to the
   approved protected-data handler. Protected-form handoffs include
   `resumeObjective`; after the approved handler fills the form, continue
   with that page-local objective. Never retry the same `act` against the
   same wall. If the page asks for something you cannot legitimately
   provide, be honest about it.
4. **Rely on session memory; do not re-narrate.** Sequential `act`
   calls in one session preserve page state and planner memory. Do not
   write "as we already found, continue with…" into goals — if you
   feel the need to, the granularity is wrong.

## Goal Formulation

1. **No element indexes or selectors in goal text.** Indexes renumber
   on every DOM scan. Describe elements semantically.
   - ✗ `act "click target 14"`
   - ✓ `act "click the 'Continue' button under the price summary"`
2. **Describe the expected terminal state where it adds a checkable
   criterion.**
   - ✗ `act "get to checkout"`
   - ✓ `act "navigate to a checkout page that shows passenger fields and total fare"`
3. **Pass the starting URL to `launch`, not as a separate step.** To
   switch sites mid-workflow, either `close` and re-`launch`, or
   describe the navigation inside the goal text.

## Common Mistakes

> - Element indexes (`[14]`, `target 7`) in goal text.
> - `magicbrowse run` for orchestrated multi-step workflows.
> - `type` / `fill` / `select` / `act` on protected fields. Stop at
>   the form boundary; if `act` returns a protected-form handoff, send it to
>   the orchestrator or approved protected-data handler and then resume with
>   `handoff.resumeObjective`.
> - Letting `act` submit, post, book, buy, save, delete, or otherwise
>   commit an account-affecting action without explicit approval or a matching
>   typed MagicPay approval for unchanged page facts.
> - Trying to solve CAPTCHA through `magicbrowse`. On a confirmed real
>   CAPTCHA, have the user or an external solver clear it, then
>   `magicbrowse mark-captcha-resolved` before the next MagicBrowse step.
> - Attaching to a logged-in browser or named profile without explicit
>   approval for the current task.
> - Closing a browser that was handed to another tool or the user before the
>   overall task is actually done.
> - Re-narrating prior `act` results into the next goal — sequential
>   `act` calls keep state.
> - Skipping the `act`-first path and starting at layer 4
>   (observe + primitives).
> - Reusing a target-id from before a click, navigation, or popup.

## Status and Errors

`act` returns `status: completed | blocked | needs_handoff |
needs_approval | failed | max_steps | cancelled`. Branch on `status`;
do not parse `finalMessage` to detect missing input, protected-data
handoff, handoff subtype, or approval stops. For `blocked`, branch on
`blockedReason: missing_input | item_unavailable | ambiguous | no_path`.
For `needs_handoff`, branch on
`handoff.kind: protected_form | captcha | auth | identity_verification`.
`finalMessage` is the explanation to show the user or pass upstream.
Protected-form handoff details are in `handoff.resumeObjective`. Exit
code `0` includes `blocked`, `needs_handoff`, and `needs_approval`; it
does not mean success.
See [references/statuses.md](https://github.com/MercuryoAI/skills/blob/main/docs/magicbrowse/references/statuses.md).

## References

- [references/commands.md](https://github.com/MercuryoAI/skills/blob/main/docs/magicbrowse/references/commands.md) — every CLI command.
- [references/workflow.md](https://github.com/MercuryoAI/skills/blob/main/docs/magicbrowse/references/workflow.md) — worked end-to-end
  example.
- [references/guardrails.md](https://github.com/MercuryoAI/skills/blob/main/docs/magicbrowse/references/guardrails.md) — long-form hard
  rules.
- [references/statuses.md](https://github.com/MercuryoAI/skills/blob/main/docs/magicbrowse/references/statuses.md) — outcome codes and
  status handling.
MagicBrowse

SKILL.md

related skills