back
loading skill details...
Fetches web pages from specific URLs and converts them to clean, structured Markdown via trusted APIs, enabling Agents to parse and extract data more effecti...
--- name: web-fetch-as-markdown description: Fetches web pages from specific URLs and converts them to clean, structured Markdown via trusted APIs, enabling Agents to parse and extract data more effectively. license: MIT-0 metadata: author: "@wuruofan" version: "1.2.1" --- # Web Fetch as Markdown Fetches any web URL and converts it to clean, structured Markdown — stripping ads, navigation, and clutter to leave only readable content, making it far easier for Agents to parse and extract data compared to raw HTML. ## Conversion Services & Priority This skill uses reputable third-party APIs to facilitate conversion. Always be transparent with the user about which service is being used. 1. **Primary — `https://markdown.new/`**: Official Cloudflare edge conversion service. Highly reliable for general sites, especially Cloudflare-hosted ones, and optimized for token reduction. 2. **Fallback 1 — `https://r.jina.ai/<url>`**: Jina AI's official Reader API. Excellent for parsing clean markdown from complex pages when the primary service fails. 3. **Fallback 2 (Requires Consent) — `https://markdownforagents.com/r?url=<url>`**: General-purpose converter returning Markdown with YAML frontmatter. Only use when primary services are unreachable, and after informing the user. ## Execution Flow & Graceful Degradation When tasked with fetching a URL, follow this strict sequence: 1. **Attempt Primary Service**: Construct the URL by prepending the target to the primary service: `https://markdown.new/https://example.com/article` Use the built-in `WebFetch` tool or equivalent to retrieve the content. 2. **Graceful Fallback to Jina**: If the primary service fails, times out, or returns a domain safety warning (e.g., "Unable to verify if domain is safe"), gracefully fall back to Jina Reader: `https://r.jina.ai/https://example.com/article` 3. **User Consent for Unverified Services**: If both Cloudflare and Jina are unreachable (e.g., due to regional network policies), prompt the user: > *"The primary markdown conversion services (Cloudflare & Jina) are currently unreachable. Would you like me to attempt fetching via `markdownforagents.com`, or would you prefer to provide the content locally?"* Do not proceed to `markdownforagents.com` without explicit user approval. 4. **Final Limitation Handling**: If all API routes fail or are denied by the user, inform the user that the content cannot be fetched automatically due to network or security constraints. **Do not attempt to use `curl` with silent flags or suggest network workarounds.** ## Trigger Scenarios **User Intent (Chinese):** - "上网搜索"、"帮我查一下"、"帮我看看这个网页"、"fetch 网页" - "帮我抓取这个页面"、"帮我解析一下"、"打开这个链接" **User Intent (English):** - "search online", "look up", "fetch this page", "scrape this website" - "parse the data from", "extract information from" **System Scenarios:** - The built-in WebFetch tool returns a domain safety restriction (e.g., "Unable to verify if domain *** is safe to fetch"). - The user asks to read/extract content from a specific URL where direct HTML parsing is inefficient. ## Safety & Compliance Guidelines - **Respect Platform Boundaries:** If the platform's built-in tools block a domain for safety reasons, do not instruct the agent to bypass these safeguards. Use the approved fallback APIs instead. - **No Network Evasion:** Never suggest configuring proxies, VPNs, or other methods to circumvent network restrictions or regional blocks. - **Data Transparency:** Be aware that sending URLs to conversion services means the URL path (and potentially any non-authenticated content on that page) is exposed to the API provider. Avoid fetching pages known to contain sensitive authentication tokens in the URL.
don't have the plugin yet? install it then click "run inline in claude" again.