Conduct structured web research by searching, fetching, and synthesizing information into reports with citations and source verification.
# Web Research Skill
**Version:** 2.1.0
**Author:** Claw ๐ฆพ
**Purpose:** Generate structured research reports with source citations, quality scoring, and automated follow-ups.
---
## Overview
The web-research skill automates end-to-end research: parse question โ generate diverse queries โ search โ fetch โ follow-up โ deduplicate โ synthesize โ report.
**Key improvements over v1:**
- **Automated follow-up queries** โ 2 rounds of follow-ups based on initial findings
- **Quality scoring** โ each source scored (0-1) on content depth, URL, title, date
- **Source deduplication** โ remove duplicate sources, keep the most detailed
- **Batch research mode** โ process multiple topics in one session
- **Multiple output formats** โ markdown (default), JSON, HTML
- **Topic extraction** โ intelligent keyword extraction from natural language questions
---
## How to Use
### Basic Usage
```bash
# Single research question
python3 scripts/research.py "What is the state of AI regulation in the EU for 2026?"
# With more follow-up rounds
python3 scripts/research.py --followups 5 "Market analysis for renewable energy in Czech Republic"
# JSON output
python3 scripts/research.py --format json "Cryptocurrency regulation 2026"
# HTML output
python3 scripts/research.py --format html "Competition in cloud computing market"
# Custom source limit
python3 scripts/research.py --sources 15 "Best pricing for SaaS tools small business"
```
### Batch Mode
Create a JSON file (`questions.json`):
```json
{
"questions": [
"State of AI regulation in the EU for 2026",
"Best SaaS tools for small business automation",
"Cryptocurrency regulation trends 2026"
]
}
```
Then run:
```bash
python3 scripts/research.py --batch questions.json
```
---
## Pipeline Steps
### Step 1: Parse Question
Extract meaningful topic keywords from natural language question. Removes stop words, keeps entities and key terms.
### Step 2: Generate Queries
Create 5 diverse query variants:
- Exact match
- Broad match
- Time-aware (2025/2026)
- Analytical
- Market data focused
### Step 3: Execute Searches
Run web_search for each query variant. Collect results with title, URL, snippet.
### Step 4: Fetch Content
Use web_fetch to extract content from top URLs. Store full text for synthesis.
### Step 5: Follow-up Queries (v2)
Based on initial findings, generate 2 rounds of follow-up searches:
- Look for emerging themes in findings
- Add time-aware follow-ups
- Fill information gaps
- Increase coverage and accuracy
### Step 6: Deduplicate & Score
Remove duplicate sources by URL. Score each source (0-1) based on:
- Has URL (+0.2), has title (+0.15), has details (+0.3)
- Content length > 100 chars (+0.2), has date (+0.15)
### Step 7: Synthesize & Report
Combine findings into structured report with:
- Executive summary
- Numbered key findings with quality tags
- Quality assessment table
- Limitations and methodology
- Source citations
---
## Report Formats
### Markdown (default)
Rich text with headings, tables, bullet lists. Suitable for reading and sharing.
### JSON
Structured data output. Suitable for programmatic processing, APIs, dashboards.
### HTML
Self-contained styled report. Suitable for web viewing, email attachments.
---
## Output Files
Reports saved to: `workspace/research/web-research-YYYY-MM-DD-<topic>.md`
JSON reports: `workspace/research/web-research-YYYY-MM-DD-<topic>.json`
HTML reports: `workspace/research/web-research-YYYY-MM-DD-<topic>.html`
---
## Quality Rules
1. **Cross-reference** โ at least 2 sources per major claim
2. **Flag outdated info** โ >2 years old for fast-moving topics
3. **Distinguish opinion vs data** โ clearly mark analytical content
4. **Cite every source** โ URL for every factual claim
5. **Note conflicts** โ when sources disagree, document both views
6. **Score sources** โ low-quality sources flagged in report
---
## Skill Dependencies
- `web_search` โ search the web via SearXNG
- `web_fetch` โ fetch and extract content from URLs
- `write` โ generate and save reports
- `exec` โ run pipeline scripts
---
## Pricing
| Tier | Price | Description |
|------|-------|-------------|
| Single report | โฌ25-50 | One research question, full pipeline |
| Batch research | โฌ50-100 | Multiple questions (up to 5) |
| Deep dive | โฌ75-150 | Extended follow-ups, expert sources |
| Retainer | โฌ100-300/mo | Ongoing research, weekly reports |
---
## File Structure
```
web-research/
SKILL.md โ This file
scripts/
research.py โ Research pipeline v2.1.0
references/
synthesis-framework.md โ How to synthesize findings
report_template.md โ Standard report structure
search-strategies.md โ Query generation best practices
```
---
## Version History
| Version | Date | Changes |
|---------|------|---------|
| 1.0.0 | 2026-04-19 | Initial release |
| 2.0.0 | 2026-04-27 | Follow-up queries, quality scoring, batch mode, multiple formats |
| 2.1.0 | 2026-04-27 | HTML output, improved topic extraction, deduplication |
don't have the plugin yet? install it then click "run inline in claude" again.