Extract structured data from websites. Use when: collecting competitor pricing; scraping product listings; extracting contact information; gathering research…
Web Scraper
Extract structured data from websites using BeautifulSoup and requests - turn any webpage into usable data.
When to Use This Skill
Competitor research - Scrape pricing, features, positioning
Lead generation - Extract contact info from directories
Content audit - Pull headings, links, meta data
Price monitoring - Track competitor pricing changes
Data collection - Gather research data from multiple sources
What Claude Does vs What You Decide
Claude Does
You Decide
Structures analysis frameworks
Strategic priorities
Synthesizes market data
Competitive positioning
Identifies opportunities
Resource allocation
Creates strategic options
Final strategy selection
Suggests implementation approaches
Execution decisions
Dependencies
pip install beautifulsoup4 requests pandas click lxml
Commands
Scrape Elements
python scripts/main.py scrape https://example.com --selector "h1,h2,p"
python scripts/main.py scrape https://example.com --selector ".product-price"
Extract Links
python scripts/main.py links https://example.com
python scripts/main.py links https://example.com --internal-only
Extract Emails
python scripts/main.py emails https://example.com
python scripts/main.py emails https://example.com --depth 2
Extract Structured Data
python scripts/main.py structured https://example.com/article --schema article
python scripts/main.py structured https://example.com/product --schema product
Examples
Example 1: Scrape Competitor Pricing
python scripts/main.py scrape https://competitor.com/pricing --selector ".price,.plan-name"
# Output:
# Extracted 6 elements
# 1. Starter - $29/mo
# 2. Pro - $99/mo
# 3. Enterprise - Contact us
Example 2: Extract Article Content
python scripts/main.py structured https://blog.example.com/post --schema article
# Output: article_data.json
# {
# "title": "How to Scale Your Startup",
# "author": "Jane Doe",
# "date": "2024-01-15",
# "content": "...",
# "word_count": 1523
# }
CSS Selector Reference
Selector
Description
Example
tag
Element type
h1, p, div
.class
Class name
.price, .title
#id
Element ID
#main-content
tag.class
Tag with class
div.product
tag[attr]
Has attribute
a[href]
parent > child
Direct child
ul > li
tag1, tag2
Multiple
h1, h2, h3
Ethical Scraping Guidelines
Check robots.txt - Respect site's scraping policy
Rate limit - Don't overload servers (1-2 req/sec)
Identify yourself - Use descriptive User-Agent
Cache requests - Don't re-scrape unchanged pages
Terms of Service - Check if scraping is allowed
Skill Boundaries
What This Skill Does Well
Structuring strategic analysis
Identifying market opportunities
Creating strategic frameworks
Synthesizing competitive data
What This Skill Cannot Do
Replace market research
Guarantee strategic success
Know proprietary competitor info
Make executive decisions
Related Skills
competitor-monitor - Monitor competitor changes
pdf-extractor - Extract from PDFs
Skill Metadata
Mode: centaur
category: automation
subcategory: data-extraction
dependencies: [beautifulsoup4, requests, pandas]
difficulty: intermediate
time_saved: 5+ hours/weekdon't have the plugin yet? install it then click "run inline in claude" again.