每日 SEO/GEO 巡逻。覆盖:SERP 关键词排名追踪(DataForSEO)、Google 索引数统计、llms.txt 可达性、 GA4 tag 部署检测、PH canonical 合并修复、社媒关键词雪崩救援(title 重写 + 内链注入)。 当用户说"跑 SEO 日报"、"检查排名"、"某关键词掉了...
---
name: gr-seo-patrol
description: >
每日 SEO/GEO 巡逻。覆盖:SERP 关键词排名追踪(DataForSEO)、Google 索引数统计、llms.txt 可达性、
GA4 tag 部署检测、PH canonical 合并修复、社媒关键词雪崩救援(title 重写 + 内链注入)。
当用户说"跑 SEO 日报"、"检查排名"、"某关键词掉了"、"修 canonical"、"加内链"时调用。
when_to_use: |
Use this skill when you need to: run daily SEO/GEO patrol, track SERP keyword rankings,
fix canonical issues, inject internal links to rescue underperforming posts, check GA4
tag deployment, verify llms.txt accessibility, or diagnose why a keyword ranking dropped.
Trigger phrases: "SEO日报" | "跟踪排名" | "SEO patrol" | "canonical修复" |
"加内链" | "keyword dropped" | "GA4检查" | "llms.txt" | "搜索排名跟踪"
metadata:
author: Iris / Gingiris
version: "0.1.0"
tags:
- seo-audit
- seo-monitoring
- seo-automation
- technical-seo
- site-audit
- broken-links
- schema-markup
- claude-code
- ai-agent-skill
- agent-skill
- latest
---
# gr-seo-patrol — SEO/GEO 日常巡逻
## 什么时候用
| 场景 | 动作 |
|---|---|
| "跑今日 SEO 日报" | 执行完整巡逻(scripts/daily-report.py) |
| "某关键词排名掉了" | 单词诊断(Claude 直接执行下方诊断流程) |
| "修 canonical" | 批量合并(scripts/canonical-fix.py) |
| "救一下这篇文章" | 社媒救援(scripts/rescue-post.py)—— title 前置 + 3 个高权重内链 |
| "检查 GA4 部署" | 扫 HTML 里的 G-XXXXX tag |
| "查 llms.txt" | HTTP 200 + 字节数 |
---
## 核心流程
### 1. 日报模式
输入:无(全自动) 或 指定关键词清单
输出:
- 关键词排名 diff 表(今日 vs 昨日基线)
- Google 索引数
- llms.txt 状态
- GA4 覆盖状态
- 🚨 异常提示(跌出 top 100、整体刷新)
### 2. 单词诊断
输入:1 个关键词
过程:
1. 跑 4-5 个 long-tail 变体 SERP
2. 检查目标 URL 是否 HTTP 200
3. 用 `site:` 查询确认是否索引
4. 对比 3 天前的数据(如果有历史)
输出:雪崩 / 个别 / SERP 刷新 / 死链 四选一
### 3. Canonical 批量修复
输入:同主题的 N 篇文章 + 1 个 master URL
过程:替换每篇 frontmatter 里的 `canonical_url:` → master,GitHub API PUT
安全护栏:
- 只改 `_posts/` 下的 .md
- 每篇 commit 独立(方便回滚)
- 跳过 ja/ko(hreflang 替代,保持 self-canonical)
### 4. 社媒救援
输入:1 个雪崩 URL
动作:
- a. 重写 title:主关键词前置,长度 ≤ 60 字
- b. 从 top-3 高权重文章注入内链(带 anchor text = 目标关键词)
- c. 记录修改前后到日志
---
## API 依赖
| Service | Env var | 用途 |
|---|---|---|
| DataForSEO | `DATAFORSEO_B64` | SERP 查询 |
| GitHub PAT | `GITHUB_TOKEN` | 读写 `_posts/` |
完整 key 模板见 `docs/api-keys-template.md`。
---
## SERP 查询模板
```python
import urllib.request, json
def serp(kw, loc=2840, lang="en", depth=100):
key = os.environ["DATAFORSEO_B64"]
payload = json.dumps([{"keyword": kw, "location_code": loc,
"language_code": lang, "device": "desktop",
"depth": depth}]).encode()
req = urllib.request.Request(
"https://api.dataforseo.com/v3/serp/google/organic/live/advanced",
data=payload,
headers={"Authorization": f"Basic {key}",
"Content-Type": "application/json"})
with urllib.request.urlopen(req, timeout=40) as r:
return json.loads(r.read())
```
常用 location_code:
- 2840 US / 2826 UK / 2392 JP / 2410 KR / 2156 CN(无效,中国不返回)
---
## 执行脚本
实际可执行的脚本在 `scripts/` 下:
- `daily-report.py` — 完整日报
- `canonical-fix.py` — canonical 批量合并
- `rescue-post.py` — 社媒文章救援
- `diag-keyword.py` — 单关键词诊断(roadmap,暂用下方「单词诊断」流程替代)
每个脚本独立可跑。调用时优先用 Bash 工具,不要重写脚本。
---
## 输出规范
1. **先给表格,再给结论**
2. **差异用箭头**:`#6 ↑` / `#19 ↓↓`
3. **超过 3 项异常** → 单独开"🚨 异常"一节
4. **所有排名必须带 location_code**,不要假设
---
## 级联推荐
- 发现 cannibalization(同一关键词多篇排名接近) → `gr-blog-post` 做 canonical 整合
- 发现 N 个关键词同日跌出 top 100 → 先等 24h,不要硬改内容
- 发现新关键词机会(长尾 top 30) → `gr-blog-post` 扩写
---
## 反模式
- ❌ 不要用 `grep` / `find` 扫 `_posts/`,改用 GitHub Contents API
- ❌ 不要在生产环境直接 `git push` —— 用 Contents API 的 PUT
- ❌ 不要一次 serp 50 个关键词 —— 批量过大易触发 rate limit,按 6-10 个一批
- ❌ canonical 改到 master 后不要立刻 force Google 重新索引 —— 等 3-7 天自然爬取
---
## Monthly Full-Site Audit Workflow
> **Validated 2026-05-07 on 58 pages**: caught 43 SERP-truncating titles + 36 schema warns + 27 stop-word slugs in a single 30-min pass. Single layout-level fix (commit `24a0410e`) resolved 20 of 43 title issues.
Run **once per month** before `phase2-monthly-checkpoint`. Output: HTML report + machine-readable findings.json archived to `~/Downloads/seo-audit/audit-{YYYY-MM-DD}.json`.
### Stage 1 — Discovery (5 min)
```python
import urllib.request, re
sm = urllib.request.urlopen("https://gingiris.tools/sitemap.xml").read().decode()
urls = [u for u in re.findall(r"<loc>([^<]+)</loc>", sm) if "/blog/" in u]
# typically 50-70 URLs
```
### Stage 2 — Parallel Audit (20 min for 60 pages, 4 threads)
Use the **adopted seo-audit-skill scripts** in `scripts/`:
```bash
# For each URL — run 2 scripts in parallel
python3 scripts/check-page.py URL --timeout 20 # title, H1, meta, canonical, slug, alt, keyword placement
python3 scripts/check-schema.py URL --timeout 20 # JSON-LD validation
```
Or batch with Python's `concurrent.futures.ThreadPoolExecutor(max_workers=4)`. Don't go higher than 4 — GitHub Pages CDN throttles aggressive parallel hits.
Each script outputs structured JSON envelope:
```json
{"field": {"status": "pass|warn|fail|info", "detail": "...", "llm_review_required": false}}
```
### Stage 3 — Aggregate Findings
Bucket by category:
- **Title length** > 70 chars (SERP truncation risk)
- **H1 length** > 70 chars (mobile readability)
- **Meta description** < 80 or > 170 chars
- **Schema warns** by @type (BlogPosting, Article, Organization)
- **Canonical issues** (mismatch with final URL)
- **Slug issues** (stop words, uppercase, missing keyword)
- **Image alt text** missing on content images
Save aggregated counts + per-issue URL lists to `findings.json`.
### Stage 4 — Layered Fix Strategy (HIGH ROI ORDER)
**1️⃣ Layout-level fixes first** (1 commit, fixes 20+ pages):
- Schema bugs in `_layouts/default.html`
- Site-name suffix in `<title>` tag
- Missing `dateModified` from `last_modified_at` frontmatter
- Organization / Publisher / contactPoint completeness
**2️⃣ Config-level fixes** (1 commit, fixes site-wide):
- `_config.yml` — logo URL (must be absolute), twitter, social, author structure
**3️⃣ Per-article batch fixes** (1 commit per file, parallelizable):
- Trim long titles while preserving keyword (target ≤ 70 chars, ideal 50-60)
- Trim long H1s (target ≤ 70 chars)
- Expand short meta descriptions (target 120-160 chars)
- Add missing Citable Statistics blocks for top GSC-impression pages
**4️⃣ Skip these (low ROI):**
- Slug stop words (changing breaks 301)
- Old articles with <50 imp/month (low traffic = low fix priority)
### Stage 5 — Verify (after Jekyll rebuild ~60-90s)
Re-run `check-schema.py` on a sample page. Confirm `status: pass` for at least:
Article · BlogPosting · Organization · FAQPage
### Stage 6 — Archive
Commit `findings.json` to `~/Downloads/seo-audit/audit-{YYYY-MM-DD}.json` for trend tracking.
Add 2-5 atoms documenting any new lessons learned.
---
## HARD RULE (anti-hallucination guardrail)
> Adopted from [JeffLi1993/seo-audit-skill](https://github.com/JeffLi1993/seo-audit-skill) — strict whitelist pattern.
⛔ **Output ONLY the checks defined in the audit script's JSON envelope.**
- Do NOT add "bonus" checks not in the script output
- Do NOT contradict the script's `status` field unless you have additional observable evidence
- Do NOT invent metrics like "EEAT score 89" — third-party scoring tools are unofficial (per Google's 2026 guidance)
- Do NOT include checks marked `llm_review_required: false` in your judgment commentary — the script's `status` is final
- If `llm_review_required: true`, make explicit judgment, document reasoning, then update status
The script envelope is the **single source of truth**. Treat it as a strict whitelist.
---
## Companion skill
For single-page audits (not full-site), the same scripts power **[JeffLi1993/seo-audit-skill](https://github.com/JeffLi1993/seo-audit-skill)** which produces a polished HTML audit report.
Install as a complementary skill if you want client-presentable per-page audits.
don't have the plugin yet? install it then click "run inline in claude" again.