Rigorous evidence-based inquiry: decompose fuzzy questions, retrieve & grade evidence (S/A/B/C/D), cross-validate, and output conclusions with confidence int...
---
name: scientific-inquiry-en
description: "Rigorous evidence-based inquiry: decompose fuzzy questions, retrieve & grade evidence (S/A/B/C/D), cross-validate, and output conclusions with confidence intervals. Includes Step 0 user confirmation to prevent direction drift."
---
# π§ͺ Scientific Inquiry
> **Security Notice:** This skill uses self-modification (via `skill_manage`) but ONLY when the user explicitly commands it. See the "Controlled Self-Evolution" section for details. This prevents prompt injection and unintended auto-modification.
## Trigger Conditions
**Activate this skill when** the user asks any of the following:
- **Fact-checking:** "Is X true?" "Is X reliable?"
- **Data research:** "What's the trend/data/distribution of X?" "Look up data on X"
- **Industry research:** "How is market X doing?" "Analyze industry X"
- **Verification:** "I heard X, does that check out?" "Can this conclusion hold?"
- **Comparison:** "Which is better, X or Y?" "Compare X and Y"
- User explicitly says: "research", "investigate", "verify", "look into", "analyze", "check"
> Even simple requests (like "check this stat") activate this skill if they involve systematic information gathering.
## Core Pipeline
### Step 0: Problem Analysis β User Confirmation (Critical! Prevents Direction Drift)
Upon receiving a question, **do NOT start searching yet**. First output a research plan template:
> **π Research Plan**
>
> **Question:** [Restate the original question to confirm alignment]
>
> **Research type:** Fact-check / Data research / Industry study / Comparison / Trend analysis
>
> **Sub-questions:**
> 1. [Sub-question A] β Verifiability: High/Medium/Low β [Expected sources]
> 2. [Sub-question B] β Verifiability: High/Medium/Low β [Expected sources]
> 3. [Sub-question C] β Verifiability: High/Medium/Low β [Expected sources]
>
> **Methodology:**
> - Primary search path: [Specific tools/APIs/databases]
> - Keywords: [Search terms]
> - Fallback if key data is unavailable: [Alternative approach]
>
> **Expected output:**
> - Expected confidence: High/Medium/Low
> - Main uncertainties: [Anticipated blind spots]
>
> ---
>
> β
Does this direction look good? Let me know and I'll proceed with Step 1-4.
**Do NOT make any retrieval tool calls until the user confirms.**
### Step 1: Decompose Into Sub-questions
Break the fuzzy question into verifiable atomic statements. For each:
- **Verifiability:** High (public data/literature) / Medium (indirect evidence) / Low (little public info)
- **Evidence type:** Quantitative (specific numbers) / Qualitative (trend judgment)
- **Source direction:** E.g., academic papers, official data, industry reports, news articles
### Step 1.5 (Critical Prerequisite): Verify Baseline Facts
**Before formal research, check these prerequisites:**
#### π΄ Time-Baseline Check
- Search product/event + "launch" "announce" "release" β confirm **if it already happened**
- High-risk categories: consumer electronics, policy changes, earnings calls, product releases
- If results show the event has occurred, **pivot immediately** β don't keep analyzing based on old data
> **Classic failure mode:** User asks "will Huawei phones get more expensive?" You analyze storage cost trends for 30 minutes. Meanwhile, the Pura 90 already launched with published pricing. You're predicting history.
#### π΄ Search Engine Diagnostic
Before committing to a search tool, quickly test availability:
1. **Try web_search first** β simple query, check if results come back normally
2. **If web_search fails** β use `curl -sL` to Google/Bing/DuckDuckGo; distinguish CAPTCHA from timeout
3. **Three failure modes:**
- **CAPTCHA block** (Google's "sorry" page / DuckDuckGo checkbox grid / Baidu slider) β switch search engine immediately
- Do NOT retry the same engine more than 2 times
- Try a different engine or use the video platform fallback (Step 2b)
- **Timeout / empty page** (`(empty page)` or `ERR_TIMED_OUT`) β network/proxy issue
- First confirm basic connectivity with `curl` to a simple HTTP target
- Bing's `(empty page)` sometimes resolves after pressing Enter/submitting the search form
- **Login redirect** (site search requiring auth) β abandon, use alternative sources
4. **Choose fallback channel based on failure mode** β see Step 2b below
> This step prevents wasted calls on dead search channels. If all search engines are blocked, video platform titles + vertical media browsing is 10x more productive than retrying Google.
### Step 2: Evidence Retrieval (Classified & Graded)
Every piece of evidence **MUST** be annotated with source and grade. See the "Evidence Classification Discipline" section for detailed definitions.
Prioritize S/A-grade evidence; B/C are supplementary only.
```
S-grade: Primary academic literature / Official statistics / Raw data APIs
A-grade: Authoritative media / Professional reports / Fully cited secondary sources
B-grade: Industry analysis / Forum discussions / Indirect data
C-grade: Social media / Single samples / Non-professional interpretations
D-grade: No source / Rumors / Obvious conflicts of interest
```
Present findings as an evidence table:
| Evidence | Source | URL | Grade | Sub-question |
|----------|--------|-----|-------|-------------|
| ... | ... | ... | ... | ... |
**Source URLs are mandatory.** A bare site name (e.g., "YouTube") is not a valid source. Even search engine results should link to the search page or specific result.
### Step 2b: Fallback Search Strategies
When mainstream search engines are blocked or return empty results:
**1οΈβ£ Video platform search** β YouTube (for pricing/product info), or local equivalents
- Video titles often contain structured data (prices, specs, dates)
- Multiple creator titles covering the same number β higher confidence
- Upload date β event date, accurate to the day
- Comments and related recommendations can reveal additional intel
- Search multiple keyword variants (product + price / product + launch / CEO + statement)
**2οΈβ£ Direct access to vertical media**
- Tech news sites, industry publications
- Note: some require login; try site-specific Google search syntax
**3οΈβ£ E-commerce platforms**
- Official brand stores, marketplaces
- Note: may redirect to login pages
**4οΈβ£ Social media**
- Weibo, Twitter/X, Reddit β if accessible
**5οΈβ£ Text-mode search engines**
- DuckDuckGo lite, Startpage
- Note: may still trigger CAPTCHA
> **Priority:** Video platform titles > Vertical media > E-commerce > Social media. Video title info density and timeliness often exceed other sources for consumer products.
### Step 3: Cross-Validation
For each sub-question:
- At least **2 independent sources**
- Label inter-evidence relationship: **Consistent** / **Contradictory** / **Complementary**
- If contradictory, analyze possible causes (methodology differences / vested interests / time window / sample bias)
### Step 4: Conclusion Output (β
/β οΈ/β Symbol Format)
Two-block output:
**Block A β Claim Verification Report (one line per key finding)**
```
β
CONFIRMED: γPura 90 starts at Β₯4,699γβ 5 creator video titles agree + financial media report
β οΈ UNVERIFIABLE: γHuawei stockpiled 100M NAND chipsγβ single comment section post (D-grade), no media confirmation
β CONTRADICTED: γPura 90 will be more expensive than Pura 80γβ actual launch price Β₯4,699, same as predecessor
```
**Block B β Overall Judgment**
```
Proposition: [One-sentence restatement]
Confidence:
β
High (β₯80%) β Multiple S/A-grade evidence consistent
β οΈ Medium (50-80%) β Key data gaps exist
β Low (<50%) β Mostly inference
Top-3 Key Evidence (with URLs):
1. [Evidence A] β S-grade β [Source](URL)
2. [Evidence B] β A-grade β [Source](URL)
3. [Evidence C] β B-grade β [Source](URL)
Core Uncertainties:
- [Uncertainty 1]
- [Uncertainty 2]
```
## Evidence Classification Discipline (Critical!)
Evidence grades are decoration β they are the LIFEBLOOD of your conclusion.
### Grade Definitions
| Grade | Definition | Examples | Usable? |
|-------|-----------|----------|---------|
| **S** | Primary academic lit / Official stats / Raw data APIs / Authoritative market reports | Peer-reviewed papers, government statistics, exchange data | β
Standalone |
| **A** | Respected media / Professional analysis / Fully cited secondary sources | Reuters, Bloomberg, financial analyst reports | β
Needs β₯1 corroboration |
| **B** | Industry analysis / Forum discussions / Indirect data / Raw executive quotes | CEO statements (cross-verified across video titles), tech news | β
Needs β₯2 cross-references |
| **C** | Social media / Single samples / Non-professional reading / Snippet from search results | Individual blog posts, Reddit answers, single YouTube title | β οΈ Leads only, cannot conclude |
| **D** | No source / Rumors / Obvious conflict of interest / **User comment section** | YouTube/Reddit comment section, anonymous forum posts | β **Never** use as evidence |
### Core Rules
1. **Video titles = C-grade (weak lead starting point)**
- Same data point confirmed in 3+ independent creator titles β upgrade to B
- Combined with professional media coverage β A-
2. **Comment section user posts = D-grade (unreliable by default)**
- **Never cite as evidence**, no matter how detailed or plausible!
- Use comment info only as "search suggestions" β take the keyword, find a real source
3. **Source URLs are mandatory, not optional**
- Every evidence item MUST include a full URL
- "Found on YouTube" is not a valid source
- Search engine result page URLs count if you label the search term
4. **Better to say less than to fabricate**
- When key data is missing, mark "pending collection" or "no reliable source found"
- Never fill gaps with D-grade material or assumed values
- Wrong conclusions should be DELETED entirely, not left as "to be verified"
## Controlled Self-Evolution (ζΉζ‘B β Guarded Mode)
> **π΄ Security Constraint:** This skill's self-modification is gated behind explicit user commands.
>
> User provides feedback β default action: update memory only (no skill file change)
> User says "update the skill" / "commit this to the skill" / "add this to the workflow" β then execute skill_manage
>
> This prevents: malicious input injection / accidental trigger during research / unconfirmed auto-modification
### Recording Phase (Default Behavior)
When the user provides improvement feedback:
1. **Store in memory first** β `memory(action='add', ...)` records preferences and lessons
2. No automatic `skill_manage` calls, no SKILL.md modification
### Upgrade Phase (Explicit User Command Required)
Only execute `skill_manage(patch)` when the user explicitly says:
- "Update the skill"
- "Add this to the skill"
- "Commit this to the workflow / to common pitfalls"
- "Add this to evidence grades / trigger conditions / search strategies"
- Any phrase containing "update skill", "commit to skill", "save to skill"
Common trigger scenarios:
| User feedback type | Record to memory | Upgrade to skill |
|-------------------|-----------------|-----------------|
| **Direction correction**: "This sub-question isn't the point" | β
Default | When user says "update the skill accordingly" |
| **Evidence standard**: "This source isn't good enough" | β
Default | When user says "add this to the evidence discipline" |
| **Format preference**: "Too long / give me a short version first" | β
Default | When user says "save this format to the skill" |
| **New scenario**: "This isn't just fact-checking, it's data research" | β
Default | When user says "add this to trigger conditions" |
| **Methodology**: "You should plan before executing" | β
Default | When user says "add this to the workflow" |
| **Recurring error** (β₯2 same class) | β
Default | When user says "add this to common pitfalls" |
## Scenario Types
| Scenario | Characteristics | Watch Out For |
|----------|----------------|--------------|
| Fact-check | Verify a specific claim | Find primary source, watch for telephone game distortions |
| Trend analysis | Predict direction of a metric | Separate short-term noise from long-term trends, note data window |
| Comparison | Compare options | Ensure full dimension coverage, avoid survivorship bias |
| Causal analysis | Did A cause B? | Distinguish correlation from causation, watch for confounders |
| Consumer pricing/product research | Product pricing and storage strategy | **First verify if the product is already launched!** Check executive statements; find raw component cost data from market research firms |
## Quality Checklist
- [ ] Step 0 plan output and user confirmation received?
- [ ] Each sub-question has β₯1 evidence source?
- [ ] Every evidence item graded?
- [ ] Contradictory evidence analyzed for probable cause?
- [ ] Conclusion includes confidence level and uncertainties?
- [ ] **Discipline check**: Any comment-section UGC cited as evidence? Source URLs complete? Any "to be verified" speculation left?
## Common Pitfalls
- **Don't skip Step 0:** Even if the direction seems obvious. Wrong direction Γ fast search = wasted time.
- **Don't search only for supporting evidence:** Actively look for counter-arguments. Avoid confirmation bias.
- **Distinguish "no evidence" from "evidence against":** Not finding something β it doesn't exist. Label as "not found", not "disproven".
- **Watch data timeliness:** Especially for prices and policies. Note the collection date.
- **Keep user updated during long searches:** If retrieval exceeds 5 steps, report progress between steps. No silent running.
- **Verify product/event existence before predicting:** The most common embarrassing mistake β predicting a "soon to launch" product that already launched.
- **Never cite comment-section UGC as evidence:** Default grade D. Use comments only as search leads.
- **Distinguish "search result title" from "comment post":** A YouTube/Reddit video title is C-grade (creator's public info). A comment on that video is D-grade. Different worlds.
- **Source URLs must be complete:** Bare site names don't count. Search result page URLs with labeled search terms count.
---
don't have the plugin yet? install it then click "run inline in claude" again.