Phy Content Humanizer Audit

AI content signature detector for social media posts. Measures 8 linguistic dimensions that LinkedIn's 360Brew and other platforms use to detect AI-generated...

installs

stars

karma

SkillRank score ↗

7.8/ 10

evaluated by implexa, claude-haiku-4-5 · 2026-05-26

phy-content-humanizer-audit measures 8 linguistic dimensions to detect ai-generated social media content before publishing. reports per-platform risk thresholds for linkedin, reddit, twitter, and hackernews with specific remediation guidance.

structure

8.0

trigger phrases

8.0

procedure

8.0

edge cases

6.0

documentation

8.0

strengths

view original SKILL.md from clawhubclick to expand

---
name: Content Humanizer Audit
description: AI content signature detector for social media posts. Measures 8 linguistic dimensions that LinkedIn's 360Brew and other platforms use to detect AI-generated content — lexical diversity, sentence length variance, transition word density, hedging ratio, contraction usage, personal pronoun density, question frequency, and specific data density. Not a humanizer that rewrites your text — an auditor that tells you exactly which signals are triggering detection so you fix only what's wrong. Research-backed (DivEye arXiv:2509.18880, LinkedIn 360Brew algorithm analysis, stylometric detection studies). Per-platform thresholds for LinkedIn (strictest), Reddit, Twitter/X, HackerNews. Zero external dependencies.
license: Apache-2.0
homepage: https://canlah.ai
metadata:
  author: Canlah AI
  version: "1.0.3"
tags:
  - social-media
  - content
  - linkedin
  - ai-detection
  - writing
  - marketing
  - authenticity
  - brand-voice
---

# phy-content-humanizer-audit — AI Content Signature Detector

LinkedIn's 360Brew algorithm penalizes AI-detected content with **30% less reach and 55% less engagement**. This tool tells you exactly which linguistic signals are triggering detection — so you fix only what's wrong instead of rewriting everything.

**Not a humanizer. An auditor.**

## The Problem

You draft a LinkedIn post (maybe with AI help), publish it, and reach tanks. Why?

LinkedIn's 360Brew uses an LLM to evaluate:
- **Lexical diversity** — AI repeats vocabulary patterns
- **Sentence rhythm** — AI maintains unnaturally consistent sentence lengths
- **Transition words** — AI overuses "Furthermore", "Moreover", "Additionally"
- **Hedging language** — AI says "arguably" and "it seems" instead of stating opinions
- **Formality** — AI avoids contractions ("do not" instead of "don't")
- **Impersonality** — AI rarely uses first-person pronouns
- **No questions** — AI makes statements, doesn't ask
- **Vagueness** — AI uses abstract language with no specific data

This tool measures all 8 dimensions, scores each 0-10, and tells you your **AI Signature %** — the probability a platform algorithm will flag your content as AI-generated.

## Quick Start

```bash
# Audit a LinkedIn post draft
echo "Your post text here" | python3 ~/.claude/skills/phy-content-humanizer-audit/scripts/content_humanizer_audit.py --platform linkedin

# Audit from file
python3 ~/.claude/skills/phy-content-humanizer-audit/scripts/content_humanizer_audit.py --file draft.txt --platform reddit

# Inline text
python3 ~/.claude/skills/phy-content-humanizer-audit/scripts/content_humanizer_audit.py --text "My post..." --platform twitter

# JSON output (for pipelines)
python3 ~/.claude/skills/phy-content-humanizer-audit/scripts/content_humanizer_audit.py --file draft.txt --format json
```

## The 8 Dimensions

| # | Dimension | What It Measures | Human Signal | AI Signal |
|---|-----------|-----------------|-------------|-----------|
| 1 | **Lexical Diversity (TTR)** | Vocabulary variety (type-token ratio) | TTR 0.55-0.80 | TTR 0.35-0.55 |
| 2 | **Sentence Length Variance** | Mix of short/long sentences (coefficient of variation) | CV > 0.4 | CV < 0.3 |
| 3 | **Transition Word Density** | "Furthermore", "Moreover" per 100 words | < 1.5/100w | > 3.0/100w |
| 4 | **Hedging Ratio** | "arguably", "it seems" per 100 words | < 1.0/100w | > 2.0/100w |
| 5 | **Contraction Usage** | "don't", "I've", "it's" per 100 words | > 1.5/100w | < 0.5/100w |
| 6 | **Personal Pronoun Density** | "I", "my", "we" per 100 words | > 3.0/100w | < 1.5/100w |
| 7 | **Question Frequency** | % of sentences that are questions | 10-25% | 0-5% |
| 8 | **Specific Data Density** | Numbers, dates, names per 100 words | > 2.0/100w | < 1.0/100w |

Each dimension scores 0-10 (10 = very human). Total /80, mapped to an **AI Signature %**.

## Platform Thresholds

| Platform | WARN above | FAIL above | Why |
|----------|-----------|-----------|-----|
| **LinkedIn** | 45% | 65% | 360Brew LLM actively detects AI. Strictest. |
| **HackerNews** | 50% | 70% | Technical audience spots AI quickly. |
| **Reddit** | 55% | 75% | Community policing + mod tools. Moderate. |
| **Twitter/X** | 60% | 80% | Short form = less surface for detection. |

## AI-Flagged Word List

The tool flags 37 words that are strong AI signals on social media:

> leverage, robust, crucial, delve, tapestry, holistic, synergy, paradigm, ecosystem, landscape, streamline, cutting-edge, game-changer, innovative, revolutionary, transformative, comprehensive, meticulous, nuanced, multifaceted, pivotal, seamless, foster, utilize, facilitate, endeavor, underscore, realm, navigate, embark, spearhead, harness, unveil, bolster, cornerstone, unparalleled, groundbreaking

Each found word adds 3% to your AI signature score.

## Example Output

### Human-written post (PASS)

```
==================================================================
  phy-content-humanizer-audit — AI Signature Report
==================================================================
  Platform : Linkedin
  Words    : 183
  AI Sig   : 10.5% ✅ PASS
  Human    : 74.0/80.0
  Threshold: WARN >45%, FAIL >65%
==================================================================

📊  Dimension Scores (0-10, higher = more human)

  Lexical Diversity (TTR)      ██████████ 10.0/10
  Sentence Length Variance     ██████████ 10.0/10
  Transition Word Density      ██████████ 10.0/10
  Hedging Ratio                ██████████ 10.0/10
  Contraction Usage            ██████████ 10.0/10
  Personal Pronoun Density     █████░░░░░  5.5/10
  Question Frequency           ████████░░  8.5/10
  Specific Data Density        ██████████ 10.0/10
```

### AI-generated post (FAIL)

```
==================================================================
  Platform : Linkedin
  AI Sig   : 100% 🔴 FAIL
  Human    : 22.0/80.0
==================================================================

  Transition Word Density      ██░░░░░░░░  2.6/10   (3.4/100w)
  Hedging Ratio                █░░░░░░░░░  1.1/10   (3.4/100w)
  Contraction Usage            ░░░░░░░░░░  0.0/10   (0.0/100w)
  Question Frequency           █░░░░░░░░░  1.0/10   (0%)
  Specific Data Density        ░░░░░░░░░░  0.0/10   (0.0/100w)

  🚫 14 AI-flagged words: comprehensive, crucial, cutting-edge,
     ecosystem, facilitate, harness, holistic, innovative, landscape,
     navigate, paradigm, revolutionary, robust, transformative
```

## How to Use the Fixes

The tool outputs your **top 3 fixes** ranked by impact:

```
💡  Top 3 Fixes to Lower AI Signature:

  1. Add contractions: change 'do not' → 'don't', 'I have' → 'I've'
  2. Add specific data: include a number, date, or tool name
  3. Remove AI words: comprehensive, crucial — replace with plain terms
```

Fix just those 3 things and re-run. Usually drops AI signature by 20-30%.

## CI / Pre-publish Gate

```bash
# Fail if AI signature > 65% (LinkedIn threshold)
echo "$POST_TEXT" | python3 content_humanizer_audit.py --platform linkedin
# Exit code: 0=PASS, 1=WARN, 2=FAIL
```

## Research Basis

| Source | Key Finding | How We Use It |
|--------|------------|---------------|
| DivEye (arXiv:2509.18880) | Human text has richer variability in lexical/structural unpredictability | TTR + sentence variance scoring |
| LinkedIn 360Brew (2026) | LLM-based feed ranking detects AI via lexical patterns, profile alignment | Platform-specific thresholds |
| Stylometric detection studies | AI shows lower sentence length variance, higher transition density | 8-dimension framework |
| LinkedIn algorithm data | 30% reach drop, 55% engagement drop for AI content | WARN/FAIL calibration |
| Consumer research | 52% reduce engagement with suspected AI content | Motivation for the tool |

## Technical Notes

- **Zero external dependencies** — pure Python 3.7+ stdlib
- **Sentence splitting** — regex-based, handles abbreviations
- **Windowed TTR** — sliding window of 100 tokens to normalize for text length
- **Exit codes** — 0 (PASS), 1 (WARN), 2 (FAIL) for CI integration
- **JSON output** — `--format json` for pipeline integration

## Companion Skills

| Skill | Relationship |
|-------|-------------|
| `phy-brand-voice-guard` | Brand-specific content rules (this tool = platform-universal AI detection) |
| `phy-post-forensics` | Analyzes why posts worked/failed (this tool = pre-publish prevention) |
| `phy-platform-rules-engine` | Platform-specific invisible rules (this tool = AI signature specifically) |

---

## Author

**[Canlah AI](https://canlah.ai)** — Run performance marketing without breaking your brand.

- GitHub: [github.com/PHY041](https://github.com/PHY041)
- All Skills: [clawhub.ai/PHY041](https://clawhub.ai/PHY041)

don't have the plugin yet? install it then click "run inline in claude" again.

added explicit inputs section with edge cases, expanded procedure into 15 granular steps with input/output per step, formalized decision points for empty input/invalid platform/short text/nan values, specified output contract with both text and json schemas including exit codes, and clarified outcome signals with concrete examples of skill working.

phy-content-humanizer-audit

Item: Phy Content Humanizer Audit
Rating: 7.8
Author: Implexa

intent

audit social media posts for linguistic signatures that platform algorithms (LinkedIn 360Brew, Reddit moderation, HackerNews detection, Twitter/X analysis) use to flag AI-generated content. this skill does not rewrite or humanize text. it measures 8 dimensions (lexical diversity, sentence length variance, transition word density, hedging ratio, contraction usage, personal pronoun density, question frequency, specific data density), scores each 0-10, and outputs an AI Signature percentage that tells you whether your post will trigger algorithmic suppression. use this before publishing to identify exactly which signals are firing, then fix only what matters instead of rewriting everything.

inputs

content

source: stdin, --text flag, or --file path
format: plain text (any length, any encoding)
edge case: empty input or < 10 words will error with "text too short"

platform selection

parameter: --platform (linkedin, reddit, twitter, hackernews)
default: linkedin (strictest thresholds)
validates against supported platforms only

output format

parameter: --format (text or json)
default: text (human-readable table + recommendations)
json includes raw scores, flagged words, top 3 fixes as structured data

python environment

requirement: python 3.7+
dependencies: none (pure stdlib)
location: ~/.claude/skills/phy-content-humanizer-audit/scripts/content_humanizer_audit.py

optional: configuration file

path: ~/.claude/skills/phy-content-humanizer-audit/config.yaml (if it exists)
allows override of ai-flagged word list, dimension weights, platform thresholds
if missing, skill uses hardcoded defaults

procedure

parse input source: read text from stdin (piped), --text inline argument, or --file path. normalize whitespace and encoding (utf-8). validate text is > 10 words, else halt with exit code 3 and error message "input text must be >= 10 words".
tokenize and clean: split text into sentences using regex that handles common abbreviations (dr., mr., etc.). split sentences into words (whitespace + punctuation). convert to lowercase for analysis. track original word count.
calculate lexical diversity (dimension 1): compute type-token ratio (TTR) using a sliding window of 100 tokens. TTR = unique words / total words in each window. average windows. map to 0-10 score: TTR 0.80+ = 10, TTR 0.35-0.55 = 0, linear interpolation between. output: individual window TTRs, final TTR score, human vs ai signal threshold.
calculate sentence length variance (dimension 2): for all sentences, record word counts. compute mean and standard deviation. calculate coefficient of variation (cv) = std / mean. map to 0-10 score: cv > 0.4 = 10 (human), cv < 0.3 = 0 (ai), linear between. output: cv value, score, narrative explanation.
scan for transition words (dimension 3): check every word against hardcoded list of 24 transition words (furthermore, moreover, additionally, consequently, subsequently, nevertheless, however, in addition, on the other hand, for instance, in particular, notably, specifically, generally, ultimately, in conclusion, as a result, meanwhile, similarly, conversely, undoubtedly, obviously, clearly, essentially). count occurrences, divide by word count * 100 to get per-100-word density. map score: < 1.5/100w = 10 (human), > 3.0/100w = 0 (ai), linear between. output: raw density, flagged words with positions, score.
scan for hedging language (dimension 4): check for 18 hedging words (arguably, it seems, it appears, somewhat, perhaps, possibly, arguably, likely, seemingly, arguably, one might say, relatively, kind of, sort of, in a sense, arguably, arguably, arguably - note: "arguably" is the strongest signal). count per 100 words. map score: < 1.0/100w = 10 (human), > 2.0/100w = 0 (ai), linear. output: raw density, flagged words with positions, score.
count contractions (dimension 5): regex search for contractions (don't, don't, can't, won't, i've, i'm, you're, it's, that's, there's, what's, who's, we've, they've, isn't, aren't, wasn't, weren't, haven't, hasn't, hadn't, etc. - minimum 24 patterns). count occurrences per 100 words. map score: > 1.5/100w = 10 (human), < 0.5/100w = 0 (ai), linear. output: raw density, all contractions found with positions, score.
count personal pronouns (dimension 6): regex search for first-person (i, me, my, mine, we, us, our, ours) and second-person (you, your, yours, yourself, yourselves) pronouns. count per 100 words. map score: > 3.0/100w = 10 (human), < 1.5/100w = 0 (ai), linear. output: raw density, breakdown by category (i/me/my vs. we/us/our vs. you), score.
calculate question frequency (dimension 7): count sentences ending with "?". divide by total sentence count. map to percentage. map score: 10-25% = 10 (human), 0-5% = 0 (ai), linear. output: raw percentage, total questions, total sentences, score.
count specific data (dimension 8): use regex to match numbers (integers, floats, percentages), dates (yyyy-mm-dd, mm/dd/yyyy, month day year), and proper nouns (capitalized word not at sentence start unless it's the first word of text and 5+ sentences exist). count per 100 words. map score: > 2.0/100w = 10 (human), < 1.0/100w = 0 (ai), linear. output: raw density, breakdown (numbers/dates/names), flagged instances with positions, score.
scan for ai-flagged word list (meta signal, not a dimension): check text against hardcoded list of 37 ai-signature words (leverage, robust, crucial, delve, tapestry, holistic, synergy, paradigm, ecosystem, landscape, streamline, cutting-edge, game-changer, innovative, revolutionary, transformative, comprehensive, meticulous, nuanced, multifaceted, pivotal, seamless, foster, utilize, facilitate, endeavor, underscore, realm, navigate, embark, spearhead, harness, unveil, bolster, cornerstone, unparalleled, groundbreaking). count unique matches. multiply by 3 to get additive bonus to ai signature. output: flagged words with positions, total count, bonus points.
aggregate scores and compute ai signature percentage: sum all 8 dimension scores (0-80 total). divide by 80 to get "human score" percentage. compute ai signature % = 100 - human score %. add ai-flagged word bonus (capped at 15 points max). clamp final ai signature to 0-100. output: all intermediate calculations, final ai signature %, human score.
apply platform thresholds: load thresholds for selected platform: linkedin (warn > 45%, fail > 65%), reddit (warn > 55%, fail > 75%), twitter (warn > 60%, fail > 80%), hackernews (warn > 50%, fail > 70%). compare final ai signature to thresholds. assign status: pass (green checkmark), warn (yellow), fail (red x). output: platform name, threshold values, status, emoji.
rank top 3 fixes by impact: sort by potential reduction impact: (1) contractions gap (if dimension 5 < 5, fix contractions), (2) specific data gap (if dimension 8 < 5, add data), (3) ai-flagged words (if count > 0, replace), (4) hedging gap (if dimension 4 < 5, reduce hedging), (5) transition word gap (if dimension 3 < 5, cut transitions). output: top 3 as numbered list with actionable suggestions.
format and output: if --format text, print human-readable table with bar charts (ascii ██░░) for each dimension, summary line with platform/words/ai-sig/status/threshold. if --format json, output structured object with all scores, flagged words, thresholds, status. exit with code 0 (pass), 1 (warn), or 2 (fail).

decision points

if input is empty or < 10 words: halt with exit code 3, error message "input text must be >= 10 words". do not proceed to analysis.

if platform argument is not in (linkedin, reddit, twitter, hackernews): halt with exit code 3, error message "unsupported platform. choose: linkedin, reddit, twitter, hackernews".

if output format is not in (text, json): default to text format, warn to stderr that format argument was invalid.

if text length is between 10 and 50 words: proceed with analysis but include warning in output that "text is very short; dimension scores may be less reliable".

if any single dimension score is nan or undefined (e.g., no sentences for question frequency calculation): set that dimension to 5.0 (neutral) and log warning to stderr.

if ai signature is < 10%: output status "pass" (green).

if ai signature is 10-44% (linkedin) or within warn-fail range for other platforms: output status "warn" (yellow).

if ai signature is >= 45% (linkedin) or >= fail threshold for platform: output status "fail" (red).

if no ai-flagged words are found: ai word bonus is 0. output message "no ai-signature words detected".

if all 8 dimensions score 9-10: output encouragement message "strong human signal across all dimensions".

if dimension is < 3: highlight that dimension in red in text output and suggest it as a top fix.

output contract

text format (default)

header: "==================================================================\n phy-content-humanizer-audit , AI Signature Report\n=================================================================="
metadata: platform, word count, ai signature %, status (emoji + pass/warn/fail), human score /80, warn/fail thresholds
dimension table: 8 rows, each with name, ascii bar chart (██░░), numeric score /10, raw metric (e.g., "0.65 TTR" or "2.3/100w")
ai-flagged words section (if any): "🚫 N ai-flagged words: [word1, word2, ...] , replace with plain terms"
top 3 fixes section: "💡 Top 3 Fixes to Lower AI Signature:\n 1. [fix]\n 2. [fix]\n 3. [fix]"
exit code: 0 (pass), 1 (warn), 2 (fail)

json format

root object with keys:
- platform (string)
- word_count (int)
- ai_signature_percent (float, 0-100)
- human_score (float, 0-80)
- status (string: "pass", "warn", "fail")
- thresholds (object: {warn_above, fail_above})
- dimensions (object with 8 keys, each containing {score, raw_metric, flagged_instances})
- ai_flagged_words (array of strings)
- ai_word_bonus_points (int)
- top_3_fixes (array of 3 strings)
- metadata (object: {version, run_timestamp_iso})
exit code: same as text format

file output (if --output flag provided)

writes text or json to specified file path
creates parent directories if missing
exits 0 on success, 3 on file write error

outcome signal

user knows the skill worked when:

skill outputs a numeric ai signature percentage (0-100) with clear pass/warn/fail status for the chosen platform. user sees which of the 8 dimensions are dragging the score down (red bar chart items). user can identify exactly which 3 actions will improve the score most.
if text is a known human post (e.g., personal anecdote with contractions, questions, specific data), ai signature is < 15% and status is green (pass). if text is known ai-generated (e.g., chatgpt post with generic language, no contractions, no questions), ai signature is > 80% and status is red (fail).
user runs skill again after fixing top 3 items and sees ai signature drop by 15-30 percentage points. this confirms the fixes worked.
user integrates skill into ci pipeline with --format json and exit codes, sees automated pre-publish gates working (pull request fails if ai signature > 65% on linkedin, passes otherwise).
json output is valid and parseable by downstream tools (ci systems, dashboards). timestamp and version fields allow audit trail.
skill completes in < 500ms for typical 500-word post (no network calls, pure python).

credits: original skill by Canlah AI. research basis: DivEye (arXiv:2509.18880), LinkedIn 360Brew algorithm analysis, stylometric detection studies.

Phy Content Humanizer Audit

related skills

phy-content-humanizer-audit

intent

inputs

procedure

decision points

output contract

outcome signal