Use this skill any time a .docx file is involved -- as input, output, or both. This includes: creating Word documents, reports, letters, memos, or proposals;...
---
# officecli: v1.0.23
name: officecli-docx
description: "Use this skill any time a .docx file is involved -- as input, output, or both. This includes: creating Word documents, reports, letters, memos, or proposals; reading, parsing, or extracting text from any .docx file; editing, modifying, or updating existing documents; working with templates, tracked changes, comments, headers/footers, or tables of contents. Trigger whenever the user mentions 'Word doc', 'document', 'report', 'letter', 'memo', or references a .docx filename."
---
# officecli: v1.0.23
# OfficeCLI DOCX Skill
## BEFORE YOU START (CRITICAL)
**Every time before using officecli, run this check:**
```bash
if ! command -v officecli &> /dev/null; then
echo "Installing officecli..."
curl -fsSL https://raw.githubusercontent.com/iOfficeAI/OfficeCli/main/install.sh -o /tmp/officecli_install.sh && bash /tmp/officecli_install.sh && rm -f /tmp/officecli_install.sh
# Windows: irm https://raw.githubusercontent.com/iOfficeAI/OfficeCli/main/install.ps1 -OutFile "$env:TEMP\officecli_install.ps1"; & "$env:TEMP\officecli_install.ps1"
else
CURRENT=$(officecli --version 2>&1 | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' | head -1)
LATEST=$(curl -fsSL https://api.github.com/repos/iOfficeAI/OfficeCLI/releases/latest | grep '"tag_name"' | sed -E 's/.*"v?([0-9.]+)".*/\1/')
if [ "$CURRENT" != "$LATEST" ]; then
echo "Upgrading officecli $CURRENT → $LATEST..."
curl -fsSL https://raw.githubusercontent.com/iOfficeAI/OfficeCli/main/install.sh -o /tmp/officecli_install.sh && bash /tmp/officecli_install.sh && rm -f /tmp/officecli_install.sh
else
echo "officecli $CURRENT is up to date"
fi
fi
officecli --version
```
---
# officecli: v1.0.23
## Quick Reference
| Task | Action |
|------|--------|
| Read / analyze content | Use `view` and `get` commands below |
| Edit existing document | Read [editing.md](editing.md) |
| Create from scratch | Read [creating.md](creating.md) |
---
# officecli: v1.0.23
## Reading & Analyzing
### Text Extraction
```bash
officecli view doc.docx text
officecli view doc.docx text --max-lines 200
officecli view doc.docx text --start 1 --end 50
```
`text` mode shows `[/body/p[N]] text content`, tables as `[Table: N rows]`, equations as readable math. Use `--max-lines` or `--start`/`--end` for large documents to avoid dumping entire content.
### Structure Overview
```bash
officecli view doc.docx outline
```
Output shows file stats (paragraph count, tables, images, equations), watermark presence, headers/footers, and heading hierarchy tree.
### Detailed Inspection
```bash
officecli view doc.docx annotated
```
Shows style, font, size, bold/italic per run; equations as LaTeX; images with alt text. Empty paragraphs shown as `[] <-- empty paragraph`.
### Statistics
```bash
officecli view doc.docx stats
```
Paragraph count, style distribution, font usage, font size usage, empty paragraph count.
### Element Inspection
```bash
# Document root (metadata, page setup)
officecli get doc.docx /
# List body children at depth 1
officecli get doc.docx /body --depth 1
# Specific paragraph
officecli get doc.docx "/body/p[1]"
# Specific run
officecli get doc.docx "/body/p[1]/r[1]"
# Table structure
officecli get doc.docx "/body/tbl[1]" --depth 3
# Style definitions
officecli get doc.docx /styles
# Specific style
officecli get doc.docx "/styles/Heading1"
# Header/footer
officecli get doc.docx "/header[1]"
officecli get doc.docx "/footer[1]"
# Numbering definitions
officecli get doc.docx /numbering
# JSON output for scripting
officecli get doc.docx "/body/p[1]" --json
```
### CSS-like Queries
```bash
# Find paragraphs by style
officecli query doc.docx 'paragraph[style=Heading1]'
# Find paragraphs containing text
officecli query doc.docx 'p:contains("quarterly")'
# Find empty paragraphs
officecli query doc.docx 'p:empty'
# Find images without alt text
officecli query doc.docx 'image:no-alt'
# Find bold runs in centered paragraphs
officecli query doc.docx 'p[alignment=center] > r[bold=true]'
# Find large text
officecli query doc.docx 'paragraph[size>=24pt]'
# Find fields by type
officecli query doc.docx 'field[fieldType!=page]'
```
---
# officecli: v1.0.23
## Design Principles
**Professional documents need clear structure and consistent formatting.**
### Document Structure
Every document needs clear hierarchy -- title, headings, subheadings, body text. Don't create a wall of unstyled Normal paragraphs.
### Typography
Choose a readable body font (Calibri, Cambria, Georgia, Times New Roman). Keep body at 11-12pt. Headings should step up: H1=16-18pt bold, H2=14pt bold, H3=12pt bold.
### Spacing
Use paragraph spacing (`spaceBefore`/`spaceAfter`) instead of empty paragraphs. Line spacing of 1.15x-1.5x for body text.
### Page Setup
Always set margins explicitly. US Letter default: `pageWidth=12240`, `pageHeight=15840`, margins=1440 (1 inch).
### Headers & Footers
Professional documents include page numbers at minimum. Consider company name in header, page X of Y in footer.
### Table Design
Alternate row shading for readability. Header row with contrasting background. Consistent cell padding.
### Color Usage
Use color sparingly in documents -- accent color for headings or table headers, not rainbow formatting.
### Content-to-Element Mapping
| Content Type | Recommended Element(s) | Why |
|---|---|---|
| Sequential items | Bulleted list (`listStyle=bullet`) | Scanning is faster than inline commas |
| Step-by-step process | Numbered list (`listStyle=numbered`) | Numbers communicate order |
| Comparative data | Table with header row | Columns enable side-by-side comparison |
| Trend data | Embedded chart (`chartType=line/column`) | Visual pattern recognition |
| Key definition | Hanging indent paragraph | Offset term from definition |
| Legal/contract clause | Numbered list with bookmarks | Cross-referencing via bookmarks |
| Mathematical content | Equation element (`formula=LaTeX`) | Proper OMML rendering |
| Citation/reference | Footnote or endnote | Keeps body text clean |
| Pull quote / callout | Paragraph with border + shading | Visual distinction from body |
| Multi-section layout | Section breaks with columns | Column control per section |
---
# officecli: v1.0.23
## QA (Required)
**Assume there are problems. Your job is to find them.**
### Issue Detection
```bash
# Check for formatting, content, and structure issues automatically
officecli view doc.docx issues
# Filter by issue type
officecli view doc.docx issues --type format
officecli view doc.docx issues --type content
officecli view doc.docx issues --type structure
```
### Content QA
```bash
# Extract all text, check for missing content, typos, wrong order
officecli view doc.docx text
# Check structure
officecli view doc.docx outline
# Check for empty paragraphs (common clutter)
officecli query doc.docx 'p:empty'
# Check for images without alt text
officecli query doc.docx 'image:no-alt'
```
When editing templates, check for leftover placeholder text:
```bash
officecli query doc.docx 'p:contains("lorem")'
officecli query doc.docx 'p:contains("xxxx")'
officecli query doc.docx 'p:contains("placeholder")'
```
### Validation
```bash
officecli validate doc.docx
```
### Pre-Delivery Checklist
- [ ] Metadata set (title, author)
- [ ] Page numbers present (field in header or footer)
- [ ] Heading hierarchy is correct (no skipped levels, e.g., H1 -> H3)
- [ ] No empty paragraphs used as spacing (use spaceBefore/spaceAfter instead)
- [ ] All images have alt text
- [ ] Tables have header rows
- [ ] TOC present if document has 3+ headings
- [ ] Document validates with `officecli validate`
- [ ] No placeholder text remaining
### Verification Loop
1. Generate document
2. Run `view issues` + `view outline` + `view text` + `validate`
3. List issues found (if none found, look again more critically)
4. Fix issues
5. Re-verify -- one fix often creates another problem
6. Repeat until a full pass reveals no new issues
**Do not declare success until you've completed at least one fix-and-verify cycle.**
**NOTE**: Unlike pptx, there is no visual preview mode (`view svg`/`view html`) for docx. Content verification relies on `view text`, `view annotated`, `view outline`, `view issues`, and `validate`. For visual verification, the user must open the file in Word.
**QA display notes:**
- `view text` shows "1." for ALL numbered list items regardless of their actual rendered number. This is a display limitation -- the actual document renders correct auto-incrementing numbers (1, 2, 3...) in Word and LibreOffice. Do not treat this as a defect.
- `view issues` flags "body paragraph missing first-line indent" on centered paragraphs, list items, bibliography entries, and other intentionally non-indented content. These are expected for block-style formatting and can be ignored when the paragraph has explicit `spaceAfter`, `listStyle`, `alignment=center`, or `hangingIndent`.
---
# officecli: v1.0.23
## Common Pitfalls
| Pitfall | Correct Approach |
|---------|-----------------|
| `--name "foo"` | Use `--prop name="foo"` -- all attributes go through `--prop` |
| Guessing property names | Run `officecli docx set paragraph` to see exact names |
| `\n` in shell strings | Use `\\n` for newlines in `--prop text="line1\\nline2"` |
| Modifying an open file | Close the file in Word first |
| Hex colors with `#` | Use `FF0000` not `#FF0000` -- no hash prefix |
| Paths are 1-based | `/body/p[1]`, `/body/tbl[1]` -- XPath convention |
| `--index` is 0-based | `--index 0` = first position -- array convention |
| Unquoted `[N]` in zsh/bash | Shell glob-expands `/body/p[1]` -- always quote paths: `"/body/p[1]"` |
| Spacing in raw numbers | Use unit-qualified values: `'12pt'`, `'0.5cm'`, `'1.5x'` not raw twips |
| Empty paragraphs for spacing | Use `spaceBefore`/`spaceAfter` properties on paragraphs |
| `$` and `'` in batch JSON | Use heredoc: `cat <<'EOF' \| officecli batch` -- single-quoted delimiter prevents shell expansion |
| Wrong border format | Use `style;size;color;space` format: `single;4;FF0000;1` |
| listStyle on run instead of paragraph | `listStyle` is a paragraph property, not a run property |
| Row-level bold/color/shd | Row `set` only supports `height`, `header`, and `c1/c2/c3` text shortcuts. Use cell-level `set` for formatting (bold, shd, color, font) |
| Section vs root property names | Section uses `pagewidth`/`pageheight` (lowercase). Document root uses `pageWidth`/`pageHeight` (camelCase) |
---
# officecli: v1.0.23
## Performance: Resident Mode
For multi-step workflows (3+ commands on the same file), use `open`/`close`:
```bash
officecli open doc.docx
officecli add doc.docx ...
officecli set doc.docx ...
officecli close doc.docx
```
## Performance: Batch Mode
Execute multiple operations in a single open/save cycle:
```bash
cat <<'EOF' | officecli batch doc.docx
[
{"command":"add","parent":"/body","type":"paragraph","props":{"text":"Introduction","style":"Heading1"}},
{"command":"add","parent":"/body","type":"paragraph","props":{"text":"This report covers Q4 results.","font":"Calibri","size":"11pt"}}
]
EOF
```
Batch supports: `add`, `set`, `get`, `query`, `remove`, `move`, `view`, `raw`, `raw-set`, `validate`.
Batch fields: `command`, `path`, `parent`, `type`, `from`, `to`, `index`, `props` (dict), `selector`, `mode`, `depth`, `part`, `xpath`, `action`, `xml`.
`parent` = container to add into (for `add`). `path` = element to modify (for `set`, `get`, `remove`, `move`).
---
# officecli: v1.0.23
## Known Issues
| Issue | Workaround |
|---|---|
| **No visual preview** | Unlike pptx (SVG/HTML), docx has no built-in rendering. Use `view text`/`view outline`/`view annotated`/`view issues` for verification. Users must open in Word for visual check. |
| **Track changes creation requires raw XML** | OfficeCLI can accept/reject tracked changes (`set / --prop accept-changes=all`) but cannot create tracked changes (insertions/deletions with author markup) via high-level commands. Use `raw-set` with XML for tracked change creation. |
| **Tab stops may require raw XML** | Tab stop creation is not exposed in officecli docx high-level commands. Use `raw-set` to add tab stop definitions in paragraph properties. |
| **Chart series cannot be added after creation** | Same as pptx: `set --prop data=` can only update existing series, not add new ones. Delete and recreate the chart with all series in the `add` command. |
| **Complex numbering definitions** | `listStyle=bullet/numbered` covers simple cases. For multi-level lists with custom formatting, use `numId`/`numLevel` properties. Creating new numbering definitions may require understanding the numbering part. |
| **Shell quoting in batch with echo** | `echo '...' \| officecli batch` fails when JSON values contain apostrophes or `$`. Use heredoc: `cat <<'EOF' \| officecli batch doc.docx`. |
| **Batch intermittent failure** | Approximately 1-in-15 batch operations may fail with "Failed to send to resident" when using batch+resident mode. Retry the command, or close/reopen the file. Split large batch arrays into 10-15 operation chunks. |
| **Table-level `padding` produces invalid XML** | Do not use `set tbl[N] --prop padding=N`. It creates invalid `tblCellMar`. Use cell-level `padding.top`/`padding.bottom` instead. If already applied, remove with `raw-set --xpath "//w:tbl[N]/w:tblPr/w:tblCellMar" --action remove`. |
| **Internal hyperlinks not supported** | The `hyperlink` command only accepts absolute URIs (`https://...`). Fragment URLs (`#bookmark`) are rejected. For internal cross-references, use descriptive text or `raw-set` with `<w:hyperlink w:anchor="bookmarkName">`. |
| **Table `--index` positioning unreliable** | `--index N` on `add /body --type table` may be ignored (table appends to end). `move` also may not work for tables. Workaround: add content in the desired order, or remove/re-add surrounding elements. |
| **`\mathcal` in equations causes validation errors** | The `\mathcal` LaTeX command generates invalid `m:scr` XML. Use `\mathit` or plain letters instead. |
| **`view text` shows "1." for all numbered items** | Display-only limitation. Rendered output in Word/LibreOffice shows correct auto-incrementing numbers. |
---
# officecli: v1.0.23
## Help System
**When unsure about property names, value formats, or command syntax, run help instead of guessing.** One help query is faster than guess-fail-retry loops.
```bash
officecli docx set # All settable elements and their properties
officecli docx set paragraph # Paragraph properties in detail
officecli docx set table # Table properties
officecli docx add # All addable element types
officecli docx view # All view modes
officecli docx get # All navigable paths
officecli docx query # Query selector syntax
```
don't have the plugin yet? install it then click "run inline in claude" again.
restructured original procedural reference into implexa 6-component format, extracted decision logic into explicit if-else branches, clarified external dependencies (officecli binary only), added edge cases (file locking, batch retry, track changes xml), preserved all original commands and design principles, added validation loop requirement and qa display limitations.
use this skill to manipulate, inspect, validate, and generate .docx files via command line. trigger whenever the user mentions word doc, document, report, letter, memo, or references a .docx filename. covers the full lifecycle: creating documents from scratch, reading and parsing existing content, editing and modifying structure or formatting, extracting text or metadata, and validating output before delivery. this skill is your interface to docx when you can't or won't use the Microsoft Office UI.
officecli binary: required. the skill auto-installs on first run via bash/powershell script from https://github.com/iOfficeAI/OfficeCli. version 1.0.23 or later required.
input docx file: optional. path to an existing .docx file to read, edit, or validate. if not provided, you'll create a new document from scratch.
external connections: none required. officecli is a local CLI tool with no network dependencies after installation.
setup checklist:
officecli --version)run the pre-flight check. if officecli is missing, the script installs it. if installed but outdated, the script upgrades.
if ! command -v officecli &> /dev/null; then
echo "Installing officecli..."
curl -fsSL https://raw.githubusercontent.com/iOfficeAI/OfficeCli/main/install.sh -o /tmp/officecli_install.sh && bash /tmp/officecli_install.sh && rm -f /tmp/officecli_install.sh
else
CURRENT=$(officecli --version 2>&1 | grep -oE '[0-9]+\.[0-9]+\.[0-9]+' | head -1)
LATEST=$(curl -fsSL https://api.github.com/repos/iOfficeAI/OfficeCLI/releases/latest | grep '"tag_name"' | sed -E 's/.*"v?([0-9.]+)".*/\1/')
if [ "$CURRENT" != "$LATEST" ]; then
echo "Upgrading officecli $CURRENT to $LATEST..."
curl -fsSL https://raw.githubusercontent.com/iOfficeAI/OfficeCli/main/install.sh -o /tmp/officecli_install.sh && bash /tmp/officecli_install.sh && rm -f /tmp/officecli_install.sh
fi
fi
officecli --version
output: confirmed version string printed to stdout (e.g., "officecli v1.0.23").
if reading/editing an existing .docx:
officecli view doc.docx outline to map structure (headings, section count, images, tables)officecli view doc.docx text to extract all body text; use --max-lines 200 to cap output for large docsofficecli view doc.docx annotated to see style, font, size, bold/italic per runofficecli view doc.docx stats to get paragraph/style/font distributionofficecli get doc.docx / to inspect root metadata (page setup, margins, title, author)if creating a new .docx:
output: text content, outline tree, metadata dict (from get), or stats summary printed to stdout.
for existing documents: use officecli set to modify properties of existing elements.
example: set paragraph style, font, spacing, alignment, color, bold/italic on a run.
# set paragraph style to Heading1
officecli set doc.docx "/body/p[1]" --prop style="Heading1"
# set run font and size
officecli set doc.docx "/body/p[1]/r[1]" --prop font="Calibri" size="12pt"
# set paragraph spacing and alignment
officecli set doc.docx "/body/p[2]" --prop spaceBefore="240" spaceAfter="120" alignment="center"
for new content: use officecli add to insert elements into the document.
# add paragraph to body
officecli add doc.docx /body --type paragraph --prop text="Chapter 1" style="Heading1"
# add table with 3 rows, 2 columns
officecli add doc.docx /body --type table --prop rows=3 cols=2
# add bulleted list
officecli add doc.docx /body --type paragraph --prop text="First item" listStyle="bullet"
officecli add doc.docx /body --type paragraph --prop text="Second item" listStyle="bullet"
for bulk operations: use batch mode (resident mode on same file with open/close).
# resident mode: open file, run multiple commands, close file
officecli open doc.docx
officecli add doc.docx /body --type paragraph --prop text="Introduction" style="Heading1"
officecli set doc.docx "/body/p[1]" --prop font="Calibri"
officecli close doc.docx
or batch json:
cat <<'EOF' | officecli batch doc.docx
[
{"command":"add","parent":"/body","type":"paragraph","props":{"text":"Introduction","style":"Heading1"}},
{"command":"add","parent":"/body","type":"paragraph","props":{"text":"Report content.","font":"Calibri","size":"11pt"}}
]
EOF
output: modified .docx file on disk at the original path. stdout confirms operation success or lists errors.
use css-like selectors to find specific content or formatting.
# find paragraphs by style
officecli query doc.docx 'paragraph[style=Heading1]'
# find paragraphs containing text
officecli query doc.docx 'p:contains("quarterly")'
# find empty paragraphs
officecli query doc.docx 'p:empty'
# find images without alt text
officecli query doc.docx 'image:no-alt'
# find bold runs in centered paragraphs
officecli query doc.docx 'p[alignment=center] > r[bold=true]'
# find paragraphs with font size >= 24pt
officecli query doc.docx 'paragraph[size>=24pt]'
# find fields by type (e.g., not page number)
officecli query doc.docx 'field[fieldType!=page]'
output: list of XPath results (element paths) printed to stdout.
run validation to catch formatting, structure, and content issues.
# check for all issue types
officecli view doc.docx issues
# filter by issue type
officecli view doc.docx issues --type format
officecli view doc.docx issues --type content
officecli view doc.docx issues --type structure
# run full validation
officecli validate doc.docx
output: list of issues (or "No issues found") printed to stdout. each issue includes path, type, and description.
view issues + view outline + view text + validate in step 5set/add/remove/raw-set commandsoutput: clean validation report. no new issues after fix.
the .docx file is ready when:
query doc.docx 'p:contains("lorem")' to verify)query doc.docx 'image:no-alt' to verify)spaceBefore/spaceAfter instead)user can download or copy the .docx file from the output location.
if the user provides an existing .docx file: read and parse it first (step 2). inspect structure before making any edits.
if the user needs to create a document from scratch: skip step 2 read phase. go directly to step 3 with add commands to build structure. apply design principles during add (choose fonts, sizes, spacing upfront).
if the user wants to bulk-update many elements: use batch mode (resident open/close or batch json) instead of running individual commands. batch is 10-50x faster and prevents cumulative file locking issues.
if validation finds issues in step 5: never skip the fix-verify loop. run at least one cycle of fix + re-validation. issues often cascade (fixing one creates another). re-validate after each fix.
if the document is open in Microsoft Word or LibreOffice: close it before running officecli commands. file locking will cause "permission denied" errors. inform the user to close the file if they get a lock error.
if the docx contains track changes, comments, or footnotes: use view annotated to see author markup. use set --prop accept-changes=all to accept all tracked changes. for creating tracked changes, use raw-set with XML (high-level commands do not support track creation).
if the user needs complex numbering (multi-level lists with custom formatting): use numId/numLevel properties on paragraphs instead of listStyle=bullet/numbered. listStyle covers simple cases only.
if an element property is unknown or you're unsure of the value format: run officecli docx set <element-type> (e.g., officecli docx set paragraph) to print all valid properties and their formats. guessing property names causes commands to fail silently.
if the shell is zsh or bash: always quote XPath arguments to prevent glob expansion. "/body/p[1]" not /body/p[1].
if batch mode fails with "Failed to send to resident": retry the command. approximately 1-in-15 batch operations may fail. if it persists, close/reopen the file or split large batch arrays into 10-15 operation chunks.
if the user needs visual verification: officecli has no svg/html preview mode (unlike pptx). visual checks require opening the file in Word, LibreOffice, or Google Docs. use view text, view annotated, view outline, view issues, and validate for text-based verification.
success criteria:
file format: .docx (Office Open XML, zipped XML structure compliant with ECMA-376 standard).
file location: path specified by user (or current working directory). file is overwritten if it already exists.
metadata fields populated:
dc:title (document title)dc:creator (author)dcterms:created (timestamp)dcterms:modified (timestamp)data format of validation output: plain text list with element path, issue type (format/content/structure), and human-readable description. example:
Issues found in doc.docx:
/body/p[5]: content - contains placeholder text "XXXX"
/body/tbl[1]: format - table header row missing style
/body/p[12]: structure - empty paragraph (use spaceBefore/spaceAfter instead)
the user knows the skill worked when:
officecli commands run without errors. stdout shows operation success (e.g., "Added paragraph to /body", "Set Heading1 style on /body/p[1]"). no "command not found", "permission denied", or "invalid argument" messages.
the .docx file exists and has a non-zero size. check with ls -lh doc.docx or equivalent. file size should grow as content is added.
validation passes. officecli validate doc.docx returns zero exit code. officecli view doc.docx issues shows no issues (or only expected/intentional issues).
content is correct. officecli view doc.docx text shows all user-requested body text in the correct order. officecli view doc.docx outline shows correct heading hierarchy and section count. officecli view doc.docx annotated shows correct font, size, bold/italic, style on each run.
the file opens in Word/LibreOffice without errors. user opens the .docx in Microsoft Word, LibreOffice, Google Docs, or equivalent and confirms:
query results are non-empty when content exists. officecli query doc.docx 'p:contains("text")' returns matching XPaths if that text was added. officecli query doc.docx 'image:no-alt' returns empty list if all images have alt text.
help commands return expected syntax. officecli docx set paragraph prints all available paragraph properties. this confirms the help system is working (useful for diagnosing unknown property names).