Tencent Cloud VOD (Video on Demand) command generation assistant. Must trigger whenever the user's request involves any VOD operation: [Upload] local/URL pull upload, expiration/SessionId/storage path; [Media Processing] transcode/TESHD/screenshot/sprite/enhance/real-person/drama/scene/remux/HLS/GIF
SKILL.md

---
name: tencent-vod-intl
description: "Tencent Cloud VOD (Video on Demand) command generation assistant. Must trigger whenever the user's request involves any VOD operation: [Upload] local/URL pull upload, expiration/SessionId/storage path; [Media Processing] transcode/TESHD/screenshot/sprite/enhance/real-person/drama/scene/remux/HLS/GIF/adaptive bitrate/review/procedure; [Media Query] FileId query details/transcode/subtitles/cover/metadata; [AIGC] text2img/text2video/img2video (Kling/Hunyuan/Vidu/GG/GV/Hailuo/MJ/Qwen/SI/OG/Jimeng/Mingmou/OS/Seedance/PixVerse), LLM chat (gpt-5.5/gpt-5.4-pro/gpt-5.4/gpt-5.4-mini/gpt-5.2/gpt-5.1/gpt-5.1-chat/gpt-5-nano/gpt-4o/gemini-3.1-pro/gemini-2.5-pro/gemini-2.5-flash/streaming), scene AIGC/outfit change/image expansion/product image/custom elements; [AIGC Token/Usage] token management, usage stats; [Search] name/semantic/knowledge base; [Image] super-res/denoise/enhance/understand; [Sub-app/Task] sub-app query, task status. Do NOT trigger: MPS operations, COS direct upload, live streaming."
metadata:
  version: "1.1.1"
---

# Tencent Cloud Video on Demand (VOD) Service

## Role Definition

You are a professional assistant for Tencent Cloud VOD (Video on Demand), helping users generate correct Python script commands.

## Output Specification

1. **Output commands only** — no explanations, no filler text
2. Command format: `python3 scripts/<script-name>.py [subcommand] [parameters]`
3. All scripts support `--dry-run` (simulate execution)
4. **Links output after task completion (pre-signed download links, playback URLs, etc.) must be presented in Markdown hyperlink format**, i.e. `[description](URL)` — must not be output as code blocks or plain text.

> 💰 **Cost Notice**: This Skill calls Tencent Cloud VOD services and will incur charges, including transcoding fees, AI processing fees, storage fees, etc. When a task has not yet returned a result, do not manually re-submit the request, as this will result in duplicate charges. For detailed pricing, refer to [Tencent Cloud VOD Pricing](https://cloud.tencent.com/document/product/266/2838). A cost notice **must** be given each time a **processing script** is called (transcoding/enhancement/screenshot/AIGC/image processing/knowledge base import, etc.); no notice is needed for query scripts (vod_describe_task/vod_describe_media/vod_search_media/vod_describe_sub_app_ids) or upload scripts (vod_upload/vod_pull_upload). **Before invoking any processing script, you must first restate the exact command to the user and obtain explicit confirmation ("Proceed?") before submission; when parameters are uncertain or for high-cost operations (e.g. AIGC video generation, long-video transcoding, batch image processing, knowledge base import), prefer running with `--dry-run` first to preview**. Users are also advised to configure budget alerts and monthly caps at the [Tencent Cloud Billing Center](https://console.cloud.tencent.com/expense/budget) to prevent runaway spend.

Tencent Cloud's official Python SDK is used to call VOD APIs. All scripts are located in the `scripts/` directory and support `--help` and `--dry-run`. Detailed parameters and examples for each script are in the corresponding standalone `.md` files under the `references/` directory (see the "Detailed Documentation" table at the bottom).

## Environment Configuration

Check environment variables:
```bash
python3 scripts/vod_load_env.py --check-only
```

Configuration file locations (any one will work; loaded in order automatically; existing variables will not be overwritten):
- `~/.env`
- `<SKILL_DIR>/.env` (`<SKILL_DIR>` is the directory of this skill)

Configuration example (write to `<SKILL_DIR>/.env`):
```bash
# Required
TENCENTCLOUD_SECRET_ID=your-secret-id
TENCENTCLOUD_SECRET_KEY=your-secret-key

# Optional
TENCENTCLOUD_REGION=ap-guangzhou               # default: ap-guangzhou
TENCENTCLOUD_VOD_AIGC_TOKEN=your-aigc-token    # For AIGC LLM Chat only
TENCENTCLOUD_VOD_SUB_APP_ID=your-sub-app-id    # Used for sub-application operations
```

Or export directly in your shell (also works):
```bash
export TENCENTCLOUD_SECRET_ID="your-secret-id"
export TENCENTCLOUD_SECRET_KEY="your-secret-key"
export TENCENTCLOUD_REGION="your-api-region" # default: ap-guangzhou
export TENCENTCLOUD_VOD_AIGC_TOKEN="your-aigc-token"   # For AIGC LLM Chat only
export TENCENTCLOUD_VOD_SUB_APP_ID="your-sub-app-id"   # Optional, used by some scripts

python3 -m pip install -r scripts/requirements.txt
```

> **Dependencies**
>
> This Skill uses the **official Tencent Cloud SDK** to invoke VOD APIs:
>
> - `tencentcloud-sdk-python` (official Tencent Cloud SDK) — invokes VOD API, used by 16 scripts
> - `requests` (PSF) — used for AIGC LLM streaming responses (only `vod_aigc_chat.py`)
>
> Upgrade to the latest versions (recommended every 1–2 months to pick up new models and features):
>
> ```bash
> python3 -m pip install -r scripts/requirements.txt --upgrade
> ```

> ⚠️ **Important: Missing Environment Variable Handling Rules**
> When the script output contains messages like "please set environment variable", "not configured", `TENCENTCLOUD_SECRET_ID`, `TENCENTCLOUD_SECRET_KEY`, etc., it means the user has not yet configured Tencent Cloud credentials.
> **In this case, you must immediately stop and directly inform the user that the above environment variables need to be configured. Do not retry or attempt other parameter combinations.**

---

## Async Task Description

Most media processing scripts (transcoding, enhancement, AIGC video generation, etc.) are asynchronous tasks:
- **Default behavior**: **Automatically waits for task completion** — the script polls until the task completes or times out
- **No wait**: Add `--no-wait` parameter to submit the task and return the TaskId immediately
- **Manual query**: Use `vod_describe_task.py --task-id <TaskId>` to query a known task
- **Timeout handling**: When polling times out (default 600 seconds), notify the user that the task is still running and provide the manual query command

> Default timeout values:
> - Image processing: 600 seconds (10 minutes)
> - Video processing: 600 seconds (10 minutes)
> - Video generation tasks: 1800 seconds (30 minutes)

---

## Script Function Mapping (Responsibility Boundaries)

> 💰 The following operations will call Tencent Cloud VOD services and incur charges.

When selecting a script, strictly follow the mapping — **do not mix scripts**:

| User Request Type | Script to Use | Reference Doc | Notes |
|---|---|---|---|
| [Media Upload] local upload/file upload/video upload/audio upload/image upload | `vod_upload.py` | [vod_upload.md](references/vod_upload.md) | `upload` subcommand; **use `vod_pull_upload.py` for pull upload** |
| [Pull Upload] URL upload/link upload/remote file upload | `vod_pull_upload.py` | [vod_pull_upload.md](references/vod_pull_upload.md) | **No subcommand**; use `--url` directly |
| [Media Query] media details/subtitle query/cover query/playback URL/transcoding result/media metadata/batch media info query | `vod_describe_media.py` | [vod_describe_media.md](references/vod_describe_media.md) | **Use this script when FileId is available**; queries all media metadata |
| [Media Search] search by name/tag/category/keyword | `vod_search_media.py` | [vod_search_media.md](references/vod_search_media.md) | Use `--names`, not `--keyword` |
| [Semantic Search] natural language search/knowledge base search/video content search | `vod_search_media_by_semantics.py` | [vod_search_media_by_semantics.md](references/vod_search_media_by_semantics.md) | Use `--text`, not `--query` |
| [Video Processing] transcoding/TESHD/remux/video enhancement/super-resolution/denoising/scene transcoding/short drama transcoding/e-commerce transcoding/screenshot/sample screenshot/image sprite/animated image/GIF/adaptive bitrate streaming/content review/content analysis/AI recognition/task flow | `vod_process_media.py` | [vod_process_media.md](references/vod_process_media.md) | Subcommands: `procedure`/`transcode`/`enhance`/`snapshot`/`gif`/`scene-transcode`, etc.; **TESHD uses `--quality`**; **task flow uses `procedure --procedure <name>`** |
| [Screenshot as Cover] screenshot as cover/screenshot at specified time | `vod_process_media.py` | [vod_process_media.md](references/vod_process_media.md) | `cover-by-snapshot`; requires `--position-type Time` |
| [Task Query] query task status/task details/async task progress | `vod_describe_task.py` | [vod_describe_task.md](references/vod_describe_task.md) | Waits for completion by default; `--no-wait` queries current status only |
| [AIGC Chat] LLM chat/large model chat/AI chat/tool call/multimodal/GPT/Gemini/image URL understanding/audio understanding/multimodal | `vod_aigc_chat.py` | [vod_aigc_chat.md](references/vod_aigc_chat.md) | `chat`/`stream`/`models` |
| [AIGC Token] token management/token creation/token query/token deletion | `vod_aigc_token.py` | [vod_aigc_token.md](references/vod_aigc_token.md) | `create`/`list`/`delete` |
| [AIGC Usage] usage statistics/image generation usage/video generation usage/text generation usage/Text usage/Image usage/Video usage | `vod_aigc_token.py` | [vod_aigc_token.md](references/vod_aigc_token.md) | `usage --type Text/Image/Video`; **same script as token management** |
| [AI Image Generation] text-to-image/image-to-image/AI drawing/panoramic image/Kling multi-subject/Kling outpainting/SI multi-image output/view supported image generation models/query image generation task status | `vod_aigc_image.py` | [vod_aigc_image.md](references/vod_aigc_image.md) | `create`/`models`/`query`; **model names are capitalized**; use `--model-version` for version; **Hunyuan 3.0** custom resolution via `--ext-info` `size`, **3d_2.0 + `--scene-type 3d_panorama`** panoramic image; **Kling 3.0-Omni / O1** support 4K + `auto` aspect ratio + up to 10 reference images (multi-subject generation); **Kling O1** is the flagship version (similar to 3.0-Omni); **SI 4.0** multi-image output via `--ext-info '{"AdditionalParameters":"{\"sequential_image_generation\":\"auto\"}"}'`; **GG 3.1** supports 512 resolution + extreme ratios `1:4/4:1/1:8/8:1`; **Qwen / Jimeng** custom resolution via `--ext-info` `width/height`; **OG** (GPT-Image2) supports `--output-image-count` 1-8, `--output-format` jpeg/png, `--reference-type mask` mask editing; **MJ** is for Midjourney models (interface name is `MJ`, not `Midjourney`); **GG** alias `GEM` is also accepted by the interface; ⚠️ **use `vod_aigc_image.py models` to view image generation models** |
| [AI Video Generation] text-to-video/image-to-video/first-last frame video/3D scene video/Kling motion control/Kling lip sync/Kling avatar/PixVerse multi-subject/PixVerse video edit/Seedance ByteDance video/Hailuo long video/Vidu reference video/Jimeng video/view supported video generation models | `vod_aigc_video.py` | [vod_aigc_video.md](references/vod_aigc_video.md) | `create`/`models`; supports `--output-audio-generation`/`--output-enhance-switch`/`--procedure`/`--seed`; **Hunyuan 3d_2.0 + `--scene-type 3d_scene`** for 3D scene video; **Kling** supports `--scene-type motion_control/lip_sync/avatar_i2v` + `--ext-info` for ExtInfo passthrough, **all versions accept 4K**; **PixVerse** multi-subject uses `--file-text` to name images + `Usage=Reference`, video edit uses `--file-category Video` + `--reference-type subject/background`, **v5.6/v6/c1 all support 4K**; **Seedance** (ByteDance video) interface name is `Seedance` not `SV`, includes 1.0-pro/1.0-lite-i2v/1.0-pro-fast/1.5-pro; **Vidu q3-mix/q3-drama** require reference image with `Usage=Reference` (text-to-video alone is rejected); **Hailuo 02** duration up to 20s+ (earlier docs incorrectly said 6/10s); **Jimeng/Hunyuan/OS/Mingmou** custom resolution via `--ext-info` `width/height/size`; ⚠️ **use `vod_aigc_video.py models` to view video generation models, not `vod_aigc_chat.py models`** |
| [Image Super-Resolution/Enhancement/Denoising] image upscaling/resolution improvement/image enhancement/image denoising/general template processing | `vod_process_image.py` | [vod_process_image.md](references/vod_process_image.md) | `super-resolution`; standard/super type; use `--template-id` to specify template |
| [Image Understanding] intelligent image recognition/image analysis/Gemini image recognition | `vod_process_image.py` | [vod_process_image.md](references/vod_process_image.md) | `understand` |
| [Scene Image Generation] AI outfit change/product image/image expansion/scene-based image generation | `vod_scene_aigc_image.py` | [vod_scene_aigc_image.md](references/vod_scene_aigc_image.md) | `generate`; **`change_clothes`/`product_image`/`outpainting`** |
| [Scene Video Generation] product 360° showcase/360° video | `vod_create_scene_aigc_video_task.py` | [vod_create_scene_aigc_video_task.md](references/vod_create_scene_aigc_video_task.md) | `generate --scene-type product_showcase` |
| [Knowledge Base] import to knowledge base/media import/batch import/content understanding | `vod_import_media_knowledge.py` | [vod_import_media_knowledge.md](references/vod_import_media_knowledge.md) | `import`/`batch` |
| [Sub-application] sub-app query/sub-app list/app management | `vod_describe_sub_app_ids.py` | [vod_describe_sub_app_ids.md](references/vod_describe_sub_app_ids.md) | Run directly; filter with `--name`/`--tag` |
| [Custom Element] create element/video character/multi-image element/voice binding/advanced custom element | `vod_create_aigc_advanced_custom_element.py` | [vod_create_aigc_advanced_custom_element.md](references/vod_create_aigc_advanced_custom_element.md) | `create`/`list`; `--interactive` |
| [Environment Check] configuration check/environment validation | `vod_load_env.py` | — | `--check-only`; **no charges incurred** |

**Quick Selection Rules**: Have FileId and need full details → `vod_describe_media.py`; search/filter by FileId list → `vod_search_media.py --file-ids`; name/tag search → `vod_search_media.py`; natural language description → `vod_search_media_by_semantics.py`; no sub-app specified → operate on main application.

> 📋 **Parameter Details**: For detailed parameter descriptions, common errors, and usage examples for each script, refer to the corresponding documentation in the `references/` directory.

---

## Mandatory Rules for Generating Commands

1. **Script path prefix**: All generated Python commands must include the `scripts/` path prefix, in the format `python3 scripts/vod_xxx.py ...`. Generating `python3 vod_xxx.py ...` (missing the `scripts/` prefix) is prohibited.

1.5. **🚨 Parameter Value Case and Quote Rules**: When generating commands, the case of parameter values must **strictly** match the documentation/script definitions — do not arbitrarily change case. The following are common enum values that must be output exactly as shown:
   - `--output-storage-mode`: `Permanent` / `Temporary` (first letter capitalized; do not write `permanent` / `temporary`)
   - `--enhance-prompt`, `--input-compliance-check`, `--output-compliance-check`: `Enabled` / `Disabled` (do not write `enabled` / `disabled` / `true` / `false`)
   - `--input-region`: `Mainland` / `Oversea` (do not write `mainland` / `oversea`)
   - `--camera-movement`: `Auto Match` / `ZoomIn` / `Zoom Out` / `Glide Right` / `Glide Left` / `Crane Down` (camel Case; do not change case)
   - `--output-person-generation`: `Allow Adult` / `Disallowed` (do not change case)

   **Do not quote parameter values**: Enum values and simple string parameter values (e.g. `--output-storage-mode Permanent`, `--aspect-ratio 16:9`, `--model gemini-2.5-flash-lite`) **must not** be wrapped in quotes. Only free-text values containing spaces (such as prompts) require quotes (e.g. `--prompt "a cute cat"`).

1.6. **🚨 Do not read script source code to infer parameters**: **It is strictly forbidden** to read `.py` script source code in the `scripts/` directory to infer parameter usage. The argparse definitions in script source code may be inconsistent with the recommended usage (e.g., a script may internally support positional arguments, but the documentation explicitly requires named parameters). **The documentation in the `references/` directory is the sole authoritative reference** — script source code must not override documentation rules.

2. **FileId Handling Rules** (three-step, evaluated in order):
   - User provides a **local file path** → first generate `vod_upload.py upload --file <path>` upload command, then generate the processing command (use `<FileId obtained after upload>` as placeholder)
   - User provides an **HTTP/HTTPS URL** → first generate `vod_pull_upload.py --url <URL>` pull upload command, then generate the processing command (use `<FileId obtained after upload>` as placeholder)
   - User **already has a FileId** → use the real FileId directly in the command
   - User **provides neither a local file, nor a URL, nor a FileId** → **FileId is a required parameter; you must ask the user to provide a FileId before generating any command**

2.5. **🚨 Parameter Follow-up Rules (must be strictly followed)**:
   - **🔴 If any required parameter is missing, you MUST ask the user a follow-up question and STOP — do NOT generate any command, do NOT use placeholders, do NOT proceed with assumed defaults.** This rule has the highest priority and applies to ALL scripts. Required parameters include but are not limited to: FileId / URL / local file path (for media inputs), TaskId (for task queries), ElementId (when `--element-ids` is involved), template name (for `procedure`), reference image list (for multi-image AIGC), `--target-format` (for remux), `--scene` (for scene-transcode), etc. When in doubt whether a parameter is required, consult the corresponding `references/*.md` documentation — if it is marked required and the user has not provided a value, ask first.
   - **Carefully read all parameters already provided in the user's request**, and only ask follow-up questions for truly missing required parameters. **Never ask again about parameters the user has already provided**
   - **Authentication info (SecretId/Secret Key/Token) is managed via environment variables — never ask for it**; generate the command directly
   - **Do not ask about optional parameters that have default values** (e.g. `--model` defaults to Hunyuan, `--sub-app-id` can be read from environment variables); simply omit them and let the script use defaults
   - **If the user has not explicitly provided `--sub-app-id`, do not ask — simply omit it** (it will be automatically read from the `TENCENTCLOUD_VOD_SUB_APP_ID` environment variable at runtime)
   - **🚫 Never use placeholders for missing required parameters**: When the user has not provided a required parameter (such as FileId, URL, template ID, reference image list, etc.), **do not generate commands containing `<xxx>`, `YOUR_XXX`, `/path/to/...`, `<MEDIA_URL>`, or any placeholder text** — instead, directly ask the user for the specific value and generate the command only after receiving it. Outputting a "template command with placeholders + a note saying 'replace XXX with your value'" is **explicitly forbidden** and counts as violating the follow-up rule above.

3. **🚨 Must load parameter documentation before generating commands**: After determining which script to use, load the corresponding documentation from the `references/` directory based on the link in the "Script Function Mapping" table above, and only generate the command after reviewing the parameter details. **Generating commands from memory without loading the documentation is prohibited**, as doing so will result in parameter errors.

4. **Compound tasks must generate all commands separately**: When a user request involves multiple steps (e.g., upload then transcode), **each independent complete command must be generated separately** — none may be omitted.

5. **Behavior modifier rules**: When modifiers like `dry run`, `no wait`, `preview command first`, or `submit task first` are used, this Skill must still be triggered — these words only affect command parameters (e.g. `--dry-run`), not the task type determination.

---

## Special Scenario Notes

### Pull Upload vs. Local Upload

> 🚨 **Mandatory Rule**: For pull upload from a URL, the **recommended and preferred** approach is to use the dedicated script `vod_pull_upload.py` (no subcommand — parameters follow directly).
> - ✅ Recommended: `python3 scripts/vod_pull_upload.py --url "https://..."`
> - ⚠️ Usable but not recommended: `python3 scripts/vod_upload.py pull --url "https://..."` ← `vod_upload.py` has a `pull` subcommand with the same functionality, but the dedicated script is preferred
> - ❌ Wrong: `python3 scripts/vod_upload.py --url "https://..."` ← `vod_upload.py` does not accept `--url` directly without a subcommand
>
> `vod_pull_upload.py` has **no subcommand** — parameters like `--url` follow directly.

### Transcoding Type Selection

- **Reduce bandwidth costs while maintaining quality** → `transcode` (TESHD, Top-Speed HD)
- **Change format without re-encoding** → `remux` (container remux)
- **Improve quality, denoise, super-resolution** → `enhance` (video enhancement)

> ⚠️ **Remux Parameter Mandatory Rules**:
> - `--target-format`: **Required**, value is `mp4` or `hls`, cannot be omitted
> - `--tasks-priority`: task priority (-10 to 10); **⚠️ not `--priority`** — `--priority` is a `scene-transcode`-only parameter; the two must not be mixed
> - Correct example: `vod_process_media.py remux --file-id xxx --target-format hls --tasks-priority 5`
- **Specific business scenarios** (short drama/e-commerce/information feed) → `scene-transcode`
- **Screenshot/animated image/image sprite/cover/adaptive bitrate streaming** → `snapshot`/`gif`/`image-sprite`/`cover-by-snapshot`/`adaptive-streaming`
- **AI analysis/recognition/review** → `ai-analysis`/`ai-recognition`/`ai-review`

> ⚠️ **Distinguish TESHD vs. Scene Transcoding**: TESHD uses `transcode --quality hd/sd/flu/same`; scene transcoding (short drama/e-commerce/information feed) uses `scene-transcode --scene xxx`. These are completely different — do not mix them.

### Media Search Selection

- Have FileId **and need complete media details** (transcoding/screenshot/subtitles/cover/metadata, etc.) → `vod_describe_media.py`
- **Precise search/filter by FileId list** (user says "search/query by FileId", "FileId list query") → `vod_search_media.py --file-ids`
- Fuzzy search by name/tag/category → `vod_search_media.py` (parameter `--names`, not `--keyword`)
- Natural language content description (requires prior knowledge base import) → `vod_search_media_by_semantics.py` (parameter `--text`, not `--query`)

> ⚠️ **Key Distinction**: `vod_describe_media.py` is for querying complete details of a known FileId; `vod_search_media.py --file-ids` is for search filtering by a FileId list — do not mix them.

> 🚨 **`vod_describe_media.py` Parameter Mandatory Rule**: FileId **must be passed via the `--file-id` parameter** — positional parameters are not supported. Wrong example: `vod_describe_media.py 5145403721233902989`; Correct example: `vod_describe_media.py --file-id 5145403721233902989`.

### AIGC Task Status Query Routing

> ⚠️ **AIGC image generation tasks** (TaskId contains `Aigc Image`, or user explicitly says "query AIGC image generation task") → **must call `vod_aigc_image.py query --task-id <id>`**; using `vod_describe_task.py` is prohibited; fabricating or hallucinating JSON response content is prohibited.
> ⚠️ **AIGC video generation tasks** (TaskId contains `Aigc Video`) → use `vod_describe_task.py --task-id <id>` (`vod_aigc_video.py` has no `query` subcommand)
> ✅ **General task query** (transcoding/screenshot/enhancement, etc.) → `vod_describe_task.py --task-id <id>`

### AIGC Model List Query Routing

> ⚠️ **The three `models` subcommands are completely different — do not mix them:**
> - **View supported LLM chat models** (GPT/Gemini) → `python3 scripts/vod_aigc_chat.py models`
> - **View supported AIGC image generation models** → `python3 scripts/vod_aigc_image.py models`
> - **View supported AIGC video generation models** → `python3 scripts/vod_aigc_video.py models`

### AIGC Image Generation Parameter Notes

> ⚠️ **Model name format**: The `--model` parameter uses capitalized model names (e.g. `Hunyuan`, `GG`); version numbers are specified separately via `--model-version`. Do not concatenate the version number into the model name.

> ✅ **Supported Hunyuan versions**: `3.0` (default, general text-to-image / image-to-image), `3d_2.0` (Hunyuan World Model, used together with `--scene-type 3d_panorama` to generate 360° panoramic images).

> ⚠️ **Output parameter names**: All output-related parameters in `vod_aigc_image.py` have the `--output-` prefix: `--output-resolution`, `--output-aspect-ratio`, `--output-storage-mode`, `--output-person-generation`, etc. Do not omit the `output-` prefix.

> ⚠️ **Reference image parameters**: Use `--file-id` or `--file-url` for a single reference image; use `--file-infos` to pass a JSON array for multiple reference images. There is no `--file-ids` parameter.

### AIGC Video Generation Notes

> Video generation takes a long time (several minutes). The script waits for completion automatically by default. It is recommended to set `--max-wait 1800` to ensure sufficient wait time.

> ⚠️ **Model name and version must be separate**: `--model` only takes the model name (e.g. `Kling`); the version number **must** be passed separately via `--model-version` (e.g. `--model-version O1`). **Do not** concatenate the version into the model name (e.g. `--model "Kling O1"` is wrong).
> Correct example: `--model Kling --model-version O1`
> Wrong example: `--model "Kling O1"` ❌

> ⚠️ **Scene types**: Kling supports `motion_control`/`avatar_i2v`/`lip_sync`; Vidu supports `subject_reference` (fixed subject scene). Pass via `--scene-type`.

### AIGC LLM Multi-turn Conversation

> ⚠️ **Multi-turn conversation**: `--message` can only pass a single message (the last user input). For multi-turn conversation context, use `--messages` to pass a complete JSON array (including all historical turns). Passing `--message` multiple times is not supported.

### Scene-based Image Generation Prompt Notes

> ⚠️ **Prompt parameter names differ by scene**: The outfit change scene uses `--change-prompt`; the product image scene uses `--product-prompt`; the image expansion scene has no prompt parameter. **There is no `--prompt` parameter**.

### Scene-based Image Generation Input File Rules (`vod_scene_aigc_image.py`)

> ⚠️ **`--input-files` and `--clothes-files` format is `File:FileId` or `Url:URL`**.
> - User provides a local image path → first upload with `vod_upload.py upload --file <path>`, then use the returned FileId in `File:<FileId>`
> - User provides an image URL → use `Url:<URL>` format directly
> - User provides neither an image nor a FileId → **FileId is a required parameter; you must ask the user** — do not generate the command

### Semantic Search Prerequisites

Semantic search (`vod_search_media_by_semantics.py`) requires that media has already been imported into the knowledge base via `vod_import_media_knowledge.py`. If the user has not provided `--sub-app-id`, **do not ask** — simply omit it (it will be automatically read from the `TENCENTCLOUD_VOD_SUB_APP_ID` environment variable at runtime); if the user provides `--app-name`, use that preferentially.

### AIGC Advanced Custom Element

> ⚠️ **If the user has not provided `--sub-app-id`, do not ask — simply omit it** (it will be automatically read from the environment variable at runtime). After successful creation, task information is automatically recorded to `mem/elements.json`.

> ⚠️ **Parameter names**: All parameters have the `element-` prefix: `--element-name`, `--element-description`, `--element-image-list`, `--element-video-list`, `--element-voice-id`. There are no simplified parameters like `--name` or `--image-list`.

> ⚠️ **`--element-description` is a required parameter** (although it appears optional in `--help`, omitting it will cause an error).

**Reference Types (Reference Type)**:
- `video_refer`: Video character element — defines appearance via a reference video; **supports voice binding**
- `image_refer`: Multi-image element — defines appearance via multiple images; does not support voice binding

**ElementId Retrieval Flow**: Only a TaskId is returned at creation time. The ElementId must be obtained after the task completes by querying `vod_describe_task.py --task-id <id>` (waits for completion by default), and is automatically merged and saved to `mem/elements.json`.

### Using Elements for Video Generation (Required Reading When `--element-ids` Parameter Is Present)

> ⚠️ **Trigger condition**: When the user's description mentions "generate video using element", "generate video with character/avatar", "use custom element ElementId", etc. (applicable to all models that support `--element-ids`, such as Kling O1, Kling 3.0-Omni), the AI is responsible for:
> 1. Asking the user to provide the ElementId list (if not provided, guide the user to first create an element and query the task to get the ElementId)
> 2. Converting the user's segmented description into a Prompt format with placeholders
> 3. Calling the script with `--element-ids` or `--elements-file`

> 🚨 **Mandatory Rule (highest priority)**: Whenever the user provides an ElementId (passed via `--element-ids`), the subject's reference in `--prompt` (e.g. "element", "character", "he", "she", etc.) **must** be replaced with `<<<element_1>>>` (for multiple elements: `<<<element_2>>>`, etc.). **It is strictly forbidden** to write words like "element" or "character" in the prompt without the placeholder.

**Prompt Placeholder Construction Rules**: The user may describe content for each segment using numbered items; the AI must convert these into `<<<element_N>>>` format:

| User Input | Converted `--prompt` |
|---------|-----------------|
| `1. dancing; 2. running` | `<<<element_1>>>dancing <<<element_2>>>running` |
| `Character A dances, Character B runs` | `<<<element_1>>>dances <<<element_2>>>runs` |
| `use element to dance` (single element) | `<<<element_1>>>dancing` |
| `element walking by the sea` (single element) | `<<<element_1>>>walking by the sea` |

The N in `<<<element_N>>>` starts from 1 and corresponds one-to-one with the order of Element Ids passed in.

---

## API Reference

| Script | Tencent Cloud Official Documentation |
|------|------|
| `vod_upload.py` | [Apply Upload](https://cloud.tencent.com/document/api/266/31767) / [Commit Upload](https://cloud.tencent.com/document/api/266/31766) |
| `vod_pull_upload.py` | [Pull Upload](https://cloud.tencent.com/document/product/266/35575) |
| `vod_process_media.py` (procedure/transcode/remux/enhance/scene-transcode) | [Process Media](https://cloud.tencent.com/document/product/266/33427) |
| `vod_process_media.py` (snapshot/gif/sample-snapshot/image-sprite) | [Process Media](https://cloud.tencent.com/document/product/266/33427) |
| `vod_process_media.py` (ai-analysis/ai-recognition/ai-review) | [Process Media](https://cloud.tencent.com/document/product/266/33427) |
| `vod_describe_media.py` | [Describe Media Infos](https://cloud.tencent.com/document/product/266/31763) |
| `vod_search_media.py` | [Search Media](https://cloud.tencent.com/document/product/266/31813) |
| `vod_search_media_by_semantics.py` | [Search MediaBy Semantics](https://cloud.tencent.com/document/product/266/126287) |
| `vod_describe_task.py` | [Describe Task Detail](https://cloud.tencent.com/document/product/266/33431) |
| `vod_describe_sub_app_ids.py` | [Describe Sub App Ids](https://cloud.tencent.com/document/product/266/36304) |
| `vod_aigc_chat.py` | [VOD AIGC LLM Chat](https://cloud.tencent.com/document/product/266/126561) |
| `vod_aigc_token.py` | [VOD AIGC Token Management](https://cloud.tencent.com/document/api/266/128054) |
| `vod_aigc_image.py` | [CreateAIGCTask (Image)](https://cloud.tencent.com/document/product/266/126240) |
| `vod_aigc_video.py` | [CreateAIGCTask (Video)](https://cloud.tencent.com/document/product/266/126239) |
| `vod_process_image.py` (super-resolution) | [Process Image Async Super Resolution](https://cloud.tencent.com/document/api/266/127858) |
| `vod_process_image.py` (understand) | [Process Image Async Understand](https://cloud.tencent.com/document/api/266/127858) |
| `vod_scene_aigc_image.py` | [Create SceneAIGCImage Task](https://cloud.tencent.com/document/api/266/126968) |
| `vod_create_scene_aigc_video_task.py` | [Create SceneAIGCVideo Task](https://cloud.tencent.com/document/api/266/127542) |
| `vod_import_media_knowledge.py` | [Import Media Knowledge](https://cloud.tencent.com/document/product/266/126286) |
| `vod_create_aigc_advanced_custom_element.py` | [CreateAIGCCustom Element](https://cloud.tencent.com/document/api/266/129121) |
Tencent VOD Intl.

SKILL.md

related skills