@clawhub-wondershare-boop-23cc30ae25
Generate diagrams from natural language using EdrawMax AI APIs. Supports four diagram types: flowchart (流程图), infographic (信息图), Gantt chart (甘特图), and mind...
---
name: edrawmax-diagram
description: "Generate diagrams from natural language using EdrawMax AI APIs. Supports four diagram types: flowchart (流程图), infographic (信息图), Gantt chart (甘特图), and mind map (思维导图). Use when the user wants to create, generate, or draw any of these diagram types. Triggers on: 'generate a flowchart,' 'create a mind map,' 'draw a Gantt chart,' 'make an infographic,' 'visualize a process,' 'project timeline,' 'knowledge map,' or any request to turn a description into a diagram."
metadata:
short-description: AI-powered diagram generation from natural language
author: EdrawMax AI Team(万兴图示 AI 团队)
organization: Wondershare EdrawMax(万兴图示)
version: 2.0.0
license: Proprietary
---
# EdrawMax(万兴图示)AI Diagram Generator
> **Author:** EdrawMax AI Team(万兴图示 AI 团队)· **Organization:** Wondershare EdrawMax(万兴图示)
> **Version:** 2.0.0 · **License:** Proprietary © 2026 Wondershare EdrawMax(万兴图示). All rights reserved.
Generate four types of diagrams from natural language via EdrawMax(万兴图示)AI APIs. Each API returns PNG/SVG image URLs and source code.
## Step 1 — Choose Diagram Type
| User Intent | Type | Endpoint |
|---|---|---|
| Process, workflow, steps, decision flow | **flowchart** | `/skills/generate-flowchart` |
| Data visualization, statistics, comparison | **infographic** | `/skills/generate-infographic` |
| Project plan, timeline, schedule, phases | **gantt** | `/skills/generate-gantt` |
| Knowledge structure, brainstorm, topic tree | **mindmap** | `/skills/generate-mindmap` |
If the user's intent is ambiguous, ask which diagram type they want.
## Step 2 — Call the API
**Base URL:** `https://api.edrawmax.cn/api/ai`
All four endpoints share the same request format:
```
POST https://api.edrawmax.cn/api/ai/skills/generate-{type}
Content-Type: application/json
{"prompt": "<user description>", "lang": "cn", "platform": "web"}
```
### Request Parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
| prompt | string | Yes | — | Natural language description of the diagram |
| lang | string | No | "cn" | Language: en, cn, jp, kr, es, fr, de, it, tw, pt, ru, id |
| platform | string | No | — | Platform: web, win, mac, ios, android, linux |
### Response Fields
**Flowchart** returns:
```json
{ "code": 0, "msg": "", "data": { "png_url": "...", "svg_url": "...", "mermaid_code": "..." } }
```
**Infographic / Gantt / Mindmap** return:
```json
{ "code": 0, "msg": "", "data": { "png_url": "...", "svg_url": "...", "source_code": "..." } }
```
> Note: flowchart uses `mermaid_code`, the other three use `source_code`.
## Step 3 — Download Files Locally
After a successful API call, **always** run the download script to save the images locally:
```bash
python <skill-path>/scripts/download_diagram.py --png-url "<png_url>" --svg-url "<svg_url>" [--output-dir "<dir>"]
```
- Default output directory: `./edrawmax_output`
- The script prints the local file paths as JSON, e.g.:
```json
{"png_path": "./edrawmax_output/diagram_20260312_143000.png", "svg_path": "./edrawmax_output/diagram_20260312_143000.svg"}
```
- Use the returned **local file paths** when presenting results to the user.
- **Security**: The script only accepts `https://` URLs whose hostname belongs to trusted EdrawMax OSS domains (`.aliyuncs.com`, `.wondershare.com`, `.edrawsoft.com`, `.edrawmax.com`). TLS certificates are fully verified. URLs from any other host are rejected — do not pass user-supplied or third-party URLs to this script.
## Step 4 — Present Results to User
Use the following preferred display format:
1. **Thumbnail (PNG)** — Render the local PNG file as an inline image if the environment supports it (e.g. Markdown ``). If inline rendering is not supported, show the `png_url` as a clickable link instead.
2. **High-res diagram (SVG)** — Always present the `svg_url` as a clickable link so the user can open the full-quality vector image in their browser: e.g. `[查看高清图](svg_url)`.
3. **Source code** — Show `mermaid_code` (flowchart) or `source_code` (other types) in a code block for secondary editing or re-rendering.
Example output format:
```

[查看高清图(SVG)](https://xxx.oss.com/.../main.svg)
```
## Error Handling
| code | msg | Action |
|---|---|---|
| 400 | prompt is required | Ask user to provide a description |
| 400 | lang不合法 | Fix lang to a valid value |
| 2406 | risk control rejection | Content rejected; ask user to rephrase |
| 3001 | concurrency limit | Wait briefly, then retry once |
| 212200 | 生成失败 | Retry once; if still failing, report to user |
| 212201 | 渲染失败 | Retry once; if still failing, report to user |
| 500 | panic | Report internal server error to user |
For retryable errors (3001, 212200, 212201), retry up to 1 time before reporting failure. If the error persists, inform the user and share the support contact (see FAQ below).
## FAQ
**Q: 使用 EdrawMax(万兴图示)AI MCP 服务是否需要付费?**
A: 目前为限时免费,用户可免费调用服务。
**Q: 如何联系我们?**
A: 如有技术问题、服务反馈或 API 大量购买需求,欢迎通过邮箱联系:
📧 [email protected]
我们将尽快为您解答。
## Language Mapping
Map user language/locale to `lang` param:
- English → `en`, 简体中文 → `cn`, 日本語 → `jp`, 한국어 → `kr`
- Español → `es`, Français → `fr`, Deutsch → `de`, Italiano → `it`
- 繁體中文 → `tw`, Português → `pt`, Русский → `ru`, Bahasa Indonesia → `id`
## Notes
- `user_id` is extracted server-side from `X-User-ID` header; do not pass it in the body
- Always present the source code so users can edit or re-render
- For full API specs, see [references/api-reference.md](references/api-reference.md)
- When an error cannot be resolved after retry, always share the support email **[email protected]** with the user
---
© 2026 Wondershare EdrawMax(万兴图示)AI Team. This skill and all associated resources are proprietary to EdrawMax(万兴图示). Unauthorized reproduction or distribution is prohibited.
FILE:license.txt
Proprietary Software License
Copyright (c) 2026 Wondershare EdrawMax AI Team(万兴图示 AI 团队). All rights reserved.
This skill, including all associated source code, documentation, scripts,
references, assets, and other materials (collectively, the "Software"),
is the proprietary and confidential property of the EdrawMax AI Team(万兴图示 AI 团队),
a division of Wondershare Technology Co., Ltd.
TERMS AND CONDITIONS
1. Ownership
The Software is owned by and remains the exclusive property of
Wondershare EdrawMax AI Team(万兴图示 AI 团队). All intellectual property rights,
including but not limited to copyrights, patents, trademarks, and
trade secrets, are reserved by the EdrawMax AI Team.
2. Restrictions
Without the prior written consent of EdrawMax AI Team, you may NOT:
(a) Copy, modify, adapt, or create derivative works of the Software;
(b) Distribute, sublicense, lease, rent, or transfer the Software
to any third party;
(c) Reverse engineer, decompile, or disassemble the Software;
(d) Remove or alter any proprietary notices, labels, or marks on
the Software.
3. Permitted Use
This Software is provided solely for use within authorized EdrawMax
products and services. Use is subject to the EdrawMax Terms of Service
and any additional agreements between you and Wondershare.
4. Disclaimer of Warranty
THE SOFTWARE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED. EDRAWMAX AI TEAM DISCLAIMS ALL WARRANTIES,
INCLUDING WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR
PURPOSE, AND NON-INFRINGEMENT.
5. Limitation of Liability
IN NO EVENT SHALL EDRAWMAX AI TEAM OR WONDERSHARE BE LIABLE FOR ANY
INDIRECT, INCIDENTAL, SPECIAL, CONSEQUENTIAL, OR PUNITIVE DAMAGES
ARISING OUT OF OR RELATED TO THE USE OF THIS SOFTWARE.
6. Governing Law
This license shall be governed by and construed in accordance with
the laws of the People's Republic of China.
Contact: EdrawMax AI Team(万兴图示 AI 团队), Wondershare Technology Co., Ltd.
FILE:scripts/download_diagram.py
#!/usr/bin/env python3
"""
EdrawMax Diagram Downloader
Downloads PNG and SVG files from EdrawMax AI API response URLs to local disk.
Usage:
python download_diagram.py --png-url <url> --svg-url <url> [--output-dir <dir>]
Output:
Prints JSON with local file paths:
{"png_path": "...", "svg_path": "..."}
Security:
- Only HTTPS URLs from trusted EdrawMax OSS domains are accepted.
- TLS certificates are fully verified (system trust store).
Author: EdrawMax AI Team(万兴图示 AI 团队)
© 2026 Wondershare EdrawMax(万兴图示). All rights reserved.
"""
import argparse
import json
import os
import sys
import urllib.request
import urllib.error
import ssl
from datetime import datetime
from urllib.parse import urlparse
# Trusted hostname suffixes for EdrawMax OSS URLs.
# Only URLs whose hostname ends with one of these suffixes are allowed.
TRUSTED_HOSTS = (
".aliyuncs.com",
".wondershare.com",
".edrawsoft.com",
".edrawmax.com",
)
def validate_url(url: str) -> None:
"""
Raise ValueError if the URL is not a safe, trusted HTTPS URL.
Checks:
- Scheme must be https
- Hostname must match a trusted EdrawMax OSS domain suffix
"""
parsed = urlparse(url)
if parsed.scheme != "https":
raise ValueError(f"Rejected non-HTTPS URL: {url!r}")
host = parsed.hostname or ""
if not any(host.endswith(suffix) for suffix in TRUSTED_HOSTS):
raise ValueError(
f"Rejected URL from untrusted host '{host}'. "
f"Allowed domains: {', '.join(TRUSTED_HOSTS)}"
)
def download_file(url: str, output_path: str) -> str:
"""
Download a file from a trusted HTTPS URL to output_path.
Returns output_path on success, empty string on failure.
TLS certificate verification uses the system default trust store.
"""
try:
validate_url(url)
except ValueError as e:
print(f"[ERROR] {e}", file=sys.stderr)
return ""
# Use system default SSL context — certificate verification is enabled.
ctx = ssl.create_default_context()
req = urllib.request.Request(url, headers={"User-Agent": "EdrawMax-Skill/2.0"})
try:
with urllib.request.urlopen(req, context=ctx, timeout=60) as resp:
data = resp.read()
os.makedirs(os.path.dirname(output_path) or ".", exist_ok=True)
with open(output_path, "wb") as f:
f.write(data)
return output_path
except urllib.error.URLError as e:
print(f"[ERROR] Failed to download {url}: {e}", file=sys.stderr)
return ""
except OSError as e:
print(f"[ERROR] Failed to write {output_path}: {e}", file=sys.stderr)
return ""
def main():
parser = argparse.ArgumentParser(description="Download EdrawMax diagram files")
parser.add_argument("--png-url", required=True, help="PNG image URL from API response")
parser.add_argument("--svg-url", required=True, help="SVG image URL from API response")
parser.add_argument("--output-dir", default="./edrawmax_output", help="Output directory (default: ./edrawmax_output)")
args = parser.parse_args()
os.makedirs(args.output_dir, exist_ok=True)
timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
png_path = os.path.join(args.output_dir, f"diagram_{timestamp}.png")
svg_path = os.path.join(args.output_dir, f"diagram_{timestamp}.svg")
result = {"png_path": "", "svg_path": ""}
downloaded_png = download_file(args.png_url, png_path)
if downloaded_png:
result["png_path"] = os.path.abspath(downloaded_png)
downloaded_svg = download_file(args.svg_url, svg_path)
if downloaded_svg:
result["svg_path"] = os.path.abspath(downloaded_svg)
print(json.dumps(result, ensure_ascii=False))
if not result["png_path"] and not result["svg_path"]:
sys.exit(1)
if __name__ == "__main__":
main()
FILE:references/api-reference.md
# EdrawMax(万兴图示)AI Skills — API Reference
> **Owner:** EdrawMax AI Team(万兴图示 AI 团队)· **Organization:** Wondershare EdrawMax(万兴图示)
> © 2026 Wondershare EdrawMax(万兴图示). All rights reserved.
## Base URL
```
https://api.edrawmax.cn/api/ai
```
- **Auth**: None required
- **Content-Type**: `application/json`
- **Note**: `user_id` is auto-extracted from `X-User-ID` request header (defaults to 0)
---
## 1. Generate Flowchart
```
POST /skills/generate-flowchart
```
Create a Mermaid flowchart from natural language.
**Request:**
```json
{
"prompt": "用户注册登录流程",
"lang": "cn",
"platform": "web"
}
```
**Success Response:**
```json
{
"code": 0,
"msg": "",
"data": {
"png_url": "https://xxx.oss.com/work/.../thumb.png",
"svg_url": "https://xxx.oss.com/work/.../main.svg",
"mermaid_code": "flowchart TD\n A[开始] --> B[输入手机号]\n B --> C{验证码是否正确?}\n C -->|是| D[设置密码]\n C -->|否| B\n D --> E[注册成功]"
}
}
```
| Response Field | Type | Description |
|---|---|---|
| png_url | string | PNG image URL on OSS |
| svg_url | string | SVG vector image URL on OSS |
| mermaid_code | string | Mermaid source code for editing/rendering |
---
## 2. Generate Infographic
```
POST /skills/generate-infographic
```
Create an infographic from natural language.
**Request:**
```json
{
"prompt": "2025年全球AI市场规模分析",
"lang": "cn",
"platform": "web"
}
```
**Success Response:**
```json
{
"code": 0,
"msg": "",
"data": {
"png_url": "https://xxx.oss.com/work/.../thumb.png",
"svg_url": "https://xxx.oss.com/work/.../main.svg",
"source_code": "..."
}
}
```
| Response Field | Type | Description |
|---|---|---|
| png_url | string | PNG image URL on OSS |
| svg_url | string | SVG vector image URL on OSS |
| source_code | string | Diagram source code |
---
## 3. Generate Gantt Chart
```
POST /skills/generate-gantt
```
Create a Gantt chart from natural language.
**Request:**
```json
{
"prompt": "新产品上线项目计划,包含需求分析、开发、测试、上线四个阶段",
"lang": "cn",
"platform": "web"
}
```
**Success Response:**
```json
{
"code": 0,
"msg": "",
"data": {
"png_url": "https://xxx.oss.com/work/.../thumb.png",
"svg_url": "https://xxx.oss.com/work/.../main.svg",
"source_code": "..."
}
}
```
| Response Field | Type | Description |
|---|---|---|
| png_url | string | PNG image URL on OSS |
| svg_url | string | SVG vector image URL on OSS |
| source_code | string | Diagram source code |
---
## 4. Generate Mind Map
```
POST /skills/generate-mindmap
```
Create a mind map from natural language.
**Request:**
```json
{
"prompt": "机器学习知识体系梳理",
"lang": "cn",
"platform": "web"
}
```
**Success Response:**
```json
{
"code": 0,
"msg": "",
"data": {
"png_url": "https://xxx.oss.com/work/.../thumb.png",
"svg_url": "https://xxx.oss.com/work/.../main.svg",
"source_code": "..."
}
}
```
| Response Field | Type | Description |
|---|---|---|
| png_url | string | PNG image URL on OSS |
| svg_url | string | SVG vector image URL on OSS |
| source_code | string | Diagram source code |
---
## Shared Request Parameters
| Field | Type | Required | Default | Description |
|---|---|---|---|---|
| prompt | string | Yes | — | Natural language description |
| lang | string | No | "cn" | Output language |
| platform | string | No | — | Platform identifier |
### Supported Languages
| Code | Language | Code | Language |
|---|---|---|---|
| en | English | it | Italiano |
| cn | 简体中文 | tw | 繁體中文 |
| jp | 日本語 | pt | Português |
| kr | 한국어 | ru | Русский |
| es | Español | id | Bahasa Indonesia |
| fr | Français | de | Deutsch |
---
## Error Codes (All Endpoints)
| code | msg | Cause |
|---|---|---|
| 400 | bad param: prompt is required | prompt empty or malformed |
| 400 | bad param: lang不合法 | lang not in allowed enum |
| 2406 | risk control rejection | Sensitive/violating content |
| 3001 | concurrency limit | Too many concurrent requests for user |
| 212200 | 生成失败 | AI model call failed or timed out |
| 212201 | 渲染失败 | Rendering service or OSS upload failed |
| 500 | panic | Internal server error |
Generate and edit AI images and videos with Media.io OpenAPI. Supports text-to-image, image-to-image, text-to-video, and image-to-video, plus task status and...
---
name: mediaio-aigc-generate
description: "Generate and edit AI images and videos with Media.io OpenAPI. Supports text-to-image, image-to-image, text-to-video, and image-to-video, plus task status and credit queries. Access top models like Imagen 4, Seedream, Kling, Vidu, Wan, and Veo 3.1 in one place."
metadata: {"mediaio":{"emoji":"🎨","requires":{"env":["API_KEY"]}},"publisher":"Community Maintainer","source":"https://platform.media.io/docs/"}
---
# MediaIO AIGC Generate Skill
## Overview
This skill routes MediaIO OpenAPI requests based on `scripts/c_api_doc_detail.json`.
It provides a single entry point, `Skill.invoke(api_name, params, api_key)`, covering credits queries, text-to-image, image-to-image, image-to-video, text-to-video, and task result lookups.
## Requirements
### Environment Variables
| Variable | Required | Description |
|----------|----------|-------------|
| `API_KEY` | **Yes** | Media.io OpenAPI key, sent as `X-API-KEY` header. Apply at <https://developer.media.io/>. Use a **least-privilege / test key**; do not reuse broader platform credentials. |
## Provenance and Credential Notes
- Maintainer: community-maintained skill (not an official Media.io release)
- Reference homepage: https://developer.media.io/
- Reference API docs: https://platform.media.io/docs/
- Target API domain used by this skill: `https://openapi.media.io`
- Required credential: `API_KEY` (used as `X-API-KEY`)
- Security recommendation: use a least-privilege/test key, avoid reusing broader platform credentials.
- Credential loading: set `API_KEY` in your environment, or pass `api_key` explicitly to `Skill.invoke(...)`.
## API Coverage (from c_api_doc_detail.json)
The current API definition includes 24 endpoints, grouped by capability:
- Query APIs
- `Credits` (query user credit balance)
- `Task Result` (query task status/result by `task_id`)
- Text To Image�?�?
- `Imagen 4`
- `soul_character`
- Image To Image�?�?
- `Nano Banana`
- `Seedream 4.0`
- `Nano Banana Pro`
- Image To Video�?4�?
- `Wan 2.6` / `Wan 2.2` / `Wan 2.5`
- `Hailuo 02` / `Hailuo 2.3`
- `Kling 2.1` / `Kling 2.5 Turbo` / `Kling 2.6` / `Kling 3.0`
- `Vidu Q2` / `Vidu Q3`
- `Google Veo 3.1` / `Google Veo 3.1 Fast`
- `Motion Control Kling 2.6`
- Text To Video�?�?
- `Wan 2.6 (Text To Video)`
- `Vidu Q3 (Text To Video)`
- `Wan 2.5 (Text To Video)`
## Input and Output Format
- Input:
- `api_name`: API name (must exactly match the `name` field in `c_api_doc_detail.json`)
- `params`: Business parameter dictionary (only include fields defined in `api_body`)
- `api_key`: API key
- Output:
- Returns the raw API response payload (dict/json)
### Common Response Structure
- Success response is typically: `{"code": 0, "msg": "", "data": {...}, "trace_id": "..."}`
- Asynchronous generation APIs usually return: `data.task_id`
- Final artifacts should be retrieved by polling `Task Result`
## Quick Start
### 1) Install Dependency
```bash
pip install requests
```
### 2) Initialize Skill
```python
import os
from scripts.skill_router import Skill
skill = Skill('scripts/c_api_doc_detail.json')
api_key = os.getenv('API_KEY', '')
if not api_key:
raise RuntimeError('API_KEY is not set')
```
### 3) Configure Environment Variable API_KEY
- Purpose: provide a shared key source for local scripts.
- Note: examples below read `api_key` from environment variable.
Windows PowerShell�?
```powershell
$env:API_KEY="your-api-key"
```
macOS / Linux (bash/zsh):
```bash
export API_KEY="your-api-key"
```
## Usage Examples (Python)
### Example A: Query Credits (`Credits`)
```python
import os
from scripts.skill_router import Skill
skill = Skill('scripts/c_api_doc_detail.json')
api_key = os.getenv('API_KEY', '')
if not api_key:
raise RuntimeError('API_KEY is not set')
result = skill.invoke('Credits', {}, api_key=api_key)
print(result)
```
### Example B: Text-to-Image (`Imagen 4`)
```python
import os
from scripts.skill_router import Skill
skill = Skill('scripts/c_api_doc_detail.json')
api_key = os.getenv('API_KEY', '')
if not api_key:
raise RuntimeError('API_KEY is not set')
result = skill.invoke(
'Imagen 4',
{
'prompt': 'a cute puppy, photorealistic, soft natural light, high detail',
'ratio': '1:1',
'counts': '1'
},
api_key=api_key
)
print(result) # When code=0, data usually contains task_id.
```
### Example C: Query Task Result (`Task Result`)
```python
import os
import time
from scripts.skill_router import Skill
skill = Skill('scripts/c_api_doc_detail.json')
api_key = os.getenv('API_KEY', '')
if not api_key:
raise RuntimeError('API_KEY is not set')
task_id = 'your-task-id'
for _ in range(24):
r = skill.invoke('Task Result', {'task_id': task_id}, api_key=api_key)
print(r)
status = (r.get('data') or {}).get('status')
if status in ('completed', 'failed', 'succeeded'):
break
time.sleep(5)
```
### Task Status Reference
- `waiting`: queued
- `processing`: running
- `completed`: completed successfully
- `failed`: failed
- `timeout`: timed out
## Parameter Guidance (by Capability)
- Text To Image / Image To Image / Video APIs:
- Required parameters are defined by `required=true` in `api_body`.
- Enum fields (for example `ratio`, `duration`, `resolution`) must use documented values.
- `Task Result`:
- Requires `task_id`, passed as `{'task_id': 'xxx'}` (path parameters are replaced automatically by the router).
- `Credits`:
- No business parameters are required; use `params={}`.
## Typical Invocation Flow
1. Call a generation API (for example `Imagen 4` or `Kling 3.0`) to obtain `task_id`.
2. Poll the same task using `Task Result`.
3. When status is `completed/succeeded`, extract output URLs from `data.result`.
## Error Handling Notes
- API not found: returns `{"error": "API 'xxx' not found."}`
- Request exception: returns `{"error": "<exception message>"}`
- Invalid response format: returns `{"error": "Invalid response", ...}`
## Additional Response Code Notes
- `374004`: not authenticated. Apply for an APP KEY at https://developer.media.io/.
- `490505`: insufficient credits. Recharge before invoking generation APIs.
## Important Implementation Notes
- API names in `c_api_doc_detail.json` are unique, including text-to-video variants.
- The router validates duplicate names at load time and fails fast if duplicates are reintroduced.
- For best compatibility, invoke APIs by exact `name` from the JSON definition.
## Extension and Integration Recommendations
- Add asynchronous invocation via `asyncio` or threads.
- Auto-generate parameter validation and type hints from API metadata.
- Package this skill as a microservice/API for multi-client integration.
## External Resources
- API documentation: https://platform.media.io/docs/
- Product overview: https://developer.media.io/
- Credit purchase: https://developer.media.io/pricing.html
## Related Files
- scripts/skill_router.py: core routing logic
- scripts/c_api_doc_detail.json: API definitions
FILE:references/mediaio-aigc-api-reference.md
# MediaIO AIGC API Capability Reference (Generated from c_api_doc_detail.json)
- Generated at: 2026-03-10 17:26:49
- Total APIs: 24
- Invocation pattern: Skill.invoke(api_name, params, api_key)
- Important note: API names are unique in the current `c_api_doc_detail.json`; keep names unique to avoid routing ambiguity.
## 1. General Invocation Rules
### 1.1 Unified Entry Point
```python
from scripts.skill_router import Skill
import os
skill = Skill('scripts/c_api_doc_detail.json')
api_key = os.getenv('API_KEY', '')
result = skill.invoke(api_name, params, api_key=api_key)
print(result)
```
### 1.2 Standard Async Workflow
1. Call a generation endpoint and obtain `data.task_id`.
2. Poll `Task Result` with the task ID.
3. When status becomes `completed` or `succeeded`, extract output URLs from `data.result`.
### 1.3 Common Status and Error Codes
- Task status: `waiting`, `processing`, `completed`, `failed`, `timeout`
- `374004`: authentication required / invalid API key
- `490505`: insufficient credits
## 2. Capability Overview
### Query APIs (2)
- Credits | model_code=user-credits | POST https://openapi.media.io/user/credits
- Task Result | model_code=generation-result | POST https://openapi.media.io/generation/result/{task_id}
### Text To Image (2)
- Imagen 4 | model_code=t2i-imagen-4 | POST https://openapi.media.io/generation/imagen/t2i-imagen-4
- soul_character | model_code=t2i-soul-character | POST https://openapi.media.io/generation/soul/t2i-soul-character
### Image To Image (3)
- Nano Banana | model_code=i2i-banana | POST https://openapi.media.io/generation/banana/i2i-banana
- Seedream 4.0 | model_code=i2i-seedream-v4-0 | POST https://openapi.media.io/generation/seedream/i2i-seedream-v4-0
- Nano Banana Pro | model_code=i2i-banana-2 | POST https://openapi.media.io/generation/banana/i2i-banana-2
### Image To Video (14)
- Wan 2.6 | model_code=i2v-wan-v2-6 | POST https://openapi.media.io/generation/wan/i2v-wan-v2-6
- Wan 2.2 | model_code=i2v-wan-v2-2 | POST https://openapi.media.io/generation/wan/i2v-wan-v2-2
- Hailuo 02 | model_code=i2v-minimax-02 | POST https://openapi.media.io/generation/minimax/i2v-minimax-02
- Kling 2.1 | model_code=i2v-kling-v2-1 | POST https://openapi.media.io/generation/kling/i2v-kling-v2-1
- Vidu Q3 | model_code=i2v-vidu-q3 | POST https://openapi.media.io/generation/vidu/i2v-vidu-q3
- Kling 2.5 Turbo | model_code=i2v-kling-v2-5-turbo | POST https://openapi.media.io/generation/kling/i2v-kling-v2-5-turbo
- Google Veo 3.1 | model_code=i2v-veo-v3-1 | POST https://openapi.media.io/generation/veo/i2v-veo-v3-1
- Kling 3.0 | model_code=i2v-kling-v3-0 | POST https://openapi.media.io/generation/kling/i2v-kling-v3-0
- Hailuo 2.3 | model_code=i2v-minimax-v2-3 | POST https://openapi.media.io/generation/minimax/i2v-minimax-v2-3
- Vidu Q2 | model_code=i2v-vidu-q2 | POST https://openapi.media.io/generation/vidu/i2v-vidu-q2
- Wan 2.5 | model_code=i2v-wan-v2-5 | POST https://openapi.media.io/generation/wan/i2v-wan-v2-5
- Kling 2.6 | model_code=i2v-kling-v2-6 | POST https://openapi.media.io/generation/kling/i2v-kling-v2-6
- Motion Control Kling 2.6 | model_code=i2v-motion-control-kling-v2-6 | POST https://openapi.media.io/generation/kling/i2v-motion-control-kling-v2-6
- Google Veo 3.1 Fast | model_code=i2v-veo-v3-1-fast | POST https://openapi.media.io/generation/veo/i2v-veo-v3-1-fast
### Text To Video (3)
- Wan 2.6 (Text To Video) | model_code=t2v-wan-v2-6 | POST https://openapi.media.io/generation/wan/t2v-wan-v2-6
- Vidu Q3 (Text To Video) | model_code=t2v-vidu-q3 | POST https://openapi.media.io/generation/vidu/t2v-vidu-q3
- Wan 2.5 (Text To Video) | model_code=t2v-wan-v2-5 | POST https://openapi.media.io/generation/wan/t2v-wan-v2-5
## 3. Detailed API Documentation
### Query APIs
#### Query APIs-1 Credits (model_code: user-credits)
- API ID: 196
- Method: POST
- Endpoint: https://openapi.media.io/user/credits
- Description: API to query user credits balance.
- Parameters: none (use `{}`)
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "Query User Credits",
"language": "cURL",
"code_example": "curl --request POST \n --url https://openapi.media.io/user/credits \n --header 'Content-Type: application/json' \n --header 'X-API-KEY: <api-key>' \n --data '{}'"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "credits",
"type": "integer",
"describe": "User credits balance, located within the data object"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Credits',
{},
api_key=api_key
)
print(result)
```
#### Query APIs-2 Task Result (model_code: generation-result)
- API ID: 197
- Method: POST
- Endpoint: https://openapi.media.io/generation/result/{task_id}
- Description: API to query generation task result by task ID.
| Parameter | Type | Required | Description |
|---|---|---|---|
| task_id | string | Yes | The task ID to query, located in the URL path parameter |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "Query Task Result",
"language": "cURL",
"code_example": "curl --request POST \n --url https://openapi.media.io/generation/result/<task_id> \n --header 'Content-Type: application/json' \n --header 'X-API-KEY: <api-key>' \n --data '{}'"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Task identifier"
},
{
"name": "status",
"type": "string",
"describe": "Task status: pending, processing, succeeded, failed"
},
{
"name": "result",
"type": "object",
"describe": "Generation result when task succeeded"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Task Result',
{
"task_id": "<required>"
},
api_key=api_key
)
print(result)
```
### Text To Image
#### Text To Image-1 Imagen 4 (model_code: t2i-imagen-4)
- API ID: 259
- Method: POST
- Endpoint: https://openapi.media.io/generation/imagen/t2i-imagen-4
- Description: Sets a new standard for photorealism and text rendering accuracy, handling the most complex prompts with ease.
| Parameter | Type | Required | Description |
|---|---|---|---|
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 1200. |
| ratio | string | No | Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4. Default is 9:16. Allowed values: 9:16, 16:9, 1:1, 4:3, 3:4. |
| counts | string | No | Counts parameter. Options: 1, 2, 3, 4. Default is 1. Allowed values: 1, 2, 3, 4. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create image by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/imagen/t2i-imagen-4 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"counts\": \"1\",\n \"prompt\": \"<string>\",\n \"ratio\": \"9:16\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Imagen 4',
{
"prompt": "<required>",
"ratio": "<optional>",
"counts": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Text To Image-2 soul_character (model_code: t2i-soul-character)
- API ID: 260
- Method: POST
- Endpoint: https://openapi.media.io/generation/soul/t2i-soul-character
- Description: Delivers the most cost-effective solution with ultra-fast generation speeds, perfect for high-volume commercial applications.
| Parameter | Type | Required | Description |
|---|---|---|---|
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2000. |
| ratio | string | No | Target aspect ratio. Options: 2:3, 1:1, 3:2, 9:16, 16:9. Default is 16:9. Allowed values: 2:3, 1:1, 3:2, 9:16, 16:9. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create image by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/soul/t2i-soul-character \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"prompt\": \"<string>\",\n \"ratio\": \"16:9\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'soul_character',
{
"prompt": "<required>",
"ratio": "<optional>"
},
api_key=api_key
)
print(result)
```
### Image To Image
#### Image To Image-1 Nano Banana (model_code: i2i-banana)
- API ID: 242
- Method: POST
- Endpoint: https://openapi.media.io/generation/banana/i2i-banana
- Description: Utilizes next-gen multimodal reasoning to generate images that perfectly align with nuanced conceptual descriptions.
| Parameter | Type | Required | Description |
|---|---|---|---|
| images | array<string> | Yes | The URLs of the input images. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 1280x1280. Aspect ratio must be between 1:2 and 2:1. Maximum 9 file(s). |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 1200. |
| ratio | string | No | Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3, 21:9. Default is 16:9. Allowed values: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3, 21:9. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create image by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/banana/i2i-banana \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"images\": [\n \"<string>\"\n ],\n \"prompt\": \"<string>\",\n \"ratio\": \"16:9\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Nano Banana',
{
"images": "<required>",
"prompt": "<required>",
"ratio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Image-2 Seedream 4.0 (model_code: i2i-seedream-v4-0)
- API ID: 243
- Method: POST
- Endpoint: https://openapi.media.io/generation/seedream/i2i-seedream-v4-0
- Description: A versatile powerhouse supporting 4K generation and advanced editing, ensuring character and style consistency.
| Parameter | Type | Required | Description |
|---|---|---|---|
| images | array<string> | Yes | The URLs of the input images. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 1280x1280. Aspect ratio must be between 3:10 and 3:1. Maximum 9 file(s). |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 800. |
| ratio | string | No | Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3. Default is 9:16. Allowed values: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create image by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/seedream/i2i-seedream-v4-0 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"images\": [\n \"<string>\"\n ],\n \"prompt\": \"<string>\",\n \"ratio\": \"9:16\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Seedream 4.0',
{
"images": "<required>",
"prompt": "<required>",
"ratio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Image-3 Nano Banana Pro (model_code: i2i-banana-2)
- API ID: 244
- Method: POST
- Endpoint: https://openapi.media.io/generation/banana/i2i-banana-2
- Description: Utilizes next-gen multimodal reasoning to generate images that perfectly align with nuanced conceptual descriptions.
| Parameter | Type | Required | Description |
|---|---|---|---|
| images | array<string> | Yes | The URLs of the input images. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 1280x1280. Aspect ratio must be between 1:2 and 2:1. Maximum 9 file(s). |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 1200. |
| ratio | string | No | Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3, 21:9. Default is 16:9. Allowed values: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3, 21:9. |
| resolution | string | No | Output resolution. Options: 1K, 2K, 4K. Default is 1K. Allowed values: 1K, 2K, 4K. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create image by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/banana/i2i-banana-2 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"images\": [\n \"<string>\"\n ],\n \"prompt\": \"<string>\",\n \"ratio\": \"16:9\",\n \"resolution\": \"1K\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Nano Banana Pro',
{
"images": "<required>",
"prompt": "<required>",
"ratio": "<optional>",
"resolution": "<optional>"
},
api_key=api_key
)
print(result)
```
### Image To Video
#### Image To Video-1 Wan 2.6 (model_code: i2v-wan-v2-6)
- API ID: 245
- Method: POST
- Endpoint: https://openapi.media.io/generation/wan/i2v-wan-v2-6
- Description: Features advanced multi-shot storytelling and native audio-visual synchronization, ideal for creating complex narratives with consistent characters.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, and BMP. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 2000x2000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 1500. |
| resolution | string | No | Output resolution. Options: 720P, 1080P. Default is 720P. Allowed values: 720P, 1080P. |
| duration | string | No | Video duration in seconds. Options: 5s, 10s, 15s. Default is 5s. Allowed values: 5s, 10s, 15s. |
| shot_type | string | No | Shot Type parameter. Options: Multi, Single. Default is Multi. Allowed values: Multi, Single. |
| generate_audio | string | No | Generate Audio parameter. Options: True, False. Default is True. Allowed values: True, False. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-6 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"generate_audio\": \"True\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"720P\",\n \"shot_type\": \"Multi\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Wan 2.6',
{
"image": "<required>",
"prompt": "<required>",
"resolution": "<optional>",
"duration": "<optional>",
"shot_type": "<optional>",
"generate_audio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-2 Wan 2.2 (model_code: i2v-wan-v2-2)
- API ID: 246
- Method: POST
- Endpoint: https://openapi.media.io/generation/wan/i2v-wan-v2-2
- Description: A balanced video generation model offering reliable motion quality and prompt adherence for general tasks.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, and BMP. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 2000x2000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 800. |
| resolution | string | No | Output resolution. Options: 1080P. Default is 1080P. Allowed values: 1080P. |
| duration | string | No | Video duration in seconds. Options: 5s. Default is 5s. Allowed values: 5s. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-2 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"1080P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Wan 2.2',
{
"image": "<required>",
"prompt": "<required>",
"resolution": "<optional>",
"duration": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-3 Hailuo 02 (model_code: i2v-minimax-02)
- API ID: 247
- Method: POST
- Endpoint: https://openapi.media.io/generation/minimax/i2v-minimax-02
- Description: Excels in cinematic realism and complex physics simulation, offering 'Director-level' control over camera movements and high fidelity.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2000. |
| duration | string | No | Video duration in seconds. Options: 6s. Default is 6s. Allowed values: 6s. |
| resolution | string | No | Output resolution. Options: 768P, 1080P. Default is 768P. Allowed values: 768P, 1080P. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/minimax/i2v-minimax-02 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"6s\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"768P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Hailuo 02',
{
"image": "<required>",
"prompt": "<required>",
"duration": "<optional>",
"resolution": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-4 Kling 2.1 (model_code: i2v-kling-v2-1)
- API ID: 248
- Method: POST
- Endpoint: https://openapi.media.io/generation/kling/i2v-kling-v2-1
- Description: A robust model known for producing realistic videos with stable motion and good prompt understanding.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2000. |
| duration | string | No | Video duration in seconds. Options: 5s, 10s. Default is 5s. Allowed values: 5s, 10s. |
| resolution | string | No | Output resolution. Options: 720P, 1080P. Default is 720P. Allowed values: 720P, 1080P. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-1 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"720P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Kling 2.1',
{
"image": "<required>",
"prompt": "<required>",
"duration": "<optional>",
"resolution": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-5 Vidu Q3 (model_code: i2v-vidu-q3)
- API ID: 249
- Method: POST
- Endpoint: https://openapi.media.io/generation/vidu/i2v-vidu-q3
- Description: The latest iteration optimized for superior speed and high-definition output, enabling rapid high-quality video creation.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 150x150 and 4000x4000. Aspect ratio must be between 1:4 and 4:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2000. |
| resolution | string | No | Output resolution. Options: 540P, 720P, 1080P. Default is 720P. Allowed values: 540P, 720P, 1080P. |
| duration | string | No | Video duration in seconds. Options: 4s, 8s, 12s, 16s. Default is 4s. Allowed values: 4s, 8s, 12s, 16s. |
| generate_audio | string | No | Generate Audio parameter. Options: True, False. Default is True. Allowed values: True, False. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/vidu/i2v-vidu-q3 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"4s\",\n \"generate_audio\": \"True\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"720P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Vidu Q3',
{
"image": "<required>",
"prompt": "<required>",
"resolution": "<optional>",
"duration": "<optional>",
"generate_audio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-6 Kling 2.5 Turbo (model_code: i2v-kling-v2-5-turbo)
- API ID: 251
- Method: POST
- Endpoint: https://openapi.media.io/generation/kling/i2v-kling-v2-5-turbo
- Description: Offers advanced motion control and high realism, bridging the gap between standard and professional output.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2500. |
| duration | string | No | Video duration in seconds. Options: 5s, 10s. Default is 5s. Allowed values: 5s, 10s. |
| resolution | string | No | Output resolution. Options: 720P, 1080P. Default is 1080P. Allowed values: 720P, 1080P. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-5-turbo \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"1080P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Kling 2.5 Turbo',
{
"image": "<required>",
"prompt": "<required>",
"duration": "<optional>",
"resolution": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-7 Google Veo 3.1 (model_code: i2v-veo-v3-1)
- API ID: 252
- Method: POST
- Endpoint: https://openapi.media.io/generation/veo/i2v-veo-v3-1
- Description: A state-of-the-art model offering exceptional resolution and deep understanding of complex prompts for cinematic results.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include JPG, JPEG, PNG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 50.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 1000. |
| aspect_ratio | string | No | Target aspect ratio. Options: 9:16, 16:9. Default is 9:16. Allowed values: 9:16, 16:9. |
| resolution | string | No | Output resolution. Options: 720P, 1080P. Default is 720P. Allowed values: 720P, 1080P. |
| duration | string | No | Video duration in seconds. Options: 4s, 6s, 8s. Default is 4s. Allowed values: 4s, 6s, 8s. |
| generate_audio | string | No | Generate Audio parameter. Options: True, False. Default is True. Allowed values: True, False. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/veo/i2v-veo-v3-1 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"aspect_ratio\": \"9:16\",\n \"duration\": \"4s\",\n \"generate_audio\": \"True\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"720P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Google Veo 3.1',
{
"image": "<required>",
"prompt": "<required>",
"aspect_ratio": "<optional>",
"resolution": "<optional>",
"duration": "<optional>",
"generate_audio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-8 Kling 3.0 (model_code: i2v-kling-v3-0)
- API ID: 253
- Method: POST
- Endpoint: https://openapi.media.io/generation/kling/i2v-kling-v3-0
- Description: The latest high-fidelity generation delivering ultra-realistic motion and superior detail, suitable for professional-grade video production.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2500. |
| resolution | string | No | Output resolution. Options: 720P, 1080P. Default is 1080P. Allowed values: 720P, 1080P. |
| duration | string | No | Video duration in seconds. Options: 5s, 8s, 10s, 15s. Default is 5s. Allowed values: 5s, 8s, 10s, 15s. |
| multi_shots | string | No | Multi Shots parameter. Options: True, False. Default is True. Allowed values: True, False. |
| generate_audio | string | No | Generate Audio parameter. Options: True, False. Default is True. Allowed values: True, False. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/kling/i2v-kling-v3-0 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"generate_audio\": \"True\",\n \"image\": \"<string>\",\n \"multi_shots\": \"True\",\n \"prompt\": \"<string>\",\n \"resolution\": \"1080P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Kling 3.0',
{
"image": "<required>",
"prompt": "<required>",
"resolution": "<optional>",
"duration": "<optional>",
"multi_shots": "<optional>",
"generate_audio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-9 Hailuo 2.3 (model_code: i2v-minimax-v2-3)
- API ID: 254
- Method: POST
- Endpoint: https://openapi.media.io/generation/minimax/i2v-minimax-v2-3
- Description: Excels in cinematic realism and complex physics simulation, offering 'Director-level' control over camera movements and high fidelity.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2000. |
| resolution | string | No | Output resolution. Options: 768P, 1080P (when duration is 6s). Default is 768P. Allowed values: 768P, 1080P. |
| duration | string | No | Video duration in seconds. Options: 6s, 10s. Default is 6s. Allowed values: 6s, 10s. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/minimax/i2v-minimax-v2-3 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"6s\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"768P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Hailuo 2.3',
{
"image": "<required>",
"prompt": "<required>",
"resolution": "<optional>",
"duration": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-10 Vidu Q2 (model_code: i2v-vidu-q2)
- API ID: 255
- Method: POST
- Endpoint: https://openapi.media.io/generation/vidu/i2v-vidu-q2
- Description: A fast and efficient model that balances generation speed with good visual fidelity.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 150x150 and 4000x4000. Aspect ratio must be between 1:4 and 4:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2000. |
| duration | string | No | Video duration in seconds. Options: 4s, 6s, 8s. Default is 4s. Allowed values: 4s, 6s, 8s. |
| resolution | string | No | Output resolution. Options: 720P, 1080P. Default is 720P. Allowed values: 720P, 1080P. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/vidu/i2v-vidu-q2 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"4s\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"720P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Vidu Q2',
{
"image": "<required>",
"prompt": "<required>",
"duration": "<optional>",
"resolution": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-11 Wan 2.5 (model_code: i2v-wan-v2-5)
- API ID: 256
- Method: POST
- Endpoint: https://openapi.media.io/generation/wan/i2v-wan-v2-5
- Description: An enhanced version offering improved motion stability and visual quality compared to earlier iterations.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, and BMP. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 2000x2000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 800. |
| resolution | string | No | Output resolution. Options: 480P, 720P, 1080P. Default is 480P. Allowed values: 480P, 720P, 1080P. |
| duration | string | No | Video duration in seconds. Options: 5s, 10s. Default is 5s. Allowed values: 5s, 10s. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-5 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"480P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Wan 2.5',
{
"image": "<required>",
"prompt": "<required>",
"resolution": "<optional>",
"duration": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-12 Kling 2.6 (model_code: i2v-kling-v2-6)
- API ID: 257
- Method: POST
- Endpoint: https://openapi.media.io/generation/kling/i2v-kling-v2-6
- Description: Kling AI's video model, recognized for its ability to generate realistic and coherent motion.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2500. |
| duration | string | No | Video duration in seconds. Options: 5s, 10s. Default is 5s. Allowed values: 5s, 10s. |
| generate_audio | string | No | Generate Audio parameter. Options: True, False. Default is True. Allowed values: True, False. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-6 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"generate_audio\": \"True\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Kling 2.6',
{
"image": "<required>",
"prompt": "<required>",
"duration": "<optional>",
"generate_audio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-13 Motion Control Kling 2.6 (model_code: i2v-motion-control-kling-v2-6)
- API ID: 258
- Method: POST
- Endpoint: https://openapi.media.io/generation/kling/i2v-motion-control-kling-v2-6
- Description: Offers advanced motion control and high realism, bridging the gap between standard and professional output.
| Parameter | Type | Required | Description |
|---|---|---|---|
| video | string | Yes | The URL of the input video. Supported formats include MP4 and MOV. File size must be between 0.0 MB and 100.0 MB. Video resolution must be between 720x720 and 2160x2160. Frame rate must be between 24 and 60 FPS. Duration must be between 3 and 30 seconds. |
| image | string | Yes | The URL of the input image. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 1:2 and 2:1. |
| prompt | string | No | The text prompt describing the content to generate. Maximum string length: 2500. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/kling/i2v-motion-control-kling-v2-6 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"video\": \"<string>\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Motion Control Kling 2.6',
{
"video": "<required>",
"image": "<required>",
"prompt": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Image To Video-14 Google Veo 3.1 Fast (model_code: i2v-veo-v3-1-fast)
- API ID: 264
- Method: POST
- Endpoint: https://openapi.media.io/generation/veo/i2v-veo-v3-1-fast
- Description: Google Veo 3.1 Fast is an AI model for video generation.
| Parameter | Type | Required | Description |
|---|---|---|---|
| image | string | Yes | The URL of the input image. Supported formats include JPG, JPEG, PNG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 50.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2:1. |
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 1000. |
| aspect_ratio | string | No | Target aspect ratio. Options: 9:16, 16:9. Default is 9:16. Allowed values: 9:16, 16:9. |
| resolution | string | No | Output resolution. Options: 720P, 1080P. Default is 720P. Allowed values: 720P, 1080P. |
| duration | string | No | Video duration in seconds. Options: 4s, 6s, 8s. Default is 4s. Allowed values: 4s, 6s, 8s. |
| generate_audio | string | No | Generate Audio parameter. Options: True, False. Default is True. Allowed values: True, False. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/veo/i2v-veo-v3-1-fast \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"aspect_ratio\": \"9:16\",\n \"duration\": \"4s\",\n \"generate_audio\": \"True\",\n \"image\": \"<string>\",\n \"prompt\": \"<string>\",\n \"resolution\": \"720P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Google Veo 3.1 Fast',
{
"image": "<required>",
"prompt": "<required>",
"aspect_ratio": "<optional>",
"resolution": "<optional>",
"duration": "<optional>",
"generate_audio": "<optional>"
},
api_key=api_key
)
print(result)
```
### Text To Video
#### Text To Video-1 Wan 2.6 (model_code: t2v-wan-v2-6)
- API ID: 261
- Method: POST
- Endpoint: https://openapi.media.io/generation/wan/t2v-wan-v2-6
- Description: Features advanced multi-shot storytelling and native audio-visual synchronization, ideal for creating complex narratives with consistent characters.
| Parameter | Type | Required | Description |
|---|---|---|---|
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 1500. |
| size | string | No | Output size. Options: 9:16, 16:9, 1:1, 4:3, 3:4. Default is 16:9. Allowed values: 9:16, 16:9, 1:1, 4:3, 3:4. |
| duration | string | No | Video duration in seconds. Options: 5s, 10s, 15s. Default is 5s. Allowed values: 5s, 10s, 15s. |
| shot_type | string | No | Shot Type parameter. Options: Multi, Single. Default is Multi. Allowed values: Multi, Single. |
| generate_audio | string | No | Generate Audio parameter. Options: True, False. Default is True. Allowed values: True, False. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/wan/t2v-wan-v2-6 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"generate_audio\": \"True\",\n \"prompt\": \"<string>\",\n \"shot_type\": \"Multi\",\n \"size\": \"16:9\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Wan 2.6',
{
"prompt": "<required>",
"size": "<optional>",
"duration": "<optional>",
"shot_type": "<optional>",
"generate_audio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Text To Video-2 Vidu Q3 (model_code: t2v-vidu-q3)
- API ID: 262
- Method: POST
- Endpoint: https://openapi.media.io/generation/vidu/t2v-vidu-q3
- Description: The latest iteration optimized for superior speed and high-definition output, enabling rapid high-quality video creation.
| Parameter | Type | Required | Description |
|---|---|---|---|
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 2000. |
| ratio | string | No | Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4. Default is 16:9. Allowed values: 9:16, 16:9, 1:1, 4:3, 3:4. |
| resolution | string | No | Output resolution. Options: 540P, 720P, 1080P. Default is 720P. Allowed values: 540P, 720P, 1080P. |
| duration | string | No | Video duration in seconds. Options: 4s, 8s, 12s, 16s. Default is 8s. Allowed values: 4s, 8s, 12s, 16s. |
| generate_audio | string | No | Generate Audio parameter. Options: True, False. Default is True. Allowed values: True, False. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/vidu/t2v-vidu-q3 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"8s\",\n \"generate_audio\": \"True\",\n \"prompt\": \"<string>\",\n \"ratio\": \"16:9\",\n \"resolution\": \"720P\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Vidu Q3',
{
"prompt": "<required>",
"ratio": "<optional>",
"resolution": "<optional>",
"duration": "<optional>",
"generate_audio": "<optional>"
},
api_key=api_key
)
print(result)
```
#### Text To Video-3 Wan 2.5 (model_code: t2v-wan-v2-5)
- API ID: 263
- Method: POST
- Endpoint: https://openapi.media.io/generation/wan/t2v-wan-v2-5
- Description: An enhanced version offering improved motion stability and visual quality compared to earlier iterations.
| Parameter | Type | Required | Description |
|---|---|---|---|
| prompt | string | Yes | The text prompt describing the content to generate. Maximum string length: 800. |
| ratio | string | No | Target aspect ratio. Options: 16:9, 9:16, 1:1, 4:3, 3:4. Default is 16:9. Allowed values: 16:9, 9:16, 1:1, 4:3, 3:4. |
| duration | string | No | Video duration in seconds. Options: 5s, 10s. Default is 5s. Allowed values: 5s, 10s. |
##### Request Example
```json
{
"title": "Example Request",
"request": [
{
"title": "create video by text or image",
"language": "cURL",
"code_example": "curl --request POST \\\n --url https://openapi.media.io/generation/wan/t2v-wan-v2-5 \\\n --header 'Content-Type: application/json' \\\n --header 'X-API-KEY: <api-key>' \\\n --data '\n{\n \"data\": {\n \"duration\": \"5s\",\n \"prompt\": \"<Your prompt text, 1-800 characters>\",\n \"ratio\": \"16:9\"\n }\n}'\n"
}
],
"describe": "",
"response": [
{
"type": "200",
"code_example": "{\n \"code\": 0,\n \"msg\": \"\",\n \"data\": {\n \"task_id\": <string>\"\n },\n \"trace_id\": <string>\"\n}"
},
{
"type": "default",
"code_example": "{\n \"code\": <integer>,\n \"msg\": <string>,\n \"data\": {},\n \"trace_id\": <string>\"\n}"
}
]
}
```
##### Response Example
```json
{
"list": [
{
"name": "code",
"type": "integer",
"describe": "Response status code, 0 indicates success"
},
{
"name": "msg",
"type": "string",
"describe": "Response message, empty string on success"
},
{
"name": "data",
"type": "object",
"describe": "Response data object"
},
{
"name": "task_id",
"type": "string",
"describe": "Unique task identifier, located within the data object"
},
{
"name": "trace_id",
"type": "string",
"describe": "Request tracking ID"
}
],
"title": "Response",
"describe": "After the request is successfully processed, the server will return the following response"
}
```
##### Skill.invoke Template
```python
result = skill.invoke(
'Wan 2.5',
{
"prompt": "<required>",
"ratio": "<optional>",
"duration": "<optional>"
},
api_key=api_key
)
print(result)
```
## 4. Recommended Runtime Templates
### 4.1 Text-to-Image Workflow (`Imagen 4` + `Task Result`)
```python
import os
import time
from scripts.skill_router import Skill
skill = Skill('scripts/c_api_doc_detail.json')
api_key = os.getenv('API_KEY', '')
create = skill.invoke('Imagen 4', {
'prompt': 'A cute kitten, photorealistic, soft natural light, highly detailed',
'ratio': '1:1',
'counts': '1'
}, api_key=api_key)
task_id = (create.get('data') or {}).get('task_id')
for _ in range(24):
r = skill.invoke('Task Result', {'task_id': task_id}, api_key=api_key)
status = (r.get('data') or {}).get('status')
if status in ('completed', 'succeeded', 'failed', 'timeout'):
print(r)
break
time.sleep(5)
```
## 5. Quick Reference (Minimum Viable Params)
### 5.1 Query APIs
| API | Purpose | Minimum params |
|---|---|---|
| Credits | Query account credits | {} |
| Task Result | Query task status/result | {"task_id": "<task_id>"} |
### 5.2 Text To Image
| API | Minimum params |
|---|---|
| Imagen 4 | {"prompt": "<text prompt>"} |
| soul_character | {"prompt": "<text prompt>"} |
### 5.3 Image To Image
| API | Minimum params |
|---|---|
| Nano Banana | {"images": "<image_url>", "prompt": "<text prompt>"} |
| Seedream 4.0 | {"images": "<image_url>", "prompt": "<text prompt>"} |
| Nano Banana Pro | {"images": "<image_url>", "prompt": "<text prompt>"} |
### 5.4 Image To Video
| API | Minimum params |
|---|---|
| Wan 2.6 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Wan 2.2 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Hailuo 02 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Kling 2.1 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Vidu Q3 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Kling 2.5 Turbo | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Google Veo 3.1 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Kling 3.0 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Hailuo 2.3 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Vidu Q2 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Wan 2.5 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Kling 2.6 | {"image": "<image_url>", "prompt": "<text prompt>"} |
| Motion Control Kling 2.6 | {"video": "<prompt_or_value>", "image": "<image_url>"} |
| Google Veo 3.1 Fast | {"image": "<image_url>", "prompt": "<text prompt>"} |
### 5.5 Text To Video
| API | Minimum params |
|---|---|
| Wan 2.6 | {"prompt": "<text prompt>"} |
| Vidu Q3 | {"prompt": "<text prompt>"} |
| Wan 2.5 | {"prompt": "<text prompt>"} |
## 6. Reusable Helper Snippets
### 6.1 Unified submit + poll function
```python
import os
import time
from scripts.skill_router import Skill
skill = Skill('scripts/c_api_doc_detail.json')
api_key = os.getenv('API_KEY', '')
def submit_and_wait(api_name, params, retries=36, interval_sec=5):
create = skill.invoke(api_name, params, api_key=api_key)
task_id = (create.get('data') or {}).get('task_id')
if not task_id:
return {'stage': 'create', 'response': create}
for _ in range(retries):
r = skill.invoke('Task Result', {'task_id': task_id}, api_key=api_key)
status = (r.get('data') or {}).get('status')
if status in ('completed', 'succeeded', 'failed', 'timeout'):
return {'stage': 'result', 'task_id': task_id, 'response': r}
time.sleep(interval_sec)
return {'stage': 'timeout', 'task_id': task_id}
```
### 6.2 Quick T2I call
```python
r = submit_and_wait('Imagen 4', {
'prompt': 'A cute kitten, cinematic composition, soft daylight, ultra-detailed',
'ratio': '1:1',
'counts': '1'
})
print(r)
```
### 6.3 Quick I2V call
```python
r = submit_and_wait('Kling 3.0', {
'image': 'https://example.com/input.jpg',
'prompt': 'Slow camera push-in, stable subject, cinematic lighting',
'duration': '5s',
'resolution': '1080P'
})
print(r)
```
## 7. Troubleshooting Checklist
1. `code=374004`: verify that `API_KEY` is configured and valid.
2. `code=490505`: your account is out of credits; recharge before retrying.
3. `API not found`: ensure `api_name` exactly matches the `name` field in API definitions.
4. Task remains `processing`: increase polling retries or reduce duration/resolution.
5. Missing `task_id`: check required parameters and payload shape against this document.
FILE:scripts/c_api_doc_detail.json
[
{
"id": 196,
"name": "Credits",
"model_code": "user-credits",
"method": "POST",
"endpoint": "https://openapi.media.io/user/credits",
"title": "User Credits API Documentation",
"description": "API to query user credits balance.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [], \"title\": \"Query Credits\", \"describe\": \"Request body to query user credits balance\"}]}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"Query User Credits\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\n --url https://openapi.media.io/user/credits \\n --header 'Content-Type: application/json' \\n --header 'X-API-KEY: <api-key>' \\n --data '{}'\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"credits\", \"type\": \"integer\", \"describe\": \"User credits balance, located within the data object\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"list\": [{\"code\": \"0\", \"describe\": \"Success\"}, {\"code\": \"40001\", \"describe\": \"Invalid API key\"}, {\"code\": \"40002\", \"describe\": \"API key expired\"}], \"title\": \"Status Code\"}",
"content": null,
"status": 1,
"created_at": "3/3/2026 18:33:49",
"updated_at": "3/3/2026 18:33:49"
},
{
"id": 197,
"name": "Task Result",
"model_code": "generation-result",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/result/{task_id}",
"title": "Task Result API Documentation",
"description": "API to query generation task result by task ID.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Path Parameters\", \"category\": [{\"list\": [{\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"The task ID to query, located in the URL path parameter\", \"required\": true}], \"title\": \"Query Task Result\", \"describe\": \"Request body to query task result\"}]}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"Query Task Result\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\n --url https://openapi.media.io/generation/result/<task_id> \\n --header 'Content-Type: application/json' \\n --header 'X-API-KEY: <api-key>' \\n --data '{}'\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Task identifier\"}, {\"name\": \"status\", \"type\": \"string\", \"describe\": \"Task status: pending, processing, succeeded, failed\"}, {\"name\": \"result\", \"type\": \"object\", \"describe\": \"Generation result when task succeeded\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"list\": [{\"code\": \"0\", \"describe\": \"Success\"}, {\"code\": \"40001\", \"describe\": \"Invalid API key\"}, {\"code\": \"40002\", \"describe\": \"API key expired\"}, {\"code\": \"40003\", \"describe\": \"Task not found\"}], \"title\": \"Status Code\"}",
"content": null,
"status": 1,
"created_at": "3/3/2026 18:33:49",
"updated_at": "4/3/2026 21:30:23"
},
{
"id": 242,
"name": "Nano Banana",
"model_code": "i2i-banana",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/banana/i2i-banana",
"title": "Nano Banana API Documentation",
"description": "Utilizes next-gen multimodal reasoning to generate images that perfectly align with nuanced conceptual descriptions.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"images\", \"type\": \"array<string>\", \"describe\": \"The URLs of the input images. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 1280x1280. Aspect ratio must be between 1:2 and 2:1. Maximum 9 file(s).\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 1200.\", \"required\": true}, {\"name\": \"ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3, 21:9. Default is 16:9.\", \"keywords\": [\"9:16\", \"16:9\", \"1:1\", \"4:3\", \"3:4\", \"3:2\", \"2:3\", \"21:9\"], \"required\": false}], \"title\": \"Image To Image\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/banana/i2i-banana \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"images\\\": [\\n \\\"<string>\\\"\\n ],\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"16:9\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/banana/i2i-banana \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"images\\\": [\\n \\\"<string>\\\"\\n ],\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"16:9\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:06:21"
},
{
"id": 243,
"name": "Seedream 4.0",
"model_code": "i2i-seedream-v4-0",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/seedream/i2i-seedream-v4-0",
"title": "Seedream 4.0 API Documentation",
"description": "A versatile powerhouse supporting 4K generation and advanced editing, ensuring character and style consistency.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"images\", \"type\": \"array<string>\", \"describe\": \"The URLs of the input images. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 1280x1280. Aspect ratio must be between 3:10 and 3:1. Maximum 9 file(s).\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 800.\", \"required\": true}, {\"name\": \"ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3. Default is 9:16.\", \"keywords\": [\"9:16\", \"16:9\", \"1:1\", \"4:3\", \"3:4\", \"3:2\", \"2:3\"], \"required\": false}], \"title\": \"Image To Image\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/seedream/i2i-seedream-v4-0 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"images\\\": [\\n \\\"<string>\\\"\\n ],\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"9:16\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/seedream/i2i-seedream-v4-0 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"images\\\": [\\n \\\"<string>\\\"\\n ],\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"9:16\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:06:21"
},
{
"id": 244,
"name": "Nano Banana Pro",
"model_code": "i2i-banana-2",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/banana/i2i-banana-2",
"title": "Nano Banana Pro API Documentation",
"description": "Utilizes next-gen multimodal reasoning to generate images that perfectly align with nuanced conceptual descriptions.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"images\", \"type\": \"array<string>\", \"describe\": \"The URLs of the input images. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 1280x1280. Aspect ratio must be between 1:2 and 2:1. Maximum 9 file(s).\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 1200.\", \"required\": true}, {\"name\": \"ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4, 3:2, 2:3, 21:9. Default is 16:9.\", \"keywords\": [\"9:16\", \"16:9\", \"1:1\", \"4:3\", \"3:4\", \"3:2\", \"2:3\", \"21:9\"], \"required\": false}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 1K, 2K, 4K. Default is 1K.\", \"keywords\": [\"1K\", \"2K\", \"4K\"], \"required\": false}], \"title\": \"Image To Image\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/banana/i2i-banana-2 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"images\\\": [\\n \\\"<string>\\\"\\n ],\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"16:9\\\",\\n \\\"resolution\\\": \\\"1K\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/banana/i2i-banana-2 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"images\\\": [\\n \\\"<string>\\\"\\n ],\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"16:9\\\",\\n \\\"resolution\\\": \\\"1K\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:06:21"
},
{
"id": 245,
"name": "Wan 2.6",
"model_code": "i2v-wan-v2-6",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/wan/i2v-wan-v2-6",
"title": "Wan 2.6 API Documentation",
"description": "Features advanced multi-shot storytelling and native audio-visual synchronization, ideal for creating complex narratives with consistent characters.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, and BMP. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 2000x2000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 1500.\", \"required\": true}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 720P, 1080P. Default is 720P.\", \"keywords\": [\"720P\", \"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s, 10s, 15s. Default is 5s.\", \"keywords\": [\"5s\", \"10s\", \"15s\"], \"required\": false}, {\"name\": \"shot_type\", \"type\": \"string\", \"describe\": \"Shot Type parameter. Options: Multi, Single. Default is Multi.\", \"keywords\": [\"Multi\", \"Single\"], \"required\": false}, {\"name\": \"generate_audio\", \"type\": \"string\", \"describe\": \"Generate Audio parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-6 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\",\\n \\\"shot_type\\\": \\\"Multi\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-6 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\",\\n \\\"shot_type\\\": \\\"Multi\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 246,
"name": "Wan 2.2",
"model_code": "i2v-wan-v2-2",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/wan/i2v-wan-v2-2",
"title": "Wan 2.2 API Documentation",
"description": "A balanced video generation model offering reliable motion quality and prompt adherence for general tasks.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, and BMP. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 2000x2000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 800.\", \"required\": true}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 1080P. Default is 1080P.\", \"keywords\": [\"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s. Default is 5s.\", \"keywords\": [\"5s\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-2 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"1080P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-2 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"1080P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 247,
"name": "Hailuo 02",
"model_code": "i2v-minimax-02",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/minimax/i2v-minimax-02",
"title": "Hailuo 02 API Documentation",
"description": "Excels in cinematic realism and complex physics simulation, offering 'Director-level' control over camera movements and high fidelity.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2000.\", \"required\": true}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 6s. Default is 6s.\", \"keywords\": [\"6s\"], \"required\": false}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 768P, 1080P. Default is 768P.\", \"keywords\": [\"768P\", \"1080P\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/minimax/i2v-minimax-02 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"6s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"768P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/minimax/i2v-minimax-02 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"6s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"768P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 248,
"name": "Kling 2.1",
"model_code": "i2v-kling-v2-1",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/kling/i2v-kling-v2-1",
"title": "Kling 2.1 API Documentation",
"description": "A robust model known for producing realistic videos with stable motion and good prompt understanding.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2000.\", \"required\": true}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s, 10s. Default is 5s.\", \"keywords\": [\"5s\", \"10s\"], \"required\": false}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 720P, 1080P. Default is 720P.\", \"keywords\": [\"720P\", \"1080P\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-1 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-1 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 249,
"name": "Vidu Q3",
"model_code": "i2v-vidu-q3",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/vidu/i2v-vidu-q3",
"title": "Vidu Q3 API Documentation",
"description": "The latest iteration optimized for superior speed and high-definition output, enabling rapid high-quality video creation.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 150x150 and 4000x4000. Aspect ratio must be between 1:4 and 4:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2000.\", \"required\": true}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 540P, 720P, 1080P. Default is 720P.\", \"keywords\": [\"540P\", \"720P\", \"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 4s, 8s, 12s, 16s. Default is 4s.\", \"keywords\": [\"4s\", \"8s\", \"12s\", \"16s\"], \"required\": false}, {\"name\": \"generate_audio\", \"type\": \"string\", \"describe\": \"Generate Audio parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/vidu/i2v-vidu-q3 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"4s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/vidu/i2v-vidu-q3 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"4s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 251,
"name": "Kling 2.5 Turbo",
"model_code": "i2v-kling-v2-5-turbo",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/kling/i2v-kling-v2-5-turbo",
"title": "Kling 2.5 Turbo API Documentation",
"description": "Offers advanced motion control and high realism, bridging the gap between standard and professional output.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2500.\", \"required\": true}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s, 10s. Default is 5s.\", \"keywords\": [\"5s\", \"10s\"], \"required\": false}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 720P, 1080P. Default is 1080P.\", \"keywords\": [\"720P\", \"1080P\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-5-turbo \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"1080P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-5-turbo \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"1080P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 252,
"name": "Google Veo 3.1",
"model_code": "i2v-veo-v3-1",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/veo/i2v-veo-v3-1",
"title": "Google Veo 3.1 API Documentation",
"description": "A state-of-the-art model offering exceptional resolution and deep understanding of complex prompts for cinematic results.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include JPG, JPEG, PNG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 50.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 1000.\", \"required\": true}, {\"name\": \"aspect_ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 9:16, 16:9. Default is 9:16.\", \"keywords\": [\"9:16\", \"16:9\"], \"required\": false}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 720P, 1080P. Default is 720P.\", \"keywords\": [\"720P\", \"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 4s, 6s, 8s. Default is 4s.\", \"keywords\": [\"4s\", \"6s\", \"8s\"], \"required\": false}, {\"name\": \"generate_audio\", \"type\": \"string\", \"describe\": \"Generate Audio parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/veo/i2v-veo-v3-1 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"aspect_ratio\\\": \\\"9:16\\\",\\n \\\"duration\\\": \\\"4s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/veo/i2v-veo-v3-1 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"aspect_ratio\\\": \\\"9:16\\\",\\n \\\"duration\\\": \\\"4s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 253,
"name": "Kling 3.0",
"model_code": "i2v-kling-v3-0",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/kling/i2v-kling-v3-0",
"title": "Kling 3.0 API Documentation",
"description": "The latest high-fidelity generation delivering ultra-realistic motion and superior detail, suitable for professional-grade video production.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2500.\", \"required\": true}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 720P, 1080P. Default is 1080P.\", \"keywords\": [\"720P\", \"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s, 8s, 10s, 15s. Default is 5s.\", \"keywords\": [\"5s\", \"8s\", \"10s\", \"15s\"], \"required\": false}, {\"name\": \"multi_shots\", \"type\": \"string\", \"describe\": \"Multi Shots parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}, {\"name\": \"generate_audio\", \"type\": \"string\", \"describe\": \"Generate Audio parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-kling-v3-0 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"multi_shots\\\": \\\"True\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"1080P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-kling-v3-0 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"multi_shots\\\": \\\"True\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"1080P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 254,
"name": "Hailuo 2.3",
"model_code": "i2v-minimax-v2-3",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/minimax/i2v-minimax-v2-3",
"title": "Hailuo 2.3 API Documentation",
"description": "Excels in cinematic realism and complex physics simulation, offering 'Director-level' control over camera movements and high fidelity.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2000.\", \"required\": true}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 768P, 1080P (when duration is 6s). Default is 768P.\", \"keywords\": [\"768P\", \"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 6s, 10s. Default is 6s.\", \"keywords\": [\"6s\", \"10s\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/minimax/i2v-minimax-v2-3 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"6s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"768P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/minimax/i2v-minimax-v2-3 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"6s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"768P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 255,
"name": "Vidu Q2",
"model_code": "i2v-vidu-q2",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/vidu/i2v-vidu-q2",
"title": "Vidu Q2 API Documentation",
"description": "A fast and efficient model that balances generation speed with good visual fidelity.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 150x150 and 4000x4000. Aspect ratio must be between 1:4 and 4:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2000.\", \"required\": true}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 4s, 6s, 8s. Default is 4s.\", \"keywords\": [\"4s\", \"6s\", \"8s\"], \"required\": false}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 720P, 1080P. Default is 720P.\", \"keywords\": [\"720P\", \"1080P\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/vidu/i2v-vidu-q2 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"4s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/vidu/i2v-vidu-q2 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"4s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 256,
"name": "Wan 2.5",
"model_code": "i2v-wan-v2-5",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/wan/i2v-wan-v2-5",
"title": "Wan 2.5 API Documentation",
"description": "An enhanced version offering improved motion stability and visual quality compared to earlier iterations.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, and BMP. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 2000x2000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 800.\", \"required\": true}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 480P, 720P, 1080P. Default is 480P.\", \"keywords\": [\"480P\", \"720P\", \"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s, 10s. Default is 5s.\", \"keywords\": [\"5s\", \"10s\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-5 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"480P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/i2v-wan-v2-5 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"480P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 257,
"name": "Kling 2.6",
"model_code": "i2v-kling-v2-6",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/kling/i2v-kling-v2-6",
"title": "Kling 2.6 API Documentation",
"description": "Kling AI's video model, recognized for its ability to generate realistic and coherent motion.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, JPG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 20.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2.5:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2500.\", \"required\": true}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s, 10s. Default is 5s.\", \"keywords\": [\"5s\", \"10s\"], \"required\": false}, {\"name\": \"generate_audio\", \"type\": \"string\", \"describe\": \"Generate Audio parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-6 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-kling-v2-6 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 258,
"name": "Motion Control Kling 2.6",
"model_code": "i2v-motion-control-kling-v2-6",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/kling/i2v-motion-control-kling-v2-6",
"title": "Kling 2.6 API Documentation",
"description": "Offers advanced motion control and high realism, bridging the gap between standard and professional output.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"video\", \"type\": \"string\", \"describe\": \"The URL of the input video. Supported formats include MP4 and MOV. File size must be between 0.0 MB and 100.0 MB. Video resolution must be between 720x720 and 2160x2160. Frame rate must be between 24 and 60 FPS. Duration must be between 3 and 30 seconds.\", \"required\": true}, {\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include PNG, JPEG, and JPG. File size must be between 0.0 MB and 10.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 1:2 and 2:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2500.\", \"required\": false}], \"title\": \"Motion Control\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-motion-control-kling-v2-6 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"video\\\": \\\"<string>\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/kling/i2v-motion-control-kling-v2-6 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"video\\\": \\\"<string>\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 259,
"name": "Imagen 4",
"model_code": "t2i-imagen-4",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/imagen/t2i-imagen-4",
"title": "Imagen 4 API Documentation",
"description": "Sets a new standard for photorealism and text rendering accuracy, handling the most complex prompts with ease.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 1200.\", \"required\": true}, {\"name\": \"ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4. Default is 9:16.\", \"keywords\": [\"9:16\", \"16:9\", \"1:1\", \"4:3\", \"3:4\"], \"required\": false}, {\"name\": \"counts\", \"type\": \"string\", \"describe\": \"Counts parameter. Options: 1, 2, 3, 4. Default is 1.\", \"keywords\": [\"1\", \"2\", \"3\", \"4\"], \"required\": false}], \"title\": \"Text To Image\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/imagen/t2i-imagen-4 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"counts\\\": \\\"1\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"9:16\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/imagen/t2i-imagen-4 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"counts\\\": \\\"1\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"9:16\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 260,
"name": "soul_character",
"model_code": "t2i-soul-character",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/soul/t2i-soul-character",
"title": "soul_character API Documentation",
"description": "Delivers the most cost-effective solution with ultra-fast generation speeds, perfect for high-volume commercial applications.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2000.\", \"required\": true}, {\"name\": \"ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 2:3, 1:1, 3:2, 9:16, 16:9. Default is 16:9.\", \"keywords\": [\"2:3\", \"1:1\", \"3:2\", \"9:16\", \"16:9\"], \"required\": false}], \"title\": \"Text To Image\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/soul/t2i-soul-character \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"16:9\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create image by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/soul/t2i-soul-character \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"16:9\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 261,
"name": "Wan 2.6 (Text To Video)",
"model_code": "t2v-wan-v2-6",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/wan/t2v-wan-v2-6",
"title": "Wan 2.6 API Documentation",
"description": "Features advanced multi-shot storytelling and native audio-visual synchronization, ideal for creating complex narratives with consistent characters.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 1500.\", \"required\": true}, {\"name\": \"size\", \"type\": \"string\", \"describe\": \"Output size. Options: 9:16, 16:9, 1:1, 4:3, 3:4. Default is 16:9.\", \"keywords\": [\"9:16\", \"16:9\", \"1:1\", \"4:3\", \"3:4\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s, 10s, 15s. Default is 5s.\", \"keywords\": [\"5s\", \"10s\", \"15s\"], \"required\": false}, {\"name\": \"shot_type\", \"type\": \"string\", \"describe\": \"Shot Type parameter. Options: Multi, Single. Default is Multi.\", \"keywords\": [\"Multi\", \"Single\"], \"required\": false}, {\"name\": \"generate_audio\", \"type\": \"string\", \"describe\": \"Generate Audio parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}], \"title\": \"Text To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/t2v-wan-v2-6 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"shot_type\\\": \\\"Multi\\\",\\n \\\"size\\\": \\\"16:9\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/t2v-wan-v2-6 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"shot_type\\\": \\\"Multi\\\",\\n \\\"size\\\": \\\"16:9\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 262,
"name": "Vidu Q3 (Text To Video)",
"model_code": "t2v-vidu-q3",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/vidu/t2v-vidu-q3",
"title": "Vidu Q3 API Documentation",
"description": "The latest iteration optimized for superior speed and high-definition output, enabling rapid high-quality video creation.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 2000.\", \"required\": true}, {\"name\": \"ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 9:16, 16:9, 1:1, 4:3, 3:4. Default is 16:9.\", \"keywords\": [\"9:16\", \"16:9\", \"1:1\", \"4:3\", \"3:4\"], \"required\": false}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 540P, 720P, 1080P. Default is 720P.\", \"keywords\": [\"540P\", \"720P\", \"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 4s, 8s, 12s, 16s. Default is 8s.\", \"keywords\": [\"4s\", \"8s\", \"12s\", \"16s\"], \"required\": false}, {\"name\": \"generate_audio\", \"type\": \"string\", \"describe\": \"Generate Audio parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}], \"title\": \"Text To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/vidu/t2v-vidu-q3 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"8s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"16:9\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/vidu/t2v-vidu-q3 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"8s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"ratio\\\": \\\"16:9\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:16:03",
"updated_at": "6/3/2026 15:02:49"
},
{
"id": 263,
"name": "Wan 2.5 (Text To Video)",
"model_code": "t2v-wan-v2-5",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/wan/t2v-wan-v2-5",
"title": "Wan 2.5 API Documentation",
"description": "An enhanced version offering improved motion stability and visual quality compared to earlier iterations.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 800.\", \"required\": true}, {\"name\": \"ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 16:9, 9:16, 1:1, 4:3, 3:4. Default is 16:9.\", \"keywords\": [\"16:9\", \"9:16\", \"1:1\", \"4:3\", \"3:4\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 5s, 10s. Default is 5s.\", \"keywords\": [\"5s\", \"10s\"], \"required\": false}], \"title\": \"Text To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/t2v-wan-v2-5 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"prompt\\\": \\\"<Your prompt text, 1-800 characters>\\\",\\n \\\"ratio\\\": \\\"16:9\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/wan/t2v-wan-v2-5 \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"duration\\\": \\\"5s\\\",\\n \\\"prompt\\\": \\\"<Your prompt text, 1-800 characters>\\\",\\n \\\"ratio\\\": \\\"16:9\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 2,
"created_at": "4/3/2026 18:16:03",
"updated_at": "5/3/2026 10:03:16"
},
{
"id": 264,
"name": "Google Veo 3.1 Fast",
"model_code": "i2v-veo-v3-1-fast",
"method": "POST",
"endpoint": "https://openapi.media.io/generation/veo/i2v-veo-v3-1-fast",
"title": "Google Veo 3.1 Fast API Documentation",
"description": "Google Veo 3.1 Fast is an AI model for video generation.",
"api_header": "{\"list\": [{\"name\": \"X-API-KEY\", \"value\": \"API key to authorize requests\"}, {\"name\": \"Content-Type\", \"value\": \"application/json\"}], \"title\": \"Authorizations\", \"describe\": \"Add the following authorization information in the request header\"}",
"api_body": "{\"title\": \"Request Body\", \"category\": [{\"list\": [{\"name\": \"image\", \"type\": \"string\", \"describe\": \"The URL of the input image. Supported formats include JPG, JPEG, PNG, WEBP, GIF, and HEIC. File size must be between 0.0 MB and 50.0 MB. Image resolution must be between 300x300 and 4000x4000. Aspect ratio must be between 0.4:1 and 2:1.\", \"required\": true}, {\"name\": \"prompt\", \"type\": \"string\", \"describe\": \"The text prompt describing the content to generate. Maximum string length: 1000.\", \"required\": true}, {\"name\": \"aspect_ratio\", \"type\": \"string\", \"describe\": \"Target aspect ratio. Options: 9:16, 16:9. Default is 9:16.\", \"keywords\": [\"9:16\", \"16:9\"], \"required\": false}, {\"name\": \"resolution\", \"type\": \"string\", \"describe\": \"Output resolution. Options: 720P, 1080P. Default is 720P.\", \"keywords\": [\"720P\", \"1080P\"], \"required\": false}, {\"name\": \"duration\", \"type\": \"string\", \"describe\": \"Video duration in seconds. Options: 4s, 6s, 8s. Default is 4s.\", \"keywords\": [\"4s\", \"6s\", \"8s\"], \"required\": false}, {\"name\": \"generate_audio\", \"type\": \"string\", \"describe\": \"Generate Audio parameter. Options: True, False. Default is True.\", \"keywords\": [\"True\", \"False\"], \"required\": false}], \"title\": \"Image To Video\"}], \"describe\": \"The request body must contain the following parameters\"}",
"api_request_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/veo/i2v-veo-v3-1-fast \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"aspect_ratio\\\": \\\"9:16\\\",\\n \\\"duration\\\": \\\"4s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"api_response": "{\"list\": [{\"name\": \"code\", \"type\": \"integer\", \"describe\": \"Response status code, 0 indicates success\"}, {\"name\": \"msg\", \"type\": \"string\", \"describe\": \"Response message, empty string on success\"}, {\"name\": \"data\", \"type\": \"object\", \"describe\": \"Response data object\"}, {\"name\": \"task_id\", \"type\": \"string\", \"describe\": \"Unique task identifier, located within the data object\"}, {\"name\": \"trace_id\", \"type\": \"string\", \"describe\": \"Request tracking ID\"}], \"title\": \"Response\", \"describe\": \"After the request is successfully processed, the server will return the following response\"}",
"api_code_demo": "{\"title\": \"Example Request\", \"request\": [{\"title\": \"create video by text or image\", \"language\": \"cURL\", \"code_example\": \"curl --request POST \\\\\\n --url https://openapi.media.io/generation/veo/i2v-veo-v3-1-fast \\\\\\n --header 'Content-Type: application/json' \\\\\\n --header 'X-API-KEY: <api-key>' \\\\\\n --data '\\n{\\n \\\"data\\\": {\\n \\\"aspect_ratio\\\": \\\"9:16\\\",\\n \\\"duration\\\": \\\"4s\\\",\\n \\\"generate_audio\\\": \\\"True\\\",\\n \\\"image\\\": \\\"<string>\\\",\\n \\\"prompt\\\": \\\"<string>\\\",\\n \\\"resolution\\\": \\\"720P\\\"\\n }\\n}'\\n\"}], \"describe\": \"\", \"response\": [{\"type\": \"200\", \"code_example\": \"{\\n \\\"code\\\": 0,\\n \\\"msg\\\": \\\"\\\",\\n \\\"data\\\": {\\n \\\"task_id\\\": <string>\\\"\\n },\\n \\\"trace_id\\\": <string>\\\"\\n}\"}, {\"type\": \"default\", \"code_example\": \"{\\n \\\"code\\\": <integer>,\\n \\\"msg\\\": <string>,\\n \\\"data\\\": {},\\n \\\"trace_id\\\": <string>\\\"\\n}\"}]}",
"content": null,
"status": 1,
"created_at": "4/3/2026 18:36:46",
"updated_at": "6/3/2026 15:02:49"
}
]
FILE:scripts/skill_router.py
#!/usr/bin/env python3
import json
import os
import requests
from typing import Any, Dict, Optional
from urllib.parse import urlparse
class Skill:
"""
Standard AIGC skill implementation with automatic API routing and parameter mapping.
"""
def __init__(self, api_doc_path: str):
with open(api_doc_path, 'r', encoding='utf-8') as f:
api_items = json.load(f)
self.api_definitions = {}
duplicate_names = []
for item in api_items:
name = item.get('name')
if name in self.api_definitions:
duplicate_names.append(name)
continue
self.api_definitions[name] = item
if duplicate_names:
deduped = sorted(set(duplicate_names))
raise ValueError(
f"Duplicate API names detected in {api_doc_path}: {', '.join(deduped)}. "
"Please use unique `name` fields in c_api_doc_detail.json."
)
def invoke(self, api_name: str, params: Dict[str, Any], api_key: Optional[str] = None) -> Dict[str, Any]:
"""
Invoke the specified API.
:param api_name: API name.
:param params: Business parameters.
:param api_key: API key. If omitted, API_KEY from environment is used.
:return: API response payload.
"""
if api_name not in self.api_definitions:
return {'error': f"API '{api_name}' not found."}
resolved_api_key = (api_key or os.getenv('API_KEY', '')).strip()
if not resolved_api_key:
return {'error': 'Missing API key. Set API_KEY or pass api_key explicitly.'}
api = self.api_definitions[api_name]
url = api['endpoint']
method = api['method']
# Restrict outbound requests to the expected Media.io API host.
parsed = urlparse(url)
if parsed.scheme != 'https' or parsed.netloc.lower() != 'openapi.media.io':
return {'error': f"Blocked endpoint host: {parsed.netloc}"}
headers = {
'X-API-KEY': resolved_api_key,
'Content-Type': 'application/json'
}
# Replace path parameters in endpoint URLs.
if '{' in url:
for k, v in params.items():
url = url.replace(f'{{{k}}}', str(v))
# Keep non-path parameters in the JSON body.
body = {k: v for k, v in params.items() if f'{{{k}}}' not in api['endpoint']}
try:
resp = requests.request(method, url, headers=headers, json={'data': body} if body else {}, timeout=30)
return resp.json()
except Exception as e:
return {'error': str(e)}
# Standard usage example.
if __name__ == '__main__':
script_dir = os.path.dirname(os.path.abspath(__file__))
skill = Skill(os.path.join(script_dir, 'c_api_doc_detail.json'))
api_key = os.getenv('API_KEY', '')
if not api_key:
raise RuntimeError('API_KEY is not set')
# Credits query.
result = skill.invoke('Credits', {}, api_key=api_key)
print(result)
Generate videos from first and last frame images using Tomoviee First-Last Frame API (`tm_tail2video_b`) via Wondershare OpenAPI gateway (`https://openapi.wo...
---
name: tomoviee-tail-to-video
description: Generate videos from first and last frame images using Tomoviee First-Last Frame API (`tm_tail2video_b`) via Wondershare OpenAPI gateway (`https://openapi.wondershare.cc`). Requires `app_key` and `app_secret`. Use when users request first-last keyframe interpolation, start-end frame animation, or two-image to 5-second video generation.
---
# Tomoviee AI First-Last Frame to Video
## Overview
Generate a 5-second video from two keyframe images:
- `image`: first frame
- `image_tail`: last frame
API capability: `tm_tail2video_b`
## Provider and Endpoint Provenance
Use this mapping to verify provenance before using production credentials:
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Runtime API gateway host used by this skill: `https://openapi.wondershare.cc`
- Create endpoint: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_tail2video_b`
- Result endpoint: `https://openapi.wondershare.cc/v1/open/pub/task`
This skill sends runtime API requests only to `openapi.wondershare.cc`.
## Credential Handling
- Sensitive credentials required: `app_key` and `app_secret`.
- Credentials are only used to build `Authorization: Basic <base64(app_key:app_secret)>`.
- Credentials are kept in process memory only and are not written to disk by this skill.
- Do not hardcode credentials in source files or commit them to git.
## Required Inputs
- Credentials: `app_key`, `app_secret`
- Generation inputs: `prompt`, `image`, `image_tail`
## Scope
- This skill only covers `tm_tail2video_b` (first-last frame to video).
- Output duration is fixed to 5 seconds by API design.
- This skill does not implement text-to-video, image-to-video, or video continuation APIs.
## Dependencies
- Runtime dependency: `requests>=2.31.0,<3.0.0`
- Install with: `pip install -r requirements.txt`
## Quick Start
### Install dependencies
```bash
pip install -r requirements.txt
```
### Authentication helper
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_firstlast2video_client import TomovieeFirstLast2VideoClient
client = TomovieeFirstLast2VideoClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client.firstlast_to_video(
prompt="Scene transitions naturally from first frame to last frame with smooth motion",
image="https://example.com/first-frame.jpg",
image_tail="https://example.com/last-frame.jpg",
resolution="720p",
duration=5,
aspect_ratio="original",
)
result = client.poll_until_complete(task_id)
import json
video_url = json.loads(result["result"])["video_path"][0]
print(video_url)
```
### Parameters
- `prompt` (required): Motion guidance text.
- `image` (required): First frame image URL.
- `image_tail` (required): Last frame image URL.
- `resolution` (optional): `720p` or `1080p`, default `720p`.
- `duration` (optional): only `5` is supported.
- `aspect_ratio` (optional): `16:9`, `9:16`, `4:3`, `3:4`, `1:1`, `original`.
- `camera_move_index` (optional): camera movement type `1-46`.
- `callback` (optional): callback URL.
- `params` (optional): transparent callback parameter.
### Image Constraints
- File size: each image must be `<200M`
- Formats: `JPG`, `JPEG`, `PNG`, `WEBP`
- Recommended resolution: at least 720p source quality
## Async Workflow
1. Create task and get `task_id`
2. Poll with `poll_until_complete(task_id)`
3. Parse video URL from `result`
Status codes:
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Resources
- `scripts/tomoviee_firstlast2video_client.py` - main API client
- `scripts/generate_auth_token.py` - auth token helper
- `references/video_apis.md` - API reference and constraints
- `references/camera_movements.md` - camera movement index reference
- `references/prompt_guide.md` - prompt writing guidance
## External Resources
- Developer portal (global): `https://www.tomoviee.ai/developers.html`
- API docs (global): `https://www.tomoviee.ai/doc/ai-video/first-and-last-frame-to-video.html`
- Developer portal (mainland): `https://www.tomoviee.cn/developers.html`
- API docs (mainland): `https://www.tomoviee.cn/doc/ai-video/first-and-last-frame-to-video.html`
FILE:requirements.txt
requests>=2.31.0,<3.0.0
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-tail-to-video",
"version": "1.0.1",
"publishedAt": 1772933838173
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_firstlast2video_client.py
#!/usr/bin/env python3
"""Tomoviee AI - First-Last Frame to Video API client."""
import base64
import json
import time
from typing import Any, Dict, Optional
import requests
class TomovieeFirstLast2VideoClient:
"""First-Last Frame to Video API client for Tomoviee AI."""
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_tail2video_b"
REQUEST_TIMEOUT = 60
ALLOWED_RESOLUTIONS = {"720p", "1080p"}
ALLOWED_ASPECT_RATIOS = {"16:9", "9:16", "4:3", "3:4", "1:1", "original"}
def __init__(self, app_key: str, app_secret: str):
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}",
}
def _safe_json(self, response: requests.Response) -> Dict[str, Any]:
try:
return response.json()
except ValueError as exc:
raise Exception(f"Invalid JSON response: {response.text}") from exc
def _make_request(self, payload: Dict[str, Any]) -> str:
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(
url,
headers=self._get_headers(),
json=payload,
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
task_id = result.get("data", {}).get("task_id")
if not task_id:
raise Exception(f"Missing task_id in response: {result}")
return task_id
def firstlast_to_video(
self,
prompt: str,
image: str,
image_tail: str,
resolution: str = "720p",
duration: int = 5,
aspect_ratio: str = "16:9",
camera_move_index: Optional[int] = None,
callback: Optional[str] = None,
params: Optional[str] = None,
) -> str:
"""Create a first-last frame video task and return task_id."""
if duration != 5:
raise ValueError("duration must be 5 for tm_tail2video_b")
if resolution not in self.ALLOWED_RESOLUTIONS:
raise ValueError("resolution must be one of: 720p, 1080p")
if aspect_ratio not in self.ALLOWED_ASPECT_RATIOS:
raise ValueError("aspect_ratio must be one of: 16:9, 9:16, 4:3, 3:4, 1:1, original")
if camera_move_index is not None and (camera_move_index < 1 or camera_move_index > 46):
raise ValueError("camera_move_index must be in range 1..46")
payload: Dict[str, Any] = {
"prompt": prompt,
"image": image,
"image_tail": image_tail,
"resolution": resolution,
"duration": duration,
"aspect_ratio": aspect_ratio,
}
if camera_move_index is not None:
payload["camera_move_index"] = camera_move_index
if callback:
payload["callback"] = callback
if params:
payload["params"] = params
return self._make_request(payload)
def get_result(self, task_id: str) -> Dict[str, Any]:
response = requests.post(
self.RESULT_ENDPOINT,
headers=self._get_headers(),
json={"task_id": task_id},
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
data = result.get("data")
if not data:
raise Exception(f"Missing task data in response: {result}")
return data
def poll_until_complete(
self,
task_id: str,
poll_interval: int = 10,
timeout: int = 600,
) -> Dict[str, Any]:
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
status = result.get("status")
if status == 3:
return result
if status in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason', 'Unknown error')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
# Backward-compatible alias
TomovieeClient = TomovieeFirstLast2VideoClient
if __name__ == "__main__":
import sys
if len(sys.argv) < 6:
print(
"Usage: python scripts/tomoviee_firstlast2video_client.py "
"<app_key> <app_secret> <prompt> <first_image_url> <last_image_url> "
"[resolution] [aspect_ratio]"
)
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
prompt = sys.argv[3]
first_image = sys.argv[4]
last_image = sys.argv[5]
resolution = sys.argv[6] if len(sys.argv) > 6 else "720p"
aspect_ratio = sys.argv[7] if len(sys.argv) > 7 else "16:9"
client = TomovieeFirstLast2VideoClient(app_key, app_secret)
try:
print("Creating first-last frame to video task...")
task_id = client.firstlast_to_video(
prompt=prompt,
image=first_image,
image_tail=last_image,
resolution=resolution,
aspect_ratio=aspect_ratio,
)
print(f"Task created: {task_id}")
print("Polling for result...")
result = client.poll_until_complete(task_id)
print("\nTask completed")
print(f"Progress: {result.get('progress', 'N/A')}%")
result_data = json.loads(result["result"])
print(json.dumps(result_data, indent=2, ensure_ascii=False))
except Exception as exc:
print(f"Error: {exc}")
sys.exit(1)
FILE:references/camera_movements.md
# Camera Movement Types Reference
Complete list of camera movement types for `camera_move_index` parameter.
## Movement Types (1-46)
| Index | Type | Description |
|-------|------|-------------|
| 1 | orbit | Camera circles around subject |
| 2 | spin | Rotating motion |
| 3 | pan left | Camera pans to the left |
| 4 | pan right | Camera pans to the right |
| 5 | tilt up | Camera tilts upward |
| 6 | tilt down | Camera tilts downward |
| 7 | push in | Camera moves closer to subject |
| 8 | pull out | Camera moves away from subject |
| 9 | static | No camera movement |
| 10 | tracking | Camera follows subject movement |
| 11 | others | Unspecified movement |
| 12 | object pov | Point of view from object perspective |
| 13 | super dolly in | Dramatic push into scene |
| 14 | super dolly out | Dramatic pull from scene |
| 15 | snorricam | Camera fixed to subject while rotating |
| 16 | head tracking | Follows head/face movement |
| 17 | car grip | Camera mounted on vehicle |
| 18 | screen transition | Transition effect movement |
| 19 | car chasing | Following vehicle action |
| 20 | fisheye | Wide-angle distortion effect |
| 21 | FPV drone | First-person drone perspective |
| 22 | crane over the head | Overhead crane shot |
| 23 | timelapse landscape | Time-lapse scenery |
| 24 | dolly in | Smooth push toward subject |
| 25 | dolly out | Smooth pull from subject |
| 26 | zoom in | Lens zooms closer |
| 27 | zoom out | Lens zooms further |
| 28 | full shot | Wide establishing shot |
| 29 | close-up shot | Tight framing on subject |
| 30 | extreme close-up | Very tight detail shot |
| 31 | Macro shot | Extreme close-up of small details |
| 32 | bird's-eye view | Overhead perspective |
| 33 | rule of thirds | Compositional guideline |
| 34 | symmetrical composition | Balanced framing |
| 35 | handheld | Shaky, documentary-style |
| 36 | FPV shot | First-person view |
| 37 | jib up | Crane moves upward |
| 38 | jib down | Crane moves downward |
| 39 | full shot | Complete subject in frame |
| 40 | Time lapse shot | Compressed time progression |
| 41 | aerial shot | High-altitude view |
| 42 | low angle shot | Camera positioned below subject |
| 43 | Eye-level shot | Camera at subject's eye level |
| 44 | diagonal composition | Angled framing |
| 45 | over shoulder shot | View from behind subject |
| 46 | crane down | Crane descends |
## Usage Example
```python
# Static shot with no camera movement
client.create_video(
prompt="猫咪转头看向镜头",
image="https://example.com/cat.jpg",
camera_move_index=9 # static
)
# Dramatic push in shot
client.create_video(
prompt="女孩突然转头,右手拿起无线耳机戴在耳朵上",
image="https://example.com/girl.jpg",
camera_move_index=13 # super dolly in
)
```
FILE:references/prompt_guide.md
# Tomoviee Prompt Engineering Guide
## Overview
This guide provides structured prompt formulas and best practices for all Tomoviee AI APIs to achieve optimal generation results.
---
## Video APIs
### Text-to-Video Prompt Formula
```
主体(描述) + 运动 + 场景(描述) + (镜头语言 + 光影 + 氛围)
Subject (description) + Motion + Scene (description) + (Camera + Lighting + Atmosphere)
```
**Component Breakdown**:
1. **Subject (主体)**: Main focus of the video
- Who/what is in the scene
- Key characteristics, appearance
- Position and pose
2. **Motion (运动)**: Action and movement
- What the subject is doing
- Speed and direction of movement
- Dynamic elements
3. **Scene (场景)**: Environment and context
- Location and setting
- Background elements
- Time of day/season
4. **Camera (镜头语言)**: Optional camera work
- Camera angle (wide, close-up, aerial, etc.)
- Camera movement (pan, zoom, tracking, etc.)
- Or specify `camera_move_index` parameter
5. **Lighting (光影)**: Optional lighting style
- Natural/artificial light
- Time of day (golden hour, blue hour, etc.)
- Light direction and quality
6. **Atmosphere (氛围)**: Optional mood and tone
- Overall feeling (dramatic, peaceful, energetic)
- Color palette/grading
- Weather/environmental effects
**Examples**:
**Minimal** (Subject + Motion + Scene):
```
"A red sports car driving fast on a coastal highway at sunset"
```
**Standard** (+ Camera + Lighting):
```
"A red sports car speeding along a winding coastal highway at golden hour,
camera following from the side, warm orange sunlight reflecting off the car"
```
**Detailed** (+ Atmosphere):
```
"A sleek red Ferrari speeding along a dramatic coastal highway carved into cliffs,
camera tracking smoothly from the side at car level,
golden hour sunset casting long shadows and warm orange glow,
cinematic and epic atmosphere with ocean waves crashing below"
```
**More Examples by Use Case**:
**Product Showcase**:
```
"White wireless headphones slowly rotating on a minimalist white surface,
studio lighting with soft shadows, clean and modern atmosphere"
```
**Nature/Travel**:
```
"Majestic waterfall cascading down mossy rocks in a lush rainforest,
slow zoom in from wide to medium shot,
dappled sunlight filtering through the canopy,
serene and peaceful atmosphere"
```
**Action/Sports**:
```
"Professional skateboarder performing a kickflip on urban street ramp,
slow motion capture from low angle,
bright daylight with high contrast shadows,
energetic and dynamic atmosphere"
```
### Image-to-Video Prompt Formula
```
主体 + 运动 + 镜头语言
Subject + Motion + Camera
```
**Why simpler?** The image already provides:
- Scene and environment
- Lighting and color palette
- Composition and framing
- Atmosphere and mood
**Your prompt should focus on**:
- Motion to add to the static image
- Camera movement to apply
- Any new dynamic elements
**Examples**:
```
"Camera slowly zooming in, subject's hair gently blowing in the wind"
```
```
"Slow pan from left to right, leaves rustling, golden hour lighting"
```
```
"Camera orbiting around the subject, dramatic lighting, cinematic feel"
```
```
"Gentle push in toward the subject's face, bokeh background, emotional atmosphere"
```
### Video Continuation Prompt Formula
```
延续的动作 + 场景变化 + 镜头延续
Continued action + Scene evolution + Camera continuation
```
**Focus on**:
- How the action continues from the last frame
- Natural progression of movement
- Scene changes or transitions
- Camera movement consistency
**Examples**:
```
"The bird continues flying higher, soaring into the clouds,
camera following the ascent"
```
```
"The car continues down the road, passing through a tunnel,
maintaining tracking shot"
```
```
"The person continues walking, entering a brightly lit room,
camera follows smoothly"
```
---
## Image APIs
### Image-to-Image Prompt Formula
```
参考图描述 + 保留要素 + 修改/新增指令
Reference description + Elements to preserve + Modifications/additions
```
**Component Breakdown**:
1. **Reference Description**: What's in the original image
2. **Preserve**: Explicitly state what to keep unchanged
3. **Modify/Add**: What to change or add
**Examples**:
```
"A woman in business attire at a modern office,
preserve facial features and body pose,
change background to outdoor garden with natural lighting,
add warm sunset atmosphere"
```
```
"Portrait of a man wearing a blue shirt,
keep facial features and expression,
change clothing to formal black suit with tie,
maintain studio lighting"
```
```
"Kitchen interior with white cabinets,
preserve layout and appliance positions,
change color scheme to navy blue cabinets with gold hardware,
add marble countertops"
```
### Image Redrawing Prompt Formula
```
替换区域的新内容描述
Description of what replaces the masked area
```
**Focus on**:
- What should appear in the masked region
- Style/texture matching surrounding areas
- Lighting consistency
- Natural integration
**Examples**:
```
"Clear blue sky with white fluffy clouds"
(for replacing background)
```
```
"Natural grass texture with small wildflowers"
(for replacing foreground)
```
```
"Modern glass windows with reflections of cityscape"
(for replacing building facade)
```
```
"Empty wooden table surface with subtle wood grain"
(for removing objects from table)
```
### Image Recognition Prompt Formula
```
要识别的对象/区域描述
Description of objects/regions to identify
```
**Focus on**:
- Clear object identification
- Specific vs. general (depending on need)
- Spatial context if helpful
**Examples**:
```
"person" / "all people"
```
```
"the red car in the foreground"
```
```
"sky and clouds"
```
```
"text and logos"
```
```
"background behind the main subject"
```
---
## Audio APIs
### Text-to-Music Prompt Formula
```
主体 + 场景(氛围/风格)
Subject/Theme + Scene (Atmosphere/Style)
```
**Component Breakdown**:
1. **Subject/Theme**: Purpose or topic of the music
2. **Atmosphere**: Mood and emotional quality
3. **Style/Genre**: Musical style and genre
4. **Instruments** (optional): Key instruments to feature
**Examples**:
**Simple**:
```
"Upbeat electronic music, energetic and modern"
```
**Standard**:
```
"Corporate presentation background music, professional and clean,
soft piano and ambient synth"
```
**Detailed**:
```
"Epic cinematic orchestral score, dramatic and heroic,
soaring strings and powerful brass,
Hans Zimmer style, for action scene"
```
**By Genre**:
**Electronic/Pop**:
```
"Energetic EDM track, festival atmosphere, heavy bass and synth drops"
```
**Jazz**:
```
"Smooth jazz for evening ambience, sophisticated and relaxed,
piano trio with double bass"
```
**Orchestral**:
```
"Emotional film score, contemplative and moving,
piano and string ensemble"
```
**Ambient**:
```
"Peaceful meditation music, calm and spacious,
soft pads and gentle chimes"
```
### Text-to-Sound-Effect Prompt Formula
```
声音来源 + 动作/环境 + 特征
Sound source + Action/Context + Characteristics
```
**Component Breakdown**:
1. **Sound Source**: What's making the sound
2. **Action/Context**: What's happening / where it is
3. **Characteristics**: Sound qualities (loud, soft, sharp, etc.)
**Examples**:
**Single Events**:
```
"Glass bottle shattering on concrete, sharp and crisp"
```
```
"Car door closing, solid thunk, reverb in garage"
```
```
"Notification bell sound, soft and pleasant"
```
**Continuous Sounds**:
```
"Heavy rain on metal roof, steady and rhythmic"
```
```
"City traffic ambience, cars passing, distant sirens"
```
```
"Forest birds chirping, peaceful morning atmosphere"
```
**By Category**:
**Nature**:
```
"Ocean waves crashing on beach, powerful and continuous"
```
**Mechanical**:
```
"Old typewriter keys typing, mechanical clicks and dings"
```
**Human**:
```
"Crowd cheering and applauding, enthusiastic and loud"
```
**UI/Digital**:
```
"Futuristic UI beep, high-tech and clean"
```
### Text-to-Speech Tips
**Structure**:
- Write naturally as you would speak
- Use punctuation for pacing and pauses
- Spell out numbers and abbreviations
**Examples**:
**With Pacing**:
```
"Welcome to our platform... Let me show you around."
(Ellipsis adds natural pause)
```
**With Emphasis**:
```
"This is extremely important. Pay close attention."
(Period creates strong pause for emphasis)
```
**Numbers**:
```
"Call one, eight hundred, five five five, zero one two three"
(Instead of "Call 1-800-555-0123")
```
### Video Soundtrack Prompt Formula
```
风格/类型 + 情绪 + (节奏/能量)
Style/Genre + Mood + (Optional: pacing/energy)
```
**Why simpler?** The API analyzes video content automatically.
**Examples**:
**Minimal** (let API analyze):
```
(no prompt - fully automatic based on video)
```
**Guided Style**:
```
"Upbeat travel vlog music, adventurous and inspiring"
```
```
"Corporate tech presentation music, modern and professional"
```
```
"Emotional documentary score, contemplative and moving"
```
---
## General Best Practices
### Do's ✅
1. **Be Specific**: Clear descriptions yield better results
- Good: "Golden retriever puppy playing with red ball"
- Bad: "Dog playing"
2. **Use Descriptive Adjectives**: Add relevant details
- Good: "Sleek modern smartphone with edge-to-edge display"
- Bad: "Phone"
3. **Specify Mood/Atmosphere**: Set the tone
- Good: "Dramatic sunset with vibrant orange and purple clouds"
- Bad: "Sunset"
4. **Include Context**: Where, when, why
- Good: "Professional product photography on white background"
- Bad: "Product"
5. **Mention Key Visual Elements**:
- Lighting conditions
- Color palette
- Composition style
- Movement characteristics
### Don'ts ❌
1. **Don't Be Vague**:
- Bad: "Nice video"
- Bad: "Good music"
- Bad: "Cool image"
2. **Don't Contradict**:
- Bad: "Bright and dark scene"
- Bad: "Fast and slow motion"
- Bad: "Happy sad music"
3. **Don't Over-specify Technical Details**:
- Bad: "1920x1080 resolution 24fps H.264 codec"
- Bad: "120 BPM in C major with I-V-vi-IV progression"
- (API handles technical parameters automatically)
4. **Don't Use Negations**:
- Bad: "Not blurry, not dark, not boring"
- Good: "Sharp, bright, engaging"
5. **Don't Mix Multiple Unrelated Concepts**:
- Bad: "Cat playing piano in space while cooking pasta"
- Better: Split into multiple generations or choose one focus
---
## Advanced Techniques
### Chaining Outputs
Use output from one API as input to another:
```python
# 1. Generate image
img_task = client.image_to_image(
prompt="Modern office space, clean and minimal",
image="reference.jpg"
)
img_url = get_result_url(img_task)
# 2. Animate image
video_task = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image=img_url
)
video_url = get_result_url(video_task)
# 3. Add soundtrack
audio_task = client.video_soundtrack(
video=video_url,
prompt="Professional corporate music"
)
```
### Iterative Refinement
1. **Start Simple**: Test with minimal prompt
2. **Evaluate Output**: What's missing or wrong?
3. **Add Details**: Incrementally add specificity
4. **Test Again**: Compare results
**Example Iteration**:
```
V1: "Car driving"
→ Too generic
V2: "Red sports car driving fast on highway"
→ Better, but lighting unclear
V3: "Red Ferrari speeding on coastal highway, golden hour sunset, dramatic"
→ Good result!
```
### A/B Testing
Generate multiple variations to compare:
```python
# Test different camera movements
variants = [
client.image_to_video(image=img, prompt="Slow zoom in", camera_move_index=5),
client.image_to_video(image=img, prompt="Pan right", camera_move_index=12),
client.image_to_video(image=img, prompt="Orbit around subject", camera_move_index=23)
]
```
### Consistency Across Assets
For cohesive content, maintain consistent prompt elements:
```python
# Consistent style keywords across multiple generations
style_guide = "cinematic, golden hour lighting, dramatic atmosphere"
video1 = client.text_to_video(f"Mountain landscape, {style_guide}")
video2 = client.text_to_video(f"Forest scene, {style_guide}")
video3 = client.text_to_video(f"Beach sunset, {style_guide}")
```
---
## Prompt Templates by Use Case
### Marketing/Commercial
```
"[Product name] showcased on [surface/background],
[camera movement],
studio lighting with soft shadows,
premium and elegant atmosphere"
```
### Social Media
```
"[Subject] [action],
[camera angle/movement],
vibrant colors and high energy,
engaging and eye-catching for [platform]"
```
### Documentary/Educational
```
"[Subject] in [environment],
[natural action],
natural documentary style cinematography,
informative and authentic atmosphere"
```
### Artistic/Creative
```
"[Abstract concept] visualized as [imagery],
[unique camera work],
[lighting style],
[artistic mood/style reference]"
```
---
## Troubleshooting
### Problem: Output doesn't match prompt
**Solutions**:
- Simplify prompt to core elements
- Remove contradictory instructions
- Be more specific about key aspects
- Check for typos or ambiguous terms
### Problem: Output quality is poor
**Solutions**:
- Add style keywords (cinematic, professional, high quality)
- Specify lighting conditions
- Mention camera quality/style
- Add atmosphere/mood descriptors
### Problem: Inconsistent results
**Solutions**:
- Use more specific prompts
- Reference specific styles or examples
- Test with variations to find optimal wording
- Use consistent style keywords across generations
### Problem: Generation fails (status=4)
**Solutions**:
- Check prompt for inappropriate content
- Simplify overly complex prompts
- Verify input files (images/videos) are valid
- Ensure parameters are within valid ranges
- Try with default parameters first
---
## Language Considerations
Tomoviee supports both **Chinese** and **English** prompts.
**Tips**:
- Use language you're most comfortable with
- Be aware of cultural/contextual nuances
- Technical terms may work better in English
- Artistic descriptions may vary by language interpretation
**Example Comparisons**:
Chinese:
```
"一只金毛寻回犬在阳光明媚的草地上奔跑,镜头跟随,电影级画质"
```
English:
```
"A golden retriever running through a sunlit meadow, camera following, cinematic quality"
```
Both can produce excellent results - choose based on your fluency and precision needs.
---
## Summary Checklist
Before submitting your prompt, verify:
- [ ] **Clear Subject**: What's the main focus?
- [ ] **Specific Action**: What's happening?
- [ ] **Scene Context**: Where is this taking place?
- [ ] **Visual Style**: What's the look and feel?
- [ ] **Technical Parameters**: Resolution, duration, aspect ratio set correctly?
- [ ] **No Contradictions**: All elements work together?
- [ ] **Appropriate Specificity**: Not too vague, not over-specified?
A well-crafted prompt is the foundation of excellent AI-generated content. Take time to refine your prompts for best results!
FILE:references/video_apis.md
# Tomoviee First-Last Frame to Video API Reference
## Provenance and Endpoint Mapping
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Runtime gateway host used by this skill package: `https://openapi.wondershare.cc`
- Primary capability in this skill: `tm_tail2video_b`
All runtime requests from this skill target only:
1. `https://openapi.wondershare.cc/v1/open/capacity/application/tm_tail2video_b`
2. `https://openapi.wondershare.cc/v1/open/pub/task`
## First-Last Frame to Video (tm_tail2video_b)
Generate a 5-second video by interpolating between a first frame and a last frame.
### Parameters
- `prompt` (required): motion guidance text
- `image` (required): first frame image URL
- `image_tail` (required): last frame image URL
- `resolution` (optional): `720p` or `1080p`, default `720p`
- `duration` (optional): only `5` supported
- `aspect_ratio` (optional): `16:9`, `9:16`, `4:3`, `3:4`, `1:1`, `original`
- `camera_move_index` (optional): camera movement index `1-46`
- `callback` (optional): callback URL
- `params` (optional): transparent callback passthrough
### Input Constraints
- Maximum file size: `<200M` per image
- Formats: `JPG`, `JPEG`, `PNG`, `WEBP`
- Recommendation: source images should be at least 720p for better quality
### Result Endpoint
`https://openapi.wondershare.cc/v1/open/pub/task`
### Status Codes
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Credential and Dependency Notes
- Sensitive credentials required: `app_key`, `app_secret`
- Auth pattern: `Authorization: Basic <base64(app_key:app_secret)>`
- Runtime dependency: `requests>=2.31.0,<3.0.0`
### Example
```python
from scripts.tomoviee_firstlast2video_client import TomovieeFirstLast2VideoClient
client = TomovieeFirstLast2VideoClient("app_key", "app_secret")
task_id = client.firstlast_to_video(
prompt="Smooth transition from first frame to last frame",
image="https://example.com/first.jpg",
image_tail="https://example.com/last.jpg",
resolution="720p",
duration=5,
aspect_ratio="original",
)
result = client.poll_until_complete(task_id)
```
Auto-generate masks for objects/regions in images. Use when users request image_recognition operations or related tasks.
---
name: tomoviee-recognition
description: Auto-generate masks for objects/regions in images. Use when users request image_recognition operations or related tasks.
---
# Tomoviee AI - 图像识别 (Image Recognition)
## Overview
Auto-generate masks for objects/regions in images.
**API**: `tm_reference_img2mask`
## Quick Start
### Authentication
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_image_recognition_client import TomovieeClient
client = TomovieeClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client._make_request({
image='https://example.com/photo.jpg'
control_type='2' # subject/default
})
result = client.poll_until_complete(task_id)
import json
output = json.loads(result['result'])
```
### Parameters
- `image` (required): Image URL to analyze
- `control_type`: 0=edge, 1=pose, 2=subject, 3=depth
## Async Workflow
1. **Create task**: Get `task_id` from API call
2. **Poll for completion**: Use `poll_until_complete(task_id)`
3. **Extract result**: Parse returned JSON for output URLs
**Status codes**:
- 1 = Queued
- 2 = Processing
- 3 = Success (ready)
- 4 = Failed
- 5 = Cancelled
- 6 = Timeout
## Resources
### scripts/
- `tomoviee_image_recognition_client.py` - API client
- `generate_auth_token.py` - Auth token generator
### references/
See bundled reference documents for detailed API documentation and examples.
## External Resources
- **Developer Portal**: https://www.tomoviee.ai/developers.html
- **API Documentation**: https://www.tomoviee.ai/doc/
- **Get API Credentials**: Register at developer portal
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-image-recognition",
"version": "1.0.1",
"publishedAt": 1772933908922
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_recognition_client.py
#!/usr/bin/env python3
"""Tomoviee AI - recognition API client"""
import base64, json, time
from typing import Dict, Optional, Any
import requests
class TomovieeClient:
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_reference_img2mask"
def __init__(self, app_key: str, app_secret: str):
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}"
}
def _make_request(self, payload: Dict[str, Any]) -> str:
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(url, headers=self._get_headers(), json=payload)
result = response.json()
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg')}")
return result["data"]["task_id"]
def get_result(self, task_id: str) -> Dict[str, Any]:
response = requests.post(self.RESULT_ENDPOINT, headers=self._get_headers(), json={"task_id": task_id})
result = response.json()
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg')}")
return result["data"]
def poll_until_complete(self, task_id: str, poll_interval: int = 10, timeout: int = 600) -> Dict[str, Any]:
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
if result["status"] == 3:
return result
elif result["status"] in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
FILE:references/image_apis.md
# Tomoviee Image Generation APIs
## Overview
Tomoviee provides three image generation APIs for reference-based generation, localized editing, and intelligent segmentation.
## API Endpoints
### 1. Image-to-Image (tm_reference_img2img)
**Generate new image based on reference image**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_reference_img2img`
**Parameters**:
- `prompt` (required): Text description (reference + preserve + modify/add)
- `image` (required): Reference image URL (JPG/JPEG/PNG/WEBP, <200M)
- `resolution`: Image resolution - `512*512` (default), `768*768`, `1024*1024`
- `aspect_ratio`: Image aspect ratio - `1:1` (default), `16:9`, `9:16`, `4:3`, `3:4`
- `image_num`: Number of images to generate - 1 to 4 (default: 1)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Reimagine existing images with modifications
- Generate variations while preserving style/composition
- Change specific elements (clothing, background, objects)
- Maintain overall structure while altering details
**Prompt Structure**:
```
[Reference description] + [Elements to preserve] + [Modifications/additions]
```
**Example**:
```python
task_id = client.image_to_image(
prompt="A woman in business attire, preserve facial features and pose, change background to modern office with floor-to-ceiling windows",
image="https://example.com/portrait.jpg",
resolution="1024*1024",
aspect_ratio="3:4",
image_num=2 # Generate 2 variations
)
```
**Important Notes**:
- The reference image guides style, composition, and structure
- Prompt should explicitly state what to preserve vs. modify
- Higher resolution = better detail but slower generation
- Multiple images (image_num > 1) generate variations with same prompt
---
### 2. Image Redrawing (tm_redrawing)
**Redraw specific region of image using mask**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_redrawing`
**Parameters**:
- `prompt` (required): Description of what to redraw in masked area
- `image` (required): Original image URL (JPG/JPEG/PNG/WEBP, <200M)
- `mask` (required): Mask image URL (white = redraw area, black = preserve area)
- `resolution`: Image resolution - `512*512` (default), `768*768`, `1024*1024`
- `aspect_ratio`: Image aspect ratio - `1:1` (default), `16:9`, `9:16`, `4:3`, `3:4`
- `image_num`: Number of images to generate - 1 to 4 (default: 1)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Remove/replace objects or people
- Edit specific image regions
- Inpainting and outpainting
- Fix or enhance localized areas
- Change backgrounds while keeping foreground
**Mask Requirements**:
- Same dimensions as input image
- White pixels (255, 255, 255) = areas to redraw
- Black pixels (0, 0, 0) = areas to preserve unchanged
- Grayscale values = partial blending (use with caution)
**Example**:
```python
# Remove person from photo
task_id = client.image_redrawing(
prompt="Empty beach sand with natural texture",
image="https://example.com/beach-with-person.jpg",
mask="https://example.com/person-mask.png", # White where person is
resolution="1024*1024",
image_num=1
)
# Change background
task_id = client.image_redrawing(
prompt="Sunset sky with orange and purple clouds",
image="https://example.com/portrait.jpg",
mask="https://example.com/background-mask.png", # White = background area
resolution="1024*1024"
)
```
**Creating Masks**:
You can create masks using:
- Image editing software (Photoshop, GIMP)
- Programmatic tools (OpenCV, PIL/Pillow)
- The Image Recognition API (see below) to auto-generate masks
---
### 3. Image Recognition (tm_reference_img2mask)
**Recognize and segment image regions based on prompt**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_reference_img2mask`
**Parameters**:
- `prompt` (required): Description of objects/regions to recognize
- `image` (required): Image URL to analyze (JPG/JPEG/PNG/WEBP, <200M)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Generate masks for the Redrawing API
- Automatically segment objects in images
- Identify and isolate specific regions
- Create selection masks without manual editing
- Batch processing of similar images
**Output**:
Returns mask images where recognized objects are white, background is black.
Can return multiple masks if multiple objects match the prompt.
**Example**:
```python
# Recognize all people in image
task_id = client.image_recognition(
prompt="people",
image="https://example.com/group-photo.jpg"
)
# Recognize specific objects
task_id = client.image_recognition(
prompt="the red car in the foreground",
image="https://example.com/street-scene.jpg"
)
# Recognize background
task_id = client.image_recognition(
prompt="sky and clouds",
image="https://example.com/landscape.jpg"
)
```
**Recognition + Redrawing Workflow**:
```python
# Step 1: Recognize object to create mask
recognition_task = client.image_recognition(
prompt="person in center",
image="https://example.com/photo.jpg"
)
recognition_result = client.poll_until_complete(recognition_task)
mask_url = json.loads(recognition_result['result'])['mask_path'][0]
# Step 2: Use generated mask to redraw that region
redraw_task = client.image_redrawing(
prompt="beautiful garden with flowers",
image="https://example.com/photo.jpg",
mask=mask_url # Use auto-generated mask
)
redraw_result = client.poll_until_complete(redraw_task)
final_image = json.loads(redraw_result['result'])['image_path'][0]
```
---
## Common Parameters
### Resolution Options
- `512*512`: Fast generation, lower detail
- `768*768`: Balanced speed and quality
- `1024*1024`: Best quality, slower generation
### Aspect Ratio Options
- `1:1`: Square - Instagram, profile pictures
- `16:9`: Landscape - desktop wallpaper, presentations
- `9:16`: Portrait - mobile wallpaper, stories
- `4:3`: Traditional photo format
- `3:4`: Portrait photo format
### Image Number (image_num)
- `1`: Single image (fastest)
- `2-4`: Multiple variations (same prompt, different results)
- Useful for A/B testing or getting options to choose from
---
## Async Workflow
All image APIs are asynchronous:
1. **Create Task**: Call API endpoint → receive `task_id`
2. **Poll Status**: Call unified result endpoint with `task_id`
3. **Check Status**:
- `1` = Queued
- `2` = Processing
- `3` = Success (images ready)
- `4` = Failed
- `5` = Cancelled
- `6` = Timeout
4. **Get Result**: When status=3, extract image URL(s) from result JSON
**Unified Result Endpoint**: `https://openapi.wondershare.cc/v1/open/pub/task`
**Example Workflow**:
```python
# Create task
task_id = client.image_to_image(prompt="...", image="...", image_num=2)
# Poll for completion
result = client.poll_until_complete(task_id, poll_interval=5, timeout=300)
# Extract image URLs
import json
result_data = json.loads(result['result'])
image_urls = result_data['image_path'] # List of URLs (length = image_num)
```
---
## Prompt Engineering Tips
### Image-to-Image Prompts
**Structure**: `[Reference] + [Preserve] + [Modify]`
**Good Examples**:
- "A modern kitchen with white cabinets, preserve layout and appliance positions, change color scheme to navy blue and gold accents"
- "Portrait of a woman, keep facial features and expression, change hairstyle to long wavy hair and add glasses"
- "Street scene with cars, maintain composition and perspective, change time to sunset with golden hour lighting"
**Avoid**:
- Vague prompts: "make it better"
- No preservation guidance: "different background" (what else to keep?)
- Conflicting instructions: "keep everything the same but completely different style"
### Redrawing Prompts
**Structure**: `[What to draw in masked area]`
**Good Examples**:
- "Clear blue sky with white fluffy clouds"
- "Modern glass windows with reflections"
- "Natural grass texture with small wildflowers"
- "Empty wooden table surface"
**Avoid**:
- Mentioning what to remove: "remove the person" (just describe what replaces it)
- Complex multi-object prompts in small masks
- Instructions about preserved areas (mask already defines this)
### Recognition Prompts
**Structure**: `[Objects/regions to identify]`
**Good Examples**:
- "person" / "all people"
- "the red car"
- "sky and clouds"
- "foreground objects"
- "background"
- "text and logos"
**Avoid**:
- Overly specific if object isn't clearly visible
- Multiple unrelated objects (split into separate calls)
- Negations ("not the person") - describe what you want, not what you don't
---
## Error Handling
**Common Errors**:
- `400`: Invalid parameters (check resolution format, aspect_ratio, image_num range)
- `401`: Authentication failed
- `413`: Image too large (must be <200M)
- `422`: Invalid image format or mask doesn't match image dimensions
- Task status `4`: Generation failed (check prompt clarity or image quality)
**Best Practices**:
- Validate image URLs are publicly accessible
- Ensure masks match image dimensions exactly
- Keep prompts clear and specific
- Use lower resolutions for testing, higher for final generation
- Implement retry logic with exponential backoff
---
## Quota and Limits
- File size: <200M per image
- Supported formats: JPG, JPEG, PNG, WEBP
- Image number range: 1-4 per request
- Generation time: Typically 30 seconds to 2 minutes
- Concurrent tasks: Check your plan limits
---
## Comparison
| Feature | Image-to-Image | Redrawing | Recognition |
|---------|----------------|-----------|-------------|
| Purpose | Reimagine entire image | Edit specific region | Generate masks |
| Inputs | Image + prompt | Image + mask + prompt | Image + prompt |
| Output | Modified image(s) | Image with region redrawn | Mask image(s) |
| Use Global/Local | Global transformation | Localized editing | N/A (tool for Redrawing) |
| Prompt Focus | Overall changes | What replaces masked area | What to segment |
| Typical Use | Style transfer, variations | Object removal, background change | Automation for Redrawing |
**When to use which**:
- **Image-to-Image**: Want to change overall style/theme while keeping structure
- **Redrawing**: Need precise control over specific regions (with manual or auto mask)
- **Recognition**: Automate mask creation for Redrawing workflow
FILE:references/prompt_guide.md
# Tomoviee Prompt Engineering Guide
## Overview
This guide provides structured prompt formulas and best practices for all Tomoviee AI APIs to achieve optimal generation results.
---
## Video APIs
### Text-to-Video Prompt Formula
```
主体(描述) + 运动 + 场景(描述) + (镜头语言 + 光影 + 氛围)
Subject (description) + Motion + Scene (description) + (Camera + Lighting + Atmosphere)
```
**Component Breakdown**:
1. **Subject (主体)**: Main focus of the video
- Who/what is in the scene
- Key characteristics, appearance
- Position and pose
2. **Motion (运动)**: Action and movement
- What the subject is doing
- Speed and direction of movement
- Dynamic elements
3. **Scene (场景)**: Environment and context
- Location and setting
- Background elements
- Time of day/season
4. **Camera (镜头语言)**: Optional camera work
- Camera angle (wide, close-up, aerial, etc.)
- Camera movement (pan, zoom, tracking, etc.)
- Or specify `camera_move_index` parameter
5. **Lighting (光影)**: Optional lighting style
- Natural/artificial light
- Time of day (golden hour, blue hour, etc.)
- Light direction and quality
6. **Atmosphere (氛围)**: Optional mood and tone
- Overall feeling (dramatic, peaceful, energetic)
- Color palette/grading
- Weather/environmental effects
**Examples**:
**Minimal** (Subject + Motion + Scene):
```
"A red sports car driving fast on a coastal highway at sunset"
```
**Standard** (+ Camera + Lighting):
```
"A red sports car speeding along a winding coastal highway at golden hour,
camera following from the side, warm orange sunlight reflecting off the car"
```
**Detailed** (+ Atmosphere):
```
"A sleek red Ferrari speeding along a dramatic coastal highway carved into cliffs,
camera tracking smoothly from the side at car level,
golden hour sunset casting long shadows and warm orange glow,
cinematic and epic atmosphere with ocean waves crashing below"
```
**More Examples by Use Case**:
**Product Showcase**:
```
"White wireless headphones slowly rotating on a minimalist white surface,
studio lighting with soft shadows, clean and modern atmosphere"
```
**Nature/Travel**:
```
"Majestic waterfall cascading down mossy rocks in a lush rainforest,
slow zoom in from wide to medium shot,
dappled sunlight filtering through the canopy,
serene and peaceful atmosphere"
```
**Action/Sports**:
```
"Professional skateboarder performing a kickflip on urban street ramp,
slow motion capture from low angle,
bright daylight with high contrast shadows,
energetic and dynamic atmosphere"
```
### Image-to-Video Prompt Formula
```
主体 + 运动 + 镜头语言
Subject + Motion + Camera
```
**Why simpler?** The image already provides:
- Scene and environment
- Lighting and color palette
- Composition and framing
- Atmosphere and mood
**Your prompt should focus on**:
- Motion to add to the static image
- Camera movement to apply
- Any new dynamic elements
**Examples**:
```
"Camera slowly zooming in, subject's hair gently blowing in the wind"
```
```
"Slow pan from left to right, leaves rustling, golden hour lighting"
```
```
"Camera orbiting around the subject, dramatic lighting, cinematic feel"
```
```
"Gentle push in toward the subject's face, bokeh background, emotional atmosphere"
```
### Video Continuation Prompt Formula
```
延续的动作 + 场景变化 + 镜头延续
Continued action + Scene evolution + Camera continuation
```
**Focus on**:
- How the action continues from the last frame
- Natural progression of movement
- Scene changes or transitions
- Camera movement consistency
**Examples**:
```
"The bird continues flying higher, soaring into the clouds,
camera following the ascent"
```
```
"The car continues down the road, passing through a tunnel,
maintaining tracking shot"
```
```
"The person continues walking, entering a brightly lit room,
camera follows smoothly"
```
---
## Image APIs
### Image-to-Image Prompt Formula
```
参考图描述 + 保留要素 + 修改/新增指令
Reference description + Elements to preserve + Modifications/additions
```
**Component Breakdown**:
1. **Reference Description**: What's in the original image
2. **Preserve**: Explicitly state what to keep unchanged
3. **Modify/Add**: What to change or add
**Examples**:
```
"A woman in business attire at a modern office,
preserve facial features and body pose,
change background to outdoor garden with natural lighting,
add warm sunset atmosphere"
```
```
"Portrait of a man wearing a blue shirt,
keep facial features and expression,
change clothing to formal black suit with tie,
maintain studio lighting"
```
```
"Kitchen interior with white cabinets,
preserve layout and appliance positions,
change color scheme to navy blue cabinets with gold hardware,
add marble countertops"
```
### Image Redrawing Prompt Formula
```
替换区域的新内容描述
Description of what replaces the masked area
```
**Focus on**:
- What should appear in the masked region
- Style/texture matching surrounding areas
- Lighting consistency
- Natural integration
**Examples**:
```
"Clear blue sky with white fluffy clouds"
(for replacing background)
```
```
"Natural grass texture with small wildflowers"
(for replacing foreground)
```
```
"Modern glass windows with reflections of cityscape"
(for replacing building facade)
```
```
"Empty wooden table surface with subtle wood grain"
(for removing objects from table)
```
### Image Recognition Prompt Formula
```
要识别的对象/区域描述
Description of objects/regions to identify
```
**Focus on**:
- Clear object identification
- Specific vs. general (depending on need)
- Spatial context if helpful
**Examples**:
```
"person" / "all people"
```
```
"the red car in the foreground"
```
```
"sky and clouds"
```
```
"text and logos"
```
```
"background behind the main subject"
```
---
## Audio APIs
### Text-to-Music Prompt Formula
```
主体 + 场景(氛围/风格)
Subject/Theme + Scene (Atmosphere/Style)
```
**Component Breakdown**:
1. **Subject/Theme**: Purpose or topic of the music
2. **Atmosphere**: Mood and emotional quality
3. **Style/Genre**: Musical style and genre
4. **Instruments** (optional): Key instruments to feature
**Examples**:
**Simple**:
```
"Upbeat electronic music, energetic and modern"
```
**Standard**:
```
"Corporate presentation background music, professional and clean,
soft piano and ambient synth"
```
**Detailed**:
```
"Epic cinematic orchestral score, dramatic and heroic,
soaring strings and powerful brass,
Hans Zimmer style, for action scene"
```
**By Genre**:
**Electronic/Pop**:
```
"Energetic EDM track, festival atmosphere, heavy bass and synth drops"
```
**Jazz**:
```
"Smooth jazz for evening ambience, sophisticated and relaxed,
piano trio with double bass"
```
**Orchestral**:
```
"Emotional film score, contemplative and moving,
piano and string ensemble"
```
**Ambient**:
```
"Peaceful meditation music, calm and spacious,
soft pads and gentle chimes"
```
### Text-to-Sound-Effect Prompt Formula
```
声音来源 + 动作/环境 + 特征
Sound source + Action/Context + Characteristics
```
**Component Breakdown**:
1. **Sound Source**: What's making the sound
2. **Action/Context**: What's happening / where it is
3. **Characteristics**: Sound qualities (loud, soft, sharp, etc.)
**Examples**:
**Single Events**:
```
"Glass bottle shattering on concrete, sharp and crisp"
```
```
"Car door closing, solid thunk, reverb in garage"
```
```
"Notification bell sound, soft and pleasant"
```
**Continuous Sounds**:
```
"Heavy rain on metal roof, steady and rhythmic"
```
```
"City traffic ambience, cars passing, distant sirens"
```
```
"Forest birds chirping, peaceful morning atmosphere"
```
**By Category**:
**Nature**:
```
"Ocean waves crashing on beach, powerful and continuous"
```
**Mechanical**:
```
"Old typewriter keys typing, mechanical clicks and dings"
```
**Human**:
```
"Crowd cheering and applauding, enthusiastic and loud"
```
**UI/Digital**:
```
"Futuristic UI beep, high-tech and clean"
```
### Text-to-Speech Tips
**Structure**:
- Write naturally as you would speak
- Use punctuation for pacing and pauses
- Spell out numbers and abbreviations
**Examples**:
**With Pacing**:
```
"Welcome to our platform... Let me show you around."
(Ellipsis adds natural pause)
```
**With Emphasis**:
```
"This is extremely important. Pay close attention."
(Period creates strong pause for emphasis)
```
**Numbers**:
```
"Call one, eight hundred, five five five, zero one two three"
(Instead of "Call 1-800-555-0123")
```
### Video Soundtrack Prompt Formula
```
风格/类型 + 情绪 + (节奏/能量)
Style/Genre + Mood + (Optional: pacing/energy)
```
**Why simpler?** The API analyzes video content automatically.
**Examples**:
**Minimal** (let API analyze):
```
(no prompt - fully automatic based on video)
```
**Guided Style**:
```
"Upbeat travel vlog music, adventurous and inspiring"
```
```
"Corporate tech presentation music, modern and professional"
```
```
"Emotional documentary score, contemplative and moving"
```
---
## General Best Practices
### Do's ✅
1. **Be Specific**: Clear descriptions yield better results
- Good: "Golden retriever puppy playing with red ball"
- Bad: "Dog playing"
2. **Use Descriptive Adjectives**: Add relevant details
- Good: "Sleek modern smartphone with edge-to-edge display"
- Bad: "Phone"
3. **Specify Mood/Atmosphere**: Set the tone
- Good: "Dramatic sunset with vibrant orange and purple clouds"
- Bad: "Sunset"
4. **Include Context**: Where, when, why
- Good: "Professional product photography on white background"
- Bad: "Product"
5. **Mention Key Visual Elements**:
- Lighting conditions
- Color palette
- Composition style
- Movement characteristics
### Don'ts ❌
1. **Don't Be Vague**:
- Bad: "Nice video"
- Bad: "Good music"
- Bad: "Cool image"
2. **Don't Contradict**:
- Bad: "Bright and dark scene"
- Bad: "Fast and slow motion"
- Bad: "Happy sad music"
3. **Don't Over-specify Technical Details**:
- Bad: "1920x1080 resolution 24fps H.264 codec"
- Bad: "120 BPM in C major with I-V-vi-IV progression"
- (API handles technical parameters automatically)
4. **Don't Use Negations**:
- Bad: "Not blurry, not dark, not boring"
- Good: "Sharp, bright, engaging"
5. **Don't Mix Multiple Unrelated Concepts**:
- Bad: "Cat playing piano in space while cooking pasta"
- Better: Split into multiple generations or choose one focus
---
## Advanced Techniques
### Chaining Outputs
Use output from one API as input to another:
```python
# 1. Generate image
img_task = client.image_to_image(
prompt="Modern office space, clean and minimal",
image="reference.jpg"
)
img_url = get_result_url(img_task)
# 2. Animate image
video_task = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image=img_url
)
video_url = get_result_url(video_task)
# 3. Add soundtrack
audio_task = client.video_soundtrack(
video=video_url,
prompt="Professional corporate music"
)
```
### Iterative Refinement
1. **Start Simple**: Test with minimal prompt
2. **Evaluate Output**: What's missing or wrong?
3. **Add Details**: Incrementally add specificity
4. **Test Again**: Compare results
**Example Iteration**:
```
V1: "Car driving"
→ Too generic
V2: "Red sports car driving fast on highway"
→ Better, but lighting unclear
V3: "Red Ferrari speeding on coastal highway, golden hour sunset, dramatic"
→ Good result!
```
### A/B Testing
Generate multiple variations to compare:
```python
# Test different camera movements
variants = [
client.image_to_video(image=img, prompt="Slow zoom in", camera_move_index=5),
client.image_to_video(image=img, prompt="Pan right", camera_move_index=12),
client.image_to_video(image=img, prompt="Orbit around subject", camera_move_index=23)
]
```
### Consistency Across Assets
For cohesive content, maintain consistent prompt elements:
```python
# Consistent style keywords across multiple generations
style_guide = "cinematic, golden hour lighting, dramatic atmosphere"
video1 = client.text_to_video(f"Mountain landscape, {style_guide}")
video2 = client.text_to_video(f"Forest scene, {style_guide}")
video3 = client.text_to_video(f"Beach sunset, {style_guide}")
```
---
## Prompt Templates by Use Case
### Marketing/Commercial
```
"[Product name] showcased on [surface/background],
[camera movement],
studio lighting with soft shadows,
premium and elegant atmosphere"
```
### Social Media
```
"[Subject] [action],
[camera angle/movement],
vibrant colors and high energy,
engaging and eye-catching for [platform]"
```
### Documentary/Educational
```
"[Subject] in [environment],
[natural action],
natural documentary style cinematography,
informative and authentic atmosphere"
```
### Artistic/Creative
```
"[Abstract concept] visualized as [imagery],
[unique camera work],
[lighting style],
[artistic mood/style reference]"
```
---
## Troubleshooting
### Problem: Output doesn't match prompt
**Solutions**:
- Simplify prompt to core elements
- Remove contradictory instructions
- Be more specific about key aspects
- Check for typos or ambiguous terms
### Problem: Output quality is poor
**Solutions**:
- Add style keywords (cinematic, professional, high quality)
- Specify lighting conditions
- Mention camera quality/style
- Add atmosphere/mood descriptors
### Problem: Inconsistent results
**Solutions**:
- Use more specific prompts
- Reference specific styles or examples
- Test with variations to find optimal wording
- Use consistent style keywords across generations
### Problem: Generation fails (status=4)
**Solutions**:
- Check prompt for inappropriate content
- Simplify overly complex prompts
- Verify input files (images/videos) are valid
- Ensure parameters are within valid ranges
- Try with default parameters first
---
## Language Considerations
Tomoviee supports both **Chinese** and **English** prompts.
**Tips**:
- Use language you're most comfortable with
- Be aware of cultural/contextual nuances
- Technical terms may work better in English
- Artistic descriptions may vary by language interpretation
**Example Comparisons**:
Chinese:
```
"一只金毛寻回犬在阳光明媚的草地上奔跑,镜头跟随,电影级画质"
```
English:
```
"A golden retriever running through a sunlit meadow, camera following, cinematic quality"
```
Both can produce excellent results - choose based on your fluency and precision needs.
---
## Summary Checklist
Before submitting your prompt, verify:
- [ ] **Clear Subject**: What's the main focus?
- [ ] **Specific Action**: What's happening?
- [ ] **Scene Context**: Where is this taking place?
- [ ] **Visual Style**: What's the look and feel?
- [ ] **Technical Parameters**: Resolution, duration, aspect ratio set correctly?
- [ ] **No Contradictions**: All elements work together?
- [ ] **Appropriate Specificity**: Not too vague, not over-specified?
A well-crafted prompt is the foundation of excellent AI-generated content. Take time to refine your prompts for best results!
Redraw image content using Tomoviee Image Redrawing API (`tm_redrawing`) through Wondershare OpenAPI gateway (`https://openapi.wondershare.cc`). Use when use...
---
name: tomoviee-image-redraw
description: Redraw image content using Tomoviee Image Redrawing API (`tm_redrawing`) through Wondershare OpenAPI gateway (`https://openapi.wondershare.cc`). Use when users request inpainting, localized replacement, object removal, or mask-based image edits.
---
# Tomoviee AI Image Redrawing
## Overview
Redraw image content with optional mask control.
- API capability: `tm_redrawing`
- White mask area: redraw
- Black mask area: keep unchanged
## Provider and Endpoint Provenance
Use this mapping to verify credential and endpoint provenance before production usage:
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Gateway host used by this skill: `https://openapi.wondershare.cc`
- Redrawing endpoint: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_redrawing`
- Task result endpoint: `https://openapi.wondershare.cc/v1/open/pub/task`
This skill sends API requests only to `openapi.wondershare.cc`.
## Quick Start
### Install dependencies
```bash
pip install -r requirements.txt
```
### Authentication
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_redrawing_client import TomovieeRedrawingClient
client = TomovieeRedrawingClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client.image_redrawing(
prompt="Clear blue sky with fluffy clouds",
init_image="https://example.com/photo.jpg",
mask_url="https://example.com/mask.png",
)
result = client.poll_until_complete(task_id)
import json
output = json.loads(result["result"])
print(output["images_path"][0])
```
### Parameters
- `prompt` (required): positive prompt text
- `init_image` (required): source image URL
- supported format: `jpg/png`
- width and height: `>512` and `<2048`
- aspect ratio: `<3`
- `mask_url` (optional): mask image URL
- should have same resolution as `init_image`
- supported format: `jpg/png`
- width and height: `>512` and `<2048`
- aspect ratio: `<3`
- `callback`: callback URL (optional)
- `params`: transparent passthrough params (optional)
## Async Workflow
1. Create task and get `task_id`
2. Poll task status with `poll_until_complete(task_id)`
3. Parse output image URLs from `result`
Status codes:
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Resources
- `scripts/tomoviee_redrawing_client.py` - main redrawing client
- `scripts/tomoviee_image_redrawing_client.py` - compatibility import shim
- `scripts/generate_auth_token.py` - auth token helper
- `references/image_apis.md` - endpoint and workflow references
- `references/prompt_guide.md` - prompt writing guidance
## External Resources
- Developer portal (global): `https://www.tomoviee.ai/developers.html`
- API docs (global): `https://www.tomoviee.ai/doc/`
- Developer portal (mainland): `https://www.tomoviee.cn/developers.html`
- API docs (mainland): `https://www.tomoviee.cn/doc/`
- Gateway host used by this package: `https://openapi.wondershare.cc`
FILE:requirements.txt
requests>=2.31.0,<3.0.0
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-image-redraw",
"version": "1.0.1",
"publishedAt": 1772933941503
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_image_redrawing_client.py
#!/usr/bin/env python3
"""Compatibility shim. Prefer importing TomovieeRedrawingClient from tomoviee_redrawing_client."""
from scripts.tomoviee_redrawing_client import TomovieeClient, TomovieeRedrawingClient
__all__ = ["TomovieeClient", "TomovieeRedrawingClient"]
FILE:scripts/tomoviee_redrawing_client.py
#!/usr/bin/env python3
"""Tomoviee AI - Image Redrawing API client."""
import base64
import json
import time
from typing import Any, Dict, Optional
import requests
class TomovieeRedrawingClient:
"""Image redrawing client for Tomoviee API."""
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_redrawing"
REQUEST_TIMEOUT = 60
def __init__(self, app_key: str, app_secret: str):
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}",
}
def _safe_json(self, response: requests.Response) -> Dict[str, Any]:
try:
return response.json()
except ValueError as exc:
raise Exception(f"Invalid JSON response: {response.text}") from exc
def _make_request(self, payload: Dict[str, Any]) -> str:
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(
url,
headers=self._get_headers(),
json=payload,
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
task_id = result.get("data", {}).get("task_id")
if not task_id:
raise Exception(f"Missing task_id in response: {result}")
return task_id
def image_redrawing(
self,
prompt: str,
init_image: Optional[str] = None,
mask_url: Optional[str] = None,
callback: Optional[str] = None,
params: Optional[str] = None,
image: Optional[str] = None,
mask: Optional[str] = None,
) -> str:
"""Create an image redrawing task and return task_id."""
payload: Dict[str, Any] = {
"prompt": prompt,
"init_image": init_image or image,
}
if not payload["init_image"]:
raise ValueError("init_image is required")
# Backward compatibility: support old `mask` argument.
final_mask = mask_url or mask
if final_mask:
payload["mask_url"] = final_mask
if callback:
payload["callback"] = callback
if params:
payload["params"] = params
return self._make_request(payload)
def get_result(self, task_id: str) -> Dict[str, Any]:
response = requests.post(
self.RESULT_ENDPOINT,
headers=self._get_headers(),
json={"task_id": task_id},
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
data = result.get("data")
if not data:
raise Exception(f"Missing task data in response: {result}")
return data
def poll_until_complete(
self,
task_id: str,
poll_interval: int = 10,
timeout: int = 600,
) -> Dict[str, Any]:
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
status = result.get("status")
if status == 3:
return result
if status in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason', 'Unknown error')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
# Backward-compatible alias for older docs/usages.
TomovieeClient = TomovieeRedrawingClient
if __name__ == "__main__":
import sys
if len(sys.argv) < 5:
print(
"Usage: python scripts/tomoviee_redrawing_client.py "
"<app_key> <app_secret> <prompt> <init_image_url> [mask_url] [callback] [params]"
)
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
prompt = sys.argv[3]
init_image = sys.argv[4]
mask_url = sys.argv[5] if len(sys.argv) > 5 else None
callback = sys.argv[6] if len(sys.argv) > 6 else None
params = sys.argv[7] if len(sys.argv) > 7 else None
client = TomovieeRedrawingClient(app_key, app_secret)
try:
print("Creating redrawing task...")
task_id = client.image_redrawing(
prompt=prompt,
init_image=init_image,
mask_url=mask_url,
callback=callback,
params=params,
)
print(f"Task created: {task_id}")
print("Polling for result...")
result = client.poll_until_complete(task_id)
print("\nTask completed")
print(f"Progress: {result.get('progress', 'N/A')}%")
result_data = json.loads(result["result"])
print(json.dumps(result_data, indent=2, ensure_ascii=False))
except Exception as exc:
print(f"Error: {exc}")
sys.exit(1)
FILE:references/image_apis.md
# Tomoviee Image APIs (Redrawing Focus)
## Provenance and Endpoint Mapping
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Gateway host used by SDK/client scripts: `https://openapi.wondershare.cc`
- This skill's primary capacity: `tm_redrawing`
All runnable scripts in this skill package call only:
1. `https://openapi.wondershare.cc/v1/open/capacity/application/tm_redrawing`
2. `https://openapi.wondershare.cc/v1/open/pub/task`
## Image Redrawing (tm_redrawing)
Use image redrawing with optional mask control.
### Endpoint
`https://openapi.wondershare.cc/v1/open/capacity/application/tm_redrawing`
### Required Parameters
- `prompt`: positive prompt text
- `init_image`: source image URL
- supported format: `jpg/png`
- width and height: `>512` and `<2048`
- aspect ratio: `<3`
### Optional Parameters
- `mask_url`: mask image URL
- should have same resolution as `init_image`
- supported format: `jpg/png`
- width and height: `>512` and `<2048`
- aspect ratio: `<3`
- `callback`: callback URL
- `params`: passthrough params
### Result Polling Endpoint
`https://openapi.wondershare.cc/v1/open/pub/task`
### Status Codes
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
### Example
```python
from scripts.tomoviee_redrawing_client import TomovieeRedrawingClient
client = TomovieeRedrawingClient("app_key", "app_secret")
task_id = client.image_redrawing(
prompt="Clear blue sky with fluffy clouds",
init_image="https://example.com/input.png",
mask_url="https://example.com/mask.png",
)
result = client.poll_until_complete(task_id)
```
FILE:references/prompt_guide.md
# Tomoviee Prompt Engineering Guide
## Overview
This guide provides structured prompt formulas and best practices for all Tomoviee AI APIs to achieve optimal generation results.
---
## Video APIs
### Text-to-Video Prompt Formula
```
主体(描述) + 运动 + 场景(描述) + (镜头语言 + 光影 + 氛围)
Subject (description) + Motion + Scene (description) + (Camera + Lighting + Atmosphere)
```
**Component Breakdown**:
1. **Subject (主体)**: Main focus of the video
- Who/what is in the scene
- Key characteristics, appearance
- Position and pose
2. **Motion (运动)**: Action and movement
- What the subject is doing
- Speed and direction of movement
- Dynamic elements
3. **Scene (场景)**: Environment and context
- Location and setting
- Background elements
- Time of day/season
4. **Camera (镜头语言)**: Optional camera work
- Camera angle (wide, close-up, aerial, etc.)
- Camera movement (pan, zoom, tracking, etc.)
- Or specify `camera_move_index` parameter
5. **Lighting (光影)**: Optional lighting style
- Natural/artificial light
- Time of day (golden hour, blue hour, etc.)
- Light direction and quality
6. **Atmosphere (氛围)**: Optional mood and tone
- Overall feeling (dramatic, peaceful, energetic)
- Color palette/grading
- Weather/environmental effects
**Examples**:
**Minimal** (Subject + Motion + Scene):
```
"A red sports car driving fast on a coastal highway at sunset"
```
**Standard** (+ Camera + Lighting):
```
"A red sports car speeding along a winding coastal highway at golden hour,
camera following from the side, warm orange sunlight reflecting off the car"
```
**Detailed** (+ Atmosphere):
```
"A sleek red Ferrari speeding along a dramatic coastal highway carved into cliffs,
camera tracking smoothly from the side at car level,
golden hour sunset casting long shadows and warm orange glow,
cinematic and epic atmosphere with ocean waves crashing below"
```
**More Examples by Use Case**:
**Product Showcase**:
```
"White wireless headphones slowly rotating on a minimalist white surface,
studio lighting with soft shadows, clean and modern atmosphere"
```
**Nature/Travel**:
```
"Majestic waterfall cascading down mossy rocks in a lush rainforest,
slow zoom in from wide to medium shot,
dappled sunlight filtering through the canopy,
serene and peaceful atmosphere"
```
**Action/Sports**:
```
"Professional skateboarder performing a kickflip on urban street ramp,
slow motion capture from low angle,
bright daylight with high contrast shadows,
energetic and dynamic atmosphere"
```
### Image-to-Video Prompt Formula
```
主体 + 运动 + 镜头语言
Subject + Motion + Camera
```
**Why simpler?** The image already provides:
- Scene and environment
- Lighting and color palette
- Composition and framing
- Atmosphere and mood
**Your prompt should focus on**:
- Motion to add to the static image
- Camera movement to apply
- Any new dynamic elements
**Examples**:
```
"Camera slowly zooming in, subject's hair gently blowing in the wind"
```
```
"Slow pan from left to right, leaves rustling, golden hour lighting"
```
```
"Camera orbiting around the subject, dramatic lighting, cinematic feel"
```
```
"Gentle push in toward the subject's face, bokeh background, emotional atmosphere"
```
### Video Continuation Prompt Formula
```
延续的动作 + 场景变化 + 镜头延续
Continued action + Scene evolution + Camera continuation
```
**Focus on**:
- How the action continues from the last frame
- Natural progression of movement
- Scene changes or transitions
- Camera movement consistency
**Examples**:
```
"The bird continues flying higher, soaring into the clouds,
camera following the ascent"
```
```
"The car continues down the road, passing through a tunnel,
maintaining tracking shot"
```
```
"The person continues walking, entering a brightly lit room,
camera follows smoothly"
```
---
## Image APIs
### Image-to-Image Prompt Formula
```
参考图描述 + 保留要素 + 修改/新增指令
Reference description + Elements to preserve + Modifications/additions
```
**Component Breakdown**:
1. **Reference Description**: What's in the original image
2. **Preserve**: Explicitly state what to keep unchanged
3. **Modify/Add**: What to change or add
**Examples**:
```
"A woman in business attire at a modern office,
preserve facial features and body pose,
change background to outdoor garden with natural lighting,
add warm sunset atmosphere"
```
```
"Portrait of a man wearing a blue shirt,
keep facial features and expression,
change clothing to formal black suit with tie,
maintain studio lighting"
```
```
"Kitchen interior with white cabinets,
preserve layout and appliance positions,
change color scheme to navy blue cabinets with gold hardware,
add marble countertops"
```
### Image Redrawing Prompt Formula
```
替换区域的新内容描述
Description of what replaces the masked area
```
**Focus on**:
- What should appear in the masked region
- Style/texture matching surrounding areas
- Lighting consistency
- Natural integration
**Examples**:
```
"Clear blue sky with white fluffy clouds"
(for replacing background)
```
```
"Natural grass texture with small wildflowers"
(for replacing foreground)
```
```
"Modern glass windows with reflections of cityscape"
(for replacing building facade)
```
```
"Empty wooden table surface with subtle wood grain"
(for removing objects from table)
```
### Image Recognition Prompt Formula
```
要识别的对象/区域描述
Description of objects/regions to identify
```
**Focus on**:
- Clear object identification
- Specific vs. general (depending on need)
- Spatial context if helpful
**Examples**:
```
"person" / "all people"
```
```
"the red car in the foreground"
```
```
"sky and clouds"
```
```
"text and logos"
```
```
"background behind the main subject"
```
---
## Audio APIs
### Text-to-Music Prompt Formula
```
主体 + 场景(氛围/风格)
Subject/Theme + Scene (Atmosphere/Style)
```
**Component Breakdown**:
1. **Subject/Theme**: Purpose or topic of the music
2. **Atmosphere**: Mood and emotional quality
3. **Style/Genre**: Musical style and genre
4. **Instruments** (optional): Key instruments to feature
**Examples**:
**Simple**:
```
"Upbeat electronic music, energetic and modern"
```
**Standard**:
```
"Corporate presentation background music, professional and clean,
soft piano and ambient synth"
```
**Detailed**:
```
"Epic cinematic orchestral score, dramatic and heroic,
soaring strings and powerful brass,
Hans Zimmer style, for action scene"
```
**By Genre**:
**Electronic/Pop**:
```
"Energetic EDM track, festival atmosphere, heavy bass and synth drops"
```
**Jazz**:
```
"Smooth jazz for evening ambience, sophisticated and relaxed,
piano trio with double bass"
```
**Orchestral**:
```
"Emotional film score, contemplative and moving,
piano and string ensemble"
```
**Ambient**:
```
"Peaceful meditation music, calm and spacious,
soft pads and gentle chimes"
```
### Text-to-Sound-Effect Prompt Formula
```
声音来源 + 动作/环境 + 特征
Sound source + Action/Context + Characteristics
```
**Component Breakdown**:
1. **Sound Source**: What's making the sound
2. **Action/Context**: What's happening / where it is
3. **Characteristics**: Sound qualities (loud, soft, sharp, etc.)
**Examples**:
**Single Events**:
```
"Glass bottle shattering on concrete, sharp and crisp"
```
```
"Car door closing, solid thunk, reverb in garage"
```
```
"Notification bell sound, soft and pleasant"
```
**Continuous Sounds**:
```
"Heavy rain on metal roof, steady and rhythmic"
```
```
"City traffic ambience, cars passing, distant sirens"
```
```
"Forest birds chirping, peaceful morning atmosphere"
```
**By Category**:
**Nature**:
```
"Ocean waves crashing on beach, powerful and continuous"
```
**Mechanical**:
```
"Old typewriter keys typing, mechanical clicks and dings"
```
**Human**:
```
"Crowd cheering and applauding, enthusiastic and loud"
```
**UI/Digital**:
```
"Futuristic UI beep, high-tech and clean"
```
### Text-to-Speech Tips
**Structure**:
- Write naturally as you would speak
- Use punctuation for pacing and pauses
- Spell out numbers and abbreviations
**Examples**:
**With Pacing**:
```
"Welcome to our platform... Let me show you around."
(Ellipsis adds natural pause)
```
**With Emphasis**:
```
"This is extremely important. Pay close attention."
(Period creates strong pause for emphasis)
```
**Numbers**:
```
"Call one, eight hundred, five five five, zero one two three"
(Instead of "Call 1-800-555-0123")
```
### Video Soundtrack Prompt Formula
```
风格/类型 + 情绪 + (节奏/能量)
Style/Genre + Mood + (Optional: pacing/energy)
```
**Why simpler?** The API analyzes video content automatically.
**Examples**:
**Minimal** (let API analyze):
```
(no prompt - fully automatic based on video)
```
**Guided Style**:
```
"Upbeat travel vlog music, adventurous and inspiring"
```
```
"Corporate tech presentation music, modern and professional"
```
```
"Emotional documentary score, contemplative and moving"
```
---
## General Best Practices
### Do's ✅
1. **Be Specific**: Clear descriptions yield better results
- Good: "Golden retriever puppy playing with red ball"
- Bad: "Dog playing"
2. **Use Descriptive Adjectives**: Add relevant details
- Good: "Sleek modern smartphone with edge-to-edge display"
- Bad: "Phone"
3. **Specify Mood/Atmosphere**: Set the tone
- Good: "Dramatic sunset with vibrant orange and purple clouds"
- Bad: "Sunset"
4. **Include Context**: Where, when, why
- Good: "Professional product photography on white background"
- Bad: "Product"
5. **Mention Key Visual Elements**:
- Lighting conditions
- Color palette
- Composition style
- Movement characteristics
### Don'ts ❌
1. **Don't Be Vague**:
- Bad: "Nice video"
- Bad: "Good music"
- Bad: "Cool image"
2. **Don't Contradict**:
- Bad: "Bright and dark scene"
- Bad: "Fast and slow motion"
- Bad: "Happy sad music"
3. **Don't Over-specify Technical Details**:
- Bad: "1920x1080 resolution 24fps H.264 codec"
- Bad: "120 BPM in C major with I-V-vi-IV progression"
- (API handles technical parameters automatically)
4. **Don't Use Negations**:
- Bad: "Not blurry, not dark, not boring"
- Good: "Sharp, bright, engaging"
5. **Don't Mix Multiple Unrelated Concepts**:
- Bad: "Cat playing piano in space while cooking pasta"
- Better: Split into multiple generations or choose one focus
---
## Advanced Techniques
### Chaining Outputs
Use output from one API as input to another:
```python
# 1. Generate image
img_task = client.image_to_image(
prompt="Modern office space, clean and minimal",
image="reference.jpg"
)
img_url = get_result_url(img_task)
# 2. Animate image
video_task = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image=img_url
)
video_url = get_result_url(video_task)
# 3. Add soundtrack
audio_task = client.video_soundtrack(
video=video_url,
prompt="Professional corporate music"
)
```
### Iterative Refinement
1. **Start Simple**: Test with minimal prompt
2. **Evaluate Output**: What's missing or wrong?
3. **Add Details**: Incrementally add specificity
4. **Test Again**: Compare results
**Example Iteration**:
```
V1: "Car driving"
→ Too generic
V2: "Red sports car driving fast on highway"
→ Better, but lighting unclear
V3: "Red Ferrari speeding on coastal highway, golden hour sunset, dramatic"
→ Good result!
```
### A/B Testing
Generate multiple variations to compare:
```python
# Test different camera movements
variants = [
client.image_to_video(image=img, prompt="Slow zoom in", camera_move_index=5),
client.image_to_video(image=img, prompt="Pan right", camera_move_index=12),
client.image_to_video(image=img, prompt="Orbit around subject", camera_move_index=23)
]
```
### Consistency Across Assets
For cohesive content, maintain consistent prompt elements:
```python
# Consistent style keywords across multiple generations
style_guide = "cinematic, golden hour lighting, dramatic atmosphere"
video1 = client.text_to_video(f"Mountain landscape, {style_guide}")
video2 = client.text_to_video(f"Forest scene, {style_guide}")
video3 = client.text_to_video(f"Beach sunset, {style_guide}")
```
---
## Prompt Templates by Use Case
### Marketing/Commercial
```
"[Product name] showcased on [surface/background],
[camera movement],
studio lighting with soft shadows,
premium and elegant atmosphere"
```
### Social Media
```
"[Subject] [action],
[camera angle/movement],
vibrant colors and high energy,
engaging and eye-catching for [platform]"
```
### Documentary/Educational
```
"[Subject] in [environment],
[natural action],
natural documentary style cinematography,
informative and authentic atmosphere"
```
### Artistic/Creative
```
"[Abstract concept] visualized as [imagery],
[unique camera work],
[lighting style],
[artistic mood/style reference]"
```
---
## Troubleshooting
### Problem: Output doesn't match prompt
**Solutions**:
- Simplify prompt to core elements
- Remove contradictory instructions
- Be more specific about key aspects
- Check for typos or ambiguous terms
### Problem: Output quality is poor
**Solutions**:
- Add style keywords (cinematic, professional, high quality)
- Specify lighting conditions
- Mention camera quality/style
- Add atmosphere/mood descriptors
### Problem: Inconsistent results
**Solutions**:
- Use more specific prompts
- Reference specific styles or examples
- Test with variations to find optimal wording
- Use consistent style keywords across generations
### Problem: Generation fails (status=4)
**Solutions**:
- Check prompt for inappropriate content
- Simplify overly complex prompts
- Verify input files (images/videos) are valid
- Ensure parameters are within valid ranges
- Try with default parameters first
---
## Language Considerations
Tomoviee supports both **Chinese** and **English** prompts.
**Tips**:
- Use language you're most comfortable with
- Be aware of cultural/contextual nuances
- Technical terms may work better in English
- Artistic descriptions may vary by language interpretation
**Example Comparisons**:
Chinese:
```
"一只金毛寻回犬在阳光明媚的草地上奔跑,镜头跟随,电影级画质"
```
English:
```
"A golden retriever running through a sunlit meadow, camera following, cinematic quality"
```
Both can produce excellent results - choose based on your fluency and precision needs.
---
## Summary Checklist
Before submitting your prompt, verify:
- [ ] **Clear Subject**: What's the main focus?
- [ ] **Specific Action**: What's happening?
- [ ] **Scene Context**: Where is this taking place?
- [ ] **Visual Style**: What's the look and feel?
- [ ] **Technical Parameters**: Resolution, duration, aspect ratio set correctly?
- [ ] **No Contradictions**: All elements work together?
- [ ] **Appropriate Specificity**: Not too vague, not over-specified?
A well-crafted prompt is the foundation of excellent AI-generated content. Take time to refine your prompts for best results!
Generate images from a reference image using Tomoviee Image-to-Image API (`tm_reference_img2img`) through Wondershare OpenAPI gateway (`https://openapi.wonde...
---
name: tomoviee-reference-to-image
description: Generate images from a reference image using Tomoviee Image-to-Image API (`tm_reference_img2img`) through Wondershare OpenAPI gateway (`https://openapi.wondershare.cc`). Use when users request image-to-image editing, style transfer, or subject-preserving transformations.
---
# Tomoviee AI Reference-to-Image
## Overview
Generate new images from a reference image and prompt.
- API capability: `tm_reference_img2img`
- Create endpoint: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_reference_img2img`
- Result endpoint: `https://openapi.wondershare.cc/v1/open/pub/task`
## Provider and Endpoint Provenance
Use this mapping to verify provider identity and runtime endpoint provenance:
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Runtime gateway host used by this skill: `https://openapi.wondershare.cc`
- This skill sends runtime API calls only to `openapi.wondershare.cc`
## Credential Handling
- `app_key` and `app_secret` are only used to build `Authorization: Basic <base64(app_key:app_secret)>`.
- Credentials are kept in process memory only and are not written to disk by this skill.
- Do not commit credentials into skill files or repository history.
## Quick Start
### Install dependencies
```bash
pip install -r requirements.txt
```
### Authentication helper
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_img2img_client import TomovieeImg2ImgClient
client = TomovieeImg2ImgClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client.image_to_image(
prompt="Keep subject identity and posture, transform scene to modern office, photorealistic lighting",
reference_image="https://example.com/reference.jpg",
control_type="2",
init_image="https://example.com/reference.jpg",
width=1024,
height=1024,
batch_size=1,
control_intensity=0.5,
)
result = client.poll_until_complete(task_id)
import json
image_url = json.loads(result["result"])["images_path"][0]
print(image_url)
```
### Parameters
- `prompt` (required): Text prompt for preservation and transformation instructions.
- `reference_image` (required): Input reference image URL.
- `control_type` (required): Control mode. Supported values: `"0"`, `"1"`, `"2"`, `"3"`.
- `width` (required): Output width in pixels, range `512-2048`.
- `height` (required): Output height in pixels, range `512-2048`.
- `batch_size` (required): Number of generated images, range `1-4`.
- `control_intensity` (required): Control strength, range `0-1`.
- `init_image` (optional): Required by backend when `control_type="2"`.
- `callback` (optional): Callback URL.
- `params` (optional): Transparent callback passthrough parameter.
## Async Workflow
1. Create task and get `task_id`
2. Poll with `poll_until_complete(task_id)`
3. Parse output image URL(s) from `result`
Status codes:
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Resources
- `scripts/tomoviee_img2img_client.py` - main API client
- `scripts/tomoviee_image_to_image_client.py` - compatibility import shim
- `scripts/generate_auth_token.py` - auth token helper
- `references/image_apis.md` - API reference and constraints
- `references/prompt_guide.md` - prompt writing guidance
## External Resources
- Developer portal (global): `https://www.tomoviee.ai/developers.html`
- API docs (global): `https://www.tomoviee.ai/doc/ai-image/image-to-image.html`
- Developer portal (mainland): `https://www.tomoviee.cn/developers.html`
- API docs (mainland): `https://www.tomoviee.cn/doc/ai-image/image-to-image.html`
FILE:requirements.txt
requests>=2.31.0,<3.0.0
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-reference-to-image",
"version": "1.0.1",
"publishedAt": 1772933965996
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_image_to_image_client.py
#!/usr/bin/env python3
"""Compatibility shim. Prefer importing TomovieeImg2ImgClient from tomoviee_img2img_client."""
try:
from scripts.tomoviee_img2img_client import TomovieeClient, TomovieeImg2ImgClient
except Exception:
from tomoviee_img2img_client import TomovieeClient, TomovieeImg2ImgClient
__all__ = ["TomovieeClient", "TomovieeImg2ImgClient"]
FILE:scripts/tomoviee_img2img_client.py
#!/usr/bin/env python3
"""Tomoviee AI - Image-to-Image API client."""
import base64
import json
import time
from typing import Any, Dict, Optional
import requests
class TomovieeImg2ImgClient:
"""Image-to-Image API client for Tomoviee AI."""
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_reference_img2img"
REQUEST_TIMEOUT = 60
def __init__(self, app_key: str, app_secret: str):
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}",
}
def _safe_json(self, response: requests.Response) -> Dict[str, Any]:
try:
return response.json()
except ValueError as exc:
raise Exception(f"Invalid JSON response: {response.text}") from exc
def _make_request(self, payload: Dict[str, Any]) -> str:
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(
url,
headers=self._get_headers(),
json=payload,
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
task_id = result.get("data", {}).get("task_id")
if not task_id:
raise Exception(f"Missing task_id in response: {result}")
return task_id
def image_to_image(
self,
prompt: str,
reference_image: Optional[str] = None,
control_type: str = "2",
init_image: Optional[str] = None,
width: int = 1024,
height: int = 1024,
batch_size: int = 1,
control_intensity: float = 0.5,
callback: Optional[str] = None,
params: Optional[str] = None,
image: Optional[str] = None,
image_num: Optional[int] = None,
) -> str:
"""Create an image-to-image task and return task_id.
Backward compatibility:
- `image` is treated as `reference_image`.
- `image_num` is treated as `batch_size`.
"""
ref = reference_image or image
if not ref:
raise ValueError("reference_image is required")
if image_num is not None:
batch_size = image_num
if width < 512 or width > 2048:
raise ValueError("width must be in range 512..2048")
if height < 512 or height > 2048:
raise ValueError("height must be in range 512..2048")
if batch_size < 1 or batch_size > 4:
raise ValueError("batch_size must be in range 1..4")
if control_intensity < 0 or control_intensity > 1:
raise ValueError("control_intensity must be in range 0..1")
ct = str(control_type)
if ct not in {"0", "1", "2", "3"}:
raise ValueError("control_type must be one of: 0, 1, 2, 3")
payload: Dict[str, Any] = {
"prompt": prompt,
"width": width,
"height": height,
"batch_size": batch_size,
"control_intensity": control_intensity,
"control_type": ct,
"reference_image": ref,
}
if ct == "2":
payload["init_image"] = init_image or ref
elif init_image:
payload["init_image"] = init_image
if callback:
payload["callback"] = callback
if params:
payload["params"] = params
return self._make_request(payload)
def get_result(self, task_id: str) -> Dict[str, Any]:
response = requests.post(
self.RESULT_ENDPOINT,
headers=self._get_headers(),
json={"task_id": task_id},
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
data = result.get("data")
if not data:
raise Exception(f"Missing task data in response: {result}")
return data
def poll_until_complete(
self,
task_id: str,
poll_interval: int = 10,
timeout: int = 600,
) -> Dict[str, Any]:
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
status = result.get("status")
if status == 3:
return result
if status in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason', 'Unknown error')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
# Backward-compatible alias
TomovieeClient = TomovieeImg2ImgClient
if __name__ == "__main__":
import sys
if len(sys.argv) < 5:
print(
"Usage: python scripts/tomoviee_img2img_client.py "
"<app_key> <app_secret> <prompt> <reference_image_url> "
"[control_type] [width] [height] [batch_size] [control_intensity] [init_image_url]"
)
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
prompt = sys.argv[3]
reference_image = sys.argv[4]
control_type = sys.argv[5] if len(sys.argv) > 5 else "2"
width = int(sys.argv[6]) if len(sys.argv) > 6 else 1024
height = int(sys.argv[7]) if len(sys.argv) > 7 else 1024
batch_size = int(sys.argv[8]) if len(sys.argv) > 8 else 1
control_intensity = float(sys.argv[9]) if len(sys.argv) > 9 else 0.5
init_image = sys.argv[10] if len(sys.argv) > 10 else None
client = TomovieeImg2ImgClient(app_key, app_secret)
try:
print("Creating image-to-image task...")
task_id = client.image_to_image(
prompt=prompt,
reference_image=reference_image,
control_type=control_type,
width=width,
height=height,
batch_size=batch_size,
control_intensity=control_intensity,
init_image=init_image,
)
print(f"Task created: {task_id}")
print("Polling for result...")
result = client.poll_until_complete(task_id)
print("\nTask completed")
print(f"Progress: {result.get('progress', 'N/A')}%")
result_data = json.loads(result["result"])
print(json.dumps(result_data, indent=2, ensure_ascii=False))
except Exception as exc:
print(f"Error: {exc}")
sys.exit(1)
FILE:references/image_apis.md
# Tomoviee Image-to-Image API Reference
## Provenance and Endpoint Mapping
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Runtime gateway host used by this skill: `https://openapi.wondershare.cc`
- Primary capability: `tm_reference_img2img`
All runtime API calls in this skill target only:
1. `https://openapi.wondershare.cc/v1/open/capacity/application/tm_reference_img2img`
2. `https://openapi.wondershare.cc/v1/open/pub/task`
## Text Parameters
- `prompt` (required): Prompt text (preserve + modify instructions)
## Control Parameters
- `control_type` (required): `"0"`, `"1"`, `"2"`, `"3"`
- `control_intensity` (required): `0-1`
## Image Parameters
- `reference_image` (required): reference image URL
- `init_image` (optional): backend-required when `control_type="2"`
## Output Parameters
- `width` (required): `512-2048`
- `height` (required): `512-2048`
- `batch_size` (required): `1-4`
## Optional Callback Parameters
- `callback` (optional): callback URL
- `params` (optional): transparent callback passthrough
## Result Endpoint
`https://openapi.wondershare.cc/v1/open/pub/task`
## Status Codes
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Example
```python
from scripts.tomoviee_img2img_client import TomovieeImg2ImgClient
client = TomovieeImg2ImgClient("app_key", "app_secret")
task_id = client.image_to_image(
prompt="Keep identity and composition, change background to modern office",
reference_image="https://example.com/reference.jpg",
control_type="2",
init_image="https://example.com/reference.jpg",
width=1024,
height=1024,
batch_size=1,
control_intensity=0.5,
)
result = client.poll_until_complete(task_id)
```
FILE:references/prompt_guide.md
# Tomoviee Prompt Engineering Guide
## Overview
This guide provides structured prompt formulas and best practices for all Tomoviee AI APIs to achieve optimal generation results.
---
## Video APIs
### Text-to-Video Prompt Formula
```
主体(描述) + 运动 + 场景(描述) + (镜头语言 + 光影 + 氛围)
Subject (description) + Motion + Scene (description) + (Camera + Lighting + Atmosphere)
```
**Component Breakdown**:
1. **Subject (主体)**: Main focus of the video
- Who/what is in the scene
- Key characteristics, appearance
- Position and pose
2. **Motion (运动)**: Action and movement
- What the subject is doing
- Speed and direction of movement
- Dynamic elements
3. **Scene (场景)**: Environment and context
- Location and setting
- Background elements
- Time of day/season
4. **Camera (镜头语言)**: Optional camera work
- Camera angle (wide, close-up, aerial, etc.)
- Camera movement (pan, zoom, tracking, etc.)
- Or specify `camera_move_index` parameter
5. **Lighting (光影)**: Optional lighting style
- Natural/artificial light
- Time of day (golden hour, blue hour, etc.)
- Light direction and quality
6. **Atmosphere (氛围)**: Optional mood and tone
- Overall feeling (dramatic, peaceful, energetic)
- Color palette/grading
- Weather/environmental effects
**Examples**:
**Minimal** (Subject + Motion + Scene):
```
"A red sports car driving fast on a coastal highway at sunset"
```
**Standard** (+ Camera + Lighting):
```
"A red sports car speeding along a winding coastal highway at golden hour,
camera following from the side, warm orange sunlight reflecting off the car"
```
**Detailed** (+ Atmosphere):
```
"A sleek red Ferrari speeding along a dramatic coastal highway carved into cliffs,
camera tracking smoothly from the side at car level,
golden hour sunset casting long shadows and warm orange glow,
cinematic and epic atmosphere with ocean waves crashing below"
```
**More Examples by Use Case**:
**Product Showcase**:
```
"White wireless headphones slowly rotating on a minimalist white surface,
studio lighting with soft shadows, clean and modern atmosphere"
```
**Nature/Travel**:
```
"Majestic waterfall cascading down mossy rocks in a lush rainforest,
slow zoom in from wide to medium shot,
dappled sunlight filtering through the canopy,
serene and peaceful atmosphere"
```
**Action/Sports**:
```
"Professional skateboarder performing a kickflip on urban street ramp,
slow motion capture from low angle,
bright daylight with high contrast shadows,
energetic and dynamic atmosphere"
```
### Image-to-Video Prompt Formula
```
主体 + 运动 + 镜头语言
Subject + Motion + Camera
```
**Why simpler?** The image already provides:
- Scene and environment
- Lighting and color palette
- Composition and framing
- Atmosphere and mood
**Your prompt should focus on**:
- Motion to add to the static image
- Camera movement to apply
- Any new dynamic elements
**Examples**:
```
"Camera slowly zooming in, subject's hair gently blowing in the wind"
```
```
"Slow pan from left to right, leaves rustling, golden hour lighting"
```
```
"Camera orbiting around the subject, dramatic lighting, cinematic feel"
```
```
"Gentle push in toward the subject's face, bokeh background, emotional atmosphere"
```
### Video Continuation Prompt Formula
```
延续的动作 + 场景变化 + 镜头延续
Continued action + Scene evolution + Camera continuation
```
**Focus on**:
- How the action continues from the last frame
- Natural progression of movement
- Scene changes or transitions
- Camera movement consistency
**Examples**:
```
"The bird continues flying higher, soaring into the clouds,
camera following the ascent"
```
```
"The car continues down the road, passing through a tunnel,
maintaining tracking shot"
```
```
"The person continues walking, entering a brightly lit room,
camera follows smoothly"
```
---
## Image APIs
### Image-to-Image Prompt Formula
```
参考图描述 + 保留要素 + 修改/新增指令
Reference description + Elements to preserve + Modifications/additions
```
**Component Breakdown**:
1. **Reference Description**: What's in the original image
2. **Preserve**: Explicitly state what to keep unchanged
3. **Modify/Add**: What to change or add
**Examples**:
```
"A woman in business attire at a modern office,
preserve facial features and body pose,
change background to outdoor garden with natural lighting,
add warm sunset atmosphere"
```
```
"Portrait of a man wearing a blue shirt,
keep facial features and expression,
change clothing to formal black suit with tie,
maintain studio lighting"
```
```
"Kitchen interior with white cabinets,
preserve layout and appliance positions,
change color scheme to navy blue cabinets with gold hardware,
add marble countertops"
```
### Image Redrawing Prompt Formula
```
替换区域的新内容描述
Description of what replaces the masked area
```
**Focus on**:
- What should appear in the masked region
- Style/texture matching surrounding areas
- Lighting consistency
- Natural integration
**Examples**:
```
"Clear blue sky with white fluffy clouds"
(for replacing background)
```
```
"Natural grass texture with small wildflowers"
(for replacing foreground)
```
```
"Modern glass windows with reflections of cityscape"
(for replacing building facade)
```
```
"Empty wooden table surface with subtle wood grain"
(for removing objects from table)
```
### Image Recognition Prompt Formula
```
要识别的对象/区域描述
Description of objects/regions to identify
```
**Focus on**:
- Clear object identification
- Specific vs. general (depending on need)
- Spatial context if helpful
**Examples**:
```
"person" / "all people"
```
```
"the red car in the foreground"
```
```
"sky and clouds"
```
```
"text and logos"
```
```
"background behind the main subject"
```
---
## Audio APIs
### Text-to-Music Prompt Formula
```
主体 + 场景(氛围/风格)
Subject/Theme + Scene (Atmosphere/Style)
```
**Component Breakdown**:
1. **Subject/Theme**: Purpose or topic of the music
2. **Atmosphere**: Mood and emotional quality
3. **Style/Genre**: Musical style and genre
4. **Instruments** (optional): Key instruments to feature
**Examples**:
**Simple**:
```
"Upbeat electronic music, energetic and modern"
```
**Standard**:
```
"Corporate presentation background music, professional and clean,
soft piano and ambient synth"
```
**Detailed**:
```
"Epic cinematic orchestral score, dramatic and heroic,
soaring strings and powerful brass,
Hans Zimmer style, for action scene"
```
**By Genre**:
**Electronic/Pop**:
```
"Energetic EDM track, festival atmosphere, heavy bass and synth drops"
```
**Jazz**:
```
"Smooth jazz for evening ambience, sophisticated and relaxed,
piano trio with double bass"
```
**Orchestral**:
```
"Emotional film score, contemplative and moving,
piano and string ensemble"
```
**Ambient**:
```
"Peaceful meditation music, calm and spacious,
soft pads and gentle chimes"
```
### Text-to-Sound-Effect Prompt Formula
```
声音来源 + 动作/环境 + 特征
Sound source + Action/Context + Characteristics
```
**Component Breakdown**:
1. **Sound Source**: What's making the sound
2. **Action/Context**: What's happening / where it is
3. **Characteristics**: Sound qualities (loud, soft, sharp, etc.)
**Examples**:
**Single Events**:
```
"Glass bottle shattering on concrete, sharp and crisp"
```
```
"Car door closing, solid thunk, reverb in garage"
```
```
"Notification bell sound, soft and pleasant"
```
**Continuous Sounds**:
```
"Heavy rain on metal roof, steady and rhythmic"
```
```
"City traffic ambience, cars passing, distant sirens"
```
```
"Forest birds chirping, peaceful morning atmosphere"
```
**By Category**:
**Nature**:
```
"Ocean waves crashing on beach, powerful and continuous"
```
**Mechanical**:
```
"Old typewriter keys typing, mechanical clicks and dings"
```
**Human**:
```
"Crowd cheering and applauding, enthusiastic and loud"
```
**UI/Digital**:
```
"Futuristic UI beep, high-tech and clean"
```
### Text-to-Speech Tips
**Structure**:
- Write naturally as you would speak
- Use punctuation for pacing and pauses
- Spell out numbers and abbreviations
**Examples**:
**With Pacing**:
```
"Welcome to our platform... Let me show you around."
(Ellipsis adds natural pause)
```
**With Emphasis**:
```
"This is extremely important. Pay close attention."
(Period creates strong pause for emphasis)
```
**Numbers**:
```
"Call one, eight hundred, five five five, zero one two three"
(Instead of "Call 1-800-555-0123")
```
### Video Soundtrack Prompt Formula
```
风格/类型 + 情绪 + (节奏/能量)
Style/Genre + Mood + (Optional: pacing/energy)
```
**Why simpler?** The API analyzes video content automatically.
**Examples**:
**Minimal** (let API analyze):
```
(no prompt - fully automatic based on video)
```
**Guided Style**:
```
"Upbeat travel vlog music, adventurous and inspiring"
```
```
"Corporate tech presentation music, modern and professional"
```
```
"Emotional documentary score, contemplative and moving"
```
---
## General Best Practices
### Do's ✅
1. **Be Specific**: Clear descriptions yield better results
- Good: "Golden retriever puppy playing with red ball"
- Bad: "Dog playing"
2. **Use Descriptive Adjectives**: Add relevant details
- Good: "Sleek modern smartphone with edge-to-edge display"
- Bad: "Phone"
3. **Specify Mood/Atmosphere**: Set the tone
- Good: "Dramatic sunset with vibrant orange and purple clouds"
- Bad: "Sunset"
4. **Include Context**: Where, when, why
- Good: "Professional product photography on white background"
- Bad: "Product"
5. **Mention Key Visual Elements**:
- Lighting conditions
- Color palette
- Composition style
- Movement characteristics
### Don'ts ❌
1. **Don't Be Vague**:
- Bad: "Nice video"
- Bad: "Good music"
- Bad: "Cool image"
2. **Don't Contradict**:
- Bad: "Bright and dark scene"
- Bad: "Fast and slow motion"
- Bad: "Happy sad music"
3. **Don't Over-specify Technical Details**:
- Bad: "1920x1080 resolution 24fps H.264 codec"
- Bad: "120 BPM in C major with I-V-vi-IV progression"
- (API handles technical parameters automatically)
4. **Don't Use Negations**:
- Bad: "Not blurry, not dark, not boring"
- Good: "Sharp, bright, engaging"
5. **Don't Mix Multiple Unrelated Concepts**:
- Bad: "Cat playing piano in space while cooking pasta"
- Better: Split into multiple generations or choose one focus
---
## Advanced Techniques
### Chaining Outputs
Use output from one API as input to another:
```python
# 1. Generate image
img_task = client.image_to_image(
prompt="Modern office space, clean and minimal",
image="reference.jpg"
)
img_url = get_result_url(img_task)
# 2. Animate image
video_task = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image=img_url
)
video_url = get_result_url(video_task)
# 3. Add soundtrack
audio_task = client.video_soundtrack(
video=video_url,
prompt="Professional corporate music"
)
```
### Iterative Refinement
1. **Start Simple**: Test with minimal prompt
2. **Evaluate Output**: What's missing or wrong?
3. **Add Details**: Incrementally add specificity
4. **Test Again**: Compare results
**Example Iteration**:
```
V1: "Car driving"
→ Too generic
V2: "Red sports car driving fast on highway"
→ Better, but lighting unclear
V3: "Red Ferrari speeding on coastal highway, golden hour sunset, dramatic"
→ Good result!
```
### A/B Testing
Generate multiple variations to compare:
```python
# Test different camera movements
variants = [
client.image_to_video(image=img, prompt="Slow zoom in", camera_move_index=5),
client.image_to_video(image=img, prompt="Pan right", camera_move_index=12),
client.image_to_video(image=img, prompt="Orbit around subject", camera_move_index=23)
]
```
### Consistency Across Assets
For cohesive content, maintain consistent prompt elements:
```python
# Consistent style keywords across multiple generations
style_guide = "cinematic, golden hour lighting, dramatic atmosphere"
video1 = client.text_to_video(f"Mountain landscape, {style_guide}")
video2 = client.text_to_video(f"Forest scene, {style_guide}")
video3 = client.text_to_video(f"Beach sunset, {style_guide}")
```
---
## Prompt Templates by Use Case
### Marketing/Commercial
```
"[Product name] showcased on [surface/background],
[camera movement],
studio lighting with soft shadows,
premium and elegant atmosphere"
```
### Social Media
```
"[Subject] [action],
[camera angle/movement],
vibrant colors and high energy,
engaging and eye-catching for [platform]"
```
### Documentary/Educational
```
"[Subject] in [environment],
[natural action],
natural documentary style cinematography,
informative and authentic atmosphere"
```
### Artistic/Creative
```
"[Abstract concept] visualized as [imagery],
[unique camera work],
[lighting style],
[artistic mood/style reference]"
```
---
## Troubleshooting
### Problem: Output doesn't match prompt
**Solutions**:
- Simplify prompt to core elements
- Remove contradictory instructions
- Be more specific about key aspects
- Check for typos or ambiguous terms
### Problem: Output quality is poor
**Solutions**:
- Add style keywords (cinematic, professional, high quality)
- Specify lighting conditions
- Mention camera quality/style
- Add atmosphere/mood descriptors
### Problem: Inconsistent results
**Solutions**:
- Use more specific prompts
- Reference specific styles or examples
- Test with variations to find optimal wording
- Use consistent style keywords across generations
### Problem: Generation fails (status=4)
**Solutions**:
- Check prompt for inappropriate content
- Simplify overly complex prompts
- Verify input files (images/videos) are valid
- Ensure parameters are within valid ranges
- Try with default parameters first
---
## Language Considerations
Tomoviee supports both **Chinese** and **English** prompts.
**Tips**:
- Use language you're most comfortable with
- Be aware of cultural/contextual nuances
- Technical terms may work better in English
- Artistic descriptions may vary by language interpretation
**Example Comparisons**:
Chinese:
```
"一只金毛寻回犬在阳光明媚的草地上奔跑,镜头跟随,电影级画质"
```
English:
```
"A golden retriever running through a sunlit meadow, camera following, cinematic quality"
```
Both can produce excellent results - choose based on your fluency and precision needs.
---
## Summary Checklist
Before submitting your prompt, verify:
- [ ] **Clear Subject**: What's the main focus?
- [ ] **Specific Action**: What's happening?
- [ ] **Scene Context**: Where is this taking place?
- [ ] **Visual Style**: What's the look and feel?
- [ ] **Technical Parameters**: Resolution, duration, aspect ratio set correctly?
- [ ] **No Contradictions**: All elements work together?
- [ ] **Appropriate Specificity**: Not too vague, not over-specified?
A well-crafted prompt is the foundation of excellent AI-generated content. Take time to refine your prompts for best results!
Generate music tailored to video content. Use when users request video_soundtrack operations or related tasks.
---
name: tomoviee-video-scoring
description: Generate music tailored to video content. Use when users request video_soundtrack operations or related tasks.
---
# Tomoviee AI - 视频配乐 (Video Soundtrack)
## Overview
Generate music tailored to video content.
**API**: `tm_video_scoring`
## Quick Start
### Authentication
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_video_soundtrack_client import TomovieeClient
client = TomovieeClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client._make_request({
video='https://example.com/my-video.mp4'
prompt='Modern tech product music, clean'
})
result = client.poll_until_complete(task_id)
import json
output = json.loads(result['result'])
```
### Parameters
- `video` (required): Video URL (MP4, <200M)
- `prompt`: Optional style guidance
- `duration`: Audio duration (5-900, default: 20)
## Async Workflow
1. **Create task**: Get `task_id` from API call
2. **Poll for completion**: Use `poll_until_complete(task_id)`
3. **Extract result**: Parse returned JSON for output URLs
**Status codes**:
- 1 = Queued
- 2 = Processing
- 3 = Success (ready)
- 4 = Failed
- 5 = Cancelled
- 6 = Timeout
## Resources
### scripts/
- `tomoviee_video_soundtrack_client.py` - API client
- `generate_auth_token.py` - Auth token generator
### references/
See bundled reference documents for detailed API documentation and examples.
## External Resources
- **Developer Portal**: https://www.tomoviee.ai/developers.html
- **API Documentation**: https://www.tomoviee.ai/doc/
- **Get API Credentials**: Register at developer portal
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-video-background-music",
"version": "1.0.1",
"publishedAt": 1772934034364
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_video_scoring_client.py
#!/usr/bin/env python3
"""Tomoviee AI - video_scoring API client"""
import base64, json, time
from typing import Dict, Optional, Any
import requests
class TomovieeClient:
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_video_scoring"
def __init__(self, app_key: str, app_secret: str):
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}"
}
def _make_request(self, payload: Dict[str, Any]) -> str:
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(url, headers=self._get_headers(), json=payload)
result = response.json()
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg')}")
return result["data"]["task_id"]
def get_result(self, task_id: str) -> Dict[str, Any]:
response = requests.post(self.RESULT_ENDPOINT, headers=self._get_headers(), json={"task_id": task_id})
result = response.json()
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg')}")
return result["data"]
def poll_until_complete(self, task_id: str, poll_interval: int = 10, timeout: int = 600) -> Dict[str, Any]:
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
if result["status"] == 3:
return result
elif result["status"] in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
FILE:references/audio_apis.md
# Tomoviee Audio Generation APIs
## Overview
Tomoviee provides four audio generation APIs covering music creation, sound effects, text-to-speech, and video soundtracks.
## API Endpoints
### 1. Text-to-Music (tm_text2music)
**Generate background music from text description**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2music`
**Parameters**:
- `prompt` (required): Music description (subject + scene/atmosphere/style)
- `duration`: Audio duration in seconds - 5 to 900 (default: 20)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Generate background music for videos
- Create royalty-free music for content
- Produce atmospheric soundtracks
- Generate music beds for podcasts/presentations
**Prompt Structure**:
```
[Subject/Theme] + [Scene/Atmosphere] + [Style/Genre]
```
**Examples**:
```python
# Upbeat commercial music
task_id = client.text_to_music(
prompt="Upbeat corporate technology music, modern and energetic, electronic pop style with piano",
duration=60
)
# Cinematic background
task_id = client.text_to_music(
prompt="Epic cinematic orchestral music, dramatic and inspiring, Hollywood film score style",
duration=120
)
# Calm ambient
task_id = client.text_to_music(
prompt="Peaceful meditation music, calm and relaxing, ambient with soft synth pads",
duration=300
)
# Specific genre
task_id = client.text_to_music(
prompt="Jazz music for cafe scene, smooth and sophisticated, with piano and double bass",
duration=90
)
```
**Duration Guidelines**:
- Short (5-30s): Intro/outro, transitions, jingles
- Medium (30-120s): Social media videos, ads, short presentations
- Long (120-900s): Full videos, podcasts, extended scenes
**Prompt Tips**:
- Specify genre (electronic, orchestral, jazz, rock, ambient, etc.)
- Include mood/atmosphere (upbeat, dramatic, calm, mysterious, etc.)
- Mention key instruments if important
- Add tempo hints (fast, slow, moderate)
- Reference use case for context
---
### 2. Text-to-Sound-Effect (tm_text2sfx)
**Generate sound effects from text description**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2sfx`
**Parameters**:
- `prompt` (required): Sound effect description
- `duration`: Audio duration in seconds - 5 to 180 (default: 10)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Create custom sound effects for videos
- Generate foley sounds
- Produce game audio assets
- Create notification/UI sounds
- Add environmental ambience
**Examples**:
```python
# Nature sounds
task_id = client.text_to_sound_effect(
prompt="Heavy rain falling on roof with distant thunder",
duration=30
)
# Action sounds
task_id = client.text_to_sound_effect(
prompt="Car engine starting and revving",
duration=8
)
# UI/notification sounds
task_id = client.text_to_sound_effect(
prompt="Soft notification bell chime",
duration=2
)
# Environmental ambience
task_id = client.text_to_sound_effect(
prompt="Busy city street with traffic and people talking",
duration=60
)
# Mechanical sounds
task_id = client.text_to_sound_effect(
prompt="Keyboard typing sounds, mechanical switches",
duration=15
)
```
**Duration Guidelines**:
- Very Short (1-5s): UI sounds, notifications, single events
- Short (5-15s): Action sounds, transitions, specific events
- Medium (15-60s): Ambience loops, environmental sounds
- Long (60-180s): Extended ambience, background environments
**Prompt Tips**:
- Be specific about the sound source
- Include context (heavy/light, fast/slow, near/far)
- Mention materials if relevant (wood, metal, glass)
- Describe sound character (crisp, muffled, echoing, etc.)
- For ambience, describe the environment
---
### 3. Text-to-Speech (tm_text2speech)
**Convert text to natural speech audio**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2speech`
**Parameters**:
- `text` (required): Text to convert to speech (max 5000 characters)
- `voice_id`: Voice ID (default: `zh-CN-YunxiNeural`)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Generate voiceovers for videos
- Create audio versions of text content
- Produce narration for presentations
- Generate dialogue for characters
- Accessibility features (text-to-audio conversion)
**Examples**:
```python
# Chinese voice (default)
task_id = client.text_to_speech(
text="欢迎使用天幕AI视频生成平台。我们提供专业的视频、图像和音频生成服务。",
voice_id="zh-CN-YunxiNeural"
)
# English voice
task_id = client.text_to_speech(
text="Welcome to Tomoviee AI platform. We provide professional video, image, and audio generation services.",
voice_id="en-US-JennyNeural"
)
# Long narration (multiple sentences)
long_text = """
This is a comprehensive guide to using AI-generated content.
First, you need to understand the basics.
Then, you can explore advanced features.
Finally, apply what you learned to your projects.
"""
task_id = client.text_to_speech(
text=long_text,
voice_id="en-US-GuyNeural"
)
```
**Voice ID Options**:
Common voice IDs include:
- **Chinese**: `zh-CN-YunxiNeural`, `zh-CN-XiaoxiaoNeural`, `zh-CN-YunjianNeural`
- **English (US)**: `en-US-JennyNeural`, `en-US-GuyNeural`, `en-US-AriaNeural`
- **English (UK)**: `en-GB-SoniaNeural`, `en-GB-RyanNeural`
- **Other languages**: Check Tomoviee documentation for full list
**Text Guidelines**:
- Maximum 5000 characters per request
- For longer content, split into multiple requests
- Use proper punctuation for natural pauses
- Include phonetic spelling for unusual names/terms
- Avoid excessive special characters
**Prompt Tips**:
- Write naturally as you would speak
- Use commas and periods for pacing
- Consider the target language and voice gender
- Test different voices for best match to content
- For emphasis, try rephrasing rather than CAPS
---
### 4. Video Soundtrack (tm_video_scoring)
**Generate soundtrack tailored to video content**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_video_scoring`
**Parameters**:
- `video` (required): Video URL (MP4 format, <200M)
- `prompt`: Optional music style description
- `duration`: Audio duration in seconds - 5 to 900 (default: 20)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Auto-generate music that matches video mood/pacing
- Create synchronized soundtracks
- Add background music to silent videos
- Generate adaptive scores based on video content
- Quick soundtrack prototyping
**Examples**:
```python
# Auto-generate based on video analysis
task_id = client.video_soundtrack(
video="https://example.com/my-video.mp4",
duration=60
)
# Guided style
task_id = client.video_soundtrack(
video="https://example.com/travel-vlog.mp4",
prompt="Upbeat travel vlog music, adventurous and inspiring",
duration=120
)
# Specific mood
task_id = client.video_soundtrack(
video="https://example.com/product-demo.mp4",
prompt="Modern corporate tech music, professional and clean",
duration=45
)
# Dramatic scene
task_id = client.video_soundtrack(
video="https://example.com/action-scene.mp4",
prompt="Intense action music with driving rhythm",
duration=30
)
```
**How It Works**:
1. API analyzes video content (scenes, pacing, mood)
2. Generates music that matches video characteristics
3. If `prompt` provided, combines video analysis with style guidance
4. Duration should match video length or intended music length
**Prompt Guidelines**:
- Optional - omit for fully automatic generation
- If provided, give high-level style/mood guidance
- Let the API handle sync and pacing based on video
- Don't over-specify - video analysis already provides context
**Video Requirements**:
- Format: MP4
- Size: <200M
- Publicly accessible URL
- Reasonable quality (not corrupted)
---
## Common Parameters
### Duration Ranges
| API | Min | Max | Default | Typical Use |
|-----|-----|-----|---------|-------------|
| Text-to-Music | 5s | 900s (15min) | 20s | Background music, full tracks |
| Text-to-SFX | 5s | 180s (3min) | 10s | Sound effects, ambience |
| Text-to-Speech | N/A | Based on text | N/A | Matches text length |
| Video Soundtrack | 5s | 900s (15min) | 20s | Match video duration |
### Callback URLs
All audio APIs support optional callback URLs for async notification when generation completes.
**Callback Payload**:
```json
{
"task_id": "...",
"status": 3,
"result": "{\"audio_path\": [\"https://...\"]}"
"params": "your_transparent_params"
}
```
---
## Async Workflow
All audio APIs are asynchronous:
1. **Create Task**: Call API endpoint → receive `task_id`
2. **Poll Status**: Call unified result endpoint with `task_id`
3. **Check Status**:
- `1` = Queued
- `2` = Processing
- `3` = Success (audio ready)
- `4` = Failed
- `5` = Cancelled
- `6` = Timeout
4. **Get Result**: When status=3, extract audio URL from result JSON
**Unified Result Endpoint**: `https://openapi.wondershare.cc/v1/open/pub/task`
**Example Workflow**:
```python
# Create task
task_id = client.text_to_music(
prompt="Upbeat electronic music",
duration=60
)
# Poll for completion
result = client.poll_until_complete(task_id, poll_interval=5, timeout=300)
# Extract audio URL
import json
result_data = json.loads(result['result'])
audio_url = result_data['audio_path'][0]
```
**Generation Times**:
- Text-to-Music: 30s - 2min (depends on duration)
- Text-to-SFX: 10s - 1min
- Text-to-Speech: 5s - 30s
- Video Soundtrack: 1min - 3min (includes video analysis)
---
## Prompt Engineering Tips
### Music Prompts
**Structure**: `[Subject/Theme] + [Atmosphere/Mood] + [Style/Genre] + [Instruments]`
**Good Examples**:
- "Epic cinematic orchestral music, dramatic and heroic, with brass and strings"
- "Chill lofi hip hop beats, relaxed and contemplative, with piano and soft drums"
- "Energetic workout music, motivating and powerful, electronic dance with heavy bass"
- "Romantic piano music, gentle and emotional, solo piano with soft reverb"
**Avoid**:
- Too vague: "good music"
- Conflicting moods: "sad and happy upbeat music"
- Overly technical: "120 BPM, C major, with specific chord progressions" (AI determines this)
### Sound Effect Prompts
**Structure**: `[Sound Source] + [Action/Context] + [Characteristics]`
**Good Examples**:
- "Glass bottle breaking on concrete floor, sharp and crisp"
- "Ocean waves crashing on rocky shore, powerful and rhythmic"
- "Footsteps on wooden floor, slow and deliberate"
- "Wind howling through trees, eerie and continuous"
**Avoid**:
- Multiple unrelated sounds in one prompt (create separate requests)
- Vague descriptions: "some noise"
- Impossible sounds: "purple sounding like Tuesday"
### Text-to-Speech Tips
**Good Practices**:
- Write as you would naturally speak
- Use punctuation to control pacing: periods for pauses, commas for breath
- Spell out numbers and abbreviations for clarity
- Test pronunciation of brand names or technical terms
- Break very long text into logical segments
**Example Text Formatting**:
```
Good: "Welcome to our platform. Let me show you around."
Better: "Welcome to our platform... Let me show you around."
(Extra pause with ellipsis)
Good: "Call 1-800-555-0123"
Better: "Call one, eight hundred, five five five, zero one two three"
(For natural pronunciation)
```
### Video Soundtrack Prompts
**Structure**: `[Style/Genre] + [Mood] + [Optional: pacing/energy]`
**Good Examples**:
- "Upbeat travel music, adventurous and inspiring"
- "Corporate presentation music, professional and modern"
- "Emotional documentary score, contemplative and moving"
- "High-energy sports highlight music, intense and exciting"
**Avoid**:
- Over-specifying video content (API analyzes this)
- Conflicting with video mood (let video analysis guide)
- Too generic: "background music" (add style/mood)
---
## Error Handling
**Common Errors**:
- `400`: Invalid parameters (check duration ranges, text length)
- `401`: Authentication failed
- `413`: Video file too large (>200M)
- `422`: Invalid video format or text encoding
- Task status `4`: Generation failed (check prompt or input)
**Best Practices**:
- Validate duration is within allowed range
- Ensure video URLs are publicly accessible
- Keep text under 5000 characters for TTS
- Test prompts with shorter durations first
- Implement retry logic with exponential backoff
---
## Output Format
All audio APIs return:
- **Format**: MP3 (lossy compression)
- **Sample Rate**: 44.1kHz or 48kHz
- **Bitrate**: 128-320 kbps (varies by API)
- **Channels**: Stereo (2 channels)
For professional use, you may want to post-process:
- Normalize volume levels
- Apply EQ/mastering
- Trim silence
- Convert to other formats if needed
---
## Quota and Limits
- Concurrent tasks: Check your plan
- File size limits: <200M for video inputs
- Text length: Max 5000 characters for TTS
- Duration limits: See table above per API
- Generation time: Typically 10s - 3min depending on API and duration
---
## Use Case Combinations
### Complete Video Production
```python
# 1. Generate video
video_task = client.text_to_video(
prompt="Product showcase on white background"
)
video_result = client.poll_until_complete(video_task)
video_url = json.loads(video_result['result'])['video_path'][0]
# 2. Generate voiceover
speech_task = client.text_to_speech(
text="Introducing our revolutionary new product..."
)
speech_result = client.poll_until_complete(speech_task)
speech_url = json.loads(speech_result['result'])['audio_path'][0]
# 3. Generate background music
music_task = client.video_soundtrack(
video=video_url,
prompt="Upbeat modern tech music"
)
music_result = client.poll_until_complete(music_task)
music_url = json.loads(music_result['result'])['audio_path'][0]
# 4. Mix audio tracks (use external tool like ffmpeg)
```
### Podcast Production
```python
# Generate intro music
intro_task = client.text_to_music(
prompt="Podcast intro jingle, energetic and catchy",
duration=10
)
# Generate main content speech
content_task = client.text_to_speech(
text="Welcome to episode 42...",
voice_id="en-US-GuyNeural"
)
# Generate outro music
outro_task = client.text_to_music(
prompt="Podcast outro, fade-out style",
duration=15
)
```
### Game Audio Assets
```python
# Background ambience
ambience = client.text_to_sound_effect(
prompt="Medieval tavern atmosphere with crowd chatter",
duration=120
)
# UI sounds
click_sound = client.text_to_sound_effect(
prompt="UI button click, soft and satisfying",
duration=1
)
# Character voice
npc_voice = client.text_to_speech(
text="Greetings, traveler! What brings you here?",
voice_id="en-US-GuyNeural"
)
```
FILE:references/prompt_guide.md
# Tomoviee Prompt Engineering Guide
## Overview
This guide provides structured prompt formulas and best practices for all Tomoviee AI APIs to achieve optimal generation results.
---
## Video APIs
### Text-to-Video Prompt Formula
```
主体(描述) + 运动 + 场景(描述) + (镜头语言 + 光影 + 氛围)
Subject (description) + Motion + Scene (description) + (Camera + Lighting + Atmosphere)
```
**Component Breakdown**:
1. **Subject (主体)**: Main focus of the video
- Who/what is in the scene
- Key characteristics, appearance
- Position and pose
2. **Motion (运动)**: Action and movement
- What the subject is doing
- Speed and direction of movement
- Dynamic elements
3. **Scene (场景)**: Environment and context
- Location and setting
- Background elements
- Time of day/season
4. **Camera (镜头语言)**: Optional camera work
- Camera angle (wide, close-up, aerial, etc.)
- Camera movement (pan, zoom, tracking, etc.)
- Or specify `camera_move_index` parameter
5. **Lighting (光影)**: Optional lighting style
- Natural/artificial light
- Time of day (golden hour, blue hour, etc.)
- Light direction and quality
6. **Atmosphere (氛围)**: Optional mood and tone
- Overall feeling (dramatic, peaceful, energetic)
- Color palette/grading
- Weather/environmental effects
**Examples**:
**Minimal** (Subject + Motion + Scene):
```
"A red sports car driving fast on a coastal highway at sunset"
```
**Standard** (+ Camera + Lighting):
```
"A red sports car speeding along a winding coastal highway at golden hour,
camera following from the side, warm orange sunlight reflecting off the car"
```
**Detailed** (+ Atmosphere):
```
"A sleek red Ferrari speeding along a dramatic coastal highway carved into cliffs,
camera tracking smoothly from the side at car level,
golden hour sunset casting long shadows and warm orange glow,
cinematic and epic atmosphere with ocean waves crashing below"
```
**More Examples by Use Case**:
**Product Showcase**:
```
"White wireless headphones slowly rotating on a minimalist white surface,
studio lighting with soft shadows, clean and modern atmosphere"
```
**Nature/Travel**:
```
"Majestic waterfall cascading down mossy rocks in a lush rainforest,
slow zoom in from wide to medium shot,
dappled sunlight filtering through the canopy,
serene and peaceful atmosphere"
```
**Action/Sports**:
```
"Professional skateboarder performing a kickflip on urban street ramp,
slow motion capture from low angle,
bright daylight with high contrast shadows,
energetic and dynamic atmosphere"
```
### Image-to-Video Prompt Formula
```
主体 + 运动 + 镜头语言
Subject + Motion + Camera
```
**Why simpler?** The image already provides:
- Scene and environment
- Lighting and color palette
- Composition and framing
- Atmosphere and mood
**Your prompt should focus on**:
- Motion to add to the static image
- Camera movement to apply
- Any new dynamic elements
**Examples**:
```
"Camera slowly zooming in, subject's hair gently blowing in the wind"
```
```
"Slow pan from left to right, leaves rustling, golden hour lighting"
```
```
"Camera orbiting around the subject, dramatic lighting, cinematic feel"
```
```
"Gentle push in toward the subject's face, bokeh background, emotional atmosphere"
```
### Video Continuation Prompt Formula
```
延续的动作 + 场景变化 + 镜头延续
Continued action + Scene evolution + Camera continuation
```
**Focus on**:
- How the action continues from the last frame
- Natural progression of movement
- Scene changes or transitions
- Camera movement consistency
**Examples**:
```
"The bird continues flying higher, soaring into the clouds,
camera following the ascent"
```
```
"The car continues down the road, passing through a tunnel,
maintaining tracking shot"
```
```
"The person continues walking, entering a brightly lit room,
camera follows smoothly"
```
---
## Image APIs
### Image-to-Image Prompt Formula
```
参考图描述 + 保留要素 + 修改/新增指令
Reference description + Elements to preserve + Modifications/additions
```
**Component Breakdown**:
1. **Reference Description**: What's in the original image
2. **Preserve**: Explicitly state what to keep unchanged
3. **Modify/Add**: What to change or add
**Examples**:
```
"A woman in business attire at a modern office,
preserve facial features and body pose,
change background to outdoor garden with natural lighting,
add warm sunset atmosphere"
```
```
"Portrait of a man wearing a blue shirt,
keep facial features and expression,
change clothing to formal black suit with tie,
maintain studio lighting"
```
```
"Kitchen interior with white cabinets,
preserve layout and appliance positions,
change color scheme to navy blue cabinets with gold hardware,
add marble countertops"
```
### Image Redrawing Prompt Formula
```
替换区域的新内容描述
Description of what replaces the masked area
```
**Focus on**:
- What should appear in the masked region
- Style/texture matching surrounding areas
- Lighting consistency
- Natural integration
**Examples**:
```
"Clear blue sky with white fluffy clouds"
(for replacing background)
```
```
"Natural grass texture with small wildflowers"
(for replacing foreground)
```
```
"Modern glass windows with reflections of cityscape"
(for replacing building facade)
```
```
"Empty wooden table surface with subtle wood grain"
(for removing objects from table)
```
### Image Recognition Prompt Formula
```
要识别的对象/区域描述
Description of objects/regions to identify
```
**Focus on**:
- Clear object identification
- Specific vs. general (depending on need)
- Spatial context if helpful
**Examples**:
```
"person" / "all people"
```
```
"the red car in the foreground"
```
```
"sky and clouds"
```
```
"text and logos"
```
```
"background behind the main subject"
```
---
## Audio APIs
### Text-to-Music Prompt Formula
```
主体 + 场景(氛围/风格)
Subject/Theme + Scene (Atmosphere/Style)
```
**Component Breakdown**:
1. **Subject/Theme**: Purpose or topic of the music
2. **Atmosphere**: Mood and emotional quality
3. **Style/Genre**: Musical style and genre
4. **Instruments** (optional): Key instruments to feature
**Examples**:
**Simple**:
```
"Upbeat electronic music, energetic and modern"
```
**Standard**:
```
"Corporate presentation background music, professional and clean,
soft piano and ambient synth"
```
**Detailed**:
```
"Epic cinematic orchestral score, dramatic and heroic,
soaring strings and powerful brass,
Hans Zimmer style, for action scene"
```
**By Genre**:
**Electronic/Pop**:
```
"Energetic EDM track, festival atmosphere, heavy bass and synth drops"
```
**Jazz**:
```
"Smooth jazz for evening ambience, sophisticated and relaxed,
piano trio with double bass"
```
**Orchestral**:
```
"Emotional film score, contemplative and moving,
piano and string ensemble"
```
**Ambient**:
```
"Peaceful meditation music, calm and spacious,
soft pads and gentle chimes"
```
### Text-to-Sound-Effect Prompt Formula
```
声音来源 + 动作/环境 + 特征
Sound source + Action/Context + Characteristics
```
**Component Breakdown**:
1. **Sound Source**: What's making the sound
2. **Action/Context**: What's happening / where it is
3. **Characteristics**: Sound qualities (loud, soft, sharp, etc.)
**Examples**:
**Single Events**:
```
"Glass bottle shattering on concrete, sharp and crisp"
```
```
"Car door closing, solid thunk, reverb in garage"
```
```
"Notification bell sound, soft and pleasant"
```
**Continuous Sounds**:
```
"Heavy rain on metal roof, steady and rhythmic"
```
```
"City traffic ambience, cars passing, distant sirens"
```
```
"Forest birds chirping, peaceful morning atmosphere"
```
**By Category**:
**Nature**:
```
"Ocean waves crashing on beach, powerful and continuous"
```
**Mechanical**:
```
"Old typewriter keys typing, mechanical clicks and dings"
```
**Human**:
```
"Crowd cheering and applauding, enthusiastic and loud"
```
**UI/Digital**:
```
"Futuristic UI beep, high-tech and clean"
```
### Text-to-Speech Tips
**Structure**:
- Write naturally as you would speak
- Use punctuation for pacing and pauses
- Spell out numbers and abbreviations
**Examples**:
**With Pacing**:
```
"Welcome to our platform... Let me show you around."
(Ellipsis adds natural pause)
```
**With Emphasis**:
```
"This is extremely important. Pay close attention."
(Period creates strong pause for emphasis)
```
**Numbers**:
```
"Call one, eight hundred, five five five, zero one two three"
(Instead of "Call 1-800-555-0123")
```
### Video Soundtrack Prompt Formula
```
风格/类型 + 情绪 + (节奏/能量)
Style/Genre + Mood + (Optional: pacing/energy)
```
**Why simpler?** The API analyzes video content automatically.
**Examples**:
**Minimal** (let API analyze):
```
(no prompt - fully automatic based on video)
```
**Guided Style**:
```
"Upbeat travel vlog music, adventurous and inspiring"
```
```
"Corporate tech presentation music, modern and professional"
```
```
"Emotional documentary score, contemplative and moving"
```
---
## General Best Practices
### Do's ✅
1. **Be Specific**: Clear descriptions yield better results
- Good: "Golden retriever puppy playing with red ball"
- Bad: "Dog playing"
2. **Use Descriptive Adjectives**: Add relevant details
- Good: "Sleek modern smartphone with edge-to-edge display"
- Bad: "Phone"
3. **Specify Mood/Atmosphere**: Set the tone
- Good: "Dramatic sunset with vibrant orange and purple clouds"
- Bad: "Sunset"
4. **Include Context**: Where, when, why
- Good: "Professional product photography on white background"
- Bad: "Product"
5. **Mention Key Visual Elements**:
- Lighting conditions
- Color palette
- Composition style
- Movement characteristics
### Don'ts ❌
1. **Don't Be Vague**:
- Bad: "Nice video"
- Bad: "Good music"
- Bad: "Cool image"
2. **Don't Contradict**:
- Bad: "Bright and dark scene"
- Bad: "Fast and slow motion"
- Bad: "Happy sad music"
3. **Don't Over-specify Technical Details**:
- Bad: "1920x1080 resolution 24fps H.264 codec"
- Bad: "120 BPM in C major with I-V-vi-IV progression"
- (API handles technical parameters automatically)
4. **Don't Use Negations**:
- Bad: "Not blurry, not dark, not boring"
- Good: "Sharp, bright, engaging"
5. **Don't Mix Multiple Unrelated Concepts**:
- Bad: "Cat playing piano in space while cooking pasta"
- Better: Split into multiple generations or choose one focus
---
## Advanced Techniques
### Chaining Outputs
Use output from one API as input to another:
```python
# 1. Generate image
img_task = client.image_to_image(
prompt="Modern office space, clean and minimal",
image="reference.jpg"
)
img_url = get_result_url(img_task)
# 2. Animate image
video_task = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image=img_url
)
video_url = get_result_url(video_task)
# 3. Add soundtrack
audio_task = client.video_soundtrack(
video=video_url,
prompt="Professional corporate music"
)
```
### Iterative Refinement
1. **Start Simple**: Test with minimal prompt
2. **Evaluate Output**: What's missing or wrong?
3. **Add Details**: Incrementally add specificity
4. **Test Again**: Compare results
**Example Iteration**:
```
V1: "Car driving"
→ Too generic
V2: "Red sports car driving fast on highway"
→ Better, but lighting unclear
V3: "Red Ferrari speeding on coastal highway, golden hour sunset, dramatic"
→ Good result!
```
### A/B Testing
Generate multiple variations to compare:
```python
# Test different camera movements
variants = [
client.image_to_video(image=img, prompt="Slow zoom in", camera_move_index=5),
client.image_to_video(image=img, prompt="Pan right", camera_move_index=12),
client.image_to_video(image=img, prompt="Orbit around subject", camera_move_index=23)
]
```
### Consistency Across Assets
For cohesive content, maintain consistent prompt elements:
```python
# Consistent style keywords across multiple generations
style_guide = "cinematic, golden hour lighting, dramatic atmosphere"
video1 = client.text_to_video(f"Mountain landscape, {style_guide}")
video2 = client.text_to_video(f"Forest scene, {style_guide}")
video3 = client.text_to_video(f"Beach sunset, {style_guide}")
```
---
## Prompt Templates by Use Case
### Marketing/Commercial
```
"[Product name] showcased on [surface/background],
[camera movement],
studio lighting with soft shadows,
premium and elegant atmosphere"
```
### Social Media
```
"[Subject] [action],
[camera angle/movement],
vibrant colors and high energy,
engaging and eye-catching for [platform]"
```
### Documentary/Educational
```
"[Subject] in [environment],
[natural action],
natural documentary style cinematography,
informative and authentic atmosphere"
```
### Artistic/Creative
```
"[Abstract concept] visualized as [imagery],
[unique camera work],
[lighting style],
[artistic mood/style reference]"
```
---
## Troubleshooting
### Problem: Output doesn't match prompt
**Solutions**:
- Simplify prompt to core elements
- Remove contradictory instructions
- Be more specific about key aspects
- Check for typos or ambiguous terms
### Problem: Output quality is poor
**Solutions**:
- Add style keywords (cinematic, professional, high quality)
- Specify lighting conditions
- Mention camera quality/style
- Add atmosphere/mood descriptors
### Problem: Inconsistent results
**Solutions**:
- Use more specific prompts
- Reference specific styles or examples
- Test with variations to find optimal wording
- Use consistent style keywords across generations
### Problem: Generation fails (status=4)
**Solutions**:
- Check prompt for inappropriate content
- Simplify overly complex prompts
- Verify input files (images/videos) are valid
- Ensure parameters are within valid ranges
- Try with default parameters first
---
## Language Considerations
Tomoviee supports both **Chinese** and **English** prompts.
**Tips**:
- Use language you're most comfortable with
- Be aware of cultural/contextual nuances
- Technical terms may work better in English
- Artistic descriptions may vary by language interpretation
**Example Comparisons**:
Chinese:
```
"一只金毛寻回犬在阳光明媚的草地上奔跑,镜头跟随,电影级画质"
```
English:
```
"A golden retriever running through a sunlit meadow, camera following, cinematic quality"
```
Both can produce excellent results - choose based on your fluency and precision needs.
---
## Summary Checklist
Before submitting your prompt, verify:
- [ ] **Clear Subject**: What's the main focus?
- [ ] **Specific Action**: What's happening?
- [ ] **Scene Context**: Where is this taking place?
- [ ] **Visual Style**: What's the look and feel?
- [ ] **Technical Parameters**: Resolution, duration, aspect ratio set correctly?
- [ ] **No Contradictions**: All elements work together?
- [ ] **Appropriate Specificity**: Not too vague, not over-specified?
A well-crafted prompt is the foundation of excellent AI-generated content. Take time to refine your prompts for best results!
Generate sound effects from text prompts using Tomoviee Text-to-Sound-Effect API (`tm_text2sfx`) through Wondershare OpenAPI gateway (`https://openapi.wonder...
---
name: tomoviee-text2sfx
description: Generate sound effects from text prompts using Tomoviee Text-to-Sound-Effect API (`tm_text2sfx`) through Wondershare OpenAPI gateway (`https://openapi.wondershare.cc`). Use when users request text_to_sound_effect operations or related tasks.
---
# Tomoviee AI Text-to-Sound-Effect
## Overview
Generate sound effects from text prompts.
- API capability: `tm_text2sfx`
- Task creation endpoint: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2sfx`
- Result endpoint: `https://openapi.wondershare.cc/v1/open/pub/task`
## Provider and Endpoint Provenance
Use this mapping to verify provider identity and runtime endpoints:
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- API gateway host used by this skill: `https://openapi.wondershare.cc`
- This skill sends runtime API calls only to `openapi.wondershare.cc`
## Credential Handling
- `app_key` and `app_secret` are only used to construct `Authorization: Basic <base64(app_key:app_secret)>`.
- Credentials are kept in process memory only and are not written to disk by this skill.
- Do not commit credentials into `SKILL.md`, scripts, or repository files.
## Quick Start
### Install dependencies
```bash
pip install requests
```
### Authentication
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_text2sfx_client import TomovieeText2SfxClient
client = TomovieeText2SfxClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client.text_to_sound_effect(
prompt="Heavy rain falling on roof with distant thunder",
duration=30,
qty=1,
)
result = client.poll_until_complete(task_id)
import json
audio_url = json.loads(result["result"])["audio_path"][0]
print(audio_url)
```
### Parameters
- `prompt` (required): Sound effect description
- `duration` (required): Duration in seconds, range `5-180`
- `qty` (optional): Number of generated results, range `1-4`
- `callback` (optional): Callback URL
- `params` (optional): Transparent callback parameter
## Async Workflow
1. Create task and get `task_id`
2. Poll with `poll_until_complete(task_id)`
3. Parse output URL from `result`
Status codes:
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Resources
- `scripts/tomoviee_text2sfx_client.py` - main API client
- `scripts/tomoviee_text_to_sound_effect_client.py` - compatibility import shim
- `scripts/generate_auth_token.py` - auth token helper
- `references/audio_apis.md` - focused API reference for `tm_text2sfx`
- `references/prompt_guide.md` - focused prompt writing guide for sound effects
## External Resources
- Developer portal (global): `https://www.tomoviee.ai/developers.html`
- API docs (global): `https://www.tomoviee.ai/doc/`
- Developer portal (mainland): `https://www.tomoviee.cn/developers.html`
- API docs (mainland): `https://www.tomoviee.cn/doc/`
- API gateway host used by this package: `https://openapi.wondershare.cc`
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-text-to-sound-effects",
"version": "1.0.1",
"publishedAt": 1772934073311
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_text2sfx_client.py
#!/usr/bin/env python3
"""Tomoviee AI - Text-to-Sound-Effect API client."""
import base64
import json
import time
from typing import Any, Dict, Optional
import requests
class TomovieeText2SfxClient:
"""Text-to-Sound-Effect API client for Tomoviee AI."""
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_text2sfx"
REQUEST_TIMEOUT = 60
def __init__(self, app_key: str, app_secret: str):
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}",
}
def _safe_json(self, response: requests.Response) -> Dict[str, Any]:
try:
return response.json()
except ValueError as exc:
raise Exception(f"Invalid JSON response: {response.text}") from exc
def _make_request(self, payload: Dict[str, Any]) -> str:
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(
url,
headers=self._get_headers(),
json=payload,
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
task_id = result.get("data", {}).get("task_id")
if not task_id:
raise Exception(f"Missing task_id in response: {result}")
return task_id
def text_to_sound_effect(
self,
prompt: str,
duration: int,
qty: Optional[int] = None,
callback: Optional[str] = None,
params: Optional[str] = None,
) -> str:
"""Create a text-to-sound-effect task and return task_id."""
if duration < 5 or duration > 180:
raise ValueError("duration must be in range 5..180")
if qty is not None and (qty < 1 or qty > 4):
raise ValueError("qty must be in range 1..4")
payload: Dict[str, Any] = {
"prompt": prompt,
"duration": duration,
}
if qty is not None:
payload["qty"] = qty
if callback:
payload["callback"] = callback
if params:
payload["params"] = params
return self._make_request(payload)
def get_result(self, task_id: str) -> Dict[str, Any]:
response = requests.post(
self.RESULT_ENDPOINT,
headers=self._get_headers(),
json={"task_id": task_id},
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
data = result.get("data")
if not data:
raise Exception(f"Missing task data in response: {result}")
return data
def poll_until_complete(
self,
task_id: str,
poll_interval: int = 10,
timeout: int = 600,
) -> Dict[str, Any]:
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
status = result.get("status")
if status == 3:
return result
if status in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason', 'Unknown error')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
# Backward-compatible alias
TomovieeClient = TomovieeText2SfxClient
if __name__ == "__main__":
import sys
if len(sys.argv) < 5:
print(
"Usage: python scripts/tomoviee_text2sfx_client.py "
"<app_key> <app_secret> <prompt> <duration> [qty]"
)
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
prompt = sys.argv[3]
duration = int(sys.argv[4])
qty = None
if len(sys.argv) > 5:
qty = int(sys.argv[5])
client = TomovieeText2SfxClient(app_key, app_secret)
try:
print("Creating text-to-sound-effect task...")
task_id = client.text_to_sound_effect(
prompt=prompt,
duration=duration,
qty=qty,
)
print(f"Task created: {task_id}")
print("Polling for result...")
result = client.poll_until_complete(task_id)
print("\nTask completed")
print(f"Progress: {result.get('progress', 'N/A')}%")
result_data = json.loads(result["result"])
print(json.dumps(result_data, indent=2, ensure_ascii=False))
except Exception as exc:
print(f"Error: {exc}")
sys.exit(1)
FILE:scripts/tomoviee_text_to_sound_effect_client.py
#!/usr/bin/env python3
"""Compatibility shim for legacy import path.
Use `tomoviee_text2sfx_client.py` as the primary implementation.
"""
from .tomoviee_text2sfx_client import TomovieeClient, TomovieeText2SfxClient
__all__ = ["TomovieeText2SfxClient", "TomovieeClient"]
FILE:references/audio_apis.md
# Tomoviee Text-to-Sound-Effect API Reference
## Scope
This reference is intentionally scoped to one capability:
- `tm_text2sfx` (text_to_sound_effect)
## Endpoints
- Task creation: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2sfx`
- Task result: `https://openapi.wondershare.cc/v1/open/pub/task`
## Authentication
Send both headers on each request:
- `X-App-Key: <app_key>`
- `Authorization: Basic <base64(app_key:app_secret)>`
The access token is generated by Base64-encoding `app_key:app_secret`.
## Request Parameters
- `prompt` (required): Description of the desired sound effect.
- `duration` (required): Duration in seconds, range `5-180`.
- `qty` (optional): Number of generated variants, range `1-4`.
- `callback` (optional): Callback URL.
- `params` (optional): Opaque string forwarded back in callback payload.
## Async Workflow
1. Create task and get `task_id`.
2. Poll result endpoint with `task_id`.
3. Check status and parse final `result` JSON.
Status codes:
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Result Parsing Example
```python
import json
result_payload = json.loads(result["result"])
audio_url = result_payload["audio_path"][0]
```
## Error Handling Notes
- Validate `duration` and `qty` before request.
- Treat non-zero `code` as API error.
- Handle network timeout and non-JSON responses.
- Handle terminal task statuses (`4`, `5`, `6`) as failure paths.
FILE:references/prompt_guide.md
# Text-to-Sound-Effect Prompt Guide
## Goal
Write concise, concrete prompts that describe one coherent sound event or ambience.
## Formula
`Sound Source + Action/Context + Acoustic Characteristics`
## Good Prompt Patterns
- Source: what makes the sound
- Context: where or under what condition
- Characteristics: loudness, texture, rhythm, distance, reverb
Examples:
- `Heavy rain hitting a metal roof, steady rhythm, distant thunder`
- `Glass bottle shattering on concrete, sharp and bright transient`
- `Busy office ambience with keyboard typing and quiet chatter`
- `Retro game UI confirmation beep, short and clean`
- `Wooden door creaking open in an empty hallway, long reverb`
## Do
- Keep one primary sound scene per request.
- Include intensity and environment cues.
- Specify if you want short event vs continuous ambience.
## Avoid
- Multiple unrelated scenes in one prompt.
- Contradictory adjectives (for example: `very soft and extremely loud`).
- Overly abstract instructions with no physical source.
## Iteration Strategy
1. Start with a short prompt.
2. If output is too generic, add source detail.
3. If output is too busy, remove secondary sounds.
4. Tune duration for intended usage (UI, transition, ambience loop).
## Duration Guidance
- `5-10s`: UI or single events
- `10-30s`: transitions and short scenes
- `30-180s`: ambience beds and environmental loops
Generate background music from text prompts using Tomoviee Text-to-Music API (`tm_text2music`) through Wondershare OpenAPI gateway (`https://openapi.wondersh...
---
name: tomoviee-text-to-music
description: Generate background music from text prompts using Tomoviee Text-to-Music API (`tm_text2music`) through Wondershare OpenAPI gateway (`https://openapi.wondershare.cc`). Use when users request text-to-music generation with duration and quantity control.
---
# Tomoviee AI Text-to-Music
## Overview
Generate background music from text prompts.
- API capability: `tm_text2music`
- Task creation endpoint: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2music`
- Result endpoint: `https://openapi.wondershare.cc/v1/open/pub/task`
## Provider and Endpoint Provenance
Use this mapping to verify provider identity and runtime endpoints:
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- API gateway host used by this skill: `https://openapi.wondershare.cc`
- This skill sends runtime API calls only to `openapi.wondershare.cc`
## Credential Handling
- `app_key` and `app_secret` are only used to construct `Authorization: Basic <base64(app_key:app_secret)>`.
- Credentials are kept in process memory only and are not written to disk by this skill.
- Do not commit credentials into `SKILL.md`, scripts, or repository files.
## Quick Start
### Install dependencies
```bash
pip install -r requirements.txt
```
### Authentication helper
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_text2music_client import TomovieeText2MusicClient
client = TomovieeText2MusicClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client.text_to_music(
prompt="Upbeat tech music, modern and energetic electronic pop",
duration=30,
qty=1,
disable_translate=False,
)
result = client.poll_until_complete(task_id)
import json
audio_url = json.loads(result["result"])["audio_path"][0]
print(audio_url)
```
### Parameters
- `prompt` (required): Prompt text. Supports up to 77 tokens; extra content is truncated.
- `duration` (required): Target music duration in seconds, range `0-95`.
- `qty` (required): Number of generated music outputs, range `1-4`.
- `disable_translate` (optional): Whether to disable translation.
- `callback` (optional): Callback URL.
- `params` (optional): Transparent callback parameter.
## Async Workflow
1. Create task and get `task_id`
2. Poll with `poll_until_complete(task_id)`
3. Parse output URL from `result`
Status codes:
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Resources
- `scripts/tomoviee_text2music_client.py` - main API client
- `scripts/tomoviee_text_to_music_client.py` - compatibility import shim
- `scripts/generate_auth_token.py` - auth token helper
- `references/audio_apis.md` - API reference and parameter constraints
- `references/prompt_guide.md` - prompt writing guidance
## External Resources
- Developer portal (global): `https://www.tomoviee.ai/developers.html`
- API docs (global): `https://www.tomoviee.ai/doc/`
- Developer portal (mainland): `https://www.tomoviee.cn/developers.html`
- API docs (mainland): `https://www.tomoviee.cn/doc/`
- API gateway host used by this package: `https://openapi.wondershare.cc`
FILE:requirements.txt
requests>=2.31.0,<3.0.0
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-text-to-music",
"version": "1.0.1",
"publishedAt": 1772934101348
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_text2music_client.py
#!/usr/bin/env python3
"""Tomoviee AI - Text-to-Music API client."""
import base64
import json
import time
from typing import Any, Dict, Optional
import requests
class TomovieeText2MusicClient:
"""Text-to-Music API client for Tomoviee AI."""
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_text2music"
REQUEST_TIMEOUT = 60
def __init__(self, app_key: str, app_secret: str):
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}",
}
def _safe_json(self, response: requests.Response) -> Dict[str, Any]:
try:
return response.json()
except ValueError as exc:
raise Exception(f"Invalid JSON response: {response.text}") from exc
def _make_request(self, payload: Dict[str, Any]) -> str:
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(
url,
headers=self._get_headers(),
json=payload,
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
task_id = result.get("data", {}).get("task_id")
if not task_id:
raise Exception(f"Missing task_id in response: {result}")
return task_id
def text_to_music(
self,
prompt: str,
duration: int,
qty: int,
disable_translate: Optional[bool] = None,
callback: Optional[str] = None,
params: Optional[str] = None,
) -> str:
"""Create a text-to-music task and return task_id."""
if duration < 0 or duration > 95:
raise ValueError("duration must be in range 0..95")
if qty < 1 or qty > 4:
raise ValueError("qty must be in range 1..4")
payload: Dict[str, Any] = {
"prompt": prompt,
"duration": duration,
"qty": qty,
}
if disable_translate is not None:
payload["disable_translate"] = disable_translate
if callback:
payload["callback"] = callback
if params:
payload["params"] = params
return self._make_request(payload)
def get_result(self, task_id: str) -> Dict[str, Any]:
response = requests.post(
self.RESULT_ENDPOINT,
headers=self._get_headers(),
json={"task_id": task_id},
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
data = result.get("data")
if not data:
raise Exception(f"Missing task data in response: {result}")
return data
def poll_until_complete(
self,
task_id: str,
poll_interval: int = 10,
timeout: int = 600,
) -> Dict[str, Any]:
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
status = result.get("status")
if status == 3:
return result
if status in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason', 'Unknown error')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
# Backward-compatible alias
TomovieeClient = TomovieeText2MusicClient
if __name__ == "__main__":
import sys
if len(sys.argv) < 6:
print(
"Usage: python scripts/tomoviee_text2music_client.py "
"<app_key> <app_secret> <prompt> <duration> <qty> [disable_translate]"
)
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
prompt = sys.argv[3]
duration = int(sys.argv[4])
qty = int(sys.argv[5])
disable_translate = None
if len(sys.argv) > 6:
disable_translate = sys.argv[6].strip().lower() in ("1", "true", "yes", "y")
client = TomovieeText2MusicClient(app_key, app_secret)
try:
print("Creating text-to-music task...")
task_id = client.text_to_music(
prompt=prompt,
duration=duration,
qty=qty,
disable_translate=disable_translate,
)
print(f"Task created: {task_id}")
print("Polling for result...")
result = client.poll_until_complete(task_id)
print("\nTask completed")
print(f"Progress: {result.get('progress', 'N/A')}%")
result_data = json.loads(result["result"])
print(json.dumps(result_data, indent=2, ensure_ascii=False))
except Exception as exc:
print(f"Error: {exc}")
sys.exit(1)
FILE:scripts/tomoviee_text_to_music_client.py
#!/usr/bin/env python3
"""Compatibility shim. Prefer importing TomovieeText2MusicClient from tomoviee_text2music_client."""
try:
from scripts.tomoviee_text2music_client import TomovieeClient, TomovieeText2MusicClient
except Exception:
from tomoviee_text2music_client import TomovieeClient, TomovieeText2MusicClient
__all__ = ["TomovieeClient", "TomovieeText2MusicClient"]
FILE:references/audio_apis.md
# Tomoviee Text-to-Music API Reference
## Provenance and Endpoint Mapping
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Gateway host used by this skill package: `https://openapi.wondershare.cc`
- Primary capacity for this skill: `tm_text2music`
All runtime requests from this skill call only:
1. `https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2music`
2. `https://openapi.wondershare.cc/v1/open/pub/task`
## Text-to-Music (tm_text2music)
Generate background music from a text prompt.
### Endpoint
`https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2music`
### Parameters
- `prompt` (required): text prompt for genre, mood, instruments, and usage context
- `duration` (required): target music duration in seconds, range `0-95`
- `qty` (required): number of generated tracks, range `1-4`
- `disable_translate` (optional): whether to disable translation
- `callback` (optional): callback URL for async notification
- `params` (optional): transparent passthrough parameter
### Result Endpoint
`https://openapi.wondershare.cc/v1/open/pub/task`
### Status Codes
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
### Example
```python
from scripts.tomoviee_text2music_client import TomovieeText2MusicClient
client = TomovieeText2MusicClient("app_key", "app_secret")
task_id = client.text_to_music(
prompt="Upbeat corporate technology music, modern and energetic",
duration=30,
qty=1,
disable_translate=False,
)
result = client.poll_until_complete(task_id)
```
### Credential Handling
- Build auth header as `Authorization: Basic <base64(app_key:app_secret)>`
- Do not hardcode or persist credentials in source files
- Prefer environment variables or secure secret stores
FILE:references/prompt_guide.md
# Tomoviee Prompt Engineering Guide
## Overview
This guide provides structured prompt formulas and best practices for all Tomoviee AI APIs to achieve optimal generation results.
---
## Video APIs
### Text-to-Video Prompt Formula
```
主体(描述) + 运动 + 场景(描述) + (镜头语言 + 光影 + 氛围)
Subject (description) + Motion + Scene (description) + (Camera + Lighting + Atmosphere)
```
**Component Breakdown**:
1. **Subject (主体)**: Main focus of the video
- Who/what is in the scene
- Key characteristics, appearance
- Position and pose
2. **Motion (运动)**: Action and movement
- What the subject is doing
- Speed and direction of movement
- Dynamic elements
3. **Scene (场景)**: Environment and context
- Location and setting
- Background elements
- Time of day/season
4. **Camera (镜头语言)**: Optional camera work
- Camera angle (wide, close-up, aerial, etc.)
- Camera movement (pan, zoom, tracking, etc.)
- Or specify `camera_move_index` parameter
5. **Lighting (光影)**: Optional lighting style
- Natural/artificial light
- Time of day (golden hour, blue hour, etc.)
- Light direction and quality
6. **Atmosphere (氛围)**: Optional mood and tone
- Overall feeling (dramatic, peaceful, energetic)
- Color palette/grading
- Weather/environmental effects
**Examples**:
**Minimal** (Subject + Motion + Scene):
```
"A red sports car driving fast on a coastal highway at sunset"
```
**Standard** (+ Camera + Lighting):
```
"A red sports car speeding along a winding coastal highway at golden hour,
camera following from the side, warm orange sunlight reflecting off the car"
```
**Detailed** (+ Atmosphere):
```
"A sleek red Ferrari speeding along a dramatic coastal highway carved into cliffs,
camera tracking smoothly from the side at car level,
golden hour sunset casting long shadows and warm orange glow,
cinematic and epic atmosphere with ocean waves crashing below"
```
**More Examples by Use Case**:
**Product Showcase**:
```
"White wireless headphones slowly rotating on a minimalist white surface,
studio lighting with soft shadows, clean and modern atmosphere"
```
**Nature/Travel**:
```
"Majestic waterfall cascading down mossy rocks in a lush rainforest,
slow zoom in from wide to medium shot,
dappled sunlight filtering through the canopy,
serene and peaceful atmosphere"
```
**Action/Sports**:
```
"Professional skateboarder performing a kickflip on urban street ramp,
slow motion capture from low angle,
bright daylight with high contrast shadows,
energetic and dynamic atmosphere"
```
### Image-to-Video Prompt Formula
```
主体 + 运动 + 镜头语言
Subject + Motion + Camera
```
**Why simpler?** The image already provides:
- Scene and environment
- Lighting and color palette
- Composition and framing
- Atmosphere and mood
**Your prompt should focus on**:
- Motion to add to the static image
- Camera movement to apply
- Any new dynamic elements
**Examples**:
```
"Camera slowly zooming in, subject's hair gently blowing in the wind"
```
```
"Slow pan from left to right, leaves rustling, golden hour lighting"
```
```
"Camera orbiting around the subject, dramatic lighting, cinematic feel"
```
```
"Gentle push in toward the subject's face, bokeh background, emotional atmosphere"
```
### Video Continuation Prompt Formula
```
延续的动作 + 场景变化 + 镜头延续
Continued action + Scene evolution + Camera continuation
```
**Focus on**:
- How the action continues from the last frame
- Natural progression of movement
- Scene changes or transitions
- Camera movement consistency
**Examples**:
```
"The bird continues flying higher, soaring into the clouds,
camera following the ascent"
```
```
"The car continues down the road, passing through a tunnel,
maintaining tracking shot"
```
```
"The person continues walking, entering a brightly lit room,
camera follows smoothly"
```
---
## Image APIs
### Image-to-Image Prompt Formula
```
参考图描述 + 保留要素 + 修改/新增指令
Reference description + Elements to preserve + Modifications/additions
```
**Component Breakdown**:
1. **Reference Description**: What's in the original image
2. **Preserve**: Explicitly state what to keep unchanged
3. **Modify/Add**: What to change or add
**Examples**:
```
"A woman in business attire at a modern office,
preserve facial features and body pose,
change background to outdoor garden with natural lighting,
add warm sunset atmosphere"
```
```
"Portrait of a man wearing a blue shirt,
keep facial features and expression,
change clothing to formal black suit with tie,
maintain studio lighting"
```
```
"Kitchen interior with white cabinets,
preserve layout and appliance positions,
change color scheme to navy blue cabinets with gold hardware,
add marble countertops"
```
### Image Redrawing Prompt Formula
```
替换区域的新内容描述
Description of what replaces the masked area
```
**Focus on**:
- What should appear in the masked region
- Style/texture matching surrounding areas
- Lighting consistency
- Natural integration
**Examples**:
```
"Clear blue sky with white fluffy clouds"
(for replacing background)
```
```
"Natural grass texture with small wildflowers"
(for replacing foreground)
```
```
"Modern glass windows with reflections of cityscape"
(for replacing building facade)
```
```
"Empty wooden table surface with subtle wood grain"
(for removing objects from table)
```
### Image Recognition Prompt Formula
```
要识别的对象/区域描述
Description of objects/regions to identify
```
**Focus on**:
- Clear object identification
- Specific vs. general (depending on need)
- Spatial context if helpful
**Examples**:
```
"person" / "all people"
```
```
"the red car in the foreground"
```
```
"sky and clouds"
```
```
"text and logos"
```
```
"background behind the main subject"
```
---
## Audio APIs
### Text-to-Music Prompt Formula
```
主体 + 场景(氛围/风格)
Subject/Theme + Scene (Atmosphere/Style)
```
**Component Breakdown**:
1. **Subject/Theme**: Purpose or topic of the music
2. **Atmosphere**: Mood and emotional quality
3. **Style/Genre**: Musical style and genre
4. **Instruments** (optional): Key instruments to feature
**Examples**:
**Simple**:
```
"Upbeat electronic music, energetic and modern"
```
**Standard**:
```
"Corporate presentation background music, professional and clean,
soft piano and ambient synth"
```
**Detailed**:
```
"Epic cinematic orchestral score, dramatic and heroic,
soaring strings and powerful brass,
Hans Zimmer style, for action scene"
```
**By Genre**:
**Electronic/Pop**:
```
"Energetic EDM track, festival atmosphere, heavy bass and synth drops"
```
**Jazz**:
```
"Smooth jazz for evening ambience, sophisticated and relaxed,
piano trio with double bass"
```
**Orchestral**:
```
"Emotional film score, contemplative and moving,
piano and string ensemble"
```
**Ambient**:
```
"Peaceful meditation music, calm and spacious,
soft pads and gentle chimes"
```
### Text-to-Sound-Effect Prompt Formula
```
声音来源 + 动作/环境 + 特征
Sound source + Action/Context + Characteristics
```
**Component Breakdown**:
1. **Sound Source**: What's making the sound
2. **Action/Context**: What's happening / where it is
3. **Characteristics**: Sound qualities (loud, soft, sharp, etc.)
**Examples**:
**Single Events**:
```
"Glass bottle shattering on concrete, sharp and crisp"
```
```
"Car door closing, solid thunk, reverb in garage"
```
```
"Notification bell sound, soft and pleasant"
```
**Continuous Sounds**:
```
"Heavy rain on metal roof, steady and rhythmic"
```
```
"City traffic ambience, cars passing, distant sirens"
```
```
"Forest birds chirping, peaceful morning atmosphere"
```
**By Category**:
**Nature**:
```
"Ocean waves crashing on beach, powerful and continuous"
```
**Mechanical**:
```
"Old typewriter keys typing, mechanical clicks and dings"
```
**Human**:
```
"Crowd cheering and applauding, enthusiastic and loud"
```
**UI/Digital**:
```
"Futuristic UI beep, high-tech and clean"
```
### Text-to-Speech Tips
**Structure**:
- Write naturally as you would speak
- Use punctuation for pacing and pauses
- Spell out numbers and abbreviations
**Examples**:
**With Pacing**:
```
"Welcome to our platform... Let me show you around."
(Ellipsis adds natural pause)
```
**With Emphasis**:
```
"This is extremely important. Pay close attention."
(Period creates strong pause for emphasis)
```
**Numbers**:
```
"Call one, eight hundred, five five five, zero one two three"
(Instead of "Call 1-800-555-0123")
```
### Video Soundtrack Prompt Formula
```
风格/类型 + 情绪 + (节奏/能量)
Style/Genre + Mood + (Optional: pacing/energy)
```
**Why simpler?** The API analyzes video content automatically.
**Examples**:
**Minimal** (let API analyze):
```
(no prompt - fully automatic based on video)
```
**Guided Style**:
```
"Upbeat travel vlog music, adventurous and inspiring"
```
```
"Corporate tech presentation music, modern and professional"
```
```
"Emotional documentary score, contemplative and moving"
```
---
## General Best Practices
### Do's ✅
1. **Be Specific**: Clear descriptions yield better results
- Good: "Golden retriever puppy playing with red ball"
- Bad: "Dog playing"
2. **Use Descriptive Adjectives**: Add relevant details
- Good: "Sleek modern smartphone with edge-to-edge display"
- Bad: "Phone"
3. **Specify Mood/Atmosphere**: Set the tone
- Good: "Dramatic sunset with vibrant orange and purple clouds"
- Bad: "Sunset"
4. **Include Context**: Where, when, why
- Good: "Professional product photography on white background"
- Bad: "Product"
5. **Mention Key Visual Elements**:
- Lighting conditions
- Color palette
- Composition style
- Movement characteristics
### Don'ts ❌
1. **Don't Be Vague**:
- Bad: "Nice video"
- Bad: "Good music"
- Bad: "Cool image"
2. **Don't Contradict**:
- Bad: "Bright and dark scene"
- Bad: "Fast and slow motion"
- Bad: "Happy sad music"
3. **Don't Over-specify Technical Details**:
- Bad: "1920x1080 resolution 24fps H.264 codec"
- Bad: "120 BPM in C major with I-V-vi-IV progression"
- (API handles technical parameters automatically)
4. **Don't Use Negations**:
- Bad: "Not blurry, not dark, not boring"
- Good: "Sharp, bright, engaging"
5. **Don't Mix Multiple Unrelated Concepts**:
- Bad: "Cat playing piano in space while cooking pasta"
- Better: Split into multiple generations or choose one focus
---
## Advanced Techniques
### Chaining Outputs
Use output from one API as input to another:
```python
# 1. Generate image
img_task = client.image_to_image(
prompt="Modern office space, clean and minimal",
image="reference.jpg"
)
img_url = get_result_url(img_task)
# 2. Animate image
video_task = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image=img_url
)
video_url = get_result_url(video_task)
# 3. Add soundtrack
audio_task = client.video_soundtrack(
video=video_url,
prompt="Professional corporate music"
)
```
### Iterative Refinement
1. **Start Simple**: Test with minimal prompt
2. **Evaluate Output**: What's missing or wrong?
3. **Add Details**: Incrementally add specificity
4. **Test Again**: Compare results
**Example Iteration**:
```
V1: "Car driving"
→ Too generic
V2: "Red sports car driving fast on highway"
→ Better, but lighting unclear
V3: "Red Ferrari speeding on coastal highway, golden hour sunset, dramatic"
→ Good result!
```
### A/B Testing
Generate multiple variations to compare:
```python
# Test different camera movements
variants = [
client.image_to_video(image=img, prompt="Slow zoom in", camera_move_index=5),
client.image_to_video(image=img, prompt="Pan right", camera_move_index=12),
client.image_to_video(image=img, prompt="Orbit around subject", camera_move_index=23)
]
```
### Consistency Across Assets
For cohesive content, maintain consistent prompt elements:
```python
# Consistent style keywords across multiple generations
style_guide = "cinematic, golden hour lighting, dramatic atmosphere"
video1 = client.text_to_video(f"Mountain landscape, {style_guide}")
video2 = client.text_to_video(f"Forest scene, {style_guide}")
video3 = client.text_to_video(f"Beach sunset, {style_guide}")
```
---
## Prompt Templates by Use Case
### Marketing/Commercial
```
"[Product name] showcased on [surface/background],
[camera movement],
studio lighting with soft shadows,
premium and elegant atmosphere"
```
### Social Media
```
"[Subject] [action],
[camera angle/movement],
vibrant colors and high energy,
engaging and eye-catching for [platform]"
```
### Documentary/Educational
```
"[Subject] in [environment],
[natural action],
natural documentary style cinematography,
informative and authentic atmosphere"
```
### Artistic/Creative
```
"[Abstract concept] visualized as [imagery],
[unique camera work],
[lighting style],
[artistic mood/style reference]"
```
---
## Troubleshooting
### Problem: Output doesn't match prompt
**Solutions**:
- Simplify prompt to core elements
- Remove contradictory instructions
- Be more specific about key aspects
- Check for typos or ambiguous terms
### Problem: Output quality is poor
**Solutions**:
- Add style keywords (cinematic, professional, high quality)
- Specify lighting conditions
- Mention camera quality/style
- Add atmosphere/mood descriptors
### Problem: Inconsistent results
**Solutions**:
- Use more specific prompts
- Reference specific styles or examples
- Test with variations to find optimal wording
- Use consistent style keywords across generations
### Problem: Generation fails (status=4)
**Solutions**:
- Check prompt for inappropriate content
- Simplify overly complex prompts
- Verify input files (images/videos) are valid
- Ensure parameters are within valid ranges
- Try with default parameters first
---
## Language Considerations
Tomoviee supports both **Chinese** and **English** prompts.
**Tips**:
- Use language you're most comfortable with
- Be aware of cultural/contextual nuances
- Technical terms may work better in English
- Artistic descriptions may vary by language interpretation
**Example Comparisons**:
Chinese:
```
"一只金毛寻回犬在阳光明媚的草地上奔跑,镜头跟随,电影级画质"
```
English:
```
"A golden retriever running through a sunlit meadow, camera following, cinematic quality"
```
Both can produce excellent results - choose based on your fluency and precision needs.
---
## Summary Checklist
Before submitting your prompt, verify:
- [ ] **Clear Subject**: What's the main focus?
- [ ] **Specific Action**: What's happening?
- [ ] **Scene Context**: Where is this taking place?
- [ ] **Visual Style**: What's the look and feel?
- [ ] **Technical Parameters**: Resolution, duration, aspect ratio set correctly?
- [ ] **No Contradictions**: All elements work together?
- [ ] **Appropriate Specificity**: Not too vague, not over-specified?
A well-crafted prompt is the foundation of excellent AI-generated content. Take time to refine your prompts for best results!
Generate videos from image + text prompts using Tomoviee Image-to-Video API (`tm_img2video_b`) through Wondershare OpenAPI gateway (`https://openapi.wondersh...
---
name: tomoviee-image-to-video
description: Generate videos from image + text prompts using Tomoviee Image-to-Video API (`tm_img2video_b`) through Wondershare OpenAPI gateway (`https://openapi.wondershare.cc`). Use when users request animating a still image with motion and camera guidance.
---
# Tomoviee AI Image-to-Video
## Overview
Generate a 5-second video from a still image and prompt.
- API capability: `tm_img2video_b`
- Create endpoint: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_img2video_b`
- Result endpoint: `https://openapi.wondershare.cc/v1/open/pub/task`
## Provider and Endpoint Provenance
Use this mapping to verify provider identity and endpoint provenance:
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Runtime gateway host used by this skill: `https://openapi.wondershare.cc`
- Compatible gateway alias: `https://open-api.wondershare.cc`
This skill sends runtime API calls only to `openapi.wondershare.cc`.
## Credential Handling
- Sensitive credentials required: `app_key` and `app_secret`.
- Credentials are only used to build `Authorization: Basic <base64(app_key:app_secret)>`.
- Credentials are kept in process memory and are not written to disk by this skill.
- Do not hardcode credentials in source files or commit them to git.
## Dependencies
- Runtime dependency: `requests>=2.31.0,<3.0.0`
- Install with: `pip install -r requirements.txt`
## Quick Start
### Authentication helper
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_img2video_client import TomovieeImg2VideoClient
client = TomovieeImg2VideoClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client.image_to_video(
prompt="Camera slowly pushes in, gentle motion in the scene, cinematic lighting",
image="https://example.com/landscape.jpg",
resolution="720p",
duration=5,
aspect_ratio="original",
)
result = client.poll_until_complete(task_id)
import json
video_url = json.loads(result["result"])["video_path"][0]
print(video_url)
```
### Parameters
- `prompt` (required): motion and scene guidance text.
- `image` (required): source image URL (`JPG/JPEG/PNG/WEBP`, `<200M`).
- `resolution` (optional): `720p` or `1080p`, default `720p`.
- `duration` (optional): only `5` supported.
- `aspect_ratio` (optional): `16:9`, `9:16`, `4:3`, `3:4`, `1:1`, `original`.
- `camera_move_index` (optional): camera movement type `1-46`.
- `callback` (optional): callback URL.
- `params` (optional): transparent callback parameter.
## Async Workflow
1. Create task and get `task_id`
2. Poll with `poll_until_complete(task_id)`
3. Parse video URL from `result`
Status codes:
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Resources
- `scripts/tomoviee_img2video_client.py` - main API client
- `scripts/tomoviee_image_to_video_client.py` - compatibility import shim
- `scripts/generate_auth_token.py` - auth token helper
- `references/video_apis.md` - API reference and constraints
- `references/camera_movements.md` - camera movement index reference
- `references/prompt_guide.md` - prompt writing guidance
## External Resources
- Developer portal (global): `https://www.tomoviee.ai/developers.html`
- API docs (global): `https://www.tomoviee.ai/doc/ai-video/image-to-video.html`
- Developer portal (mainland): `https://www.tomoviee.cn/developers.html`
- API docs (mainland): `https://www.tomoviee.cn/doc/ai-video/image-to-video.html`
FILE:requirements.txt
requests>=2.31.0,<3.0.0
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-image-to-video",
"version": "1.0.1",
"publishedAt": 1772934161125
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_image_to_video_client.py
#!/usr/bin/env python3
"""Compatibility shim. Prefer importing TomovieeImg2VideoClient from tomoviee_img2video_client."""
try:
from scripts.tomoviee_img2video_client import TomovieeClient, TomovieeImg2VideoClient
except Exception:
from tomoviee_img2video_client import TomovieeClient, TomovieeImg2VideoClient
__all__ = ["TomovieeClient", "TomovieeImg2VideoClient"]
FILE:scripts/tomoviee_img2video_client.py
#!/usr/bin/env python3
"""Tomoviee AI - Image-to-Video API client."""
import base64
import json
import time
from typing import Any, Dict, Optional
import requests
class TomovieeImg2VideoClient:
"""Image-to-Video API client for Tomoviee AI."""
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_img2video_b"
REQUEST_TIMEOUT = 60
def __init__(self, app_key: str, app_secret: str):
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}",
}
def _safe_json(self, response: requests.Response) -> Dict[str, Any]:
try:
return response.json()
except ValueError as exc:
raise Exception(f"Invalid JSON response: {response.text}") from exc
def _make_request(self, payload: Dict[str, Any]) -> str:
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(
url,
headers=self._get_headers(),
json=payload,
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
task_id = result.get("data", {}).get("task_id")
if not task_id:
raise Exception(f"Missing task_id in response: {result}")
return task_id
def image_to_video(
self,
prompt: str,
image: str,
resolution: str = "720p",
duration: int = 5,
aspect_ratio: str = "16:9",
camera_move_index: Optional[int] = None,
callback: Optional[str] = None,
params: Optional[str] = None,
) -> str:
"""Create an image-to-video task and return task_id."""
if duration != 5:
raise ValueError("duration must be 5 for tm_img2video_b")
payload: Dict[str, Any] = {
"prompt": prompt,
"image": image,
"resolution": resolution,
"duration": duration,
"aspect_ratio": aspect_ratio,
}
if camera_move_index is not None:
payload["camera_move_index"] = camera_move_index
if callback:
payload["callback"] = callback
if params:
payload["params"] = params
return self._make_request(payload)
def get_result(self, task_id: str) -> Dict[str, Any]:
response = requests.post(
self.RESULT_ENDPOINT,
headers=self._get_headers(),
json={"task_id": task_id},
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
data = result.get("data")
if not data:
raise Exception(f"Missing task data in response: {result}")
return data
def poll_until_complete(
self,
task_id: str,
poll_interval: int = 10,
timeout: int = 600,
) -> Dict[str, Any]:
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
status = result.get("status")
if status == 3:
return result
if status in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason', 'Unknown error')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
# Backward-compatible alias
TomovieeClient = TomovieeImg2VideoClient
if __name__ == "__main__":
import sys
if len(sys.argv) < 5:
print(
"Usage: python scripts/tomoviee_img2video_client.py "
"<app_key> <app_secret> <prompt> <image_url> [resolution] [aspect_ratio]"
)
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
prompt = sys.argv[3]
image = sys.argv[4]
resolution = sys.argv[5] if len(sys.argv) > 5 else "720p"
aspect_ratio = sys.argv[6] if len(sys.argv) > 6 else "16:9"
client = TomovieeImg2VideoClient(app_key, app_secret)
try:
print("Creating image-to-video task...")
task_id = client.image_to_video(
prompt=prompt,
image=image,
resolution=resolution,
aspect_ratio=aspect_ratio,
)
print(f"Task created: {task_id}")
print("Polling for result...")
result = client.poll_until_complete(task_id)
print("\nTask completed")
print(f"Progress: {result.get('progress', 'N/A')}%")
result_data = json.loads(result["result"])
print(json.dumps(result_data, indent=2, ensure_ascii=False))
except Exception as exc:
print(f"Error: {exc}")
sys.exit(1)
FILE:references/camera_movements.md
# Camera Movement Types Reference
Complete list of camera movement types for `camera_move_index` parameter.
## Movement Types (1-46)
| Index | Type | Description |
|-------|------|-------------|
| 1 | orbit | Camera circles around subject |
| 2 | spin | Rotating motion |
| 3 | pan left | Camera pans to the left |
| 4 | pan right | Camera pans to the right |
| 5 | tilt up | Camera tilts upward |
| 6 | tilt down | Camera tilts downward |
| 7 | push in | Camera moves closer to subject |
| 8 | pull out | Camera moves away from subject |
| 9 | static | No camera movement |
| 10 | tracking | Camera follows subject movement |
| 11 | others | Unspecified movement |
| 12 | object pov | Point of view from object perspective |
| 13 | super dolly in | Dramatic push into scene |
| 14 | super dolly out | Dramatic pull from scene |
| 15 | snorricam | Camera fixed to subject while rotating |
| 16 | head tracking | Follows head/face movement |
| 17 | car grip | Camera mounted on vehicle |
| 18 | screen transition | Transition effect movement |
| 19 | car chasing | Following vehicle action |
| 20 | fisheye | Wide-angle distortion effect |
| 21 | FPV drone | First-person drone perspective |
| 22 | crane over the head | Overhead crane shot |
| 23 | timelapse landscape | Time-lapse scenery |
| 24 | dolly in | Smooth push toward subject |
| 25 | dolly out | Smooth pull from subject |
| 26 | zoom in | Lens zooms closer |
| 27 | zoom out | Lens zooms further |
| 28 | full shot | Wide establishing shot |
| 29 | close-up shot | Tight framing on subject |
| 30 | extreme close-up | Very tight detail shot |
| 31 | Macro shot | Extreme close-up of small details |
| 32 | bird's-eye view | Overhead perspective |
| 33 | rule of thirds | Compositional guideline |
| 34 | symmetrical composition | Balanced framing |
| 35 | handheld | Shaky, documentary-style |
| 36 | FPV shot | First-person view |
| 37 | jib up | Crane moves upward |
| 38 | jib down | Crane moves downward |
| 39 | full shot | Complete subject in frame |
| 40 | Time lapse shot | Compressed time progression |
| 41 | aerial shot | High-altitude view |
| 42 | low angle shot | Camera positioned below subject |
| 43 | Eye-level shot | Camera at subject's eye level |
| 44 | diagonal composition | Angled framing |
| 45 | over shoulder shot | View from behind subject |
| 46 | crane down | Crane descends |
## Usage Example
```python
# Static shot with no camera movement
client.create_video(
prompt="猫咪转头看向镜头",
image="https://example.com/cat.jpg",
camera_move_index=9 # static
)
# Dramatic push in shot
client.create_video(
prompt="女孩突然转头,右手拿起无线耳机戴在耳朵上",
image="https://example.com/girl.jpg",
camera_move_index=13 # super dolly in
)
```
FILE:references/prompt_guide.md
# Tomoviee Prompt Engineering Guide
## Overview
This guide provides structured prompt formulas and best practices for all Tomoviee AI APIs to achieve optimal generation results.
---
## Video APIs
### Text-to-Video Prompt Formula
```
主体(描述) + 运动 + 场景(描述) + (镜头语言 + 光影 + 氛围)
Subject (description) + Motion + Scene (description) + (Camera + Lighting + Atmosphere)
```
**Component Breakdown**:
1. **Subject (主体)**: Main focus of the video
- Who/what is in the scene
- Key characteristics, appearance
- Position and pose
2. **Motion (运动)**: Action and movement
- What the subject is doing
- Speed and direction of movement
- Dynamic elements
3. **Scene (场景)**: Environment and context
- Location and setting
- Background elements
- Time of day/season
4. **Camera (镜头语言)**: Optional camera work
- Camera angle (wide, close-up, aerial, etc.)
- Camera movement (pan, zoom, tracking, etc.)
- Or specify `camera_move_index` parameter
5. **Lighting (光影)**: Optional lighting style
- Natural/artificial light
- Time of day (golden hour, blue hour, etc.)
- Light direction and quality
6. **Atmosphere (氛围)**: Optional mood and tone
- Overall feeling (dramatic, peaceful, energetic)
- Color palette/grading
- Weather/environmental effects
**Examples**:
**Minimal** (Subject + Motion + Scene):
```
"A red sports car driving fast on a coastal highway at sunset"
```
**Standard** (+ Camera + Lighting):
```
"A red sports car speeding along a winding coastal highway at golden hour,
camera following from the side, warm orange sunlight reflecting off the car"
```
**Detailed** (+ Atmosphere):
```
"A sleek red Ferrari speeding along a dramatic coastal highway carved into cliffs,
camera tracking smoothly from the side at car level,
golden hour sunset casting long shadows and warm orange glow,
cinematic and epic atmosphere with ocean waves crashing below"
```
**More Examples by Use Case**:
**Product Showcase**:
```
"White wireless headphones slowly rotating on a minimalist white surface,
studio lighting with soft shadows, clean and modern atmosphere"
```
**Nature/Travel**:
```
"Majestic waterfall cascading down mossy rocks in a lush rainforest,
slow zoom in from wide to medium shot,
dappled sunlight filtering through the canopy,
serene and peaceful atmosphere"
```
**Action/Sports**:
```
"Professional skateboarder performing a kickflip on urban street ramp,
slow motion capture from low angle,
bright daylight with high contrast shadows,
energetic and dynamic atmosphere"
```
### Image-to-Video Prompt Formula
```
主体 + 运动 + 镜头语言
Subject + Motion + Camera
```
**Why simpler?** The image already provides:
- Scene and environment
- Lighting and color palette
- Composition and framing
- Atmosphere and mood
**Your prompt should focus on**:
- Motion to add to the static image
- Camera movement to apply
- Any new dynamic elements
**Examples**:
```
"Camera slowly zooming in, subject's hair gently blowing in the wind"
```
```
"Slow pan from left to right, leaves rustling, golden hour lighting"
```
```
"Camera orbiting around the subject, dramatic lighting, cinematic feel"
```
```
"Gentle push in toward the subject's face, bokeh background, emotional atmosphere"
```
### Video Continuation Prompt Formula
```
延续的动作 + 场景变化 + 镜头延续
Continued action + Scene evolution + Camera continuation
```
**Focus on**:
- How the action continues from the last frame
- Natural progression of movement
- Scene changes or transitions
- Camera movement consistency
**Examples**:
```
"The bird continues flying higher, soaring into the clouds,
camera following the ascent"
```
```
"The car continues down the road, passing through a tunnel,
maintaining tracking shot"
```
```
"The person continues walking, entering a brightly lit room,
camera follows smoothly"
```
---
## Image APIs
### Image-to-Image Prompt Formula
```
参考图描述 + 保留要素 + 修改/新增指令
Reference description + Elements to preserve + Modifications/additions
```
**Component Breakdown**:
1. **Reference Description**: What's in the original image
2. **Preserve**: Explicitly state what to keep unchanged
3. **Modify/Add**: What to change or add
**Examples**:
```
"A woman in business attire at a modern office,
preserve facial features and body pose,
change background to outdoor garden with natural lighting,
add warm sunset atmosphere"
```
```
"Portrait of a man wearing a blue shirt,
keep facial features and expression,
change clothing to formal black suit with tie,
maintain studio lighting"
```
```
"Kitchen interior with white cabinets,
preserve layout and appliance positions,
change color scheme to navy blue cabinets with gold hardware,
add marble countertops"
```
### Image Redrawing Prompt Formula
```
替换区域的新内容描述
Description of what replaces the masked area
```
**Focus on**:
- What should appear in the masked region
- Style/texture matching surrounding areas
- Lighting consistency
- Natural integration
**Examples**:
```
"Clear blue sky with white fluffy clouds"
(for replacing background)
```
```
"Natural grass texture with small wildflowers"
(for replacing foreground)
```
```
"Modern glass windows with reflections of cityscape"
(for replacing building facade)
```
```
"Empty wooden table surface with subtle wood grain"
(for removing objects from table)
```
### Image Recognition Prompt Formula
```
要识别的对象/区域描述
Description of objects/regions to identify
```
**Focus on**:
- Clear object identification
- Specific vs. general (depending on need)
- Spatial context if helpful
**Examples**:
```
"person" / "all people"
```
```
"the red car in the foreground"
```
```
"sky and clouds"
```
```
"text and logos"
```
```
"background behind the main subject"
```
---
## Audio APIs
### Text-to-Music Prompt Formula
```
主体 + 场景(氛围/风格)
Subject/Theme + Scene (Atmosphere/Style)
```
**Component Breakdown**:
1. **Subject/Theme**: Purpose or topic of the music
2. **Atmosphere**: Mood and emotional quality
3. **Style/Genre**: Musical style and genre
4. **Instruments** (optional): Key instruments to feature
**Examples**:
**Simple**:
```
"Upbeat electronic music, energetic and modern"
```
**Standard**:
```
"Corporate presentation background music, professional and clean,
soft piano and ambient synth"
```
**Detailed**:
```
"Epic cinematic orchestral score, dramatic and heroic,
soaring strings and powerful brass,
Hans Zimmer style, for action scene"
```
**By Genre**:
**Electronic/Pop**:
```
"Energetic EDM track, festival atmosphere, heavy bass and synth drops"
```
**Jazz**:
```
"Smooth jazz for evening ambience, sophisticated and relaxed,
piano trio with double bass"
```
**Orchestral**:
```
"Emotional film score, contemplative and moving,
piano and string ensemble"
```
**Ambient**:
```
"Peaceful meditation music, calm and spacious,
soft pads and gentle chimes"
```
### Text-to-Sound-Effect Prompt Formula
```
声音来源 + 动作/环境 + 特征
Sound source + Action/Context + Characteristics
```
**Component Breakdown**:
1. **Sound Source**: What's making the sound
2. **Action/Context**: What's happening / where it is
3. **Characteristics**: Sound qualities (loud, soft, sharp, etc.)
**Examples**:
**Single Events**:
```
"Glass bottle shattering on concrete, sharp and crisp"
```
```
"Car door closing, solid thunk, reverb in garage"
```
```
"Notification bell sound, soft and pleasant"
```
**Continuous Sounds**:
```
"Heavy rain on metal roof, steady and rhythmic"
```
```
"City traffic ambience, cars passing, distant sirens"
```
```
"Forest birds chirping, peaceful morning atmosphere"
```
**By Category**:
**Nature**:
```
"Ocean waves crashing on beach, powerful and continuous"
```
**Mechanical**:
```
"Old typewriter keys typing, mechanical clicks and dings"
```
**Human**:
```
"Crowd cheering and applauding, enthusiastic and loud"
```
**UI/Digital**:
```
"Futuristic UI beep, high-tech and clean"
```
### Text-to-Speech Tips
**Structure**:
- Write naturally as you would speak
- Use punctuation for pacing and pauses
- Spell out numbers and abbreviations
**Examples**:
**With Pacing**:
```
"Welcome to our platform... Let me show you around."
(Ellipsis adds natural pause)
```
**With Emphasis**:
```
"This is extremely important. Pay close attention."
(Period creates strong pause for emphasis)
```
**Numbers**:
```
"Call one, eight hundred, five five five, zero one two three"
(Instead of "Call 1-800-555-0123")
```
### Video Soundtrack Prompt Formula
```
风格/类型 + 情绪 + (节奏/能量)
Style/Genre + Mood + (Optional: pacing/energy)
```
**Why simpler?** The API analyzes video content automatically.
**Examples**:
**Minimal** (let API analyze):
```
(no prompt - fully automatic based on video)
```
**Guided Style**:
```
"Upbeat travel vlog music, adventurous and inspiring"
```
```
"Corporate tech presentation music, modern and professional"
```
```
"Emotional documentary score, contemplative and moving"
```
---
## General Best Practices
### Do's ✅
1. **Be Specific**: Clear descriptions yield better results
- Good: "Golden retriever puppy playing with red ball"
- Bad: "Dog playing"
2. **Use Descriptive Adjectives**: Add relevant details
- Good: "Sleek modern smartphone with edge-to-edge display"
- Bad: "Phone"
3. **Specify Mood/Atmosphere**: Set the tone
- Good: "Dramatic sunset with vibrant orange and purple clouds"
- Bad: "Sunset"
4. **Include Context**: Where, when, why
- Good: "Professional product photography on white background"
- Bad: "Product"
5. **Mention Key Visual Elements**:
- Lighting conditions
- Color palette
- Composition style
- Movement characteristics
### Don'ts ❌
1. **Don't Be Vague**:
- Bad: "Nice video"
- Bad: "Good music"
- Bad: "Cool image"
2. **Don't Contradict**:
- Bad: "Bright and dark scene"
- Bad: "Fast and slow motion"
- Bad: "Happy sad music"
3. **Don't Over-specify Technical Details**:
- Bad: "1920x1080 resolution 24fps H.264 codec"
- Bad: "120 BPM in C major with I-V-vi-IV progression"
- (API handles technical parameters automatically)
4. **Don't Use Negations**:
- Bad: "Not blurry, not dark, not boring"
- Good: "Sharp, bright, engaging"
5. **Don't Mix Multiple Unrelated Concepts**:
- Bad: "Cat playing piano in space while cooking pasta"
- Better: Split into multiple generations or choose one focus
---
## Advanced Techniques
### Chaining Outputs
Use output from one API as input to another:
```python
# 1. Generate image
img_task = client.image_to_image(
prompt="Modern office space, clean and minimal",
image="reference.jpg"
)
img_url = get_result_url(img_task)
# 2. Animate image
video_task = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image=img_url
)
video_url = get_result_url(video_task)
# 3. Add soundtrack
audio_task = client.video_soundtrack(
video=video_url,
prompt="Professional corporate music"
)
```
### Iterative Refinement
1. **Start Simple**: Test with minimal prompt
2. **Evaluate Output**: What's missing or wrong?
3. **Add Details**: Incrementally add specificity
4. **Test Again**: Compare results
**Example Iteration**:
```
V1: "Car driving"
→ Too generic
V2: "Red sports car driving fast on highway"
→ Better, but lighting unclear
V3: "Red Ferrari speeding on coastal highway, golden hour sunset, dramatic"
→ Good result!
```
### A/B Testing
Generate multiple variations to compare:
```python
# Test different camera movements
variants = [
client.image_to_video(image=img, prompt="Slow zoom in", camera_move_index=5),
client.image_to_video(image=img, prompt="Pan right", camera_move_index=12),
client.image_to_video(image=img, prompt="Orbit around subject", camera_move_index=23)
]
```
### Consistency Across Assets
For cohesive content, maintain consistent prompt elements:
```python
# Consistent style keywords across multiple generations
style_guide = "cinematic, golden hour lighting, dramatic atmosphere"
video1 = client.text_to_video(f"Mountain landscape, {style_guide}")
video2 = client.text_to_video(f"Forest scene, {style_guide}")
video3 = client.text_to_video(f"Beach sunset, {style_guide}")
```
---
## Prompt Templates by Use Case
### Marketing/Commercial
```
"[Product name] showcased on [surface/background],
[camera movement],
studio lighting with soft shadows,
premium and elegant atmosphere"
```
### Social Media
```
"[Subject] [action],
[camera angle/movement],
vibrant colors and high energy,
engaging and eye-catching for [platform]"
```
### Documentary/Educational
```
"[Subject] in [environment],
[natural action],
natural documentary style cinematography,
informative and authentic atmosphere"
```
### Artistic/Creative
```
"[Abstract concept] visualized as [imagery],
[unique camera work],
[lighting style],
[artistic mood/style reference]"
```
---
## Troubleshooting
### Problem: Output doesn't match prompt
**Solutions**:
- Simplify prompt to core elements
- Remove contradictory instructions
- Be more specific about key aspects
- Check for typos or ambiguous terms
### Problem: Output quality is poor
**Solutions**:
- Add style keywords (cinematic, professional, high quality)
- Specify lighting conditions
- Mention camera quality/style
- Add atmosphere/mood descriptors
### Problem: Inconsistent results
**Solutions**:
- Use more specific prompts
- Reference specific styles or examples
- Test with variations to find optimal wording
- Use consistent style keywords across generations
### Problem: Generation fails (status=4)
**Solutions**:
- Check prompt for inappropriate content
- Simplify overly complex prompts
- Verify input files (images/videos) are valid
- Ensure parameters are within valid ranges
- Try with default parameters first
---
## Language Considerations
Tomoviee supports both **Chinese** and **English** prompts.
**Tips**:
- Use language you're most comfortable with
- Be aware of cultural/contextual nuances
- Technical terms may work better in English
- Artistic descriptions may vary by language interpretation
**Example Comparisons**:
Chinese:
```
"一只金毛寻回犬在阳光明媚的草地上奔跑,镜头跟随,电影级画质"
```
English:
```
"A golden retriever running through a sunlit meadow, camera following, cinematic quality"
```
Both can produce excellent results - choose based on your fluency and precision needs.
---
## Summary Checklist
Before submitting your prompt, verify:
- [ ] **Clear Subject**: What's the main focus?
- [ ] **Specific Action**: What's happening?
- [ ] **Scene Context**: Where is this taking place?
- [ ] **Visual Style**: What's the look and feel?
- [ ] **Technical Parameters**: Resolution, duration, aspect ratio set correctly?
- [ ] **No Contradictions**: All elements work together?
- [ ] **Appropriate Specificity**: Not too vague, not over-specified?
A well-crafted prompt is the foundation of excellent AI-generated content. Take time to refine your prompts for best results!
FILE:references/video_apis.md
# Tomoviee Image-to-Video API Reference
## Provenance and Endpoint Mapping
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- Runtime gateway host used by this skill package: `https://openapi.wondershare.cc`
- Compatible gateway alias: `https://open-api.wondershare.cc`
- Primary capability in this skill: `tm_img2video_b`
All runtime requests from this skill target only:
1. `https://openapi.wondershare.cc/v1/open/capacity/application/tm_img2video_b`
2. `https://openapi.wondershare.cc/v1/open/pub/task`
## Image-to-Video (tm_img2video_b)
Generate a 5-second video from a still image and prompt.
### Parameters
- `prompt` (required): motion guidance text
- `image` (required): source image URL
- `resolution` (optional): `720p` or `1080p`, default `720p`
- `duration` (optional): only `5` supported
- `aspect_ratio` (optional): `16:9`, `9:16`, `4:3`, `3:4`, `1:1`, `original`
- `camera_move_index` (optional): camera movement index `1-46`
- `callback` (optional): callback URL
- `params` (optional): transparent callback passthrough
### Input Constraints
- Maximum file size: `<200M`
- Formats: `JPG`, `JPEG`, `PNG`, `WEBP`
### Result Endpoint
`https://openapi.wondershare.cc/v1/open/pub/task`
### Status Codes
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
## Credential and Dependency Notes
- Sensitive credentials required: `app_key`, `app_secret`
- Auth pattern: `Authorization: Basic <base64(app_key:app_secret)>`
- Runtime dependency: `requests>=2.31.0,<3.0.0`
### Example
```python
from scripts.tomoviee_img2video_client import TomovieeImg2VideoClient
client = TomovieeImg2VideoClient("app_key", "app_secret")
task_id = client.image_to_video(
prompt="Camera slowly pushes in with natural movement",
image="https://example.com/image.jpg",
resolution="720p",
duration=5,
aspect_ratio="original",
)
result = client.poll_until_complete(task_id)
```Generate 5-second videos from text prompts using Tomoviee Text-to-Video API (tm_text2video_b) via Wondershare OpenAPI gateway (https://openapi.wondershare.cc...
---
name: tomoviee-text-to-video
description: Generate 5-second videos from text prompts using Tomoviee Text-to-Video API (tm_text2video_b) via Wondershare OpenAPI gateway (https://openapi.wondershare.cc). Use when users request text-to-video creation with control over resolution, aspect ratio, and camera movement.
---
# Tomoviee AI Text-to-Video
## Overview
Generate 5-second videos from text descriptions.
- API capability: `tm_text2video_b`
- Supported resolutions: `720p`, `1080p`
- Supported aspect ratios: `16:9`, `9:16`, `4:3`, `3:4`, `1:1`
- Optional camera control: `camera_move_index` (1-46)
## Provider and Endpoints
Use the following provider and endpoint mapping to keep credentials and routing consistent:
- Vendor portals: `https://www.tomoviee.ai` and `https://www.tomoviee.cn`
- API gateway host used by this skill: `https://openapi.wondershare.cc`
- Create-task endpoint pattern: `https://openapi.wondershare.cc/v1/open/capacity/application/<capacity_id>`
- Result endpoint: `https://openapi.wondershare.cc/v1/open/pub/task`
This skill sends API requests only to `openapi.wondershare.cc`.
## Quick Start
### Install dependencies
```bash
pip install -r requirements.txt
```
### Authentication
```bash
python scripts/generate_auth_token.py YOUR_APP_KEY YOUR_APP_SECRET
```
### Python Client
```python
from scripts.tomoviee_text2video_client import TomovieeText2VideoClient
client = TomovieeText2VideoClient("app_key", "app_secret")
```
## API Usage
### Basic Example
```python
task_id = client.text_to_video(
prompt="Golden retriever running through sunlit meadow, slow motion, cinematic",
resolution="720p",
aspect_ratio="16:9",
camera_move_index=5,
)
result = client.poll_until_complete(task_id)
import json
video_url = json.loads(result["result"])["video_path"][0]
print(video_url)
```
### Parameters
- `prompt` (required): Text description (subject + action + scene + camera + lighting)
- `resolution`: `720p` or `1080p` (default: `720p`)
- `duration`: duration in seconds (currently `5`)
- `aspect_ratio`: `16:9`, `9:16`, `4:3`, `3:4`, `1:1`
- `camera_move_index`: camera movement type (`1-46`, optional)
- `callback`: callback URL (optional)
- `params`: transparent passthrough params (optional)
## Async Workflow
1. Create task: call `text_to_video()` and get `task_id`
2. Poll status: call `poll_until_complete(task_id)`
3. Parse result: read video URL from returned JSON
Status codes:
- `1` queued
- `2` processing
- `3` success
- `4` failed
- `5` cancelled
- `6` timeout
Typical generation time is 1-5 minutes per 5-second video.
## Resources
### scripts/
- `tomoviee_text2video_client.py` - Text-to-Video API client
- `generate_auth_token.py` - auth token generator
### references/
- `video_apis.md` - detailed video API documentation
- `camera_movements.md` - camera movement index reference
- `prompt_guide.md` - prompt writing best practices
## External Resources
- Developer portal (global): `https://www.tomoviee.ai/developers.html`
- API docs (global): `https://www.tomoviee.ai/doc/`
- Developer portal (mainland): `https://www.tomoviee.cn/developers.html`
- API docs (mainland): `https://www.tomoviee.cn/doc/`
- API host used by this skill: `https://openapi.wondershare.cc`
FILE:requirements.txt
requests>=2.31.0,<3.0.0
FILE:_meta.json
{
"ownerId": "kn7bn8fv79mjtchs17087tdtsx82fc86",
"slug": "tomoviee-text-to-video",
"version": "1.0.4",
"publishedAt": 1773223635132
}
FILE:scripts/generate_auth_token.py
#!/usr/bin/env python3
"""
Generate authentication token for Tomoviee API.
Usage:
python generate_auth_token.py <app_key> <app_secret>
Output:
Base64 encoded access_token in the format required by Authorization header
"""
import base64
import sys
def generate_access_token(app_key: str, app_secret: str) -> str:
"""
Generate access token for Tomoviee API authentication.
Args:
app_key: Application key from Tomoviee console
app_secret: Application secret from Tomoviee console
Returns:
Base64 encoded string in format: base64(app_key:app_secret)
"""
credentials = f"{app_key}:{app_secret}"
access_token = base64.b64encode(credentials.encode()).decode()
return access_token
if __name__ == "__main__":
if len(sys.argv) != 3:
print("Usage: python generate_auth_token.py <app_key> <app_secret>")
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
token = generate_access_token(app_key, app_secret)
print(f"Access Token: {token}")
print(f"\nUse in Authorization header as: Basic {token}")
FILE:scripts/tomoviee_text2video_client.py
#!/usr/bin/env python3
"""
Tomoviee AI - Text-to-Video API client
"""
import base64
import json
import time
from typing import Any, Dict, Optional
import requests
class TomovieeText2VideoClient:
"""Text-to-Video API client for Tomoviee AI."""
# Official gateway host used by this skill package.
BASE_URL = "https://openapi.wondershare.cc/v1/open/capacity/application"
RESULT_ENDPOINT = "https://openapi.wondershare.cc/v1/open/pub/task"
ENDPOINT = "tm_text2video_b"
REQUEST_TIMEOUT = 60
def __init__(self, app_key: str, app_secret: str):
"""Initialize Text-to-Video API client."""
self.app_key = app_key
self.access_token = self._generate_token(app_key, app_secret)
def _generate_token(self, app_key: str, app_secret: str) -> str:
"""Generate base64 access token."""
credentials = f"{app_key}:{app_secret}"
return base64.b64encode(credentials.encode()).decode()
def _get_headers(self) -> Dict[str, str]:
"""Get request headers with authentication."""
return {
"Content-Type": "application/json",
"X-App-Key": self.app_key,
"Authorization": f"Basic {self.access_token}",
}
def _safe_json(self, response: requests.Response) -> Dict[str, Any]:
try:
return response.json()
except ValueError as exc:
raise Exception(f"Invalid JSON response: {response.text}") from exc
def _make_request(self, payload: Dict[str, Any]) -> str:
"""Make API request and return task_id."""
url = f"{self.BASE_URL}/{self.ENDPOINT}"
response = requests.post(
url,
headers=self._get_headers(),
json=payload,
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0:
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
task_id = result.get("data", {}).get("task_id")
if not task_id:
raise Exception(f"Missing task_id in response: {result}")
return task_id
def text_to_video(
self,
prompt: str,
resolution: str = "720p",
duration: int = 5,
aspect_ratio: str = "16:9",
camera_move_index: Optional[int] = None,
callback: Optional[str] = None,
params: Optional[str] = None,
) -> str:
"""Generate video from text description and return task_id."""
payload = {
"prompt": prompt,
"resolution": resolution,
"duration": duration,
"aspect_ratio": aspect_ratio,
}
if camera_move_index is not None:
payload["camera_move_index"] = camera_move_index
if callback:
payload["callback"] = callback
if params:
payload["params"] = params
return self._make_request(payload)
def get_result(self, task_id: str) -> Dict[str, Any]:
"""Get task result."""
response = requests.post(
self.RESULT_ENDPOINT,
headers=self._get_headers(),
json={"task_id": task_id},
timeout=self.REQUEST_TIMEOUT,
)
response.raise_for_status()
result = self._safe_json(response)
if result.get("code") != 0 and not result.get("data"):
raise Exception(f"API Error: {result.get('msg', 'Unknown error')}")
data = result.get("data")
if not data:
raise Exception(f"Missing task data in response: {result}")
return data
def poll_until_complete(
self,
task_id: str,
poll_interval: int = 10,
timeout: int = 600,
) -> Dict[str, Any]:
"""Poll task until completion or timeout."""
elapsed = 0
while elapsed < timeout:
result = self.get_result(task_id)
status = result.get("status")
# Status: 3=success, 4=failed, 5=cancelled, 6=timeout
if status == 3:
return result
if status in [4, 5, 6]:
raise Exception(f"Task failed: {result.get('reason', 'Unknown error')}")
time.sleep(poll_interval)
elapsed += poll_interval
raise TimeoutError(f"Task did not complete within {timeout} seconds")
if __name__ == "__main__":
import sys
if len(sys.argv) < 4:
print(
"Usage: python scripts/tomoviee_text2video_client.py "
"<app_key> <app_secret> <prompt> [resolution] [aspect_ratio]"
)
sys.exit(1)
app_key = sys.argv[1]
app_secret = sys.argv[2]
prompt = sys.argv[3]
resolution = sys.argv[4] if len(sys.argv) > 4 else "720p"
aspect_ratio = sys.argv[5] if len(sys.argv) > 5 else "16:9"
client = TomovieeText2VideoClient(app_key, app_secret)
try:
print("Creating text-to-video task...")
task_id = client.text_to_video(prompt, resolution, aspect_ratio=aspect_ratio)
print(f"Task created: {task_id}")
print("Polling for result...")
result = client.poll_until_complete(task_id)
print("\nTask completed")
print(f"Progress: {result.get('progress', 'N/A')}%")
result_data = json.loads(result["result"])
print(f"Result: {json.dumps(result_data, indent=2, ensure_ascii=False)}")
except Exception as exc:
print(f"Error: {exc}")
sys.exit(1)
FILE:references/camera_movements.md
# Camera Movement Types Reference
Complete list of camera movement types for `camera_move_index` parameter.
## Movement Types (1-46)
| Index | Type | Description |
|-------|------|-------------|
| 1 | orbit | Camera circles around subject |
| 2 | spin | Rotating motion |
| 3 | pan left | Camera pans to the left |
| 4 | pan right | Camera pans to the right |
| 5 | tilt up | Camera tilts upward |
| 6 | tilt down | Camera tilts downward |
| 7 | push in | Camera moves closer to subject |
| 8 | pull out | Camera moves away from subject |
| 9 | static | No camera movement |
| 10 | tracking | Camera follows subject movement |
| 11 | others | Unspecified movement |
| 12 | object pov | Point of view from object perspective |
| 13 | super dolly in | Dramatic push into scene |
| 14 | super dolly out | Dramatic pull from scene |
| 15 | snorricam | Camera fixed to subject while rotating |
| 16 | head tracking | Follows head/face movement |
| 17 | car grip | Camera mounted on vehicle |
| 18 | screen transition | Transition effect movement |
| 19 | car chasing | Following vehicle action |
| 20 | fisheye | Wide-angle distortion effect |
| 21 | FPV drone | First-person drone perspective |
| 22 | crane over the head | Overhead crane shot |
| 23 | timelapse landscape | Time-lapse scenery |
| 24 | dolly in | Smooth push toward subject |
| 25 | dolly out | Smooth pull from subject |
| 26 | zoom in | Lens zooms closer |
| 27 | zoom out | Lens zooms further |
| 28 | full shot | Wide establishing shot |
| 29 | close-up shot | Tight framing on subject |
| 30 | extreme close-up | Very tight detail shot |
| 31 | Macro shot | Extreme close-up of small details |
| 32 | bird's-eye view | Overhead perspective |
| 33 | rule of thirds | Compositional guideline |
| 34 | symmetrical composition | Balanced framing |
| 35 | handheld | Shaky, documentary-style |
| 36 | FPV shot | First-person view |
| 37 | jib up | Crane moves upward |
| 38 | jib down | Crane moves downward |
| 39 | full shot | Complete subject in frame |
| 40 | Time lapse shot | Compressed time progression |
| 41 | aerial shot | High-altitude view |
| 42 | low angle shot | Camera positioned below subject |
| 43 | Eye-level shot | Camera at subject's eye level |
| 44 | diagonal composition | Angled framing |
| 45 | over shoulder shot | View from behind subject |
| 46 | crane down | Crane descends |
## Usage Example
```python
# Static shot with no camera movement
client.create_video(
prompt="猫咪转头看向镜头",
image="https://example.com/cat.jpg",
camera_move_index=9 # static
)
# Dramatic push in shot
client.create_video(
prompt="女孩突然转头,右手拿起无线耳机戴在耳朵上",
image="https://example.com/girl.jpg",
camera_move_index=13 # super dolly in
)
```
FILE:references/prompt_guide.md
# Tomoviee Prompt Engineering Guide
## Overview
This guide provides structured prompt formulas and best practices for all Tomoviee AI APIs to achieve optimal generation results.
---
## Video APIs
### Text-to-Video Prompt Formula
```
主体(描述) + 运动 + 场景(描述) + (镜头语言 + 光影 + 氛围)
Subject (description) + Motion + Scene (description) + (Camera + Lighting + Atmosphere)
```
**Component Breakdown**:
1. **Subject (主体)**: Main focus of the video
- Who/what is in the scene
- Key characteristics, appearance
- Position and pose
2. **Motion (运动)**: Action and movement
- What the subject is doing
- Speed and direction of movement
- Dynamic elements
3. **Scene (场景)**: Environment and context
- Location and setting
- Background elements
- Time of day/season
4. **Camera (镜头语言)**: Optional camera work
- Camera angle (wide, close-up, aerial, etc.)
- Camera movement (pan, zoom, tracking, etc.)
- Or specify `camera_move_index` parameter
5. **Lighting (光影)**: Optional lighting style
- Natural/artificial light
- Time of day (golden hour, blue hour, etc.)
- Light direction and quality
6. **Atmosphere (氛围)**: Optional mood and tone
- Overall feeling (dramatic, peaceful, energetic)
- Color palette/grading
- Weather/environmental effects
**Examples**:
**Minimal** (Subject + Motion + Scene):
```
"A red sports car driving fast on a coastal highway at sunset"
```
**Standard** (+ Camera + Lighting):
```
"A red sports car speeding along a winding coastal highway at golden hour,
camera following from the side, warm orange sunlight reflecting off the car"
```
**Detailed** (+ Atmosphere):
```
"A sleek red Ferrari speeding along a dramatic coastal highway carved into cliffs,
camera tracking smoothly from the side at car level,
golden hour sunset casting long shadows and warm orange glow,
cinematic and epic atmosphere with ocean waves crashing below"
```
**More Examples by Use Case**:
**Product Showcase**:
```
"White wireless headphones slowly rotating on a minimalist white surface,
studio lighting with soft shadows, clean and modern atmosphere"
```
**Nature/Travel**:
```
"Majestic waterfall cascading down mossy rocks in a lush rainforest,
slow zoom in from wide to medium shot,
dappled sunlight filtering through the canopy,
serene and peaceful atmosphere"
```
**Action/Sports**:
```
"Professional skateboarder performing a kickflip on urban street ramp,
slow motion capture from low angle,
bright daylight with high contrast shadows,
energetic and dynamic atmosphere"
```
### Image-to-Video Prompt Formula
```
主体 + 运动 + 镜头语言
Subject + Motion + Camera
```
**Why simpler?** The image already provides:
- Scene and environment
- Lighting and color palette
- Composition and framing
- Atmosphere and mood
**Your prompt should focus on**:
- Motion to add to the static image
- Camera movement to apply
- Any new dynamic elements
**Examples**:
```
"Camera slowly zooming in, subject's hair gently blowing in the wind"
```
```
"Slow pan from left to right, leaves rustling, golden hour lighting"
```
```
"Camera orbiting around the subject, dramatic lighting, cinematic feel"
```
```
"Gentle push in toward the subject's face, bokeh background, emotional atmosphere"
```
### Video Continuation Prompt Formula
```
延续的动作 + 场景变化 + 镜头延续
Continued action + Scene evolution + Camera continuation
```
**Focus on**:
- How the action continues from the last frame
- Natural progression of movement
- Scene changes or transitions
- Camera movement consistency
**Examples**:
```
"The bird continues flying higher, soaring into the clouds,
camera following the ascent"
```
```
"The car continues down the road, passing through a tunnel,
maintaining tracking shot"
```
```
"The person continues walking, entering a brightly lit room,
camera follows smoothly"
```
---
## Image APIs
### Image-to-Image Prompt Formula
```
参考图描述 + 保留要素 + 修改/新增指令
Reference description + Elements to preserve + Modifications/additions
```
**Component Breakdown**:
1. **Reference Description**: What's in the original image
2. **Preserve**: Explicitly state what to keep unchanged
3. **Modify/Add**: What to change or add
**Examples**:
```
"A woman in business attire at a modern office,
preserve facial features and body pose,
change background to outdoor garden with natural lighting,
add warm sunset atmosphere"
```
```
"Portrait of a man wearing a blue shirt,
keep facial features and expression,
change clothing to formal black suit with tie,
maintain studio lighting"
```
```
"Kitchen interior with white cabinets,
preserve layout and appliance positions,
change color scheme to navy blue cabinets with gold hardware,
add marble countertops"
```
### Image Redrawing Prompt Formula
```
替换区域的新内容描述
Description of what replaces the masked area
```
**Focus on**:
- What should appear in the masked region
- Style/texture matching surrounding areas
- Lighting consistency
- Natural integration
**Examples**:
```
"Clear blue sky with white fluffy clouds"
(for replacing background)
```
```
"Natural grass texture with small wildflowers"
(for replacing foreground)
```
```
"Modern glass windows with reflections of cityscape"
(for replacing building facade)
```
```
"Empty wooden table surface with subtle wood grain"
(for removing objects from table)
```
### Image Recognition Prompt Formula
```
要识别的对象/区域描述
Description of objects/regions to identify
```
**Focus on**:
- Clear object identification
- Specific vs. general (depending on need)
- Spatial context if helpful
**Examples**:
```
"person" / "all people"
```
```
"the red car in the foreground"
```
```
"sky and clouds"
```
```
"text and logos"
```
```
"background behind the main subject"
```
---
## Audio APIs
### Text-to-Music Prompt Formula
```
主体 + 场景(氛围/风格)
Subject/Theme + Scene (Atmosphere/Style)
```
**Component Breakdown**:
1. **Subject/Theme**: Purpose or topic of the music
2. **Atmosphere**: Mood and emotional quality
3. **Style/Genre**: Musical style and genre
4. **Instruments** (optional): Key instruments to feature
**Examples**:
**Simple**:
```
"Upbeat electronic music, energetic and modern"
```
**Standard**:
```
"Corporate presentation background music, professional and clean,
soft piano and ambient synth"
```
**Detailed**:
```
"Epic cinematic orchestral score, dramatic and heroic,
soaring strings and powerful brass,
Hans Zimmer style, for action scene"
```
**By Genre**:
**Electronic/Pop**:
```
"Energetic EDM track, festival atmosphere, heavy bass and synth drops"
```
**Jazz**:
```
"Smooth jazz for evening ambience, sophisticated and relaxed,
piano trio with double bass"
```
**Orchestral**:
```
"Emotional film score, contemplative and moving,
piano and string ensemble"
```
**Ambient**:
```
"Peaceful meditation music, calm and spacious,
soft pads and gentle chimes"
```
### Text-to-Sound-Effect Prompt Formula
```
声音来源 + 动作/环境 + 特征
Sound source + Action/Context + Characteristics
```
**Component Breakdown**:
1. **Sound Source**: What's making the sound
2. **Action/Context**: What's happening / where it is
3. **Characteristics**: Sound qualities (loud, soft, sharp, etc.)
**Examples**:
**Single Events**:
```
"Glass bottle shattering on concrete, sharp and crisp"
```
```
"Car door closing, solid thunk, reverb in garage"
```
```
"Notification bell sound, soft and pleasant"
```
**Continuous Sounds**:
```
"Heavy rain on metal roof, steady and rhythmic"
```
```
"City traffic ambience, cars passing, distant sirens"
```
```
"Forest birds chirping, peaceful morning atmosphere"
```
**By Category**:
**Nature**:
```
"Ocean waves crashing on beach, powerful and continuous"
```
**Mechanical**:
```
"Old typewriter keys typing, mechanical clicks and dings"
```
**Human**:
```
"Crowd cheering and applauding, enthusiastic and loud"
```
**UI/Digital**:
```
"Futuristic UI beep, high-tech and clean"
```
### Text-to-Speech Tips
**Structure**:
- Write naturally as you would speak
- Use punctuation for pacing and pauses
- Spell out numbers and abbreviations
**Examples**:
**With Pacing**:
```
"Welcome to our platform... Let me show you around."
(Ellipsis adds natural pause)
```
**With Emphasis**:
```
"This is extremely important. Pay close attention."
(Period creates strong pause for emphasis)
```
**Numbers**:
```
"Call one, eight hundred, five five five, zero one two three"
(Instead of "Call 1-800-555-0123")
```
### Video Soundtrack Prompt Formula
```
风格/类型 + 情绪 + (节奏/能量)
Style/Genre + Mood + (Optional: pacing/energy)
```
**Why simpler?** The API analyzes video content automatically.
**Examples**:
**Minimal** (let API analyze):
```
(no prompt - fully automatic based on video)
```
**Guided Style**:
```
"Upbeat travel vlog music, adventurous and inspiring"
```
```
"Corporate tech presentation music, modern and professional"
```
```
"Emotional documentary score, contemplative and moving"
```
---
## General Best Practices
### Do's ✅
1. **Be Specific**: Clear descriptions yield better results
- Good: "Golden retriever puppy playing with red ball"
- Bad: "Dog playing"
2. **Use Descriptive Adjectives**: Add relevant details
- Good: "Sleek modern smartphone with edge-to-edge display"
- Bad: "Phone"
3. **Specify Mood/Atmosphere**: Set the tone
- Good: "Dramatic sunset with vibrant orange and purple clouds"
- Bad: "Sunset"
4. **Include Context**: Where, when, why
- Good: "Professional product photography on white background"
- Bad: "Product"
5. **Mention Key Visual Elements**:
- Lighting conditions
- Color palette
- Composition style
- Movement characteristics
### Don'ts ❌
1. **Don't Be Vague**:
- Bad: "Nice video"
- Bad: "Good music"
- Bad: "Cool image"
2. **Don't Contradict**:
- Bad: "Bright and dark scene"
- Bad: "Fast and slow motion"
- Bad: "Happy sad music"
3. **Don't Over-specify Technical Details**:
- Bad: "1920x1080 resolution 24fps H.264 codec"
- Bad: "120 BPM in C major with I-V-vi-IV progression"
- (API handles technical parameters automatically)
4. **Don't Use Negations**:
- Bad: "Not blurry, not dark, not boring"
- Good: "Sharp, bright, engaging"
5. **Don't Mix Multiple Unrelated Concepts**:
- Bad: "Cat playing piano in space while cooking pasta"
- Better: Split into multiple generations or choose one focus
---
## Advanced Techniques
### Chaining Outputs
Use output from one API as input to another:
```python
# 1. Generate image
img_task = client.image_to_image(
prompt="Modern office space, clean and minimal",
image="reference.jpg"
)
img_url = get_result_url(img_task)
# 2. Animate image
video_task = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image=img_url
)
video_url = get_result_url(video_task)
# 3. Add soundtrack
audio_task = client.video_soundtrack(
video=video_url,
prompt="Professional corporate music"
)
```
### Iterative Refinement
1. **Start Simple**: Test with minimal prompt
2. **Evaluate Output**: What's missing or wrong?
3. **Add Details**: Incrementally add specificity
4. **Test Again**: Compare results
**Example Iteration**:
```
V1: "Car driving"
→ Too generic
V2: "Red sports car driving fast on highway"
→ Better, but lighting unclear
V3: "Red Ferrari speeding on coastal highway, golden hour sunset, dramatic"
→ Good result!
```
### A/B Testing
Generate multiple variations to compare:
```python
# Test different camera movements
variants = [
client.image_to_video(image=img, prompt="Slow zoom in", camera_move_index=5),
client.image_to_video(image=img, prompt="Pan right", camera_move_index=12),
client.image_to_video(image=img, prompt="Orbit around subject", camera_move_index=23)
]
```
### Consistency Across Assets
For cohesive content, maintain consistent prompt elements:
```python
# Consistent style keywords across multiple generations
style_guide = "cinematic, golden hour lighting, dramatic atmosphere"
video1 = client.text_to_video(f"Mountain landscape, {style_guide}")
video2 = client.text_to_video(f"Forest scene, {style_guide}")
video3 = client.text_to_video(f"Beach sunset, {style_guide}")
```
---
## Prompt Templates by Use Case
### Marketing/Commercial
```
"[Product name] showcased on [surface/background],
[camera movement],
studio lighting with soft shadows,
premium and elegant atmosphere"
```
### Social Media
```
"[Subject] [action],
[camera angle/movement],
vibrant colors and high energy,
engaging and eye-catching for [platform]"
```
### Documentary/Educational
```
"[Subject] in [environment],
[natural action],
natural documentary style cinematography,
informative and authentic atmosphere"
```
### Artistic/Creative
```
"[Abstract concept] visualized as [imagery],
[unique camera work],
[lighting style],
[artistic mood/style reference]"
```
---
## Troubleshooting
### Problem: Output doesn't match prompt
**Solutions**:
- Simplify prompt to core elements
- Remove contradictory instructions
- Be more specific about key aspects
- Check for typos or ambiguous terms
### Problem: Output quality is poor
**Solutions**:
- Add style keywords (cinematic, professional, high quality)
- Specify lighting conditions
- Mention camera quality/style
- Add atmosphere/mood descriptors
### Problem: Inconsistent results
**Solutions**:
- Use more specific prompts
- Reference specific styles or examples
- Test with variations to find optimal wording
- Use consistent style keywords across generations
### Problem: Generation fails (status=4)
**Solutions**:
- Check prompt for inappropriate content
- Simplify overly complex prompts
- Verify input files (images/videos) are valid
- Ensure parameters are within valid ranges
- Try with default parameters first
---
## Language Considerations
Tomoviee supports both **Chinese** and **English** prompts.
**Tips**:
- Use language you're most comfortable with
- Be aware of cultural/contextual nuances
- Technical terms may work better in English
- Artistic descriptions may vary by language interpretation
**Example Comparisons**:
Chinese:
```
"一只金毛寻回犬在阳光明媚的草地上奔跑,镜头跟随,电影级画质"
```
English:
```
"A golden retriever running through a sunlit meadow, camera following, cinematic quality"
```
Both can produce excellent results - choose based on your fluency and precision needs.
---
## Summary Checklist
Before submitting your prompt, verify:
- [ ] **Clear Subject**: What's the main focus?
- [ ] **Specific Action**: What's happening?
- [ ] **Scene Context**: Where is this taking place?
- [ ] **Visual Style**: What's the look and feel?
- [ ] **Technical Parameters**: Resolution, duration, aspect ratio set correctly?
- [ ] **No Contradictions**: All elements work together?
- [ ] **Appropriate Specificity**: Not too vague, not over-specified?
A well-crafted prompt is the foundation of excellent AI-generated content. Take time to refine your prompts for best results!
FILE:references/video_apis.md
# Tomoviee Video Generation APIs
## Overview
Tomoviee provides three video generation APIs supporting different input types and use cases.
## API Endpoints
### 1. Text-to-Video (tm_text2video_b)
**Generate video from text description only**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_text2video_b`
**Parameters**:
- `prompt` (required): Text description for video generation
- `resolution`: Video resolution - `720p` (default) or `1080p`
- `duration`: Video duration in seconds - only `5` is supported
- `aspect_ratio`: Video aspect ratio - `16:9` (default), `9:16`, `4:3`, `3:4`, `1:1`
- `camera_move_index`: Camera movement type (1-46, see camera_movements.md)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Create video from scratch with text only
- Generate establishing shots or B-roll footage
- Prototype video ideas quickly
**Example**:
```python
task_id = client.text_to_video(
prompt="A golden retriever running through a sunlit meadow, slow motion",
resolution="720p",
aspect_ratio="16:9",
camera_move_index=5 # Slow zoom in
)
```
---
### 2. Image-to-Video (tm_img2video_b)
**Generate video from image + text description**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_img2video_b`
**Parameters**:
- `prompt` (required): Text description guiding video generation
- `image` (required): Image URL (JPG/JPEG/PNG/WEBP format, <200M)
- `resolution`: Video resolution - `720p` (default) or `1080p`
- `duration`: Video duration in seconds - only `5` is supported
- `aspect_ratio`: Video aspect ratio - `16:9`, `9:16`, `4:3`, `3:4`, `1:1`, `original` (keeps image ratio)
- `camera_move_index`: Camera movement type (1-46, see camera_movements.md)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Animate still images with motion
- Create product demo videos from photos
- Add dynamic camera movements to static images
- Generate video variations from reference image
**Example**:
```python
task_id = client.image_to_video(
prompt="Camera slowly panning right, golden hour lighting",
image="https://example.com/sunset-beach.jpg",
resolution="720p",
aspect_ratio="original",
camera_move_index=12 # Pan right
)
```
---
### 3. Video Continuation (tm_video_continuation_b)
**Continue/extend existing video**
**Endpoint**: `https://openapi.wondershare.cc/v1/open/capacity/application/tm_video_continuation_b`
**Parameters**:
- `prompt` (required): Text description for continuation
- `video` (required): Video URL (MP4 format, <200M, 5s duration, 720p resolution)
- `resolution`: Video resolution - `720p` (default) or `1080p`
- `duration`: Video duration in seconds - only `5` is supported (generates 5s continuation)
- `aspect_ratio`: Video aspect ratio - `16:9`, `9:16`, `4:3`, `3:4`, `1:1`
- `camera_move_index`: Camera movement type (1-46, see camera_movements.md)
- `callback`: Optional callback URL for async notification
- `params`: Optional transparent parameters passed back in callback
**Use Cases**:
- Extend video clips beyond original length
- Create seamless video sequences
- Generate multiple continuation variations
- Overcome 5-second duration limit by chaining continuations
**Important Constraints**:
- Input video MUST be exactly 5 seconds, 720p resolution
- Output is always 5 seconds (extending the input)
- To create longer videos, chain multiple continuations
**Example**:
```python
# First generate 5s video
task_id_1 = client.text_to_video(
prompt="A bird taking flight from a tree branch"
)
# Then continue it
task_id_2 = client.video_continuation(
video="https://result.com/first_video.mp4",
prompt="The bird soars higher into the blue sky"
)
```
---
## Common Parameters
### Resolution Options
- `720p`: 1280x720 (faster generation, lower quality)
- `1080p`: 1920x1080 (slower generation, higher quality)
### Aspect Ratio Options
- `16:9`: Widescreen (landscape) - standard video format
- `9:16`: Vertical (portrait) - mobile/social media
- `4:3`: Traditional TV format
- `3:4`: Vertical medium format
- `1:1`: Square - Instagram/social media
- `original`: (Image-to-Video only) Preserves input image ratio
### Duration
Currently only `5 seconds` is supported across all video APIs.
### Camera Movement
All video APIs support camera_move_index (1-46). See `camera_movements.md` for full list.
- Use `null`/`None` for automatic camera movement
- Specify index (1-46) for precise control
---
## Async Workflow
All video APIs are asynchronous:
1. **Create Task**: Call API endpoint → receive `task_id`
2. **Poll Status**: Call unified result endpoint with `task_id`
3. **Check Status**:
- `1` = Queued
- `2` = Processing
- `3` = Success (video ready)
- `4` = Failed
- `5` = Cancelled
- `6` = Timeout
4. **Get Result**: When status=3, extract video URL from result JSON
**Unified Result Endpoint**: `https://openapi.wondershare.cc/v1/open/pub/task`
**Example Workflow**:
```python
# Create video
task_id = client.text_to_video(prompt="...")
# Poll for completion (built-in helper)
result = client.poll_until_complete(task_id, poll_interval=10, timeout=600)
# Extract video URL
import json
result_data = json.loads(result['result'])
video_url = result_data['video_path'][0]
```
---
## Error Handling
**Common Errors**:
- `400`: Invalid parameters (check resolution, aspect_ratio, duration)
- `401`: Authentication failed (verify app_key and access_token)
- `413`: File too large (image/video must be <200M)
- `422`: Invalid file format or constraints
- Task status `4`: Generation failed (check prompt or input quality)
- Task status `6`: Timeout (retry or contact support)
**Best Practices**:
- Always check response `code` field (0 = success)
- Implement exponential backoff for polling
- Set reasonable timeout (video generation typically takes 1-5 minutes)
- Validate file URLs are publicly accessible before API call
- Use callback URLs for production to avoid polling overhead
---
## Quota and Limits
- Maximum concurrent tasks: Check your plan
- File size limits: <200M for images/videos
- Video input constraints: Must be 5s, 720p for continuation API
- Supported formats:
- Images: JPG, JPEG, PNG, WEBP
- Videos: MP4
- Generation time: Typically 1-5 minutes per 5-second video
---
## Chaining Videos
To create videos longer than 5 seconds:
```python
# Generate first 5s
task_1 = client.text_to_video(prompt="Scene 1...")
result_1 = client.poll_until_complete(task_1)
video_1_url = json.loads(result_1['result'])['video_path'][0]
# Continue for another 5s
task_2 = client.video_continuation(video=video_1_url, prompt="Scene 2...")
result_2 = client.poll_until_complete(task_2)
video_2_url = json.loads(result_2['result'])['video_path'][0]
# Continue again for another 5s
task_3 = client.video_continuation(video=video_2_url, prompt="Scene 3...")
result_3 = client.poll_until_complete(task_3)
video_3_url = json.loads(result_3['result'])['video_path'][0]
# Now concatenate all three 5s clips to create 15s final video
```
Note: You'll need to download and concatenate videos using a tool like ffmpeg or video editing software.