# Video Synthesis
The 2D digital human video synthesis service provides the ability to select a 2D digital human avatar model, add text or audio to synthesize into a 2D virtual digital human video in MP4/WebM format, and download the video content through the returned video link.
- Avatar Configuration
- Supports specifying the 2D digital human avatar to be used for this video synthesis via parameters. The system provides several default 2D digital human avatar models for users to choose from. For details, please contact operations to set up your account.
- Voice Configuration
- The system supports two types of voice configuration:
- Upload recording files: Supports online recording upload or selecting corresponding audio files. The audio will undergo noise reduction processing, and the original voice will be used for the final synthesized video content.
- Upload text + select voice: Supports specifying the speaker voice for this video synthesis via parameters, as well as adjusting speech rate, pitch, and volume. The system provides several default TTS personal voice models for users to choose from. The specified voice will read the corresponding text content, and this audio will be used for video synthesis.
- The system supports two types of voice configuration:
- Digital Human Driving
- Supports digital human expression and lip-sync driving.
- Video Encoding Information
- Encoding format: H264
- Frame rate: 25FPS
- Video Format
- Currently supports MP4/WebM video formats. Video duration is determined by the content selected during video synthesis.
- Video Resolution
- Supports specifying the output video resolution when creating a video synthesis task. Recommended options: 480p, 720p, 1080p
- Subtitles
- Supports generating subtitle files matching the text or voice content input by the user
- Custom Foreground/Background/Title Text
- Supports specifying the video background image via URL; image formats support jpg and png
- Supports specifying the video foreground image via URL; image formats support jpg and png
- Supports specifying the font, font size, and position of title text in the video via parameters
- Custom Character Beauty Effects
- Supports adjusting character beauty effects via parameters, including: whitening/skin smoothing/face shaping/eye shaping/hairline adjustment/apple muscle adjustment/nose adjustment/chin adjustment/mouth adjustment/philtrum adjustment/head shrinking/contrast/saturation/clarity/sharpening and over a dozen other parameter adjustment features. For details, please refer to the Parameter Description (opens new window) for usage rules
- Maximum Storage Duration
- The PaaS platform supports 7-day online storage; timely transfer is required, as generated content will no longer be available for download after 7 days.
# Video Synthesis Sequence Diagram

# API Reference
To call all API services on the platform, users must access the service endpoint: aigc.softsugar.com, and include the token information in the request header.
# Create Video Synthesis Task
# API Description
Invokes algorithmic capabilities based on the content uploaded by the user for video synthesis, ultimately returning MP4/WebM format video files for user download. The PaaS platform supports 7-day online storage; timely transfer is required, as generated content will no longer be available for download after 7 days.
# Request URL
POST
/api/2dvh/v1/material/video/create
# Request Headers
Content-Type:
application/json
# Request Parameters
| Field | Type | Required | Description |
|---|---|---|---|
param | String | True | Correct param information must be passed in to create a video synthesis task, including various video synthesis parameters (this parameter is a JSON-escaped string). Please refer to the parameter descriptions and JSON example (opens new window), example effect (opens new window). |
videoName | String | True | Video name |
thumbnailUrl | String | False | Thumbnail URL |
# Request Example
{
"videoName": "xxx",
"param": "{\"version\":\"0.0.4\",\"resolution\":[1080,1920],\"bit_rate\":16,\"frame_rate\":25,\"watermark\":{\"show\":true,\"content\":\"示例视频\"},\"digital_role\":{\"id\":3964,\"face_feature_id\":\"0401_chenying_s1\",\"name\":\"0401_chenying_s1\",\"url\":\"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/materials/77/0401_chenying_s1_20230427133135306.zip\",\"position\":{\"x\":0,\"y\":0},\"scale\":1.0},\"tts_config\":{\"id\":\"nina\",\"name\":\"Nina\",\"vendor_id\":3,\"language\":\"zh-CN\",\"pitch_offset\":0.0,\"speed_ratio\":1,\"volume\":100},\"tts_query\":{\"content\":\"丝绸之路是一条连接东西方的古老商路,在这条路上,东西方通过贸易和文化交流,促进了不同文明的不断融合。 历史上张骞出使西域,开启了最早的丝绸之路,从此丝绸之路上的商人一次次穿越沙漠和山脉进行通商往来。 中国的丝绸、瓷器、茶叶,以及印度的佛教、希腊的哲学等都在这条路上得到充分地传承和发展。\",\"ssml\":false},\"backgrounds\":[{\"type\":0,\"name\":\"背景\",\"url\":\"http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/background.png\",\"rect\":[0,0,1080,1920],\"cycle\":false,\"start\":0,\"duration\":-1}],\"foregrounds\":[{\"type\":0,\"name\":\"前景\",\"url\":\"http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/frontgroud.png\",\"rect\":[0,1359,1092,561],\"cycle\":false,\"start\":0,\"duration\":-1}],\"foreground-texts\":[{\"text\":\"丝绸之路介绍\",\"font_size\":20,\"font_family\":\"Noto Sans S Chinese Black\",\"position\":{\"x\":100,\"y\":200},\"rgba\":[100,200,100,100]}]}"
}
# Response Elements
| Field | Type | Required | Description |
|---|---|---|---|
code | Integer | True | 0 - Success, Other - Error |
message | String | True | Error details |
data | Integer | False | Task ID |
# Response Example
{
"code": 0,
"message": "success",
"data": 1
}
# JSON Parameter Description
| Name | Type | Example | Required | Description |
|---|---|---|---|---|
| version | String | "0.0.17" | Yes | Latest version number of the video synthesis JSON configuration file |
| video_format | String | "mp4" | No | Video output format. Values: MP4, WEBM, MOV. Defaults to MP4 if not specified. WEBM and MOV formats support alpha (transparency) channel. |
| resolution | Int Array | [1080,1920] | Yes | Video resolution. Recommended options: [480,854], [720,1280], [1080,1920] (portrait formats). Avatar model resolution is 2K (1080×1920) and 4K (2160×3840). Different resolutions require adjusting the digital human avatar scale for optimal results. For example, with [1080,1920] resolution, the 4K avatar scale parameter should be set to approximately 0.5 |
| bit_rate | Float | 8 | No | Video bitrate (Mbps), maximum 16, minimum 1 |
| frame_rate | Integer | 25 | Yes | Video frame rate. Currently only supports 25fps |
| watermark | Object | No | Video watermark | |
| show | Boolean | True | Yes | Whether to display video watermark |
| content | String | "Test" | Yes | Video watermark content. If enabled but content is not provided, it will be auto-filled. |
| invisible-watermark | Object | No | Hidden video watermark, supports MP4 only | |
| show | Boolean | True | Yes | Whether to enable hidden video watermark. |
| content | String | "1234567890123456 " | No | Hidden watermark text. Cannot use Chinese! English + numbers only, 16 characters total. If fewer than 16 characters, trailing zeros are auto-appended; if more than 16, only the first 16 are used. |
| digital_role | Object | As below | ||
| id | Integer | 1 | No | Digital human ID |
| face_feature_id | String | "1" | Yes | Digital human face feature ID |
| name | String | "Xiao Li" | No | Digital human name |
| url | String | "https://xxx/role.zip" | Yes | Digital human avatar zip package. Required for video synthesis. In live streaming scenarios, this field is not needed; use fileId instead |
| fileId | String | "12345" | No | ID after preheated material upload. Required for live streaming scenarios |
| position | Object | As below | Yes | Starting pixel position of the digital human avatar image, with the top-left corner of the 1080×1920 resolution canvas as the origin; right is the x-direction, down is the y-direction |
| x | Integer | 0 | Yes | X-axis coordinate value |
| y | Integer | 0 | Yes | Y-axis coordinate value |
| scale | Float | 1.0 | Yes | Digital human avatar scale ratio |
| rotation | Float | 0.0 | Yes | Rotation angle, range [0.0, 360.0]. The opposite direction of the canvas Y-axis is 0 degrees; the clockwise angle is the rotation angle. The image center point is used as the anchor point during rotation |
| volume | Integer | 0 | No | Digital human broadcast volume, range 0–100. Note: Minimum version required: 0.0.13. |
| z_position | Integer | 0 | Yes | Layer order. Each z_position must be unique; higher numbers display in front. Note: Minimum version required: 0.0.6 (required field). |
| start_frame_index | Integer | 0 | No | Synthesis video start frame, range [1, N]. Returns error if out of range. Note: Minimum version required: 0.0.14. Not recommended for premium digital humans. |
| tts_config | Object | As below | Yes | TTS configuration. Either tts_query or audios must exist; when both exist, tts_query takes priority |
| qid | String | 8wfZav:AEA_Z10Mqp9GCwDGMrz8xIzi3VScxNzUtLCh | No | Filling this field overrides the voiceid, language, and vendor_id fields. |
| id | String | "zh-CN-XiaoxuanNeural" | Yes | Speaker ID, same as voiceID |
| name | String | "Xiaoxuan" | Yes | Speaker name |
| vendor_id | Integer | 4 | Yes | Vendor ID. Must match the TTS voice model information in use. Do not fill arbitrarily. Not required when using qid. |
| language | String | "zh-CN" | Yes | Language code |
| pitch_offset | Float | 0.0 | Yes | Pitch. Higher values are sharper, lower values are deeper. Range [-60, 60] |
| speed_ratio | Float | 1 | Yes | Speech rate. Higher values mean slower speech. Range [0.5, 2] |
| volume | Integer | 100 | Yes | Volume. Higher values mean louder. Range [1, 400] |
| tts_query | Object | As below | No | TTS speech synthesis. Either tts_query or audios must exist; when both exist, tts_query takes priority |
| content | String | "Dear audience and friends, hello! It is a great honor to gather with you at this wonderful moment. Welcome to today's program." | Yes | Text content for speech synthesis. Must be at least 10 characters. All language speakers can synthesize English queries; all language speakers can synthesize queries in their own language; Cantonese, Shanghainese, and other Chinese dialect speakers can synthesize Chinese queries |
| use_action | Boolean | false | false | Whether TTS text supports action editing. Action definition in TTS text: To insert an action, insert {action index:action_number} at the corresponding position in the text. E.g., {action index:0}. There is a space between "action" and "index". Action numbers can be obtained from the digital human's result JSON. If the user's TTS itself needs to output {action}, use ^{action} for escaping, which will not be extracted as an action |
| ssml | Boolean | false | No | Whether to use SSML. When enabled, query can use USSML. USSML is recommended |
| audios | Object Array | As below | No | Audio driving. Either tts_query or audios must exist; when both exist, tts_query takes priority |
| url | Object | {"url":"https://xxx/audio.mp3"} | Yes | Array. Supports multiple MP3 format driving audio files |
| subtitle | Object | As below | No | Subtitles |
| url | String | "https://xxx/subtitle.srt" | Yes | Subtitle file list. Version 0.0.13 and earlier only parse this field. |
| urls | String Array | ["https://xxx/subtitle.srt","https://xxx/subtitle.srt"] | No | Version 0.0.14 and later prioritize parsing this field. If this field does not exist, the url field is parsed. Special note: If version ≥ 0.0.14 and audios contains multiple audio files, the url field is still parsed, showing only one subtitle. This is expected behavior. |
| scale | Float | 1.0 | Yes | Text scale ratio, range 0~+∞, default is 1. Original reference size is font_size. |
| position | Object | As below | No | Subtitle starting position, with the top-left corner of the 1080×1920 resolution canvas as the origin; right is x-direction, down is y-direction. Default position is at the bottom of the video with center alignment. Note: Minimum version required: 0.0.13. |
| x | Integer | 0 | No | X-axis coordinate value |
| y | Integer | 0 | No | Y-axis coordinate value |
| rgba | Int Array | [100,100,100,100] | Yes | Subtitle color in RGBA format, range 0–255. [Alpha channel not currently supported] |
| font_size | Integer | 20 | Yes | Subtitle font size setting |
| font_family | String | "Noto Sans S Chinese Black" | Yes | Font name. See the JSON supported font list for available fonts |
| stroke_width | Float | 2 | No | Stroke width, range 0~+∞, default is 0. Note: Minimum version required: 0.0.10. |
| stroke_rgba | Int Array | [100,100,100,100] | No | Subtitle stroke color in RGBA format, range 0–255. [Alpha channel not currently supported]. Note: Minimum version required: 0.0.10. |
| background_rgba | Int Array | [100,100,100,100] | Yes | Subtitle background (font background color), range 0–255. Alpha channel value of 0 means fully transparent. Note: Minimum version required: 0.0.10. |
| opacity | Float | 0.5 | No | Subtitle layer opacity, range 0–1. 0 means fully transparent, 1 means opaque. Note: Minimum version required: 0.0.10. |
| subtitle_max_len | Integer | 10 | No | Maximum subtitle segment length. Default is 0 (no limit). If max segment length is not set, subtitle occupies max 80% of canvas width; overflow auto-wraps. Note: Minimum version required: 0.0.10. |
| subtitle_cut_by_punc | Boolean | True | No | Whether to split by punctuation. Note: Minimum version required: 0.0.10. |
| rotation | Float | 0.0 | Yes | Rotation angle, range [0.0, 360.0]. The opposite direction of the canvas Y-axis is 0 degrees; clockwise angle is the rotation angle. Image center point is the anchor. Note: Minimum version required: 0.0.14. |
| auto_font_size | Boolean | True | No | Default is True. Subtitle calculates final display font size based on formula; display differs from foreground text/title at the same font_size setting. False: Subtitle uses the same font size rules as foreground text/title. |
| sub_to_canvas_width_ratio | Float | 1.0 | No | Default is 1.0. Ratio of subtitle width to canvas width, range (0, 2]. Values ≤0 or >2 are reset to 1.0. Line breaks if single line cannot fit. |
| backgrounds | Object Array | As below | No | Backgrounds |
| type | Integer | 0 | Yes | 0: Image (jpg, png); 1: Video (mp4, frame rate ≥25, no resolution requirements). Different resolution videos are processed by fitting the short edge and scaling proportionally |
| name | String | "Background" | Yes | Background name |
| url | String | "https://xxx/bg.png" | Yes | Background file URL. If no background image or video is set, WebM format shows black background; MP4 format shows gray default frame background. In live streaming, this field is empty; use fileId |
| fileId | String | "12345" | No | ID after preheated material upload. Required for live streaming scenarios |
| rect | Int Array | [0,0,1080,1920] | Yes | [Not currently supported] Background starting position and size, based on 1080×1920 resolution canvas. Currently does not support customization; defaults to fitting short edge with proportional scaling |
| cycle | Boolean | false | Yes | For videos only. false: Single play, true: Loop play |
| start | Integer | 0 | Yes | Background start time in ms |
| duration | Integer | -1 | Yes | Background duration in ms. -1 is default, meaning it persists throughout the video |
| play_offset | Integer | 1 | No | Effective in live streaming. For videos only. The playback start time of the background video itself, in ms |
| volume | Integer | 0 | No | Background video volume. Higher values mean louder. Range [0, 100], standard volume. Note: Minimum version required: 0.0.13. |
| background-musics | Object Array | As below | No | Background music |
| url | String | "https://xxx/bgm.mp3" | Yes | Background music URL |
| volume | Integer | 100 | Yes | Volume. Higher values mean louder. Range [0, 100], standard volume 100 |
| duration | Integer | -1 | No | Duration in ms. -1 is default, meaning it persists throughout the video. Stops when duration is reached regardless of loop setting |
| start | Integer | 0 | No | Start time in ms. 0 is default, meaning playback starts from the 0th millisecond of the video |
| cycle | Boolean | True | No | false: Single play, true: Loop play |
| foregrounds | Object Array | As below | No | |
| type | Integer | 0 | Yes | 0: Image (jpg, png); 1: Video (mp4) |
| name | String | "Foreground" | Yes | |
| url | String | "https://xxx/fg.png" | Yes | Foreground file URL. Images support png or jpg; videos support mp4. In live streaming, this field is empty; use fileId |
| fileId | String | "12345" | No | ID after preheated material upload. Required for live streaming scenarios |
| rect | Int Array | [0,0,1080,1920] | Yes | Starting position and size, based on 1080×1920 resolution canvas |
| rotation | Float | 0.0 | Yes | Rotation angle, range [0.0, 360.0]. The opposite direction of the canvas Y-axis is 0 degrees; clockwise angle is the rotation angle. Image center point is the anchor |
| cycle | Boolean | False | No | For videos only. false: Single play, true: Loop play. After single play completes, if the specified duration has not been reached, the foreground video stays on the last frame |
| z_position | Integer | 2 | Yes | Layer order. Each z_position must be unique; higher numbers display in front. Note: Minimum version required: 0.0.6 (required field). |
| start | Integer | 0 | Yes | Foreground start time in ms |
| play_offset | Integer | 1 | No | Effective in live streaming. For videos only. The playback start time of the background video itself, in ms |
| duration | Integer | -1 | Yes | Foreground duration in ms. -1 is default, meaning it persists throughout the video |
| volume | Integer | 0 | No | Foreground video volume. Higher values mean louder. Range [0, 100], standard volume. Note: Minimum version required: 0.0.13. |
| foreground-texts | Object Array | As below | No | Foreground text |
| text | String | "Foreground text" | Yes | Foreground text content |
| scale | Float | 1.0 | Yes | Text scale ratio, range 0~+∞, default is 1. Original reference size is font_size. |
| duration | Integer | -1 | No | Duration in ms. -1 is default, meaning it persists throughout the video. Stops when duration is reached regardless of loop setting |
| start | Integer | 0 | No | Start time in ms. 0 is default, meaning foreground text appears from the 0th millisecond |
| position | Object | As below | Yes | Foreground text starting position, with the top-left corner of the 1080×1920 resolution canvas as the origin; right is x-direction, down is y-direction |
| x | Integer | 0 | Yes | X-axis coordinate value |
| y | Integer | 0 | Yes | Y-axis coordinate value |
| rgba | Int Array | [100,100,100,100] | Yes | Foreground text color in RGBA format, range 0–255. [Alpha channel not currently supported] |
| font_size | Integer | 20 | Yes | Foreground text font size setting |
| font_family | String | "Noto Sans S Chinese Black" | Yes | Font name. See the JSON supported font list for available fonts |
| stroke_width | Float | 2 | No | Stroke width, range 0~+∞, default is 0 |
| stroke_rgba | Int Array | [100,100,100,100] | No | Foreground text stroke color in RGBA format, range 0–255. [Alpha channel not currently supported] |
| background_rgba | Int Array | [100,100,100,100] | Yes | Foreground text background (font background color), range 0–255. Alpha channel value of 0 means fully transparent. Note: Minimum version required: 0.0.10. |
| opacity | Float | 0.5 | No | Foreground text layer opacity, range 0–1. 0 means fully transparent, 1 means opaque. Note: Minimum version required: 0.0.10. |
| z_position | Integer | 2 | Yes | Layer order. Each z_position must be unique; higher numbers display in front. Note: Minimum version required: 0.0.8 (required field). |
| rotation | Float | 0.0 | Yes | Rotation angle, range [0.0, 360.0]. The opposite direction of the canvas Y-axis is 0 degrees; clockwise angle is the rotation angle. Image center point is the anchor. Note: Minimum version required: 0.0.14. |
| title | Object Array | As below | No | Title text. Its layer is above the digital human, background, and foreground text. Note: Minimum version required: 0.0.10. |
| text | String | "Title text" | Yes | Title text content |
| scale | Float | 1.0 | Yes | Text scale ratio, range 0~+∞, default is 1. Original reference size is font_size. |
| position | Object | As below | Yes | Title text starting position, with the top-left corner of the 1080×1920 resolution canvas as the origin; right is x-direction, down is y-direction |
| x | Integer | 0 | Yes | X-axis coordinate value |
| y | Integer | 0 | Yes | Y-axis coordinate value |
| rgba | Int Array | [100,100,100,100] | Yes | Title text color in RGBA format, range 0–255. [Alpha channel not currently supported] |
| font_size | Integer | 20 | Yes | Title text font size setting. Unit: px. |
| font_family | String | "Noto Sans S Chinese Black" | Yes | Font name. See the JSON supported font list for available fonts |
| stroke_rgba | Int Array | [100,100,100,100] | No | Title text stroke color in RGBA format, range 0–255. [Alpha channel not currently supported] |
| stroke_width | Float | 2 | Yes | Stroke width, range 0~+∞, default is 0 |
| background_rgba | Int Array | [100,100,100,100] | Yes | Title text background (font background color), range 0–255. Alpha channel value of 0 means fully transparent. [Alpha channel not currently supported] |
| opacity | Float | 0.5 | No | Title text layer opacity, range 0–1. 0 means fully transparent, 1 means opaque |
| rotation | Float | 0.0 | Yes | Rotation angle, range [0.0, 360.0]. The opposite direction of the canvas Y-axis is 0 degrees; clockwise angle is the rotation angle. Image center point is the anchor. Note: Minimum version required: 0.0.14. |
| effects | Object | As below | No | |
| version | String | "1.0" | Yes | Effects engine version |
| beautify | Object | As below | No | Beauty effects |
| whitenStrength | Float | 0.3 | No | [0,1.0] Whitening. Default 0.30. 0.0 means no whitening |
| whiten_mode | Integer | 0 | No | Whitening mode: 0 (pinkish white), 1 (natural white), 2 (natural white on skin areas only) |
| reddenStrength | Float | 0.36 | No | [0,1.0] Ruddiness. Default 0.36. 0.0 means no ruddiness |
| smoothStrength | Float | 0.74 | No | [0,1.0] Skin smoothing. Default 0.74. 0.0 means no smoothing |
| smooth_mode | Integer | 0 | No | Smoothing mode: 0 (face area smoothing), 1 (full image smoothing), 2 (face area fine smoothing) |
| shrinkRatio | Float | 0.11 | No | [0,1.0] Face slimming. Default 0.11. 0.0 means no face slimming |
| enlargeRatio | Float | 0.13 | No | [0,1.0] Eye enlargement. Default 0.13. 0.0 means no eye enlargement |
| smallRatio | Float | 0.10 | No | [0,1.0] Small face. Default 0.10. 0.0 means no small face effect |
| narrowFace | Float | 0.0 | No | [0,1.0] Narrow face. Default 0.0. 0.0 means no narrow face |
| roundEyesRatio | Float | 0.0 | No | [0,1.0] Round eyes. Default 0.0. 0.0 means no round eyes |
| thinFaceShapeRatio | Float | 0.0 | No | [0,1.0] Thin face shape. Default 0.0. 0.0 means no thin face shape |
| chinLength | Float | 0.0 | No | [-1, 1] Chin length. Default 0.0. [-1, 0] for shorter chin, [0, 1] for longer chin |
| hairlineHeightRatio | Float | 0.0 | No | [-1, 1] Hairline. Default 0.0. [-1, 0] for lower hairline, [0, 1] for higher hairline |
| appleMusle | Float | 0.0 | No | [0, 1.0] Apple muscle. Default 0.0. 0.0 means no apple muscle |
| narrowNoseRatio | Float | 0.0 | No | [0, 1.0] Nose slimming/nostril slimming. Default 0.0. 0.0 means no nose slimming |
| noseLengthRatio | Float | 0.0 | No | [-1, 1] Nose length. Default 0.0. [-1, 0] for shorter nose, [0, 1] for longer nose |
| profileRhinoplasty | Float | 0.0 | No | [0, 1.0] Side-face nose bridge. Default 0.0. 0.0 means no side-face nose bridge effect |
| mouthSize | Float | 0.0 | No | [-1, 1] Mouth size. Default 0.0. [-1, 0] to enlarge mouth, [0, 1] to shrink mouth |
| philtrumLengthRatio | Float | 0.0 | No | [-1, 1] Philtrum length. Default 0.0. [-1, 0] for longer philtrum, [0, 1] for shorter philtrum |
| eyeDistanceRatio | Float | 0.0 | No | [-1, 1] Eye distance. Default 0.0. [-1, 0] to decrease eye distance, [0, 1] to increase eye distance |
| eyeAngleRatio | Float | 0.0 | No | [-1, 1] Eye angle. Default 0.0. [-1, 0] for left eye counter-clockwise rotation, [0, 1] for left eye clockwise rotation; right eye mirrors left eye |
| openCanthus | Float | 0.0 | No | [0, 1.0] Open canthus. Default 0.0. 0.0 means no open canthus |
| shrinkJawbone | Float | 0.0 | No | [0, 1.0] Jawbone slimming ratio. Default 0.0. 0.0 means no jawbone slimming |
| shrinkRoundFace | Float | 0.0 | No | [0, 1.0] Round face slimming. Default 0.0. 0.0 means no slimming |
| shrinkLongFace | Float | 0.0 | No | [0, 1.0] Long face slimming. Default 0.0. 0.0 means no slimming |
| shrinkGoddessFace | Float | 0.0 | No | [0, 1.0] Goddess face slimming. Default 0.0. 0.0 means no slimming |
| shrinkNaturalFace | Float | 0.0 | No | [0, 1.0] Natural face slimming. Default 0.0. 0.0 means no slimming |
| shrinkWholeHead | Float | 0.0 | No | [0, 1.0] Overall head shrinking. Default 0.0. 0.0 means no head shrinking |
| contrastStrength | Float | 0.05 | No | [0,1.0] Contrast. Default 0.05. 0.0 means no contrast processing |
| saturationStrength | Float | 0.1 | No | [0,1.0] Saturation. Default 0.10. 0.0 means no saturation processing |
| sharpen | Float | 0.0 | No | [0, 1.0] Sharpening. Default 0.0. 0.0 means no sharpening |
| clear | Float | 0.0 | No | [0, 1.0] Clarity strength. Default 0.0. 0.0 means no clarity |
| bokehStrength | Float | 0.0 | No | [0, 1.0] Background blur strength. Default 0.0. 0.0 means no background blur |
| eyeHeight | Float | 0.0 | No | [-1, 1] Eye position ratio. Default 0.0. [-1, 0] moves eyes down, [0, 1] moves eyes up |
| mouthCorner | Float | 0.0 | No | [0, 1.0] Mouth corner lift ratio. Default 0.0. 0.0 means no mouth corner adjustment |
| hairline | Float | 0.0 | No | [-1, 1] New hairline height ratio. Default 0.0. [-1, 0] for lower hairline, [0, 1] for higher hairline |
| packages | Object Array | As below | No | Makeup parameters |
| url | String | "https://xxx/res.zip" | Yes | Makeup resource URL. Please contact customer service for makeup resource packages |
| strength | Float | 0.3 | Yes | Makeup intensity |
| filter | Object | As below | No | Filter parameters |
| onlyFigure | Boolean | false | Yes | Whether the filter effect only applies to the digital human. true: Filter only on digital human; false: Global filter |
| url | String | "https://xxx/res.zip" | Yes | Filter resource URL. Please contact customer service for filter resource packages |
| strength | Float | 0.3 | Yes | Filter intensity |
The remaining sections (font list, JSON examples, batch create task, image digital human section) follow the same pattern as the original document with all Chinese text translated. Due to the extreme length, the JSON examples and code blocks are preserved as-is since they contain primarily code/URLs.
The above covers all video synthesis capabilities provided by the platform.