# Platform Capabilities

The platform offers various algorithmic capabilities for enterprise accounts, including: video synthesis, creation of character image models, generation of TTS personal voice models, updating of character image models, and video face swapping capabilities, among others.

# Feature Introduction

# Video Synthesis

The 2D digital human video synthesis service allows you to select a 2D digital human model, add text or audio to synthesize into a 2D virtual digital human video in mp4/webm format, and download the video content via the returned video link.

  • Image Configuration
    • Supports specifying the 2D digital human image needed for this video synthesis through parameters. The system provides several default 2D digital human image models for users to choose from. Details can be viewed after contacting the operation to open an account.
  • Voice Configuration
    • The system supports two forms of voice configuration:
      • Uploading a recording file, supporting online recording upload or selecting a corresponding audio file to upload. The audio will undergo noise reduction before the original sound is used in the final video content.
      • Uploading text + selecting a voice, supports specifying the voice of the speaker and adjustments to the voice's speed, pitch, and volume through parameters for this video synthesis. The system provides several default TTS personal voice models for users to choose from, which will read the corresponding text content in the specified voice, and this audio will be used for video synthesis.
  • Digital Human Driving
    • Supports digital human expression and mouth shape driving.
  • Video Encoding Information
    • Encoding format: H264
    • Frame rate: 25FPS
  • Video Format
    • Currently supports MP4/WebM formats. The video length is determined by the content selected during video synthesis.
  • Video Resolution
    • Supports specifying the output video resolution when creating a video synthesis task. Recommended range: 480p, 720p, 1080p
  • Subtitles
    • Supports generating subtitles files that match the text or voice content entered by the user.
  • Custom Foreground/Background/Title Text
    • Supports specifying the video background image through a URL, with jpg and png formats supported.
    • Supports specifying the video foreground image through a URL, with jpg and png formats supported.
    • Supports specifying the font, font size, and position of the title text content in the video through parameters.
  • Custom Human Beautification Effects
    • Supports adjusting human beautification effects through parameters, including: whitening, skin smoothing, face shape adjustment, eye shape adjustment, hairline adjustment, cheekbone adjustment, nose adjustment, chin adjustment, mouth adjustment, philtrum adjustment, head shrinking, contrast, saturation, clarity, and sharpness adjustments, among more than ten parameter adjustment functions. For details, please refer to the Parameter Description (opens new window) to understand the usage rules.
  • Maximum Storage Time
    • The PaaS platform supports 7 days of online storage. It is necessary to transfer the data in time, as content generated will not be downloadable after 7 days.

# Video Synthesis Sequence Diagram

Video Composite Process

# TTS Personal Voice Model Generation

The TTS personal voice model generation service can produce a digital human TTS voice model that matches the voice of the voice material provider, based on the user-uploaded real human collected or recorded voice material files through algorithm training. Please follow the SenseTime digital human voice copy collection production specifications during collection, which include environmental requirements, equipment requirements, pronunciation requirements, authorization requirements, and reading scripts. For details, refer to: Collection Specifications (opens new window). The PaaS platform supports 7 days of online storage, which requires timely transfer, as content generated will not be downloadable after 7 days.

# Character Image Model Generation

The character image model generation service can produce an AI-driven digital human character image model almost indistinguishable from a real person, based on the user-uploaded real human collected or recorded video through algorithm training. For perfect cloning of the character image, please follow the SenseTime digital human collection production specifications during filming, which include video, voice, for training and testing of 2D digital humans. For details, refer to: Collection Specifications (opens new window). The PaaS platform supports 7 days of online storage, which requires timely transfer, as content generated will not be downloadable after 7 days.

# Character Image Model Generation Sequence Diagram

Human Model Process

# Character Image Model Update

The 2D digital human character image model update service can update the already completed character image models, supporting modifications to the digital human training action clips. The PaaS platform supports 7 days of online storage, which requires timely transfer, as content generated will not be downloadable after 7 days.

# Green Screen Segmentation Effect Preview

The platform supports previewing the green screen segmentation effect on images and videos, to confirm the green screen segmentation parameters effect before submitting the character model generation task, or to confirm whether the filming environment meets the shooting requirements before the official shooting.

# Video Character Face Swap (Not Supported Yet)

The video character face swap task can process video character face swapping based on the user-uploaded video content and template images using algorithmic capabilities, and finally return the processed video file and thumbnail for the user to download. The PaaS platform supports 7 days of online storage, which requires timely transfer, as content generated will not be downloadable after 7 days.

# API Description

To call all API services of the platform, users need to access the service entry point: aigc.softsugar.com, and add token information in the request header.

# Creating a Video Synthesis Task

# Interface Description

Calls the algorithm capability based on the content uploaded by the user to perform video synthesis, ultimately returning an mp4/webm format video file for the user to download. The PaaS platform supports 7 days of online storage, which requires timely transfer, as content generated will not be downloadable after 7 days.

# Request Address

POST /api/2dvh/v1/material/video/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
param String True To create a video synthesis task, correct param information needs to be passed in, which includes various video synthesis parameters (this parameter is a string escaped from json), please refer to the parameter description and json example (opens new window) , example effect (opens new window).
videoName String True Video name
thumbnailUrl String False Thumbnail URL

# Request Example

{
  "videoName": "xxx",
    
  "param": "{\"version\":\"0.0.4\",\"resolution\":[1080,1920],\"bit_rate\":16,\"frame_rate\":25,\"watermark\":{\"show\":true,\"content\":\"Example Video\"},\"digital_role\":{\"id\":3964,\"face_feature_id\":\"0401_chenying_s1\",\"name\":\"0401_chenying_s1\",\"url\":\"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/materials/77/0401_chenying_s1_20230427133135306.zip\",\"position\":{\"x\":0,\"y\":0},\"scale\":1.0},\"tts_config\":{\"id\":\"nina\",\"name\":\"Nina\",\"vendor_id\":3,\"language\":\"zh-CN\",\"pitch_offset\":0.0,\"speed_ratio\":1,\"volume\":100},\"tts_query\":{\"content\":\"The Silk Road is an ancient trade route connecting the East and the West. On this road, the East and the West promoted the continuous integration of different civilizations through trade and cultural exchanges. Historically, Zhang Qian's mission to the Western Regions opened the earliest Silk Road, since then, merchants on the Silk Road have crossed deserts and mountains for trade exchanges time and again. Chinese silk, porcelain, tea, as well as Indian Buddhism, Greek philosophy, and more were fully inherited and developed on this road.\",\"ssml\":false},\"backgrounds\":[{\"type\":0,\"name\":\"Background\",\"url\":\"http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/background.png\",\"rect\":[0,0,1080,1920],\"cycle\":false,\"start\":0,\"duration\":-1}],\"foregrounds\":[{\"type\":0,\"name\":\"Foreground\",\"url\":\"http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/frontgroud.png\",\"rect\":[0,1359,1092,561],\"cycle\":false,\"start\":0,\"duration\":-1}],\"foreground-texts\":[{\"text\":\"Introduction to the Silk Road\",\"font_size\":20,\"font_family\":\"Noto Sans S Chinese Black\",\"position\":{\"x\":100,\"y\":200},\"rgba\":[100,200,100,100]}]}"

}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Exception
message String True Detailed information of the exception
data Integer False Task ID

# Response Example

{

    "code": 0,

    "message": "success",

    "data": 1

}
# JSON Parameter Description
Name Type Example Value Required Description
version String "0.0.17" Yes The latest version number of the video synthesis JSON configuration file
video_format String "mp4" No Video output formats can be MP4, WEBM, or MOV. If this field is not specified, the default is MP4. Among these, WEBM and MOV formats support alpha transparency.
resolution Int Array [1080,1920] Yes Video resolution, it is recommended to choose one of the three vertical formats [480,854], [720,1280], [1080,1920], the resolution of the human model is 2K (1080 * 1920) and 4K (2160 * 3840), different resolutions require adjusting the digital human figure ratio to achieve better effects, for example, when choosing [1080,1920] resolution, it is recommended to adjust the 4K digital human figure scale parameter to about 0.5
bit_rate Float 8 No Video bitrate (Mbps), maximum value 16, minimum value 1
frame_rate Integer 25 Yes Video frame rate, currently only supports 25fps
watermark Object No Video watermark
 show Boolean True Yes Whether to display video watermark
 content String "Testing Test " Yes Video watermark content
invisible-watermark Object No Invisible video watermark, only supports mp4
 show Boolean True Yes Whether to enable invisible video watermark
 content String "1234567890123456 " No Invisible watermark text, Chinese characters are not allowed! Only English letters and numbers are allowed, a total of 16 characters, if less than 16 characters, the rest will be automatically filled with 0, if more than 16, the first 16 characters will be taken. If this field is not filled in, the watermark "AI Synthesis" will be automatically filled in, the watermark is located at the bottom right corner of the image, and the watermark disappears after three seconds
digital_role Object Content as below
 id Integer 1 No Digital human id
 face_feature_id String "1" Yes Digital human face feature id
 name String "Xiao Li" No Digital human name
 url String "https://xxx/role.zip" Yes Digital human figure zip package
 position Object Content as below Yes The starting pixel position of the digital human figure image, with the top left corner of the 1080*1920 resolution canvas as the origin, to the right as the x direction, and down as the y direction
  x Integer 0 Yes x-direction coordinate value
  y Integer 0 Yes y-direction coordinate value
 scale Float 1.0 Yes Digital human figure ratio
 rotation Float 0.0 Yes Rotation angle, value range [0.0,360.0], the opposite direction of the Y-axis of the canvas coordinate system is 0 degrees, the angle of the clockwise direction is the rotation angle, rotating with the center point of the image as the anchor
 volume Integer 0 No Digital human broadcast volume, size value range 0~100.Note: Minimum version requirement: 0.0.13.
 z_position Integer 0 Yes Layer order, each zposition must not repeat, the larger the number, the more forward it is displayed. Note: Minimum version requirement: 0.0.6 (mandatory field).
 start_frame_index Integer 0 No The starting frame of the composite video, the value range is [1, N], if the input parameter range is not within the range, it will directly return an error.Note: Minimum version requirement: 0.0.14. It is not recommended to set this parameter for premium digital humans
tts_config Object Content as below Yes TTS configuration. Either tts_query or audios must exist, when both tts_query and audios are present, tts_query takes precedence
 qid String 8wfZav:AEA_Z10Mqp9GCwDGMrz8xIzi3VScxNzUtLCh No Filling in this field will override the voiceid, language, and vendor_id fields.
 id String "zh-CN-XiaoxuanNeural" Yes Speaker id, same as voiceID
 name String "Xiaoxuan" Yes Speaker name
 vendor_id Integer 4 Yes Vendor id, i.e., vendor_id, needs to be consistent with the used TTS voice model information. Randomly filling in will result in errors. Do not fill in randomly. When using qid, this field can be absent.
 language String "zh-CN" Yes Language code
 pitch_offset Float 0.0 Yes Pitch, the higher the value, the sharper the sound, the lower the value, the deeper the sound, support range [-60, 60]
 speed_ratio Float 1 Yes Speech rate, the higher the value, the slower the speech rate, support range [0.5, 2]
 volume Integer 100 Yes Volume, the higher the value, the louder the sound, support range [1, 400]
tts_query Object Content as below No TTS voice synthesis. Either tts_query or audios must exist, when both tts_query and audios are present, tts_query takes precedence
 content String "Dear audience, hello! I am very honored to be able to gather with everyone at this beautiful moment, welcome to watch today's program." Yes The text content for voice synthesis, the number of characters must not be less than 10, all language speakers can synthesize English queries; all language speakers can synthesize queries in their own language; Cantonese, Shanghainese, and other Chinese dialect speakers can synthesize Chinese queries
 use_action Boolean false false Is action editing supported in TTS text? The action definition in TTS text is as follows: if you want to insert an action, insert {action index:action number} at the corresponding position in the text. For example, {action index:0}. There is a space between "action" and "index". The action number can be obtained from the digital human's result JSON. If the user's TTS itself needs to output {action}, use ^{action } for escaping, so it is not extracted as an action.
 ssml Boolean false No Whether to use ssml, after opening, the query can use USSML and SSML, USSML is recommended
audios Object Array Content as below No Audio drive. Either tts_query or audios must exist, when both tts_query and audios are present, tts_query takes precedence
 url Object {"url":"https://xxx/audio.mp3"} Yes Array, supports multiple mp3 format driving audio files
subtitle Object Content as below No Subtitles
 url String "https://xxx/subtitle.srt" Yes Subtitle file list.Versions 0.0.13 and earlier only parse this field
 urls String Array ["https://xxx/subtitle.srt","https://xxx/subtitle.srt"] No Versions 0.0.14 and later prioritize parsing this field, if this field does not exist, then parse the url field.
Special case description: If the version number is greater than or equal to 0.0.14, and there are multiple audios in audios, at this time, the url field is still parsed, only one subtitle is displayed, this is a normal case.
 scale Float 1.0 Yes Text scaling ratio, value range 0~+∞, default is 1, the original reference size is font_size.
 position Object Content as below No The starting position of the subtitles, with the top left corner of the 1080*1920 resolution canvas as the origin, to the right as the x direction, and down as the y direction, the default position is below the video, and the subtitle effect is centered.Note: Minimum version requirement: 0.0.13.
  x Integer 0 No x-direction coordinate value
  y Integer 0 No y-direction coordinate value
 rgba Int Array [100,100,100,100] Yes Subtitle color, entered in rgba format, value range 0~255 【alpha channel not supported】
 font_size Integer 20 Yes Subtitle font size setting
 font_family String "Noto Sans S Chinese Black" Yes Font name, supported fonts see JSON supported font list
 stroke_width Float 2 No Stroke width, value range 0~+∞, default is 0, indicating stroke width.Note: Minimum version requirement: 0.0.10.
 stroke_rgba Int Array [100,100,100,100] No Subtitle stroke color, entered in rgba format, value range 0~255 【alpha channel not supported】.Note: Minimum version requirement: 0.0.10.
 background_rgba Int Array [100,100,100,100] Yes Subtitle background (font base color) color, value range 0~255. Alpha channel value of 0 means fully transparent.Note: Minimum version requirement: 0.0.10.
 opacity Float 0.5 No Subtitle layer transparency, value range 0~1. 0 means fully transparent, 1 means opaque.Note: Minimum version requirement: 0.0.10.
 subtitle_max_len Integer 10 No Maximum subtitle split length, default is 0, i.e., no length limit, if the maximum split word count is not set, the maximum length of subtitles occupies 80% of the canvas width, if exceeded, it will automatically wrap.Note: Minimum version requirement: 0.0.10.
 subtitle_cut_by_punc Boolean True No Whether to split by punctuation.Note: Minimum version requirement: 0.0.10.
 rotation Float 0.0 Yes Rotation angle, value range [0.0,360.0], the opposite direction of the Y-axis of the canvas coordinate system is 0 degrees, the angle of the clockwise direction is the rotation angle, rotating with the center point of the image as the anchor.Note: Minimum version requirement: 0.0.14.
 auto_font_size Boolean True No If not filled in, the default is True, the subtitle calculates the final display font size according to the formula, with the same font size setting as the foreground text and title, the display effect is different; False, the subtitle uses the same font size rule as the foreground text and title.
 sub_to_canvas_width_ratio Float 1.0 No If not filled in, the default is 1.0. This field indicates the proportion of the subtitle's width to the canvas width, value range (0, 2], if the input parameter <=0 or >2, then the value is reset to 1.0 by default. If it cannot be displayed in a single line, it will wrap.
backgrounds Object Array Content as below No Background
 type Integer 0 Yes 0: Image, supports jpg, png formats; 1: Video, supports mp4 format, frame rate requirement is above 25, no resolution requirement, videos of different resolutions are processed by scaling the short side to fill and scaling the video proportionally
 name String "Background" Yes Background name
 url String "https://xxx/bg.png" Yes Background file url, if no background image or video is set, Webm format shows black background; Mp4 format shows the background effect of the default frame
 rect Int Array [0,0,1080,1920] Yes 【Not supported yet】 Background starting position and size, referring to the 1080*1920 resolution canvas, the top left corner is (0,0), currently does not support customization, by default, the short side fills, and the long side scales proportionally
 cycle Boolean false Yes For videos, false: play once, true: loop
 start Integer 0 Yes Background start time, in ms
 duration Integer -1 Yes Background duration, in ms, -1 is the default value, indicating it exists throughout the video
 volume Integer 0 No Background video volume, the larger the value, the louder the sound, support range [0, 100], standard volume.Note: Minimum version requirement: 0.0.13.
background-musics Object Array Content as below No Background music
 url String "https://xxx/bgm.mp3" Yes Background music url
 volume Integer 100 Yes Volume, the larger the value, the louder the sound, support range [0, 100], standard volume of 100
 duration Integer -1 No Duration, in milliseconds. -1 is the default value, indicating that it will persist for the entire duration of the video. Once the duration time is reached, it will stop/disappear regardless of whether it is set to loop or not.
 start Integer 0 No Start time, in milliseconds. 0 is the default value, indicating that the background music starts playing from the 0th millisecond of the video.
 cycle Boolean True No false: play once, true: loop
foregrounds Object Array Content as below No
 type Integer 0 Yes 0: Image, supports jpg, png formats; 1: Video, supports mp4 format
 name String "Foreground" Yes
 url String "https://xxx/fg.png" Yes Foreground file url, images support png or jpg, videos support mp4 format
 rect Int Array [0,0,1080,1920] Yes Starting position and size, referring to the 1080*1920 resolution canvas
 rotation Float 0.0 Yes Rotation angle, value range [0.0,360.0], the opposite direction of the Y-axis of the canvas coordinate system is 0 degrees, the angle of the clockwise direction is the rotation angle, rotating with the center point of the image as the anchor
 cycle Boolean False No For videos, false: play once, true: loop, after the foreground video plays once, if it has not reached the specified duration, the foreground video stays on the last frame
 z_position Integer 2 Yes Layer order, each zposition must not repeat, the larger the number, the more forward it is displayed.Note: Minimum version requirement: 0.0.6 (mandatory field).
 start Integer 0 Yes Foreground start time, in ms
 duration Integer -1 Yes Foreground duration, in ms, -1 is the default value, indicating it exists throughout the video
 volume Integer 0 No Foreground video volume, the larger the value, the louder the sound, support range [0, 100], standard volume.Note: Minimum version requirement: 0.0.13.
foreground-texts Object Array As described below No Foreground Texts
 text String "Foreground Text" Yes Foreground text content
 scale Float 1.0 Yes Text scaling ratio, range 0~+∞, default is 1, original reference size is font_size.
 duration Integer -1 No Duration, in milliseconds. -1 is the default value, indicating that it will persist for the entire duration of the video. Once the duration time is reached, it will stop/disappear regardless of whether it is set to loop or not.
 start Integer 0 No Start time, in milliseconds. 0 is the default value, indicating that the foreground text starts playing from the 0th millisecond of the video.
 position Object As described below Yes Foreground text start position, with the top left corner of the 1080*1920 resolution canvas as the origin, right as the x direction, down as the y direction
  x Integer 0 Yes x-direction coordinate value
  y Integer 0 Yes y-direction coordinate value
 rgba Int Array [100,100,100,100] Yes Foreground text color, entered in rgba format, range 0~255 【alpha channel not supported】
 font_size Integer 20 Yes Foreground text font size setting
 font_family String "Noto Sans S Chinese Black" Yes Font name, see json for supported fonts list
 stroke_width Float 2 No Stroke width, range 0~+∞, default is 0, indicating stroke width
 stroke_rgba Int Array [100,100,100,100] No Foreground text stroke color, entered in rgba format, range 0~255 【alpha channel not supported】
 background_rgba Int Array [100,100,100,100] Yes Foreground text background (font bottom) color, range 0~255. Alpha channel of 0 means fully transparent. Note: Minimum version requirement: 0.0.10.
 opacity Float 0.5 No Foreground text layer transparency, range 0~1. 0 means fully transparent, 1 means opaque. Note: Minimum version requirement: 0.0.10.
 z_position Integer 2 Yes Layer order, each z_position must be unique, the larger the number, the more forward it is displayed. Note: Minimum version requirement: 0.0.8 (mandatory field).
 rotation Float 0.0 Yes Rotation angle, range [0.0,360.0], the counter-direction of the Y-axis of the canvas coordinate system is 0 degrees, the angle in the clockwise direction is the rotation angle, rotating around the center point of the image when rotating. Note: Minimum version requirement: 0.0.14.
title Object Array As described below No Title text, its layer is above digital humans, background, and foreground text. Note: Minimum version requirement: 0.0.10.
 text String "Title Text" Yes Title text content
 scale Float 1.0 Yes Text scaling ratio, range 0~+∞, default is 1, original reference size is font_size.
 position Object As described below Yes Title text start position, with the top left corner of the 1080*1920 resolution canvas as the origin, right as the x direction, down as the y direction
  x Integer 0 Yes x-direction coordinate value
  y Integer 0 Yes y-direction coordinate value
 rgba Int Array [100,100,100,100] Yes Title text color, entered in rgba format, range 0~255 【alpha channel not supported】
 font_size Integer 20 Yes Title text font size setting. Unit is px.
 font_family String "Noto Sans S Chinese Black" Yes Font name, see json for supported fonts list
 stroke_rgba Int Array [100,100,100,100] No Title text stroke color, entered in rgba format, range 0~255 【alpha channel not supported】
 stroke_width Float 2 Yes Stroke width, range 0~+∞, default is 0, indicating stroke width
 background_rgba Int Array [100,100,100,100] Yes Title text background (font bottom) color, range 0~255. Alpha channel of 0 means fully transparent. 【alpha channel not supported】
 opacity Float 0.5 No Title text layer transparency, range 0~1. 0 means fully transparent, 1 means opaque
 rotation Float 0.0 Yes Rotation angle, range [0.0,360.0], the counter-direction of the Y-axis of the canvas coordinate system is 0 degrees, the angle in the clockwise direction is the rotation angle, rotating around the center point of the image when rotating. Note: Minimum version requirement: 0.0.14.
effects Object As described below No
 version String "1.0" Yes Effects engine version
 beautify Object As described below No Beautification
  whitenStrength Float 0.3 No [0,1.0] Whitening, default value 0.30, 0.0 means no whitening
  whiten_mode Integer 0 No Whitening mode: 0 (pinkish white), 1 (natural white), 2 (natural white only in skin areas)
  reddenStrength Float 0.36 No [0,1.0] Rosiness, default value 0.36, 0.0 means no rosiness
  smoothStrength Float 0.74 No [0,1.0] Smoothing, default value 0.74, 0.0 means no smoothing
  smooth_mode Integer 0 No Smoothing mode: 0 (face area smoothing), 1 (whole picture smoothing), 2 (detailed face area smoothing)
  shrinkRatio Float 0.11 No [0,1.0] Slim face, default value 0.11, 0.0 means no slim face effect
  enlargeRatio Float 0.13 No [0,1.0] Big eyes, default value 0.13, 0.0 means no big eyes effect
  smallRatio Float 0.10 No [0,1.0] Small face, default value 0.10, 0.0 means no small face effect
  narrowFace Float 0.0 No [0,1.0] Narrow face, default value 0.0, 0.0 means no narrow face
  roundEyesRatio Float 0.0 No [0,1.0] Round eyes, default value 0.0, 0.0 means no round eyes
  thinFaceShapeRatio Float 0.0 No [0,1.0] Thin face shape, default value 0.0, 0.0 means no thin face shape effect
  chinLength Float 0.0 No [-1, 1] Chin length, default value 0.0, [-1, 0] for short chin, [0, 1] for long chin
  hairlineHeightRatio Float 0.0 No [-1, 1] Hairline, default value 0.0, [-1, 0] for low hairline, [0, 1] for high hairline
  appleMusle Float 0.0 No [0, 1.0] Apple muscle, default value 0.0, 0.0 means no apple muscle
  narrowNoseRatio Float 0.0 No [0, 1.0] Slim nose, slim nostrils, default value 0.0, 0.0 means no slim nose
  noseLengthRatio Float 0.0 No [-1, 1] Nose length, default value 0.0, [-1, 0] for short nose, [0, 1] for long nose
  profileRhinoplasty Float 0.0 No [0, 1.0] Side profile nose lift, default value 0.0, 0.0 means no side profile nose lift effect
  mouthSize Float 0.0 No [-1, 1] Mouth size, default value 0.0, [-1, 0] for enlarging mouth, [0, 1] for shrinking mouth
  philtrumLengthRatio Float 0.0 No [-1, 1] Philtrum length, default value 0.0, [-1, 0] for long philtrum, [0, 1] for short philtrum
  eyeDistanceRatio Float 0.0 No [-1, 1] Adjusting eye distance, default value 0.0, [-1, 0] for decreasing eye distance, [0, 1] for increasing eye distance
  eyeAngleRatio Float 0.0 No [-1, 1] Eye angle, default value 0.0, [-1, 0] for counterclockwise rotation of the left eye, [0, 1] for clockwise rotation of the left eye, the right eye rotates oppositely
  openCanthus Float 0.0 No [0, 1.0] Open canthus, default value 0.0, 0.0 means no open canthus
  shrinkJawbone Float 0.0 No [0, 1.0] Slim jawbone ratio, default value 0.0, 0.0 means no slim cheekbone
  shrinkRoundFace Float 0.0 No [0, 1.0] Slim round face, default value 0.0, 0.0 means no slim face
  shrinkLongFace Float 0.0 No [0, 1.0] Slim long face, default value 0.0, 0.0 means no slim face
  shrinkGoddessFace Float 0.0 No [0, 1.0] Goddess slim face, default value 0.0, 0.0 means no slim face
  shrinkNaturalFace Float 0.0 No [0, 1.0] Natural slim face, default value 0.0, 0.0 means no slim face
  shrinkWholeHead Float 0.0 No [0, 1.0] Overall small head scaling, default value 0.0, 0.0 means no overall small head scaling effect
  contrastStrength Float 0.05 No [0,1.0] Contrast, default value 0.05, 0.0 means no contrast adjustment
  saturationStrength Float 0.1 No [0,1.0] Saturation, default value 0.10, 0.0 means no saturation adjustment
  sharpen Float 0.0 No [0, 1.0] Sharpening, default value 0.0, 0.0 means no sharpening
  clear Float 0.0 No [0, 1.0] Clarity strength, default value 0.0, 0.0 means no clarity
  bokehStrength Float 0.0 No [0, 1.0] Background blur strength, default value 0.0, 0.0 means no background blur
  eyeHeight Float 0.0 No [-1, 1] Eye position ratio, default value 0.0, [-1, 0] for moving eyes down, [0, 1] for moving eyes up
  mouthCorner Float 0.0 No [0, 1.0] Mouth corner lifting ratio, default value 0.0, 0.0 means no mouth corner adjustment
  hairline Float 0.0 No [-1, 1] New hairline height ratio, default value 0.0, [-1, 0] for low hairline, [0, 1] for high hairline
 packages Object Array As described below No Makeup parameters
  url String "https://xxx/res.zip" Yes Makeup resource URL, please contact customer service for makeup resource pack
  strength Float 0.3 Yes Makeup strength
 filter Object As described below No Filter parameters
  onlyFigure Boolean false Yes Whether the filter effect is only applied to the digital human, true for digital human only filter, false for global filter
  url String "https://xxx/res.zip" Yes Filter resource URL, please contact customer service for makeup resource pack
  strength Float 0.3 Yes Filter strength
# List of Fonts Supported by JSON
Language Font Name
Chinese Noto Sans S Chinese Black
Chinese Noto Sans S Chinese Bold
Chinese Noto Sans S Chinese DemiLight
Chinese Noto Sans S Chinese Light
Chinese Noto Sans S Chinese Medium
Chinese Noto Sans S Chinese Regular
Chinese Noto Sans S Chinese Thin
Chinese 仓耳渔阳体 W03
Chinese 站酷酷黑
Chinese 站酷快乐体2016修订版
Chinese 站酷庆科黄油体
Chinese 站酷文艺体
Chinese 站酷小薇LOGO体
Chinese 得意黑
Chinese 钉钉进步体
Chinese 阿里妈妈东方大楷
Chinese 阿里妈妈数黑体
Chinese 字魂扁桃体
Chinese 包图小白体
Chinese 庞门正道粗书体
Chinese 杨任东竹石体-Bold
Chinese 优设标题黑
Chinese Gen Jyuu Gothic Normal
Chinese 字制区喜脉体
Chinese 文道潮黑
Chinese Alibaba-PuHuiTi-Bold
Chinese Alibaba-PuHuiTi-Heavy
Chinese Alibaba-PuHuiTi-Light
Chinese Alibaba-PuHuiTi-Medium
Chinese Alibaba-PuHuiTi-Regular
Arabic mastollehregular-2oaxk
Korean HANDotumLVT
Korean HANDotumLVT-bold
Japanese SourceHanSansJP-Bold
Japanese SourceHanSansJP-ExtraLight
Japanese SourceHanSansJP-Heavy
Japanese SourceHanSansJP-Light
Japanese SourceHanSansJP-Medium
Japanese SourceHanSansJP-Normal
Japanese SourceHanSansJP-Regular
# JSON Example
{
	"version": "0.0.13",
	"video_format": "MP4",
	"resolution": [1080, 1920],
	"bit_rate": 8,
	"frame_rate": 25,
	"watermark": {
		"show": true,
		"content": "内部测试"
	},
	"digital_role": {
		"id": 4051,
		"face_feature_id": "0325_nina_s3_beauty",
		"name": "Nina",
		"url": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/materials/77/0325_nina_s3_beauty_20230523213912566.zip",
		"position": {
			"x": 0,
			"y": 0
		},
		"scale": 1.0,
		"z_position": 1,
		"rotation": 0.0
	},
	"tts_config": {
		"id": "xiaoyue",
		"name": "晓月",
		"vendor_id": 3,
		"language": "zh-CN",
		"pitch_offset": 0.0,
		"speed_ratio": 1,
		"volume": 100
	},
	"tts_query": {
		"content": "您好,尊贵的客户",
		"ssml": false
	},
	"audios": [{
		"url": "https://dhpoc.softsugar.com/adapter/static/9b158cc9-8e42-4d09-b928-49dd9941d922.mp3"
	}, {
		"url": "https://dhpoc.softsugar.com/adapter/static/9b158cc9-8e42-4d09-b928-49dd9941d922.mp3"
	}],
	"subtitle": {
		"url": "https://aigc.blob.core.chinacloudapi.cn/audio/tts-srt/823v6j88s1k7aobpe7wmqm83q_de347214-96f2-4246-b283-17f40fe6abba.srt",
		"position": {
			"x": 100,
			"y": 300
		},
		"rgba": [100, 200, 100, 100],
		"font_size": 20,
    "stroke_width": 5.0,
		"stroke_rgba": [255, 0, 0, 0],
		"opacity": 0.5,
		"background_rgba": [0, 255, 0, 200],
		"subtitle_max_len": 8,
		"subtitle_cut_by_punc": "True",
		"font_family": "Noto Sans S Chinese Black"
	},
	"backgrounds": [{
		"type": 0,
		"name": "背景",
		"url": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/background.png",
		"rect": [0, 0, 1080, 1920],
		"cycle": false,
		"start": 0,
		"duration": -1
	}],
	"background-musics": [{
		"url": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/mayahui/%E7%BE%A4%E6%98%9F%20-%20%E5%96%9C%E6%B4%8B%E6%B4%8B.mp3",
		"volume": 100,
		"cycle": false
	}],
	"foregrounds": [{
		"type": 0,
		"name": "前景",
		"url": "http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/frontgroud.png",
		"rect": [0, 0, 1080, 1920],
		"rotation": 0.0,
		"z_position": 0,
		"cycle": false,
		"start": 0,
		"duration": -1
	}],
	"foreground-texts": [{
		"text": "前景",
		"font_size": 20,
		"font_family": "Noto Sans S Chinese Black",
    "z_position": 10,
    "stroke_width": 5.0,
		"stroke_rgba": [255, 0, 0, 0],
    "opacity": 0.5,
		"position": {
			"x": 0,
			"y": 0
		},
    "background_rgba": [0, 255, 0, 200],
		"rgba": [100, 200, 100, 100]
	}],
  "title": {
		"text": "这是标题",
		"rgba": [100, 255, 255, 255],
		"position": {
			"x": 540,
			"y": 200
		},
		"font_size": 50,
		"font_family": "Noto Sans S Chinese Black",
		"stroke_width": 5.0,
		"stroke_rgba": [255, 0, 0, 0],
		"scale": 1.0,
		"opacity": 0.5,
		"background_rgba": [0, 255, 0, 200]
	},
	"effects": {
		"version": "1.0",
		"beautify": {
			"whitenStrength": 0.30,
			"whiten_mode": 0,
			"reddenStrength": 0.36,
			"smoothStrength": 0.74,
			"smooth_mode": 0,
			"shrinkRatio": 0.11,
			"enlargeRatio": 0.13,
			"smallRatio": 0.10,
			"narrowFace": 0.0,
			"roundEyesRatio": 0.0,
			"thinFaceShapeRatio": 0.0,
			"chinLength": 0.0,
			"hairlineHeightRatio": 0.0,
			"appleMusle": 0.0,
			"narrowNoseRatio": 0.0,
			"noseLengthRatio": 0.0,
			"profileRhinoplasty": 0.0,
			"mouthSize": 0.0,
			"philtrumLengthRatio": 0.0,
			"eyeDistanceRatio": 0.0,
			"eyeAngleRatio": 0.0,
			"openCanthus": 0.0,
			"brightEyeStrength": 0.0,
			"removeDarkCircleStrength": 0.0,
			"removeNasolabialFoldsStrength": 0.0,
			"whiteTeeth": 0.0,
			"shrinkCheekbone": 0.0,
			"thinnerHead": 0.0,
			"openExternalCanthus": 0.0,
			"shrinkJawbone": 0.0,
			"shrinkRoundFace": 0.0,
			"shrinkLongFace": 0.0,
			"shrinkGoddessFace": 0.0,
			"shrinkNaturalFace": 0.0,
			"shrinkWholeHead": 0.0,
			"contrastStrength": 0.05,
			"saturationStrength": 0.10,
			"sharpen": 0.0,
			"clear": 0.0,
      "eyeHeight": 0.0,
			"mouthCorner": 0.05,
			"hairline": 0.10,
			"bokehStrength": 0.0
		},
		"packages": [{
			"url": "https://xxx/xxx.zip",
			"strength": 0.3
		}, {
			"url": "https://xxx/xxx.model",
			"strength": 0.5
		}],
		"filter": {
			"onlyFigure": false,
			"url": "https://xxx/xxx.model",
			"strength": 0.5
		}
	}
}

# Batch Video Synthesis Task Creation

# Interface Description

Based on the specified content uploaded by the user, call the algorithm capability for batch video synthesis, and finally return a list of mp4 video files for the user to download. The PaaS platform supports 7 days of online storage, and it is necessary to transfer the data in time, as the content generated after 7 days will not be downloadable.

# Request URL

POST /api/2dvh/v1/material/video/batchCreate

# Request Header

Content-Type: application/json

# Request Parameters

In the format of a JSON array, the fields of the objects in the array are defined as follows:

Field Type Required Description
param String True Video generation parameters (this parameter is a string escaped from JSON)
videoRequestId String True Video synthesis ID, needs to be unique
videoName String True Video name
thumbnailUrl String False Thumbnail URL

# Request Example

[
  {
    "param": "video config",
    "videoName": "name",
    "videoRequestId": "aaa"
  },
  {
    "param": "video config",
    "videoName": "name",
    "videoRequestId": "bbb"
  }
]

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Exception
message String True Detailed information of exception
data Object False Data object, usually empty in case of exceptions
  - videoRequestId String True Video synthesis ID, needs to be unique
  - taskId Long True Task ID
  - description String True Description of task dispatch result

# Response Example

{
  "code": 0,
  "message": "success",
  "data": [
    {
      "videoRequestId": "aaa",
      "taskId": 26,
      "description": "Queue waiting"
    },
    {
      "videoRequestId": "bbb",
      "taskId": 27,
      "description": "Queue waiting"
    }
  ]
}

# Creation of TTS Personal Voice Model Generation Task (QID)

# Interface Description

The TTS personal voice model generation (QID) service can produce a digital human TTS voice model that matches the pronunciation effect of the voice material provider, based on the real human-collected or recorded voice material files and the voice cloning consent files uploaded by the user, through algorithm training. To ensure training effects, please adhere to the SenseTime digital human voice cloning collection and production standards during collection, which include requirements for environment, equipment, pronunciation, authorization, and reading scripts. For details, refer to: Collection Standards (opens new window). The PaaS platform supports 7 days of online storage, and it is necessary to transfer the data in time, as the content generated after 7 days will not be downloadable.

# Request URL

POST /api/2dvh/v1/material/voice/clone/qid/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
audioUrl String True URL of the training audio file. Supported formats: wav, mp3, m4a, mp4, mov, aac
audioLanguage String True The primary language used in the audio file. zh-CN for Mandarin Chinese, en-US for American English. Following BCP 47 standard
consent Object True User consent information
  - audioUrl String True URL of the consent audio file. The consent file should be recorded in the same environment and in the same language as the audio file.
Chinese:”我(发音人姓名)确认我的声音将会被(公司名称)使用于创建合成版本语音。”。
English: "I [state your first and last name] am aware that recordings of my voice will be used by [state the name of the company] to create and use a synthetic version of my voice."
Japanese: "私(姓名を記入)は自身の音声を(会社名を記入)が使用し、合成音声を作り使用されることに同意します。"
Korean: "나는 [본인의 이름을 말씀하세요] 내 목소리의 녹음을 이용해 합성 버전을 만들어 사용된다는 것을 [회사 이름을 말씀하세요]알고 있습니다."
Supported formats: wav, mp3, m4a, mp4, mov, aac
  - speakerName String True The name of the speaker in the consent audio file, must be consistent with the name of the speaker in the audio file. Length limit is no more than 64 characters
  - companyName String True The company name used in the consent file, must be consistent with the company name in the audio file. Length limit is no more than 64 characters
taskType String True Training algorithm type. TTS3, TTS6, TTS7, TTS8, TTS101. TTS3 is the default. For more requirements, please consult technical support
voice Object True Information about the speaker
  - name String True Name of the speaker. Length limit is no more than 64 characters
  - gender Integer True Gender of the speaker (1: Male, 2 : Female)
musicSep Boolean False Whether to remove background music from the audio (source separation)
trainMode String False Training mode, only effective for TTS3. common: regular training mode, default is common mode; backend_only: fast training mode, significantly compresses the model training duration, but will affect the outcome

# Request Example

{
  "audioUrl": "http://oss.com/abc/object.mp3",
  "audioLanguage": "zh-CN",
  "consent": {
      "audioUrl":"http://oss.com/abc/xx.mp3",
      "speakerName": "xiaowang",
      "companyName": "XXXX"
  },
  "taskType": "TTS3",
  "voice": {
    "name": "xiaotang0",
    "gender": 2
  },
  "musicSep": false,
  "trainMode": "common"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Exception
message String True Detailed information of exception
data Object False Task ID

# Response Example

{
    "code": 0,
    "message": "success",
    "data": 11890
}
# TTS Voice Training Audio Time Requirements
Training Algorithm Type Time Requirements
TTS3 At least 5 minutes, better if more than 20 minutes
TTS6 30-90 seconds
TTS7 30-300 seconds
TTS8 30-300 seconds
TTS101 At least 5 minutes, better if more than 20 minutes
# TTS Language Standards (BCP 47 Standard)
Code Language (Region)
en-US English (United States)
zh-CN Chinese (China)
af-ZA Afrikaans (South Africa)
am-ET Amharic (Ethiopia)
ar-EG Arabic (Egypt)
ar-SA Arabic (Saudi Arabia)
az-AZ Azerbaijani (Azerbaijan)
bg-BG Bulgarian (Bulgaria)
bn-BD Bengali (Bangladesh)
bn-IN Bengali (India)
bs-BA Bosnian (Bosnia and Herzegovina)
ca-ES Catalan (Spain)
cs-CZ Czech (Czech Republic)
cy-GB Welsh (United Kingdom)
da-DK Danish (Denmark)
de-AT German (Austria)
de-CH German (Switzerland)
de-DE German (Germany)
el-GR Greek (Greece)
en-AU English (Australia)
en-CA English (Canada)
en-GB English (United Kingdom)
en-IE English (Ireland)
en-IN English (India)
es-ES Spanish (Spain)
es-MX Spanish (Mexico)
et-EE Estonian (Estonia)
eu-ES Basque (Spain)
fa-IR Persian (Iran)
fi-FI Finnish (Finland)
fil-PH Filipino (Philippines)
fr-BE French (Belgium)
fr-CA French (Canada)
fr-CH French (Switzerland)
fr-FR French (France)
ga-IE Irish (Ireland)
gl-ES Galician (Spain)
he-IL Hebrew (Israel)
hi-IN Hindi (India)
hr-HR Croatian (Croatia)
hu-HU Hungarian (Hungary)
hy-AM Armenian (Armenia)
id-ID Indonesian (Indonesia)
is-IS Icelandic (Iceland)
it-IT Italian (Italy)
ja-JP Japanese (Japan)
jv-ID Javanese (Indonesia)
ka-GE Georgian (Georgia)
kk-KZ Kazakh (Kazakhstan)
km-KH Khmer (Cambodia)
kn-IN Kannada (India)
ko-KR Korean (South Korea)
lo-LA Lao (Laos)
lt-LT Lithuanian (Lithuania)
lv-LV Latvian (Latvia)
mk-MK Macedonian (North Macedonia)
ml-IN Malayalam (India)
mn-MN Mongolian (Mongolia)
ms-MY Malay (Malaysia)
mt-MT Maltese (Malta)
my-MM Burmese (Myanmar)
nb-NO Norwegian Bokmål (Norway)
ne-NP Nepali (Nepal)
nl-BE Dutch (Belgium)
nl-NL Dutch (Netherlands)
pl-PL Polish (Poland)
ps-AF Pashto (Afghanistan)
pt-BR Portuguese (Brazil)
pt-PT Portuguese (Portugal)
ro-RO Romanian (Romania)
ru-RU Russian (Russia)
si-LK Sinhala (Sri Lanka)
sk-SK Slovak (Slovakia)
sl-SI Slovenian (Slovenia)
so-SO Somali (Somalia)
sq-AL Albanian (Albania)
sr-RS Serbian (Serbia)
su-ID Sundanese (Indonesia)
sv-SE Swedish (Sweden)
sw-KE Swahili (Kenya)
ta-IN Tamil (India)
te-IN Telugu (India)
th-TH Thai (Thailand)
tr-TR Turkish (Turkey)
uk-UA Ukrainian (Ukraine)
ur-PK Urdu (Pakistan)
uz-UZ Uzbek (Uzbekistan)
vi-VN Vietnamese (Vietnam)
zh-HK Chinese (Hong Kong)
zh-TW Chinese (Taiwan)
zu-ZA Zulu (South Africa)

# Interface Description

The TTS personal voice model generation service can produce a digital human TTS voice model that matches the pronunciation effect of the voice material provider according to the real human voice material files uploaded by the user through algorithm training. To ensure the training effect, the training audio duration must not be less than 5 minutes. Please follow the SenseTime digital human voice copy collection production specification during collection, including environmental requirements, equipment requirements, pronunciation requirements, authorization requirements, and reading scripts. For details, refer to: Collection Specification (opens new window), The PaaS platform supports 7 days of online storage, which needs to be transferred in time, and the generated content will not be downloadable after 7 days.

# Request URL

POST /api/2dvh/v1/material/voice/clone/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
url String True Training audio file URL, duration not less than 5 minutes
voice Object True Voice parameters
  - name String True Name of the speaker
  - gender Integer True Gender of the speaker (1: Male, 2: Female)
  - language String True Language of the speaker (currently only supports zh-CN: Mandarin Chinese)
musicSep Boolean False Whether to remove background music from the audio
sampleAudioMsg String False Sample audio content text. By default, no sample audio is generated. Not more than 500 characters.
trainMode String False Training mode, common: regular training mode, default is common mode; backend_only: fast training mode, significantly compresses model training time, but will affect the result.

# Request Example

{
  "url": "http://oss.com/abc/object.zip",
  "voice": {
    "name": "xiaotang0",
    "gender": 2,
    "language": "zh-CN"
  },
  "sampleAudioMsg": "I am SenseTime Digital Human!",
  "musicSep": true,
  "trainMode": "common"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Error detailed information
data Object False Task id

# Response Example

{
    "code": 0,
    "message": "success",
    "data": 11890
}

# Create Character Image Model Generation Task

# Interface Description

Generate character image models based on one or more videos uploaded by the user and specified content by calling algorithm capabilities. It supports producing one or multiple model files in one training session. The algorithm finally returns a compressed package and thumbnail files of the character image model for the user to download. Please refer to Collection Specification (opens new window) for uploaded video content. The PaaS platform supports 7 days of online storage, which needs to be transferred in time, and the generated content will not be downloadable after 7 days. If the character model generation result is not satisfactory, please refer to the document's case solution for training parameter adjustment.

Supports ordinary digital human training and premium digital human training.

# Request Address

POST /api/2dvh/v1/material/2davatar/model/multi/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
materialName String True Name of the character model material, only one name is supported per training task
videoUrl String True Download address for the base base video material, the base video duration must be over 6 minutes
param String True Correct param information must be passed in to create multiple video character model generation tasks, including various parameters (this parameter is a string escaped from json), please refer to the parameter description and json example below

# Parameter Description for param

Field Type Required Description
personal String True Basic video parameters, can be overridden by auxiliary videos. If auxiliary video is not provided, the basic video parameters will be used for processing.
  - segmentStyle Integer True Background segmentation method: 0: no segmentation, 1: green screen segmentation, 2: general segmentation, 3: SDK green screen segmentation post-processing (GPU post-processing during video synthesis)
  - removeGreenEdge Boolean False Effective when segmentStyle=2, default is false, removes green edges around the character
  - greenParamsRefinethHBgr Integer False Effective when segmentStyle=1 or 3, default is 160, range 0-255; refine alpha high threshold (for red-green-blue background), used to adjust the degree of background retention, the larger the value, the more the background is retained
  - greenParamsRefinethLBgr Integer False Effective when segmentStyle=1 or 3, default is 40, range 0-255; refine alpha low threshold (for red-green-blue background), used to adjust the edge width of the human/body, the larger the value, the more is retained
  - greenParamsBlurKs Integer False Effective when segmentStyle=1 or 3, default is 3, smoothness; blur coefficient for de-noising, greater or equal to 0, the larger the smoothness, the smoother, affects the edges. If black edges appear, increase this value, if internal erosion occurs, decrease this value
  - greenParamsColorbalance Integer False Effective when segmentStyle=1 or 3, default is 100, green degree, range 0-100, the larger the value, the higher the green degree
  - greenParamsSpillByalpha Double False Effective when segmentStyle=1 or 3, default is 0.5, color balance for removing green, range [-1.0 ~ 1.0], 0 ~ 1 reduces color cast, -1 ~ 0 enhances color, less than 0.5 may cause yellow color cast, greater than 0.5 may cause cyan-blue color cast. If using blue screen segmentation, the default value should be changed to 0.0
  - greenParamsSamplePointBgr int[] False Effective when segmentStyle=1 or 3, sample colors, composed of three values, each with a range of 0-255, e.g., [0, 255, 0]. If using blue screen segmentation, the default value should be changed to [255, 0, 0]
  - assetStart Float False Video material clipping start time (seconds) (not valid for premium digital human)
  - assetEnd Float False Video material clipping end time (seconds) (not valid for premium digital human)
  - assetScale Float False Video material scale ratio (default 1.0)
  - actionChange Object False Parameters related to premium digital human. This set of parameters is valid when support=true, indicating that the trained digital human type is a premium digital human.
actionChange part is mutually exclusive with actionEdit part, please avoid setting both sets of parameters to true at the same time.
  -   - support Boolean True Whether premium digital human is supported, true for premium digital human.
  -   - staticRangeStart Float True Static material start time (seconds) (only supports premium digital human)
  -   - staticRangeEnd Float True Static material end time (seconds) (only supports premium digital human)
  -   - dynamicRangeStart Float True Dynamic material start time (seconds) (only supports premium digital human)
  -   - dynamicRangeEnd Float True Dynamic material end time (seconds) (only supports premium digital human)
  -   - gap Integer False Maximum interval frame number for clipping points (default 75)
  - actionEdit Object False Parameters related to action-editing digital human. This set of parameters is valid when support=true, indicating that the trained digital human type is an action-editing digital human.
actionChange part is mutually exclusive with actionEdit part, please avoid setting both sets of parameters to true at the same time.
  -   - support Boolean True Whether action-editing is supported, true for support.
  -   - videoPath String True Dynamic material file address
  -   - gap Integer False Maximum interval frame number for clipping points (default 25)
  -   - actionList Array True Action list
  -   -   - name String True Action name
  -   -   - clipRangeStart Float True Start time (seconds)
  -   -   - clipRangeEnd Float True End time (seconds)
  -   -   - description String False Text description of the action
persistent Object False Global parameters of the model. Cannot be overridden by auxiliary video parameters.
  - avatarType Integer False Number type, default is 0. (0: Numeric person, 1: Dynamic numeric person, 2: Action editing numeric person, 3: Fast numeric person)
  - videoCrfQuality Integer False Video coding quality parameter crf, the smaller the parameter, the better the quality but the larger the file size, default is 23, allowing a range of 0-51, recommended 14-28
  - stage1Config Array False Character model mouth shape training configuration, default is 0 indicating original mouth shape model; 1 indicating universal mouth shape model, users can manually switch between the two mouth shape models based on actual effects
  - dev Object False Video material model training configuration
  -   - stage2 Object False Video material model training configuration
  -   -   - config Integer False Video material model training configuration, model size, default is 0 indicating 2k resolution model; 1 indicating 4k resolution model
override Array False Auxiliary video information. (not valid for premium digital human parameter group and action-editing digital human parameter group)
  - videoUrl String True Auxiliary video URL. If auxiliary video is not configured, parameters in personal will be used
  - segmentStyle Integer False Background segmentation method: 0: no segmentation, 1: green screen segmentation, 2: general segmentation, 3: SDK green screen segmentation post-processing (GPU post-processing during video synthesis)
  - removeGreenEdge Boolean False Effective when segmentStyle=2, default is false, removes green edges around the character
  - greenParamsRefinethHBgr Integer False Effective when segmentStyle=1 or 3, default is 160, range 0-255; refine alpha high threshold (for red-green-blue background), used to adjust the degree of background retention, the larger the value, the more the background is retained
  - greenParamsRefinethLBgr Integer False Effective when segmentStyle=1 or 3, default is 40, range 0-255; refine alpha low threshold (for red-green-blue background), used to adjust the edge width of the human/body, the larger the value, the more is retained
  - greenParamsBlurKs Integer False Effective when segmentStyle=1 or 3, default is 3, smoothness; blur coefficient for de-noising, greater or equal to 0, the larger the smoothness, the smoother, affects the edges. If black edges appear, increase this value, if internal erosion occurs, decrease this value
  - greenParamsColorbalance Integer False Effective when segmentStyle=1 or 3, default is 100, green degree, range 0-100, the larger the value, the higher the green degree
  - greenParamsSpillByalpha Double False Effective when segmentStyle=1 or 3, default is 0.5, color balance for removing green, range [-1.0 ~ 1.0], 0 ~ 1 reduces color cast, -1 ~ 0 enhances color, less than 0.5 may cause yellow color cast, greater than 0.5 may cause cyan-blue color cast. If using blue screen segmentation, the default value should be changed to 0.0
  - greenParamsSamplePointBgr int[] False Effective when segmentStyle=1 or 3, sample colors, composed of three values, each with a range of 0-255, e.g., [0, 255, 0]. If using blue screen segmentation, the default value should be changed to [255, 0, 0]
  - assetStart Float False Video material clipping start time (seconds)
  - assetEnd Float False Video material clipping end time (seconds)
  - assetScale Float False Video material scale ratio (default 1.0)

# Request Example

{
  "materialName": "534",
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4",
  "param": "{\"personal\":{\"segmentStyle\":1,\"removeGreenEdge\":false,\"greenParamsRefinethHBgr\":180,\"greenParamsRefinethLBgr\":50,\"greenParamsBlurKs\":3,\"greenParamsColorbalance\":90,\"greenParamsSpillByalpha\":0.4,\"greenParamsSamplePointBgr\":[0,255,0],\"assetStart\":0.1,\"assetEnd\":0.6,\"assetScale\":1},\"persistent\":{\"videoCrfQuality\":23,\"stage1Config\":[0,1],\"dev\":{\"stage2\":{\"config\":1}}},\"override\":[{\"videoUrl\":\"https://aigc-video-saas.oss-cn-hangzhou.aliyuncs.com/AIGC/online/vendor/24/customization/1700120490581/package_1700120490581.mp4\",\"segmentStyle\":1,\"removeGreenEdge\":false,\"greenParamsRefinethHBgr\":180,\"greenParamsRefinethLBgr\":50,\"greenParamsBlurKs\":3,\"greenParamsColorbalance\":90,\"greenParamsSpillByalpha\":0.4,\"greenParamsSamplePointBgr\":[0,255,0],\"assetStart\":0.1,\"assetEnd\":0.6,\"assetScale\":1},{\"videoUrl\":\"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/demo.mp4\",\"segmentStyle\":1,\"removeGreenEdge\":false,\"greenParamsRefinethHBgr\":180,\"greenParamsRefinethLBgr\":50,\"greenParamsBlurKs\":3,\"greenParamsColorbalance\":90,\"greenParamsSpillByalpha\":0.4,\"greenParamsSamplePointBgr\":[0,255,0],\"assetStart\":0.1,\"assetEnd\":0.6,\"assetScale\":1}]}"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error information
data Object False Task id

# Response Example

{
    "code": 0,
    "message": "success",
    "data": 1
}

# Create Character Image Model Generation Task (Old Interface)

# Interface Description

Note: This interface only supports ordinary digital human model generation tasks, and this interface will not be updated with new content. It is recommended to use the create character image model generation interface.

Generate character image models based on user-uploaded specified content by calling algorithm capabilities, and finally return a compressed package and thumbnail file of the character image model for users to download. Please refer to the Collection Standard (opens new window) for uploaded content. The PaaS platform supports 7-day online storage, which needs to be transferred in time. The generated content will not be downloadable after 7 days.

# Request URL

POST /api/2dvh/v1/material/2davatar/model/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
materialName String True Name of the character model material
videoUrl String True Video material download address
segmentStyle Integer True Background segmentation method: 0: No segmentation, 1: Green screen segmentation, 2: Normal segmentation, 3: Green screen segmentation post-processing with SDK (GPU post-processing during video synthesis)
removeGreenEdge Boolean False Effective when segmentStyle=2, default is false, function to remove green edges around the character
greenParamsRefinethHBgr Integer False Effective when segmentStyle=1 or 3, default 160, range 0-255; refine alpha high threshold (for red/green/blue background), used to adjust the degree of background retention, the larger the value, the greater the degree of background retention
greenParamsRefinethLBgr Integer False Effective when segmentStyle=1 or 3, default 40, range 0-255; refine alpha low threshold (for red/green/blue background), used to adjust the width of the edge retention of the body/object, the larger the value, the more retention
greenParamsBlurKs Integer False Effective when segmentStyle=1 or 3, default 3, smoothness; blur coefficient for noise reduction, greater than or equal to 0, the greater the smoothness, the smoother it is, affecting edges, if black edges or color aberration appear on the edges, this value can be increased, if internal erosion appears on the edges, this value can be appropriately reduced
greenParamsColorbalance Integer False Effective when segmentStyle=1 or 3, default 100, degree of green removal, range 0-100, the larger the value, the higher the degree of green removal
greenParamsSpillByalpha Double False Effective when segmentStyle=1 and 3, default 0.5, green color balance adjustment, range [-1.0 ~ 1.0], 0 ~ 1 reduces color bias, -1 ~ 0 enhances color, less than 0.5 yellow will be biased, more than 0.5 cyan/blue will be biased, if using blue screen segmentation, the default value needs to be changed to 0.0
greenParamsSamplePointBgr int[] False Effective when segmentStyle=1 or 3, sampling color, consisting of three values, each ranging from 0-255, for example, [0, 255, 0], if using blue screen segmentation, the default value needs to be changed to [255, 0, 0]
videoCrfQuality Integer False Video encoding quality parameter crf, the smaller the parameter, the better the quality but the larger the file, default 23, allowable range 0-51, recommended 14-28
assetStart Float False Start time for trimming video material (seconds)
assetEnd Float False End time for trimming video material (seconds)
assetScale Float False Video material scaling ratio (default 1.0)
devStage2Config Integer False Video material model training configuration, model size, default is 0, indicating 2k precision model; 1 indicates 4k precision model
stage1Template Integer False Character model lip shape training configuration, default is 0 indicating the generation of the original lip shape model; 1 indicates the generation of the universal lip shape model, users can choose to manually switch between the two lip shape models according to the actual effect

# Request Example

Example when segmentStyle=0

{
  "materialName": "534",
  "segmentStyle": 1,
  "assetScale": 1,
  "videoCrfQuality": 21,
  "stage1Template": 0,
  "devStage2Config": 0,
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4"
}

Example when segmentStyle=1

{
  "materialName": "534",
  "segmentStyle": 1,
  "assetScale": 1,
  "devStage2Config": 0,
  "greenParamsRefinethHBgr": 167,
  "greenParamsRefinethLBgr": 17,
  "greenParamsBlurKs": 7,
  "greenParamsColorbalance": 97,
  "greenParamsSpillByalpha": 0.3,
  "greenParamsSamplePointBgr": [
    7,
    275,
    7
  ],
  "videoCrfQuality": 21,
  "stage1Template": 0,
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4"
}

Example when segmentStyle=2, only greenParamsSpillByalpha can change values, other parameters are entered by default

{
  "materialName": "534",
  "segmentStyle": 2,
  "devStage2Config": 0,
  "stage1Template": 0,
  "removeGreenEdge" : true,
  "assetScale": 1,
  "greenParamsSpillByalpha": 0.3,
  "videoCrfQuality": 21,
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4"
}

Example when segmentStyle=3

{
  "materialName": "534",
  "segmentStyle": 3,
  "assetScale": 1,
  "devStage2Config": 0,
  "stage1Template": 0,
  "greenParamsRefinethHBgr": 167,
  "greenParamsRefinethLBgr": 17,
  "greenParamsBlurKs": 7,
  "greenParamsColorbalance": 97,
  "greenParamsSpillByalpha": 0.3,
  "greenParamsSamplePointBgr": [
    7,
    275,
    7
  ],
  "videoCrfQuality": 21,
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4"
}

# Parameter Explanation

In general, default parameters can adapt to most scenarios. However, if scenario-specific issues arise, parameters need to be adjusted accordingly. Below are parameter suggestions for typical scenarios.

  1. General Scenario Parameters (default)

This scenario adapts to most parameters, i.e., the default values provided above.

  1. Adjusting parameters for unclear digital human images

Method 1: Lower the video encoding quality parameter (videoCrfQuality). Setting the value to 14 aligns the digital human material clarity with the original material. This method may slightly increase the material size.

Method 2: Add a sharpening value appropriately in the input request for video composition or live creation. Refer to the sharpen value under the beautify object in the JSON definition for more details.

Method 3: Choose 4k version digital human training.

  1. Adjusting parameters when black edges and slight green hue appear around the character (frequent occurrence, particularly in scenes with white clothing)

Refer to the following parameters for updating the character model (rebuilding), while lowering the background retention degree and character edge retention width. This method mainly adapts to green screen segmentation scenarios. Reference values:

{
  "materialName": "534",
  "segmentStyle": 1,
  "removeGreenEdge": false,
  "assetScale": 1,
  "devStage2Config": 0,
  "stage1Template": 0,
  "greenParamsRefinethHBgr": 90,
  "greenParamsRefinethLBgr": 10,
  "greenParamsBlurKs": 3,
  "greenParamsColorbalance": 100,
  "greenParamsSpillByalpha": -0.3,
  "greenParamsSamplePointBgr": [
    0,
    275,
    0
  ],
  "videoCrfQuality": 21,
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4"
}
  1. Adjusting parameters for green edges or overall green hue around the character

Decrease the green balance removal parameter; the smaller the parameter, the higher the degree of green removal, which may also lead to color distortion. For example, lemon yellow may turn into orange by removing green elements. A minimum setting of -0.3 is recommended. This method enhances color and is suitable when there is no yellow in the image. It supports both green screen segmentation and normal segmentation. Reference values:

{
  "materialName": "534",
  "segmentStyle": 2,
  "removeGreenEdge": true,
  "devStage2Config": 0,
  "stage1Template": 0,
  "greenParamsSpillByalpha": -0.3,
  "videoCrfQuality": 21,
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4"
}
  1. Adjusting parameters when large cheek movements result in gray edges on the cheek or neck

This occurs because the initial material segmentation results do not match the driven digital human cheek edge. Choose green screen segmentation post-processing (segmentStyle=3) for training. This method mainly adapts to green screen segmentation processed digital humans.

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Exception
message String True Exception detailed information
data Object False Task id

# Response Example

{
    "code": 0,
    "message": "success",
    "data": 1
}

# Create Character Image Model Update Task

# Interface Description

The motion clips displayed by the 2D digital human are extracted from the training video, by default, from the first second to 3 and a half minutes in length. If you are not satisfied with the motion clips of the 2D digital human, you can use this interface to modify the motion clips, adjusting the length and content of the displayed motions. It is recommended to use the same background segmentation method as the original model file generation when using the character image model update function to avoid abnormal effects caused by changing the segmentation method. The PaaS platform supports 7-day online storage, which needs to be transferred in time. The generated content will not be downloadable after 7 days.

# Request Address

POST /api/2dvh/v1/material/2davatar/model/rebuilding/video

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
materialName String True Name of the character model material
videoUrl String True Video material download address
modelUrl String True Download address of the original model file generated
segmentStyle Integer True Background segmentation method: 0: No segmentation, 1: Green screen segmentation, 2: Normal segmentation, 3: Green screen segmentation post-processing with SDK (GPU post-processing during video synthesis)
removeGreenEdge Boolean False Effective when segmentStyle=2, default is false, function to remove green edges around the character
greenParamsRefinethHBgr Integer False Effective when segmentStyle=1 or 3, default 160, range 0-255; refine alpha high threshold (for red/green/blue background), used to adjust the degree of background retention, the larger the value, the greater the degree of background retention
greenParamsRefinethLBgr Integer False Effective when segmentStyle=1 or 3, default 40, range 0-255; refine alpha low threshold (for red/green/blue background), used to adjust the width of the edge retention of the body/object, the larger the value, the more retention
greenParamsBlurKs Integer False Effective when segmentStyle=1 or 3, default 3, smoothness; blur coefficient for noise reduction, greater than or equal to 0, the greater the smoothness, the smoother it is, affecting edges, if black edges or color aberration appear on the edges, this value can be increased, if internal erosion appears on the edges, this value can be appropriately reduced
greenParamsColorbalance Integer False Effective when segmentStyle=1 or 3, default 100, degree of green removal, range 0-100, the larger the value, the higher the degree of green removal
greenParamsSpillByalpha Double False Effective when segmentStyle=1 and 3, default 0.5, green color balance adjustment, range [-1.0 ~ 1.0], 0 ~ 1 reduces color bias, -1 ~ 0 enhances color, less than 0.5 yellow will be biased, more than 0.5 cyan/blue will be biased, if using blue screen segmentation, the default value needs to be changed to 0.0
greenParamsSamplePointBgr int[] False Effective when segmentStyle=1 or 3, sampling color, consisting of three values, each ranging from 0-255, for example, [0, 255, 0], if using blue screen segmentation, the default value needs to be changed to [255, 0, 0]
videoCrfQuality Integer False Video encoding quality parameter crf, the smaller the parameter, the better the quality but the larger the file, default 23, allowable range 0-51, recommended 14-28
assetStart Float False Start time for trimming video material (seconds)
assetEnd Float False End time for trimming video material (seconds)
assetScale Float False Video material scaling ratio (default 1.0)
actionChange Object False Switch between dynamic and static material parameters
  - support Boolean True Whether to support material action switching
  - staticRangeStart Float True Start time of static material (seconds)
  - staticRangeEnd Float True End time of static material (seconds)
  - dynamicRangeStart Float True Start time of dynamic material (seconds)
  - dynamicRangeEnd Float True End time of dynamic material (seconds)
  - gap Integer False Maximum frame gap for cut-out points (default 75)
actionEdit Object False Parameters related to action lists, effective when support=true
  - support Boolean True Whether to support action editing, true to support.
  - videoPath String True URL of the dynamic material file
  - gap Integer False Maximum interval of frames between cut points (default 25)
  - actionList Array True List of actions
  -   - name String True Name of the action
  -   - clipRangeStart Float True Start time (in seconds)
  -   - clipRangeEnd Float True End time (in seconds)
  -   - description String False Text description of the action

# Request Example

{
  "materialName": "2d任务A",
  "videoUrl": "https://xxx.oss-cn-hangzhou.aliyuncs.com/xxx/audio1.mp4",
  "modelUrl": "https://xxx.oss-cn-hangzhou.aliyuncs.com/xxx/model1.zip",
  "assetStart": 0.0,
  "assetEnd": 120.0,
  "assetScale": 1.0,
  "segmentStyle": 1,
  "devStage2Config": 0,
  "stage1Template": 0,
  "greenParamsRefinethHBgr": 167,
  "greenParamsRefinethLBgr": 17,
  "greenParamsBlurKs": 7,
  "segmentGreenUseGpu":false,
  "greenParamsColorbalance": 97,
  "greenParamsSpillByalpha": 0.3,
  "greenParamsSamplePointBgr": [
    7,
    275,
    7
  ],
  "videoCrfQuality": 21
}

# Parameter Description

In most cases, the default parameters can handle most scenarios. However, if there are issues due to different scene performances, the parameters need to be adjusted accordingly. Below are some recommended parameters for typical scenarios.

  1. General Scene Parameters (Default)

This scenario is for most cases, using the default values provided above.

  1. Blurred Digital Human Screen Adjustment Parameters

Method 1: Lower the video encoding quality parameter (videoCrfQuality). When set to 14, the clarity of the digital human matches the original human material. This method may slightly increase the material size.

Method 2: Increase the sharp value in the input request of the synthesized video or live broadcast creation. Refer to the beautify object's sharpen value in the JSON definition.

Method 3: Choose 4K version to train the digital human, although resolution cannot be modified after the update.

  1. Black Edge with Slight Green Around Person Edges Adjustment Parameters (Common, especially in scenes with white clothes)

Refer to the parameters below for updating the person model (rebuilding). Also, lower the background retention and edge retention width. This method mainly fits green screen segmentation. Suggested values below:

{
  "materialName": "534",
  "segmentStyle": 1,
  "removeGreenEdge": false,
  "assetScale": 1,
  "greenParamsRefinethHBgr": 90,
  "greenParamsRefinethLBgr": 10,
  "greenParamsBlurKs": 3,
  "greenParamsColorbalance": 100,
  "greenParamsSpillByalpha": -0.3,
  "greenParamsSamplePointBgr": [
    0,
    275,
    0
  ],
  "videoCrfQuality": 21,
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4"
}
  1. Green Edge or Overall Green Hue Around Person Adjustment Parameters

Decrease the green balance parameter for stronger de-greening, although over-reduction may cause color shift (e.g., removing green elements from lemon yellow to become orange). Recommended minimum is -0.3. This method enhances color, suitable when there's no yellow in the screen, supporting both green screen and normal segmentation. Suggested values below:

{
  "materialName": "534",
  "segmentStyle": 2,
  "removeGreenEdge": true,
  "greenParamsSpillByalpha": -0.3,
  "videoCrfQuality": 21,
  "videoUrl": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4"
}
  1. Gray Edges on Person's Cheek with Large Cheek Movement While Talking Adjustment Parameters

This usually occurs due to a mismatch between initial material segmentation and driven edge of the digital human's cheek. Use green screen segmentation post-processing (segmentStyle=3) for training, suitable for green screen segmented digital humans.

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Exception
message String True Exception detailed information
data Object False Task id

# Response Example

{
    "code": 0,
    "message": "success",
    "data": 1
}

# Create Image Green Screen Segmentation Preview Task

# Interface Description

Image green screen segmentation effect preview

# Request Address

POST /api/2dvh/v1/material/2davatar/model/green/segment/image/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
materialName String True Name of the image green screen segmentation effect preview task
url String True Image material download address
param String True The image green screen segmentation effect preview task needs to pass in the correct param information, including various parameters (this parameter is a string after JSON escaping), please refer to the parameter description and json example below

# param Parameter Description

Field Type Required Description
greenParamsRefinethHBgr Integer False Default 160, range 70-220. Refine alpha high threshold (for red/green/blue background), used to adjust the degree of background retention, the larger the value, the greater the degree of background retention
greenParamsRefinethLBgr Integer False Default 40, range 10-80. Refine alpha low threshold (for red/green/blue background), used to adjust the width of the edge retention of the body/object, the larger the value, the more retention
greenParamsBlurKs Integer False Default 3, range: 1-24. Smoothness; blur coefficient for noise reduction, greater than or equal to 0, the greater the smoothness, the smoother it is, affecting edges, if black edges or color aberration appear on the edges, this value can be increased, if internal erosion appears on the edges, this value can be appropriately reduced
greenParamsColorbalance Integer False Default 100, degree of green removal, range 0-100, the larger the value, the higher the degree of green removal
greenParamsSpillByalpha Double False Default 0.5, green color balance adjustment, range [-1.0 ~ 1.0], 0 ~ 1 reduces color bias, -1 ~ 0 enhances color, less than 0.5 yellow will be biased, more than 0.5 cyan/blue will be biased, if using blue screen segmentation, the default value needs to be changed to 0.0
greenParamsSamplePointBgr int[] False Effective, sampling color, consisting of three values, each ranging from 0-255, for example, [0, 255, 0], if using blue screen segmentation, the default value needs to be changed to [255, 0, 0]
greenParamsSampleBackground object False Background parameters, please refer to the parameter description and json example below
# Explanation of the greenParamsSampleBackground Parameter
Field Type Required Description
color int[] False Default [0,255,0], RGB color value, range 0-255

# Request Example

{
  "materialName": "534",
  "url": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4",
  "param": "{\"green_params_refineth_h_bgr\":230,\"green_params_refineth_l_bgr\":70,\"green_params_blur_ks\":3,\"green_params_colorbalance\":100,\"green_params_spill_byalpha\":0,\"green_params_sample_point_bgr\":[0,255,0],\"green_params_sample_background\":{\"color\":[0,100,255]}}"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, others - Exception
message String True Detailed exception information
data Object False Task information
  - id Long True Task ID
  - url String True Image URL

# Response Example

{
  "code": 0,
  "message": "success",
  "data": 1
}

# Video Green Screen Segmentation Effect Preview

# Interface Description

Video green screen segmentation effect preview

# Request URL

POST /api/2dvh/v1/material/2davatar/model/green/segment/video/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
materialName String True The name of the video green screen segmentation effect preview task
url String True Basic base video material download address, basic video duration must be over 6min
param String True Correct param information required for the video green screen segmentation effect preview task, including various parameters (this parameter is a string after JSON escaping), please refer to the parameter description and JSON example below

# Param Parameter Description

Field Type Required Description
greenParamsRefinethHBgr Integer False Default 160, range 70-220. Refine alpha high threshold (for red, green, blue background), used to adjust the degree of background retention, the larger the value, the more background is retained
greenParamsRefinethLBgr Integer False Default 40, range 10-80. Refine alpha low threshold (for red, green, blue background), used to adjust the width of retention on the edges of humans/objects, the larger the value, the more is retained
greenParamsBlurKs Integer False Default 3, range: 1-24. Smoothness; noise reduction blur coefficient, greater than or equal to 0, the greater the smoothness, the smoother it is, affecting the edges, if black edges or color aberration occurs on the edges, this value can be increased, if the edges appear eroded, this value can be appropriately reduced
greenParamsColorbalance Integer False Default 100, degree of green removal, range 0-100, the higher the value, the higher the degree of green removal
greenParamsSpillByalpha Double False Default 0.5, green color balance removal, range [-1.0 ~ 1.0], 0 ~ 1 is to reduce color bias, -1 ~ 0 is to enhance color, less than 0.5 yellow will be biased, more than 0.5 cyan will be biased, if using blue screen segmentation, then the default value needs to be changed to 0.0
greenParamsSamplePointBgr int[] False Effective, sampling color, consisting of three values, each ranging from 0-255, for example [0, 255, 0], if using blue screen segmentation, then the default value needs to be changed to [255, 0, 0]
greenParamsSampleBackground object False Background parameter, please refer to the parameter description and JSON example below
# Explanation of the greenParamsSampleBackground Parameter
Field Type Required Description
color int[] False Default [0,255,0], RGB color value, range 0-255

# Request Example

{
  "materialName": "534",
  "url": "https://xxx/materials/33/demo_20230228104258028_20230720185601860.mp4",
  "param": "{\"green_params_refineth_h_bgr\":230,\"green_params_refineth_l_bgr\":70,\"green_params_blur_ks\":3,\"green_params_colorbalance\":100,\"green_params_spill_byalpha\":0,\"green_params_sample_point_bgr\":[0,255,0],\"green_params_sample_background\":{\"color\":[0,100,255]}}"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, others - Exception
message String True Detailed exception information
data Object False Task ID

# Response Example

{
  "code": 0,
  "message": "success",
  "data": 1
}

# Create Video Character Face Swap Task (Internal Test)

# Interface Description

Use algorithm capabilities based on the video content and template image uploaded by the user to process video character face swap, and finally return the processed video file and thumbnail for the user to download.

# Request URL

POST /api/2dvh/v1/material/face/swap/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
facePhotoUrl String True Template face photo for face swap
videoUrl String True Original video file for face swap
materialName String True Name of the face swap task

# Request Example

{
  "facePhotoUrl": "facePhotoUrl",
  "videoUrl": "videoUrl",
  "materialName": "materialName"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error message
data Object False Task ID

# Response Example

{
  "code": 0,
  "message": "success",
  "data": 1
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error message
data Object False Task ID

# Response Example

{
    "code": 0,
    "message": "success",
    "data": 1
}

# Get Information on a Specific Task

# Interface Description

Query the corresponding information and current status of the task based on the task ID provided by the user.

# Request URL

POST /api/2dvh/v1/task/info

# Request Header

Content-Type: application/json

# Request Parameters

JSON array format, with fields defined for objects within the array as follows:

Field Type Required Description
ids Long[] True List of task IDs

# Request Example

{
    "ids": [7,27]
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error message
data Object False Task information
  - id Long True Task ID
  - materialId Integer True Material ID
  - materialName String True Material name
  - algoType Integer True Task type
  - algoSubType1 String True Model specification for character model: 2K/4K, video synthesis: character model specification: 2K/4K
  - algoSubType2 String False Video synthesis: result format: webm/mp4
  - algoSubType3 String False Video synthesis: result frame rate
  - status Integer True Task status
  - extendParam String False Extended parameters
  - productParam String True Task result JSON string, different formats for different tasks.
  - startTime String True Algorithm start time (yyyy-MM-dd HH:mm:ss)
  - endTime String True Algorithm end time (yyyy-MM-dd HH:mm:ss)

# Response Example

{
    "code": 0,
    "message": "success",
    "data": {
        "id": 7,
        "materialId": 27,
        "materialName": "Sample Material",
        "algoType": 14,
        "algoSubType1": "2K",
        "algoSubType2": "mp4",
        "algoSubType3": "30fps",
        "status": 2,
        "extendParam": "",
        "productParam": "{}",
        "startTime": "2023-01-01 00:00:00",
        "endTime": "2023-01-01 00:10:00"
    }
}
# Case1: Video Synthesis
Field Type Required Description
duration Integer True Video synthesis duration (in milliseconds)
lastFrameIndex Integer True Video end frame
algoSubType1 String True Video synthesis: character model specification: 2K/4K
algoSubType2 String True Video synthesis: result format: webm/mp4
algoSubType3 String True Video synthesis: result frame rate
thumbPath String True Thumbnail download URL (valid for 7 days)
videoPath String True Video download URL (valid for 7 days)
# Case1 Return Example:
{
    "code": 0,
    "message": "success",
    "data": [
        {
            "id": 913318,
            "materialId": 854513,
            "materialName": "913288",
            "productParam": "{\"duration\": 880, \"thumbPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxxx/thumb.png\", \"videoPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxxx/video.mp4\"}",
            "extendParam": null,
            "startTime": "2024-05-27 16:38:54",
            "endTime": "2024-05-27 16:39:03",
            "status": 5,
            "message": "{\"time_info\": {\"parse_json\": {\"avg\": 15, \"end\": \"2024-05-27 16:38:52.913\", \"sum\": 15, \"start\": \"2024-05-27 16:38:52.897\"}, \"preprocess\": {\"avg\": 342, \"end\": \"2024-05-27 16:38:56.134\", \"sum\": 342, \"start\": \"2024-05-27 16:38:55.792\"}, \"postprocess\": {\"avg\": 380, \"end\": \"2024-05-27 16:39:02.351\", \"sum\": 380, \"start\": \"2024-05-27 16:39:01.971\"}, \"main_process\": {\"avg\": 5837, \"end\": \"2024-05-27 16:39:01.971\", \"sum\": 5837, \"start\": \"2024-05-27 16:38:56.134\"}, \"audio_process\": {\"avg\": 123.91666412353516, \"sum\": 2974}, \"video_process\": {\"avg\": 9.47826099395752, \"sum\": 218}, \"wait_srt_stream\": {\"avg\": 0, \"sum\": 0}, \"send_task_response\": {\"start\": \"2024-05-27 16:39:02.775\"}, \"receive_task_from_agent\": {\"start\": \"2024-05-27 16:38:52.897\"}, \"st_mobile_change_package\": {\"avg\": 1832, \"end\": \"2024-05-27 16:38:54.745\", \"sum\": 1832, \"start\": \"2024-05-27 16:38:52.913\"}}, \"video_info\": {\"fps\": 25, \"format\": \"mp4\", \"digital_type\": \"2K\", \"last_frame_index\": 22}}",
            "algoType": 14,
            "algoId": "8216eaea6xxxxxx2e0798d21",
            "algoSubType1": "2K",
            "algoSubType2": "mp4",
            "algoSubType3": "25",
            "isDelete": 0
        }
    ]
}
# Case 2: Character Model Generation
Field Type Required Description
thumbPath String True Download URL for the thumbnail of the character model generated from the base video (valid for 7 days)
multi Array True Model results
  - width String True Width
  - height String True Height
  - pkgPath String True Download URL for the character model (valid for 7 days); not returned when generating models from multiple videos
  - thumbPath String True Download URL for the thumbnail of the character model generated from the video (valid for 7 days)
  - faceFeatureId String True Face feature ID
  - userJson String True Training parameters
  - avatarResultJson String True Training results
# Case2 Return Example:
{
    "code": 0,
    "message": "success",
    "data": [
        {
            "id": 908438,
            "materialId": 850297,
            "materialName": "蘇xxx",
            "productParam": "{\"multi\": [{\"common\": {\"pkgPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.x.com/x/116/xxxx/input_source/2/xx.zip\", \"userJson\": \"https://dwg-aigc-paas.oss-cnxxxx/input_source/2/xxx.json\", \"thumbPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxx/input_source/2/xxx.png\", \"faceFeatureId\": \"xxxxx\"}, \"origin\": {\"pkgPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/2bd5e869e94d4995967428fa7ad7cf49_s1/input_source/0/xxx.zip\", \"userJson\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxx/input_source/0/xxx.json\", \"thumbPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxx/input_source/0/xxx.png\", \"faceFeatureId\": \"2bd5e869e94dxxxxxa7ad7cf49_s1_0\"}, \"videoUrl\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/xxxxB.mp4\"}], \"thumbPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxx/input_source/0/xxx.png\"}",
            "extendParam": null,
            "startTime": "2024-05-22 18:14:21",
            "endTime": "2024-05-22 21:57:19",
            "status": 5,
            "message": "{}",
            "algoType": 12,
            "algoId": "2bd5exxxxxa7ad7cf49_s1",
            "algoSubType1": "4K",
            "algoSubType2": "multi",
            "algoSubType3": "normal",
            "isDelete": 0
        }
    ]
}
# Case 3: (Multiple Videos) Character Model Generation (/model/multi/create Interface Result)
Field Type Required Description
thumbPath String True Download URL for the thumbnail of the character model generated from the base video (valid for 7 days)
multi Array False Results of the character model for multiple videos, in array form
  - videoUrl String True URL of the original video file
  - orgin Object True Original lip-sync character model object (when stage1Template parameter is 0)
  -   - thumbPath String True Download URL for the thumbnail of the character model (valid for 7 days)
  -   - pkgPath String True Download URL for the character model (valid for 7 days)
  -   - faceFeatureId String True Face Feature Id
  - common Object True Common lip-sync character model object (when stage1Template parameter is 1)
  -   - thumbPath String True Download URL for the thumbnail of the character model (valid for 7 days)
  -   - pkgPath String True Download URL for the character model (valid for 7 days)
  -   - faceFeatureId String True Face Feature Id
# case3 productParam Return Sample:
{
	"multi": [{
		"common": {
			"pkgPath": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/xxx47_s1_input_source_2_result.zip",
			"userJson": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/ss_user.json",
			"thumbPath": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/xxlt.png",
			"faceFeatureId": "8c19c600a75addd9e666eca06413f47_s1_1",
                         "avatarResultJson": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/xxlt.Json"
		},
		"origin": {
			"pkgPath": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/aa_0_result.zip",
			"userJson": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/bbuser.json",
			"thumbPath": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/xx0_result.png",
			"faceFeatureId": "8c19c600a75a4f323e666eca06413f47_s1_0",
                         "avatarResultJson": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/xxlt.Json"
		},
		"videoUrl": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/xx.mp4"
	}],
	"thumbPath": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/xx.png"
}
# Case 4: Character Model Update
Field Type Required Description
thumbPath String True Download URL for the thumbnail of the character model (valid for 7 days)
pkgPath String True Download URL for the character model (valid for 7 days)
modelInfo String True Character model: Model specification: 2K/4K
# Case 4 Return Example:
{
    "code": 0,
    "message": "success",
    "data": [
        {
            "id": 890212,
            "materialId": 833961,
            "materialName": "KURUMI",
            "productParam": "{\"pkgPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxxx/xxx.zip\", \"userJson\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxxx/xxxx.json\", \"modelInfo\": \"2K\", \"thumbPath\": \"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/download/116/xxx/xxxx.png\"}",
            "extendParam": null,
            "startTime": "2024-05-10 10:51:35",
            "endTime": "2024-05-10 11:50:24",
            "status": 5,
            "message": "{}",
            "algoType": 18,
            "algoId": "cut_bf9c19046exxxxxb9af791587_s1",
            "algoSubType1": "2K",
            "algoSubType2": null,
            "algoSubType3": "normal",
            "isDelete": 0
        }
    ]
}
Field Type Required Description
taskId String True Corresponding task ID
voice Object True Voice information
  - id String True Voice ID
  - name String True Speaker's name
  - gender Integer True Speaker's gender. 0 = Not known, 1 = Male, 2 = Female, 9 = Not applicable
  - language String True Speaker's language. zh-CN for Mandarin Chinese, en-US for American English
  - vendor_id Integer True Voice vendor ID
taskStatus Integer True Task status. 1 = Queuing, 2 = In progress, 3 = Cancelled, 5 = Completed, 9 = Exception
msg String True Task status description
stage String True Task sub-step. preprocess: Data preprocessing, label: Data labeling, training: Model training, deployment: Deployment phase
stageStatus Integer True Stage status. 1 = Queuing, 2 = In progress, 5 = Completed, 9 = Exception
sampleAudioUrl String False Sample audio URL (valid for 7 days)
tenant String True Tenant owning the task
updatedTime String True Task information update time in RFC3339 format.
modelUrl String False Model download URL upon successful task, internal network only. Not publicly available
# case6: Video Character Face Swap
Field Type Required Description
thumbPath String True Thumbnail download URL (valid for 7 days)
pkgPath String True Character image download URL (valid for 7 days)
# case7: TTS -Qid Voice Model Generation
Field Type Required Description
taskId String True Corresponding task ID
voice Object True Voice information
  - qid String True Voice QID
  - name String True Speaker's name
  - gender Integer True Speaker's gender. 0 = Not known, 1 = Male, 2 = Female, 9 = Not applicable
  - languages String True List of languages supported by the speaker. Returns only upon task completion. zh-CN for Mandarin Chinese, en-US for American English
taskType String True Voice training algorithm type
taskStatus Integer True Task status. 1 = Queuing, 2 = In progress, 3 = Cancelled, 5 = Completed, 9 = Exception
msg String True Task status description
stage String True Task sub-step. preprocess: Data preprocessing, label: Data labeling, training: Model training, deployment: Deployment phase
stageStatus Integer True Stage status. 1 = Queuing, 2 = In progress, 5 = Completed, 9 = Exception
sampleAudioUrl String False Sample audio URL (valid for 7 days)
tenant String True Tenant owning the task
updatedTime String True Task information update time in RFC3339 format.

extendParam character model parameter information

Field Type Required Description
faceFeatureId String True Face Feature Id
# case7 Return Example:
{
    "code": 0,
    "message": "success",
    "data": [
        {
            "id": 206216,
            "materialId": 1990411,
            "materialName": "TTS_yunynsent",
            "productParam": "{\"msg\":\"task is finished\",\"stage\":\"deployment\",\"voice\":{\"qid\":\"VQ1fQv:AEAygt1ixxxxxxdRPNLE11kg1TLXWSzMxNExLTksK\",\"name\":\"TTS6_yunyxxxxxen_consent\",\"gender\":1,\"languages\":[\"en-US\",\"zh-CN\",\"af-ZA\",\"am-ET\",\"ar-EG\",\"ar-SA\",\"az-AZ\",\"bg-BG\",\"bn-BD\",\"bn-IN\",\"bs-BA\",\"ca-ES\",\"cs-CZ\",\"cy-GB\",\"da-DK\",\"de-AT\",\"de-CH\",\"de-DE\",\"el-GR\",\"en-AU\",\"en-CA\",\"en-GB\",\"en-IE\",\"en-IN\",\"es-ES\",\"es-MX\",\"et-EE\",\"eu-ES\",\"fa-IR\",\"fi-FI\",\"fil-PH\",\"fr-BE\",\"fr-CA\",\"fr-CH\",\"fr-FR\",\"ga-IE\",\"gl-ES\",\"he-IL\",\"hi-IN\",\"hr-HR\",\"hu-HU\",\"hy-AM\",\"id-ID\",\"is-IS\",\"it-IT\",\"ja-JP\",\"jv-ID\",\"ka-GE\",\"kk-KZ\",\"km-KH\",\"kn-IN\",\"ko-KR\",\"lo-LA\",\"lt-LT\",\"lv-LV\",\"mk-MK\",\"ml-IN\",\"mn-MN\",\"ms-MY\",\"mt-MT\",\"my-MM\",\"nb-NO\",\"ne-NP\",\"nl-BE\",\"nl-NL\",\"pl-PL\",\"ps-AF\",\"pt-BR\",\"pt-PT\",\"ro-RO\",\"ru-RU\",\"si-LK\",\"sk-SK\",\"sl-SI\",\"so-SO\",\"sq-AL\",\"sr-RS\",\"su-ID\",\"sv-SE\",\"sw-KE\",\"ta-IN\",\"te-IN\",\"th-TH\",\"tr-TR\",\"uk-UA\",\"ur-PK\",\"uz-UZ\",\"vi-VN\",\"zh-HK\",\"zh-TW\",\"zu-ZA\"]},\"taskId\":\"tts6-xxx-xxxx-xxx-xx-789308\",\"tenant\":\"0\",\"taskType\":\"TTS6\",\"taskStatus\":5,\"stageStatus\":5,\"updatedTime\":\"2024-05-29T09:41:51.373802578Z\",\"sampleAudioUrl\":\"\"}",
            "extendParam": null,
            "startTime": "2024-05-29 17:38:31",
            "endTime": "2024-05-29 17:41:51",
            "status": 5,
            "message": "{\"tts resp msg\": \"task is finished\"}",
            "algoType": 41,
            "algoId": "f627-a980-78aba9c20308",
            "algoSubType1": null,
            "algoSubType2": null,
            "algoSubType3": null,
            "isDelete": 0
        }
    ]
}

# Interface Description

Query all task information under a certain algorithm type for a user's account based on their account ID, as well as the current status of the tasks. The task list supports pagination.

# Request URL

POST /api/2dvh/v1/task/listByAccount

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
algoType Integer True Task type (11: TTS voice model generation, 12: character image model generation, 14: video synthesis, 20: video character face swap, 18: character image model update, 25: voice conversion, 32: image green screen preview, 33: video green screen preview, 41: TTS V3 voice model generation)
pageNo int False Current page number (default 1)
pageSize int False Number of items per page (default 10)
sortName String False Sort field name
sortValue String False Sort order: asc, desc

# Request Example

{
  "algoType": 12,
  "pageSize": 10,
  "pageNo": 1
}

# Response Elements

Field Type Required Description
code Number True 0 - Success, others - Exception
message String True Detailed exception information
data Object False Data object, usually empty in case of exception
  - pagination Pagination True Pagination information (refer to common data structure description)
  - result Object True Task list (refer to the description below)

Task List

Field Type Required Description
id Long True Task ID
algoType Integer True Task type (11: TTS voice model generation, 12: character image model generation, 14: video synthesis, 20: video character face swap, 18: character image model update, 25: voice conversion, 32: image green screen preview, 33: video green screen preview, 41: TTS V3 voice model generation)
algoSubType1 String True Character model: model specification: 2K/4K, video synthesis: using character model specification: 2K/4K
algoSubType2 String False Video synthesis: result format: webm/mp4
algoSubType3 String False Video synthesis: result frame rate
status Integer True Task status 0: not started, 1: Dispatcher queue waiting, 2: algorithm processing, 3: canceled, 5: completed, 9: exception
productParam String True Task result JSON string, including video address videoPath, video duration duration, thumbnail address thumbPath
startTime String True Algorithm start time (yyyy-MM-dd HH:mm:ss)
endTime String True Algorithm end time (yyyy-MM-dd HH:mm:ss)

# Response Example

{
  "code": 0,
  "message": "success",
  "data": {
    "pagination": {
      "pageNo": 1,
      "numberPages": 1,
      "numberRecords": 2,
      "pageSize": 2,
      "startIndex": 0
    },
    "result": [
      {
        "id": 27,
        "algoType": 12,
        "algoSubType1": "4K",
        "algoSubType2": null,
        "algoSubType3": null,
        "productParam": "\"{\\\"duration\\\":1234,\\\"thumbPath\\\":\\\"https://oss-cn-hangzhou.aliyuncs.com/dwg-aigc-paas/materials/a8610d001aaa412ab2e0433fc848b48f/thumb.jpg\\\",\\\"videoPath\\\":\\\"https://oss-cn-hangzhou.aliyuncs.com/dwg-aigc-paas/materials/a8610d001aaa412ab2e0433fc848b48f/output.mp4\\\"}\"",
        "startTime": "2023-02-17 16:53:26",
        "endTime": "2023-02-18 10:03:21",
        "status": 5
      },
      {
        "id": 7,
        "algoType": 12,
        "algoSubType1": "4K",
        "algoSubType2": null,
        "algoSubType3": null,
        "productParam": "{}",
        "startTime": "2023-02-17 16:56:26",
        "endTime": "2023-02-17 17:43:19",
        "status": 9
      }
    ]
  }
}

# Get Account Task Information Details

# Interface Description

Query all task information under a certain algorithm type for a user's account, as well as the current status of the tasks, original input content, returned result items, etc. The task list supports pagination.

# Request URL

POST /api/2dvh/v1/task/listWithQueue

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
userId Long False User ID, default to the current logged-in account ID
algoType Integer True Task type (11: TTS voice model generation (old), 12: character image model generation, 14: video synthesis, 20: video character face swap, 18: character image model update, 25: voice conversion, 32: image green screen preview, 33: video green screen preview, 41: TTS Qid voice model generation)
status Integer True Task status (0: not started, 1: Dispatcher queue waiting, 2: algorithm processing, 3: canceled, 5: completed, 9: exception, -1: all)
key String False Task ID/role name query (exact match)
pageNo int False Current page number (default 1)
pageSize int False Number of items per page (default 10)
sortName String False Sort field name
sortValue String False Sort order: asc, desc

# Request Example

{
  "algoType": 12,
  "pageSize": 10,
  "pageNo": 1
}

# Response Elements

Field Type Required Description
code Number True 0 - Success, others - Exception
message String True Detailed exception information
data Object False Data object, usually empty in case of exception
  - pagination Pagination True Pagination information (refer to common data structure description)
  - result Object True Task list (refer to the description below)

Task List

Field Type Required Description
id Long True Task ID
algoType Integer True Task type (11: TTS voice model generation, 12: character image model generation, 14: video synthesis, 20: video character face swap, 18: character image model update, 25: voice conversion, 32: image green screen preview, 33: video green screen preview, 41: TTS V3 voice model generation)
materialId Long True Model ID
materialName String True Material name
queueInfo String False Queue information
status Integer True Task status 0: not started, 1: Dispatcher queue waiting, 2: algorithm processing, 3: canceled, 5: completed, 9: exception
productParam String True Task result JSON string, including video address videoPath, video duration duration, thumbnail address thumbPath
extendParam String False Extended parameters, including faceFeatureId for character image model generation
algoSubType1 String True Character model: model specification: 2K/4K, video synthesis: using character model specification: 2K/4K
algoSubType2 String False Video synthesis: result format: webm/mp4
algoSubType3 String False Video synthesis: result frame rate
taskInfo String True Initial task parameters and original files
algoId String True Algorithm task ID
message String False Error information
submitTime String True Algorithm submission time (yyyy-MM-dd HH:mm:ss)
startTime String False Algorithm start time (yyyy-MM-dd HH:mm:ss)
endTime String False Algorithm end time (yyyy-MM-dd HH:mm:ss)
owner Long True Account owning the task
ownerPhone String True Account phone number

# Response Example

{
  "code": 0,
  "message": "success",
  "data": {
    "pagination": {
      "pageNo": 1,
      "numberPages": 1,
      "numberRecords": 2,
      "pageSize": 2,
      "startIndex": 0
    },
    "result": [
      {
        "id": 8833,
        "materialId": 8122,
        "materialName": "Mario_4_talk.mp4_sensetime-segment_type_green screen segmentation",
        "productParam": "{\"pkgPath\": \"https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/download/8/b6ecebc8233b47809dedd6731c052d15_s1/b6ecebc8233b47809dedd6731c052d15_s1_result.zip\", \"thumbPath\": \"https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/download/8/b6ecebc8233b47809dedd6731c052d15_s1/b6ecebc8233b47809dedd6731c052d15_s1_result.png\", \"faceFeaturePath\": \"https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/download/8/b6ecebc8233b47809dedd6731c052d15_s1/b6ecebc8233b47809dedd6731c052d15_s1_face_feature.zip\"}",
        "extendParam": "{\"faceFeatureId\": \"b6ecebc8233b47809dedd6731c052d15_s1\"}",
        "startTime": "2023-06-07 23:31:50",
        "endTime": "2023-06-08 05:19:29",
        "status": 5,
        "message": "{}",
        "algoType": 12,
        "algoId": "b6ecebc8233b47809dedd6731c052d15_s1",
        "algoSubType1": "4K",
        "algoSubType2": null,
        "algoSubType3": null,
        "submitTime": "2023-06-07 17:34:40",
        "ownerPhone": "18311096857",
        "owner": 8,
        "queueInfo": null,
        "taskInfo": "{\"create2DAvatarModel\": {\"videoUrl\": \"https://ailab-storage-eus.oss-us-west-1.aliyuncs.com/31_trim_result/Mario_4_talk.mp4?OSSAccessKeyId=LTAI5tE2Hq2BAqr8EBzxmSrR&Expires=37686060051&Signature=C1L%2FxpHD%2FW155s%2BuhTocyVvsUfo%3D\", \"accountId\": 8, \"assetScale\": 1.0, \"existTaskId\": 0, \"firstCreate\": true, \"materialName\": \"Mario_4_talk.mp4_sensetime-segment_type_green screen segmentation\", \"segmentStyle\": 1}}"
      },
      {
        "id": 9093,
        "materialId": 8258,
        "materialName": "wu0609_sensetime-segment_type_green screen segmentation",
        "productParam": null,
        "extendParam": null,
        "startTime": "2023-06-09 10:56:38",
        "endTime": null,
        "status": 2,
        "message": "{}",
        "algoType": 12,
        "algoId": "5f6006acb891496f93bfeeff601201fe_s1",
        "algoSubType1": "4K",
        "algoSubType2": null,
        "algoSubType3": null,
        "submitTime": "2023-06-09 10:56:36",
        "ownerPhone": "18311096857",
        "owner": 8,
        "queueInfo": null,
        "taskInfo": "{\"create2DAvatarModel\": {\"videoUrl\": \"http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/wanxing_0606/zhuzong.mp4\", \"accountId\": 8, \"assetScale\": 1.0, \"existTaskId\": 0, \"firstCreate\": true, \"materialName\": \"wu0609_sensetime-segment_type_green screen segmentation\", \"segmentStyle\": 1}}"
      },
      {
        "id": 8528,
        "materialId": 7908,
        "materialName": "Claire_3_talk.mp4_sensetime-segment_type_green screen segmentation",
        "productParam": "{\"pkgPath\": \"https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/download/8/1907b913f78845168529bad59f36a43f_s1/1907b913f78845168529bad59f36a43f_s1_result.zip\", \"thumbPath\": \"https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/download/8/1907b913f78845168529bad59f36a43f_s1/1907b913f78845168529bad59f36a43f_s1_result.png\", \"faceFeaturePath\": \"https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/download/8/1907b913f78845168529bad59f36a43f_s1/1907b913f78845168529bad59f36a43f_s1_face_feature.zip\"}",
        "extendParam": "{\"faceFeatureId\": \"1907b913f78845168529bad59f36a43f_s1\"}",
        "startTime": "2023-06-06 02:32:13",
        "endTime": "2023-06-06 20:17:19",
        "status": 9,
        "message": "{\"errorMsg\": \"Algorithm heart beat is overtime!!!\"}",
        "algoType": 12,
        "algoId": "1907b913f78845168529bad59f36a43f_s1",
        "algoSubType1": "4K",
        "algoSubType2": null,
        "algoSubType3": null,
        "submitTime": "2023-06-05 20:41:43",
        "ownerPhone": "18311096857",
        "owner": 8,
        "queueInfo": null,
        "taskInfo": "{\"create2DAvatarModel\": {\"videoUrl\": \"https://ailab-storage-eus.oss-us-west-1.aliyuncs.com/online_videos/Claire_3_talk.mp4?OSSAccessKeyId=LTAI5tE2Hq2BAqr8EBzxmSrR&Expires=1689391554&Signature=pMSBmAlawZ7h2sxjUO8Dk%2B1dHRg%3D\", \"accountId\": 8, \"assetScale\": 1.0, \"existTaskId\": 0, \"firstCreate\": true, \"materialName\": \"Claire_3_talk.mp4_sensetime-segment_type_green screen segmentation\", \"segmentStyle\": 2}}"
      },
      {
       "id": 9161,
        "materialId": 8317,
        "materialName": "Eddie_3_talk_trim_sensetime_0_绿幕分割",
        "productParam": null,
        "extendParam": null,
        "startTime": null,
        "endTime": null,
        "status": 1,
        "message": "{}",
        "algoType": 18,
        "algoId": "1667070933254279169",
        "algoSubType1": "4K",
        "algoSubType2": null,
        "algoSubType3": null,
        "submitTime": "2023-06-09 15:28:48",
        "ownerPhone": "18311096857",
        "owner": 8,
        "queueInfo": "8/9",
        "taskInfo": "{\"rebuild2DAvatarModelVideo\": {\"assetEnd\": 120.0, \"modelUrl\": \"https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/download/8/ba80636d8a77423083af66174375a130_s1/ba80636d8a77423083af66174375a130_s1_result.zip\", \"videoUrl\": \"http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/wanxing_0606/Eddie_3_talk_trim.mp4.mp4\", \"accountId\": 8, \"assetScale\": 1.0, \"assetStart\": 60.0, \"existTaskId\": 0, \"firstCreate\": true, \"materialName\": \"Eddie_3_talk_trim_sensetime_0_绿幕分割\", \"segmentStyle\": 1}}"
      }
    ]
  }
}

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
id Long True Task ID

# Request Example

http://xxx/api/2dvh/v1/task/cancel?id=1

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Error Detailed Information
data Object False Value is null

# Response Example

{
    "code": 0,
    "message": "success",
    "data": null
}

# Delete Task

# Interface Description

Supports the operation of deleting tasks for users for non-ongoing tasks. After deletion, task information will no longer be saved.

# Request Address

DELETE /api/2dvh/v1/task/del

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
id Long True Task ID

# Request Example

http://xxx/api/2dvh/v1/task/del/id

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Error Detailed Information
data Object False Value is null

# Response Example

{
    "code": 0,
    "message": "success",
    "data": null
}

# Restart Task

# Interface Description

Supports the operation of restarting abnormal tasks for users. After restarting, the task ID remains unchanged.

# Request Address

GET /api/2dvh/v1/task/restart

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
id Long True Task ID

# Request Example

http://xxx/api/2dvh/v1/task/restart?id=1

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Error Detailed Information
data Object False Task ID

# Response Example

{
    "code": 0,
    "message": "success",
    "data": 2
}

# Query Task Phase Time Consumption Information

# Interface Description

Query the time consumption information of each phase of the task, currently only supports video synthesis.

# Request Address

GET /api/2dvh/v1/task/phase/cost

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
id Long True Task ID

# Request Example

https://xxx/api/2dvh/v1/task/phase/cost?id=1

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Error Detailed Information
data Array False Value is null
phase String True Algorithm phase: asset_download, parse_json, st_mobile_change_package, preprocess, main_process, postprocess, result_upload
startTime String True Phase start time (yyyy-MM-dd HH:mm:ss)
endTime String True Phase completion time (yyyy-MM-dd HH:mm:ss)
costTime String True Time consumed (milliseconds)
callCount Integer False Phase repeat count, a null value indicates no repetition

# Response Example

{
  "code": 0,
  "message": "success",
  "data": [
    {
      "phase": "asset_download",
      "startTime": "2023-11-01 16:41:17",
      "endTime": "2023-11-01 16:41:17",
      "costTime": 0
    },
      {
      "id": 291340,
      "phase": "parse_json",
      "startTime": "2023-11-01 16:41:17",
      "endTime": "2023-11-01 16:41:18",
      "costTime": 124
    },
    {
      "id": 291340,
      "phase": "st_mobile_change_package",
      "startTime": "2023-11-01 16:41:18",
      "endTime": "2023-11-01 16:41:20",
      "costTime": 2242
    },
    {
      "id": 291340,
      "phase": "preprocess",
      "startTime": "2023-11-01 16:41:20",
      "endTime": "2023-11-01 16:41:21",
      "costTime": 1238
    },
    {
      "id": 291340,
      "phase": "main_process",
      "startTime": "2023-11-01 16:41:21",
      "endTime": "2023-11-01 16:41:25",
      "costTime": 3692
    },
    {
      "id": 291340,
      "phase": "result_upload",
      "startTime": "2023-11-01 16:41:25",
      "endTime": "2023-11-01 16:41:25",
      "costTime": 189
    },
    {
      "id": 291340,
      "phase": "postprocess",
      "startTime": "2023-11-01 16:41:25",
      "endTime": "2023-11-01 16:41:25",
      "costTime": 457
    }
  ]
}

# Task Completion Callback Parameters

When using the API, the system will return task status information through the filled interface callback address. If you need a task callback function, you need to contact the administrator to provide the interface callback address when creating an account. If the user has configured AuthKey, authentication information timestamp and signature will be returned, refer to for details.

The provided interface implementation has HTTP Method POST, and Content-Type should be application/json.

Field Type Required Description
taskId Integer True Task ID
materialId Integer True Material ID
materialName String True Material Name
algoType Integer True Task Type (Various types listed)
algoSubType1 String True Model specifications for character model: 2K/4K, Video synthesis: Used character model specification: 2K/4K
algoSubType2 String False Video synthesis: Result format: webm/mp4
algoSubType3 String False Video synthesis: Result frame rate
status Integer True Status 3: Canceled, 5: Completed, 9: Error
taskResult String False Error Message

This encompasses all the algorithmic capabilities the platform can provide.

Last Updated: 7/31/2024, 9:10:45 PM