# Interactive Live Broadcasting

The platform provides the capability of digital human products that can support pushing digital human stream media content to the live broadcasting platform specified by users. It also supports real-time voice interaction between users and 2D digital humans. The platform retains the interface of the historical version of interactive live broadcasting. It is recommended that users access the new version of live broadcasting services according to the online documentation. The historical version of the interface is about to be taken offline. For historical documentation, refer to: Documentation Link (opens new window)

# Feature Introduction

The virtual digital human live broadcasting service integrates graphic, image, and voice algorithm capabilities, providing customers with API services to facilitate the rapid creation of live content and broadcasting. The main industries targeted include government, finance, education and training, and media. The platform provides the capability of digital human products that support real-time voice interaction with 2D digital humans. Currently, interactive digital human live broadcasting supports text and voice interaction methods. Meanwhile, during the digital human's speech, it can support interruption through specific phrases to achieve a more user-friendly duplex voice dialogue experience. This aims to bring personalized services to users, improve user experience, enhance efficiency, and competitiveness.

# API Description

To call all API services of the platform, it is necessary to access the service entry point: aigc.softsugar.com, and add token information in the request header; WS interface requires spelling the token information onto the URL.

# Live Broadcasting Sequence Diagram

Live Process

# Create a Live Broadcasting Room

# Interface Description

Create a digital human live broadcasting instance by passing in specified parameters for system initialization, and return session ID and other information for subsequent use. If the initialization takes too long, it will return status 32, and the completion status will be updated later by callback or client query. When the status is 35, it means the video stream initialization is completed, and live broadcasting can be started.
The playback method used in the web sample is the integration of third-party SDK services, which can be referred to: Third-party SDK Usage Instructions (opens new window).

Note: Timing begins as soon as the live broadcasting room is created.

# Request URL

POST /api/2dvh/v2/material/live/channel/create

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
param String True Parameters for generating live video (this parameter is a JSON-escaped string)
urtcUseExternalApp Boolean False Whether to use the user's Agora RTC appId. If not, the platform's unified rtc appId will be used (default is false)
urtcToken String False User rtc authentication token (required if using user's rtc app)
urtcUid String False User rtc authentication uid (required if using user's rtc app, must be a pure number not exceeding 32uint)
urtcAppId String False User rtc APP ID (required if using user's rtc app)
urtcChannelId String False User rtc Channel ID (required if using user's rtc app, length must not exceed 64 characters)
userData String False User information (returned as is during callback)

# Request Example

{
  "param": "{\"version\":\"0.0.1\",\"recycle\":0,\"stream_mode\":true,\"sceneList\":[{\"digital_role\":{\"id\":1,\"face_feature_id\":\"0325_nina_s3_beauty\",\"name\":\"小李\",\"url\":\"https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/materials/77/0325_nina_s3_beauty.zip\",\"position\":{\"x\":10,\"y\":60},\"scale\":1},\"tts_config\":{\"id\":\"nina\",\"name\":\"nina\",\"vendor_id\":3,\"language\":\"zh-CN\",\"pitch_offset\":0.0,\"speed_ratio\":1,\"volume\":400},\"tts_query\":{\"content\":\"商汤科技是一家行业领先的人工智能软件公司,以坚持原创,让AI引领人类进步为使命。\",\"ssml\":false},\"audios\":[{\"url\":\"http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/softsugar.mp3\"}],\"backgrounds\":[{\"type\":0,\"name\":\"背景\",\"url\":\"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/background.png\",\"rect\":[0,0,1080,1920],\"cycle\":true,\"start\":0,\"duration\":-1}],\"foregrounds\":[{\"type\":0,\"name\":\"前景1\",\"url\":\"https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/frontgroud.png\",\"rect\":[0,0,310,88],\"start\":0,\"duration\":100}],\"effects\":{\"version\":\"1.0\",\"beautify\":{}}}]}"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, others - Exception
message String True Detailed information about the exception
data Object False Whether the live broadcast was successfully started
  - sessionId String True Unique identifier for the session
  - sessionToken String True Token information used for microphone takeover
  - status Integer True Task status: 32: Video stream creation in progress, 35: Video stream created, 9: Exception, 97: System channel shortage, 98: Customer channel shortage
  - rtcAppId String False RTC APP ID (Not available when status is 32)
  - rtcChannelId String False RTC Channel ID (Not available when status is 32), length must not exceed 64 characters
  - rtcUid String False RTC UID (Not available when status is 32), must be a pure number not exceeding 32-bit uint
  - rtcToken String False RTC Token (Not available when status is 32)

# Response Example

   {
     "code": 0,
     "message": "success",
     "data": { 
               "sessionId": "sessionId",
               "sessionToken": "sessionToken",
               "status": 35,
               "rtcAppId": "rtcAppId",
               "rtcChannelId": "rtcChannelId",
                "rtcUid": "rtcUid",
                "rtcToken": "rtcToken"
             }
   }

# Live Stream Status Monitoring

# Interface Description

This is used to keep the live stream ongoing. It is recommended to call this interface every 60 seconds. If the heartbeat is lost for more than the threshold time, the live stream will be passively closed.

# Request URL

POST /api/2dvh/v1/material/live/channel/heartbeat

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique session ID

# Request Example

{
  "sessionId": "sessionId"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error message
data Object False Data object, usually null in case of error

# Response Example

{
  "code": 0,
  "message": "success"
}

# Start Live Stream

# Interface Description

Starts an instance of a digital human live stream room. By passing in specific parameters, a digital human live stream push instance can be started, which can be used for pulling streams through third-party RTC SDK products.

# Request URL

POST /api/2dvh/v1/material/live/channel/start

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique session ID

# Request Example

{
  "sessionId": "sessionId"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error message
data Boolean False Whether the live stream was successfully started

# Response Example

{
  "code": 0,
  "message": "success",
  "data": true
}

# Live Takeover (Soon to be Discontinued, Please Use "Interruptible Live Takeover")

# Interface Description

Send a broadcast text or audio file to the server, and the digital human will stop the predefined broadcast content based on the text/audio, generate TTS in real time, drive facial expressions and actions, and render the video stream to the user.

# Request URL

POST /api/2dvh/v1/material/live/channel/command

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True The unique identifier of the session
param String True Takeover content (This parameter is a JSON-escaped string)

# Request Example

{
  "sessionId": "sessionId",
  "param": "{\"tts_query\": {\"content\": \"A egg rolled on the ground. It rolled past the woods, through the garden, down a slope, and finally into a duck nest. The duck mother did not notice anything wrong and continued to incubate. One day, the egg cracked. The first egg hatched a little duck with blue spots, named Crayon. The second egg hatched a little duck with stripes, named Zebra. The third egg hatched a yellow little duck, named Moonlight. The fourth egg hatched a strange little duck with a blue-green body, constantly making 'gurgling' noises, so it was named Gurgling.\",\"ssml\": false},\"audio\": \"/data/common/song.mp3\"}"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Exception
message String True Detailed information of the exception
data Boolean False Whether the takeover was successful

# Response Example

{   
  "code": 0,
  "message": "success",
  "data": true
}

# Interruptible Live Takeover

# Interface Description

Send a broadcast text or audio file to the server, and the digital human will stop the predefined broadcast content based on the text/audio, generate TTS in real time, drive facial expressions and actions, and render the video stream to the user.

# Request URL

POST /api/2dvh/v2/material/live/channel/command

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True The unique identifier of the session
param String True Takeover content (This parameter is a JSON-escaped string)

# Request Example

{
  "sessionId": "sessionId",
  "param": "{\"tts_query\": {\"content\": \"A egg rolled on the ground. It rolled past the woods, through the garden, down a slope, and finally into a duck nest. The duck mother did not notice anything wrong and continued to incubate. One day, the egg cracked. The first egg hatched a little duck with blue spots, named Crayon. The second egg hatched a little duck with stripes, named Zebra. The third egg hatched a yellow little duck, named Moonlight. The fourth egg hatched a strange little duck with a blue-green body, constantly making 'gurgling' noises, so it was named Gurgling.\",\"ssml\": false},\"audio\": \"/data/common/song.mp3\"}"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Exception
message String True Detailed information of the exception
data Object True
isSuccess Boolean True Whether the takeover was successful
commandTraceId String False Task TraceId for cancellation of takeover

# Response Example

{   
  "code": 0,
  "message": "success",
  "data": {
    "isSuccess": true,
    "commandTraceId": "commandTraceId"
  }
}

# Interruptible Live Takeover with Queueing

# Interface Description

A queueable live takeover, if there is an ongoing takeover task, this takeover will be added to the queue and taken over in order. The current ongoing takeover task is at position number 1.
The range of insertion is: 1 - N, meaning it will broadcast in the order of the inserted position number +1. For example: if the insertion position number is 1, it will be the second content in the queue to be broadcasted, which will be played after the current task finishes, and subsequent queued tasks will automatically adjust their order.
There are 2 statuses for the takeover joining the queue (33: Takeover queue initialization; 34: Takeover joined the queue), if the status returns as 33 it means the initialization time (downloading files, etc.) exceeds the threshold (currently 10s). Subsequent queue status will be notified via asynchronous callback.
If the takeover contains large file content, such as large image files or large sound files, it will take a longer time to download. Currently, the system will complete all downloads before entering the queue and return that the takeover has joined the queue.
To ensure normal program processing, it is recommended that the length of the takeover queue does not exceed 15.

# Request URL

POST /api/2dvh/v1/material/live/channel/command/queue

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True The unique identifier of the session
priority Integer False Desired position number in the queue, if absent or beyond the current queue length, it will be added to the end of the queue
param String True Takeover content (This parameter is a JSON-escaped string)

# Request Example

{
  "sessionId": "sessionId",
  "priority": 1,
  "param": "{\"tts_query\": {\"content\": \"A egg rolled on the ground. It rolled past the woods, through the garden, down a slope, and finally into a duck nest. The duck mother did not notice anything wrong and continued to incubate. One day, the egg cracked. The first egg hatched a little duck with blue spots, named Crayon. The second egg hatched a little duck with stripes, named Zebra. The third egg hatched a yellow little duck, named Moonlight. The fourth egg hatched a strange little duck with a blue-green body, constantly making 'gurgling' noises, so it was named Gurgling.\",\"ssml\": false},\"audio\": \"/data/common/song.mp3\"}"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Exception
message String True Detailed information of the exception
data Object True
commandStatus Integer True Takeover queue status (33: Takeover queue initialization; 34: Takeover joined the queue)
commandTraceId String True Task TraceId for cancellation of takeover

# Response Example

{   
  "code": 0,
  "message": "success",
  "data": {
    "commandStatus": 34,
    "commandTraceId": "commandTraceId"
  }
}

# Live Broadcast Takeover Reordering

Adjust the order of a takeover task in the takeover queue.

# API Description

# Request URL

POST /api/2dvh/v1/material/live/channel/command/queue/reorder

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique identifier for the session
commandTraceId String True Takeover task TraceId (returned during takeover)
priority Integer True Desired priority in queue

# Request Example

{
  "sessionId": "sessionId",
  "commandTraceId": "commandTraceId",
  "priority": 1
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error information
data Object True
  - isSuccess Boolean True Whether successful

# Response Example

{   
  "code": 0,
  "message": "success",
  "data": {
    "isSuccess": true
  }
}

# Live Broadcast Takeover Cancellation

# API Description

Cancel a sent text or audio file takeover request. To cancel multiple at once, separate trace IDs with a semicolon ";". If "-1", cancel all content (including in progress and in queue). If text/audio is being taken over, calling for microphone takeover requires cancelling everything before starting.

# Request URL

POST /api/2dvh/v1/material/live/channel/command/cancel

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique identifier for the session
commandTraceId String True Takeover task TraceId (returned during takeover); to cancel multiple, separate with ";". "-1": cancel all

# Request Example

{
  "sessionId": "sessionId",
  "commandTraceId": "commandTraceId"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error information
data Object True
  - isSuccess Boolean True Whether cancellation was successful
  - commandFinishTime String False Completion time (yyyy-MM-dd HH:mm:ss)
  - message String False Reason for failure

# Response Example

{   
  "code": 0,
  "message": "success",
  "data": {
    "isSuccess": true
  }
}

# Live Broadcast Takeover Queue Inquiry

# API Description

Retrieve all live queues, returning content as a list of trace IDs sorted by live sequence. The currently playing or about to play content is first.

# Request URL

POST /api/2dvh/v1/material/live/channel/command/queue/list

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique identifier for the session

# Request Example

{
  "sessionId": "sessionId"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, Others - Error
message String True Detailed error information
data Object True
  - commandTraceIds Array False Array of takeover task TraceIds

# Response Example

{   
  "code": 0,
  "message": "success",
  "data": {
    "commandTraceIds": ["commandTraceId1","commandTraceId2","commandTraceId3","commandTraceId4"]
  }
}

# Live Chat Handoff

# Interface Description

This API facilitates the handoff of live chat scenarios for interactive digital humans. It does not support queuing, and handoff cannot occur if there's already an ongoing chat.

# Request URL

POST /api/2dvh/v2/material/live/channel/command/quick

# Request Header

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique identifier for the session
param String True Handoff content (this parameter is a JSON-escaped string)

# Request Example

{
  "sessionId": "sessionId",
  "param": "{\"quick_response\": true,   \"order\": -1 }"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, other - Error
message String True Detailed error message
data Object True
  - isSuccess Boolean True Whether the handoff was successful
  - commandTraceId String False Trace ID for the handoff task

# Response Example

{   
  "code": 0,
  "message": "success",
  "data": {
    "isSuccess": true,
    "commandTraceId": "commandTraceId"
  }
}

# Streaming Voice Takeover

# API Description

Note: This is a WebSocket API, data requests use JSON, encoding uses UTF8.
Send microphone voice, the digital human will generate images based on real-time voice and render video streams to users, even if not speaking, the microphone will pick up sound during the open state.

# Request URL

WebSocket /api/live-takeover-cc/v1/material/live/channel/command/microphone?clientType={clientType}&sessionId={sessionId}&Authorization={Token}

# Request Headers

# Request Parameters

Field Type Required Description
clientType String True Connection client type: BROWSER: production side
sessionId String True Unique session id returned after establishing connection
Authorization String True The token is the sessionToken returned when creating the live room

# Request Example (Upstream Message)

# Connection Established

wss://host/api/live-takeover-cc/v1/material/live/channel/command/microphone?clientType=BROWSER&sessionId=123&Authorization=Bearer MTZiMjQzMzg4ZTYyMTlkNTBjZWU0ODAxODg5N2MyY2NmMmQzNzNhOC04YzNiLTRlMjEtYmY3Mi0yNzA2YzFkMjg4ODA

# Ready to Start
	{
		"code": 0,
		"message": "begin"
	}
# Voice Data
Binary voice data requires the streaming audio to be transmitted in PCM format, with a bitrate of 16000, mono channel, and a 16-bit sampling rate.
# Sending Finished
	{
		"code": 99,
		"message": "eof"
	}

# Response Elements

Field Type Required Description
code String True Message type code
message String True Detailed message

# Response Example (Downstream Message)

# Authentication Failed
   {
		"code": 1,
		"message": "Authentication failed"
   }
# Server Ready to Receive
   {
       "code": 2,
	   "message": "Server ready to receive"
   }
# Completion/Cancelation of Takeover Initiated via HTTP (Text/Audio Takeover) Notification
   {
       "code": 18,
	   "message": "Takeover/cancellation completed",
	   "data": {
			"sessionId": "xxxxx",
			"commandTraceId": "yyyyyy",
			"timestamp": 1234567890
		}
   }
# Unmatched or Algorithm Connection Interruption
   {
       "code": 4,
	   "message": "Stop sending voice data"
   }
# Failure for Unknown Reasons
   {
		"code": 5,
		"message": "Failed to send voice data"
	}
# No Corresponding Live Information
   {
       "code": 6,
	   "message": "Server not ready to receive"
   }
# Takeover Conflict Refusal Message to Frontend during Other Forms of Server Takeover
   {
		"code": 7,
		"message": "Algorithm-side takeover conflict rejected"
   }
# Notification of Start Playing Takeover Initiated via HTTP (Text/Audio Takeover)
   {
		"code": 19,
		"message": "Start playing takeover",
		"data": {
			"sessionId": "xxxxx",
			"commandTraceId": "yyyyyy",
			"timestamp": 1234567890
		}
   }
# Notification of Takeover Queueing Completion Initiated via HTTP (Text/Audio Takeover)
   {
		"code": 20,
		"message": "Takeover queued successfully",
		"data": {
			"sessionId": "xxxxx",
			"commandTraceId": "yyyyyy",
			"timestamp": 1234567890
		}
   }

# Live NLP Q&A

# Interface Description

Before the live broadcast, multiple Q&A databases and Q&A manuscripts can be preset. The specific method needs to be set through the platform web interface (opens new window). During the live broadcast, the Q&A database ID for this live broadcast can be set (one live broadcast only supports matching one Q&A database). During the live broadcast, the matched Q&A database can be searched based on the questions asked by the audience. If an answer is hit, the corresponding answer content will be returned. Users can choose to play the corresponding answer by taking over the live broadcast.

# Request URL

POST /api/2dvh/v1/nlp/model/reply

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
appId String True Q&A database ID, note that the string format of the Q&A database ID is required, not the auto-increment ID shown in the interface
device String True Access device information, choose based on the initiating terminal, value range: ios / android / windows / macos
language String True Language, currently only supports Chinese, fixed value "zh"
query String True Question content for query
session String True User-defined field, required to be a combination of numbers/letters/-, using the same session for multi-round conversations will be considered as continuous questions, and the best answer result can be matched based on context information
trace String True User-defined field, required to be a combination of numbers/letters/-, used to track query information

# Request Example

{
  "appId": "app_id",
  "device": "macos",
  "language": "zh",
  "query": "Hello",
  "session": "456tf6c2-846c-4b8b-8ga9-e866fgr678a2",
  "trace": "451ee6c2-838c-4b8b-8ba9-e86c2d1692c7"
}

# Response Elements

Field Type Required Description
answer String True Answer to the question
device_id String True Access device information
language String True Language, currently only supports "zh"
query String True Question content for query
session String True Query Session ID
trace String True Query Trace ID
error_message String True Error message

# Response Example

{
  "answer": "The bathroom is on your right",
  "device_id": "DeviceID",
  "error_message": "",
  "language": "LanguageCode",
  "query": "I want to go to the toilet",
  "session": "SessionID",
  "trace": "TraceID"
}

# Close Live Broadcast

# Interface Description

Close a digital human live broadcast room instance that is in progress, stop the digital human streaming to close the live broadcast, and release related resources.

# Request URL

POST /api/2dvh/v1/material/live/channel/close

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique identifier of the session

# Request Example

{
   "sessionId": "sessionId"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, others - Exception
message String True Detailed information of the exception
data Boolean False Whether the live broadcast was successfully closed

# Response Example

{
    "code": 0,
    "message": "success",
    "data": true
}

# Query Session Status

# Interface Description

Support querying the running status of the live broadcast room instance specified by sessionId. Enter the sessionId parameter and return the running status information of the live broadcast room.

# Request URL

GET /api/2dvh/v1/material/live/channel/stat?sessionId={sessionId}

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique identifier of the session

# Request Example

http://{domainname}/api/2dvh/v1/material/live/channel/stat?sessionId=8bfeb2bd58474547a87520643cc2d8df

# Response Elements

Field Type Required Description
code Integer True 0 - Success, others - Exception
message String True Detailed information of the exception
data Object False Task information
  - status Integer True Task status: 2: Live, 32: Video stream creating, 35: Video stream created, 5: Live manually closed, 50: Other closures (e.g., long time no heartbeat, network link interruption, etc.), 9: Exception, 99: Initialization failed
  - sessionToken String False Token information for microphone takeover
  - rtcAppId String False RTC APP ID (Available when status is 2 and 35)
  - rtcChannelId String False RTC Channel ID (Available when status is 2 and 35)
  - rtcUid String False RTC UID (Available when status is 2 and 35)
  - rtcToken String False RTC Token (Available when status is 2 and 35)

# Response Example

{
    "code": 0,
    "message": "success",
    "data": {
          "sessionToken": "sessionToken",
          "status": 35,
          "rtcAppId": "rtcAppId",
          "rtcChannelId": "rtcChannelId",
          "rtcUid": "rtcUid",
          "rtcToken": "rtcToken"
        }
}

# Query Ongoing Session List

# Interface Description

Query the list of all ongoing live broadcast room instances under the current user. This interface only returns instances in progress; closed instances will not be returned.

# Request URL

GET /api/2dvh/v1/material/live/channel/running/list

# Request Headers

None

# Request Parameters

None

# Request Example

None

# Response Elements

Field Type Required Description
code Integer True 0 - Success, others - Exception
message String True Detailed information of the exception
data Object False Task information
  - sessionId String True Unique identifier of the session
  - status Integer True Task status: 2: Live, 32: Video stream creating, 35: Video stream created

# Response Example

{
	"code": 0,
	"message": "success",
	"data": [{
			"sessionId": "sessionId",
			"status": 2
		},
		{
			"sessionId": "sessionId",
			"status": 32
		}
	]
}

# Query History Session List

# Interface Description

Query the list of all historical live broadcast room instances under the current user. This interface only returns completed instances; instances in progress will not be returned.

# Request URL

POST /api/2dvh/v1/material/live/channel/history/list

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String False Unique identifier of the live session
status Integer False Task status: 5: Completed, 50: Other closures (e.g., long time no heartbeat, network link interruption, etc.), 9: Exception, 99: Initialization failed, 97: System route insufficient, 98: Customer route insufficient
pageNo int False Current page number (default 1)
pageSize int False Number of items per page (default 10)
sortName String False Sort field name
sortValue String False Sort order: asc, desc

# Request Example

{
  "pageSize": 10,
  "pageNo": 1
}

# Response Elements

Field Type Required Description
code Number True 0 - Success, others - Exception
message String True Detailed information of the exception
data Object False Data object, usually empty in case of exceptions
  - pagination pagination True Pagination information (refer to common description)
  - result Object True Task list (refer to the description below)

Live Task List

Field Type Required Description
sessionId String True Unique identifier of the live session
status Integer True Task status: 5: Live manually closed, 50: Other closures, 9: Exception, 99: Initialization failed
startTime String True Live start time (yyyy-MM-dd HH:mm:ss)
endTime String True Live end time (yyyy-MM-dd HH:mm:ss)

# Response Example

{
  "code": 0,
  "message": "success",
  "data": {
    "pagination": {
      "pageNo": 1,
      "numberPages": 1,
      "numberRecords": 2,
      "pageSize": 2,
      "startIndex": 0
    },
    "result": [
      {
        "sessionId": "sessionId",
        "startTime": "2023-02-17 16:53:26",
        "endTime": "2023-02-18 10:03:21",
        "status": 5
      },
      {
        "sessionId": "sessionId",
        "startTime": "2023-02-17 16:56:26",
        "endTime": "2023-02-17 17:43:19",
        "status": 9
      }
    ]
  }
}

# Get RTC Token

# Interface Description

Get rtc token. 30 seconds before the token expires, the SDK will trigger the token-privilege-will-expire callback. After receiving this callback, the client needs to obtain a new Token from the server and pass the newly generated Token to the SDK by calling the renewToken method.

# Request URL

POST /api/2dvh/v1/material/rtc/token/audience

# Request Headers

Content-Type: application/json

# Request Parameters

Field Type Required Description
sessionId String True Unique identifier of the session
rtcChannelId String True RTC Channel ID

# Request Example

{
    "sessionId": "sessionId",
    "rtcChannelId": "rtcChannelId"
}

# Response Elements

Field Type Required Description
code Integer True 0 - Success, others - Exception
message String True Detailed information of the exception
data String False rtc token

# Response Example

{
    "code": 0,
    "message": "success",
    "data": "007eJxTYGD8ItJX2zMjMu2BfJybs/H+9+66e3wXr2p+eTt877SnV3cqMKSZGKRamJomGRqnGZhYmhhamlikGJmaJ6WmGRuYpqQahHteTjmwgoHh8I7fcYwMjAwsQAziM4FJZjDJAiYVGCwNLM3NTdJSLVLMgMZYJFqapqalWZhZJpqnGSabp6YxMhgBADEzLqA="
}

# Live Script JSON Definition Explanation

# Introduction

The live script refers to the content definition required for a 2D digital persona during a live broadcast. It includes the character image used, the voice of the announcer, the content of the broadcast, images of the live room's background and foreground, as well as various beauty parameters for the host.

# JSON Parameters Description
Name Type Example Value Required Description
version String "0.0.2" Yes Version number of the live streaming JSON configuration file.
recycle Integer 0 Yes Number of live stream repetitions. 0 means infinite loop. If the value is greater than 0, it indicates the number of loops. For example, 1 indicates one playthrough, 6 indicates 6 loops.
stream_mode Boolean True Yes Whether to use streaming mode (quick start). The stream starts immediately after meeting the minimum conditions without waiting for all resources to be in place.
resolution Int Array [1080,1920] No >= 0.0.2: Agora stream and internal blend canvas resolution. Maximum support is 1080p, minimum is 1010.
0.0.1: Agora stream resolution, maximum support is 1080p, minimum is 10
10.
bit_rate Float 3.5 No Agora stream bitrate. Maximum value is 7, minimum value must be greater than 0 (Mbps). The actual stream bitrate achievable depends on the current resolution.
live_type String "live" , "interactive" No Defines the type of the current live stream. It can be left blank, defaulting to "live" (standard live mode). "interactive" sets it to interactive mode, which supports quick response (quick_response) definition.
sentence_break Boolean True No Default is enabled to break sentences by punctuation. It is not a mandatory field. If you do not want to use punctuation breaks and prefer direct breaks, set this to false.
streaming_param Object No Suggestions for streaming parameters.
 h264_profile Integer 100 No Agora PROFILE setting, default is 100. Generally, it is not recommended to make modifications.
 key_frame_interval Integer 0 No Agora I-frame interval setting, in seconds. Default is 0, meaning no change to I-frame intervals. Generally, it is not recommended to make modifications.
 agora_live_mode Boolean False No Whether to set Agora LIVE mode. Default is false (i.e., RTC mode). True sets it to live mode.
sceneList Object Array As follows Yes Scene array list. Currently does not support multiple scenes; only the first array element is effective.
digital role Object As follows
 id Integer 1 No Digital human ID
 face_feature_id String "1" Yes Digital human face feature ID
 name String "Xiao Li" No Digital human name
 url String "https://xxx/role.zip" Yes Digital human image zip package
 position Object As follows No Position of the initial pixels of the digital human image. The upper left corner of a 1080*1920 canvas is the origin, with x directions to the right and y directions downward.
  x Integer 0 Yes X-coordinate value
  y Integer 0 Yes Y-coordinate value
 scale Float 1.0 No Scale of the digital human image
 alpha Boolean false No Set whether the stream of the digital human is in ALPHA transparent background mode. The default is false, meaning the transparent channel is not pushed.
tts_config Object As follows Yes TTS configuration
 qid String "8wfZav:AEA_Z10Mqp9GCwDGMrz8xIzi3VScxNzUtLCh" No Not mandatory. Using qid can replace id/vendor_id, meaning id/vendor_id can be left blank, but id and qid cannot be both empty.
 id String "zh-CN-XiaoxuanNeural" Yes TTS speaker ID
 name String "Xiaoxuan" Yes TTS speaker name
 vendor_id Integer 4 Yes Vendor ID
 language String "zh-CN" Yes Language code
 pitch_offset Float 0.0 Yes Pitch. The higher the value, the sharper. The lower, the deeper. Range [-60, 60]
 speed_ratio Float 1 Yes Speech rate. The higher the value, the slower the speech. Range [0.5, 2]
 volume Integer 100 Yes Volume. The higher the value, the louder the sound. Range [1, 400]
tts_query Object As follows No TTS speech synthesis
 content String "Dear audience, hello everyone! It is a great honor to be with you all at this wonderful moment. Welcome to today's live broadcast." Yes The textual content to be synthesized into voice. Must be at least 10 characters. All language speakers can synthesize English queries; all language speakers can synthesize their own language queries; Chinese dialect speakers like Cantonese, Shanghainese can synthesize Chinese queries.
 ssml Boolean false No Whether to use SSML. If enabled, query can use USSML and SSML. USSML is recommended.
audios Object Array As follows No Audio drivers. If both tts_query and audios exist, tts_query takes precedence.
 url String "https://xxx/audio.mp3" Yes Array. Supports multiple mp3 format driving audio files.
quick_response String Array ["hello", "bye", "ni-hao"] No For quick replies in interactive digital human applications only, use the takeover interface to send the index of the quick reply array, starting from 0. -1 represents selecting one at random.
The TTS text and audio file can only support one at a time; TTS takes priority if both are present. Filler words should either all be TTS requests or all be audio requests. Please do not mix them.
 tts_query Object Array As follows No The request text for filler words: The TTS text and audio file can only support one at a time; TTS takes priority if both are present. Filler words should either all be TTS requests or all be audio requests. Please do not mix them.
  content Object "Hi" Yes The request text for filler words:
  ssml String false No Whether to use SSML.
 audio Boolean "https://xxx/audio.mp3" No The takeover file address for the filler words' audio.The TTS text and audio file can only support one at a time; TTS takes priority if both are present. Filler words should either all be TTS requests or all be audio requests. Please do not mix them.
opening_words String "xxxxx" No For use only as an opening speech for an interactive digital human.
backgrounds Object Array As follows No Background
 type Integer 0 Yes 0: image, 1: video. Supports mp4 format. No resolution requirements. Videos/images of different resolutions follow a short-side fill, long-side centering cropping approach.
 name String "Background" Yes Background name
 url String "https://xxx/bg.png" Yes Background file URL
 rect Int Array [0,0,1080,1920] Yes Initial position and size of the background, referencing a 1080*1920 canvas
 cycle Boolean false Yes Effective for videos. false means single play; true means loop play
 start Integer 0 Yes Background start time in ms
 duration Integer -1 Yes Background duration in ms. -1 as default value indicates it stays with the video.
foregrounds Object Array As follows No
 type Integer 0 Yes 0: image
 name String "Foreground" Yes
 url String "https://xxx/fg.png" Yes Foreground file URL, supporting png or jpg.
 rect Int Array [0,0,1080,1920] Yes Initial position and size, referencing a 1080*1920 canvas
 start Integer 0 Yes Foreground start time in ms
 duration Integer -1 Yes Foreground duration in ms. -1 as default value indicates it stays with the video.
effects Object As follows No
version String "1.0" Yes Special effects engine version
beautify Object As follows No Beautification
  whitenStrength Float 0.3 No [0,1.0] Whitening effect, default is 0.30. 0.0 means no whitening.
  whiten_mode Integer 0 No Whitening mode: 0 (pinkish white), 1 (natural white), 2 (natural white only for skin areas)
  reddenStrength Float 0.36 No [0,1.0] Redden effect, default is 0.36. 0.0 means no redden effect.
  smoothStrength Float 0.74 No [0,1.0] Smoothing effect, default is 0.74. 0.0 means no smoothing.
  smooth_mode Integer 0 No Smoothing mode: 0 (smoothing face area), 1 (smoothing entire image), 2 (fine smoothing face area)
  shrinkRatio Float 0.11 No [0,1.0] Face slimming effect, default is 0.11. 0.0 means no face slimming.
  enlargeRatio Float 0.13 No [0,1.0] Eye enlargement effect, default is 0.13. 0.0 means no eye enlargement.
  smallRatio Float 0.10 No [0,1.0] Face reduction effect, default is 0.10. 0.0 means no face reduction.
  narrowFace Float 0.0 No [0,1.0] Narrow face effect, default is 0.0. 0.0 means no narrow face effect.
  roundEyesRatio Float 0.0 No [0,1.0] Round eyes effect, default is 0.0. 0.0 means no round eyes effect.
  thinFaceShapeRatio Float 0.0 No [0,1.0] Thin face shape effect, default is 0.0. 0.0 means no thin face shape effect.
  chinLength Float 0.0 No [-1, 1] Chin length, default is 0.0. [-1, 0] for shorter chin, [0, 1] for longer chin.
  hairlineHeightRatio Float 0.0 No [-1, 1] Hairline height, default is 0.0. [-1, 0] for lower hairline, [0, 1] for higher hairline.
  appleMusle Float 0.0 No [0, 1.0] Apple muscle, default is 0.0. 0.0 means no apple muscle effect.
  narrowNoseRatio Float 0.0 No [0, 1.0] Narrow nose, default is 0.0. 0.0 means no narrow nose effect.
  noseLengthRatio Float 0.0 No [-1, 1] Nose length, default is 0.0. [-1, 0] for shorter nose, [0, 1] for longer nose.
  profileRhinoplasty Float 0.0 No [0, 1.0] Profile rhinoplasty, default is 0.0. 0.0 means no profile rhinoplasty effect.
  mouthSize Float 0.0 No [-1, 1] Mouth size, default is 0.0. [-1, 0] for larger mouth, [0, 1] for smaller mouth.
  philtrumLengthRatio Float 0.0 No [-1, 1] Philtrum length, default is 0.0. [-1, 0] for longer philtrum, [0, 1] for shorter philtrum.
  eyeDistanceRatio Float 0.0 No [-1, 1] Eye distance adjustment, default is 0.0. [-1, 0] to reduce eye distance, [0, 1] to increase eye distance.
  eyeAngleRatio Float 0.0 No [-1, 1] Eye angle, default is 0.0. [-1, 0] for left eye counterclockwise rotation, [0, 1] for left eye clockwise rotation. The right eye will adjust accordingly.
  openCanthus Float 0.0 No [0, 1.0] Open canthus, default is 0.0. 0.0 means no open canthus effect.
  shrinkJawbone Float 0.0 No [0, 1.0] Shrink jawbone ratio, default is 0.0. 0.0 means no shrink jawbone effect.
  shrinkRoundFace Float 0.0 No [0, 1.0] Round face slimming, default is 0.0. 0.0 means no round face slimming effect.
  shrinkLongFace Float 0.0 No [0, 1.0] Long face slimming, default is 0.0. 0.0 means no long face slimming effect.
  shrinkGoddessFace Float 0.0 No [0, 1.0] Goddess face slimming, default is 0.0. 0.0 means no goddess face slimming effect.
  shrinkNaturalFace Float 0.0 No [0, 1.0] Natural face slimming, default is 0.0. 0.0 means no natural face slimming effect.
  shrinkWholeHead Float 0.0 No [0, 1.0] Whole head shrink, default is 0.0. 0.0 means no whole head shrink effect.
  contrastStrength Float 0.05 No [0,1.0] Contrast, default is 0.05. 0.0 means no contrast adjustment.
  saturationStrength Float 0.1 No [0,1.0] Saturation, default is 0.10. 0.0 means no saturation adjustment.
  sharpen Float 0.0 No [0, 1.0] Sharpen, default is 0.0. 0.0 means no sharpening.
  clear Float 0.0 No [0, 1.0] Clarity, default is 0.0. 0.0 means no clarity adjustment.
# JSON File Example
{
  "version": "0.0.1",
  "recycle": 0,
  "resolution": [1080,1920],
  "bit_rate": 5,
  "stream_mode": true,
  "sceneList": [
    {
      "digital_role": {
        "id": 1,
        "face_feature_id": "0325_nina_s3_beauty",
        "name": "小李",
        "url": "https://dwg-aigc-paas-test.oss-cn-hangzhou.aliyuncs.com/materials/77/0325_nina_s3_beauty.zip",
        "position": {
          "x": 10,
          "y": 60
        },
        "scale": 1
      },
      "tts_config": {
        "id": "zh-CN-XiaoxuanNeural",
        "name": "xiaoxuan",
        "vendor_id": 4,
        "language": "zh-CN",
        "pitch_offset": 0.0 ,
        "speed_ratio": 1,
        "volume": 400
      },
      "tts_query": {
        "content": "商汤科技是一家行业领先的人工智能软件公司,以坚持原创,让AI引领人类进步为使命。",
        "ssml": false
      },
      "audios": [
        {
          "url": "http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/softsugar.mp3"
        }
      ],
      "quick_response": [
    		"I'm considering it, I'll get back to you later.",
    		"Let me think about it and I'll reply to you later.",
    		"I need time to carefully consider this issue.",
    		"This issue is a bit complex, I'll need to reply to you later.",
    		"I need some time to prepare a detailed response.",
    		"Let me think about it for a moment, and then I'll reply to you.",
    		"I'll give you a detailed response later, please wait a moment.",
    		"I'll think carefully about this issue and then reply to you.",
    		"I need some time to thoroughly understand this issue.",
    		"I'll consider it and let you know my thoughts later."
  		],
  	  "opening_words": "Hello everyone, welcome to the live stream.",
      "backgrounds": [
        {
          "type": 0,
          "name": "背景",
          "url": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/background.png",
          "rect": [
            0,
            0,
            1080,
            1920
          ],
          "cycle": true,
          "start": 0,
          "duration": -1
        }
      ],
      "foregrounds": [
        {
          "type": 0,
          "name": "前景1",
          "url": "http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/frontgroud.png",
          "rect": [
            0,
            0,
            310,
            88
          ],
          "start": 0,
          "duration": 100
        }
      ],
      "effects": {
        "version": "1.0",
        "beautify": {
          "whitenStrength": 1.0,
          "whiten_mode": 0,
          "reddenStrength": 0.36,
          "smoothStrength": 1.0,
          "smooth_mode": 0,
          "shrinkRatio": 1.0,
          "enlargeRatio": 1.0,
          "smallRatio": 0.1,
          "narrowFace": 1.0,
          "roundEyesRatio": 0.0,
          "thinFaceShapeRatio": 0.0,
          "chinLength": 0.0,
          "hairlineHeightRatio": 0.0,
          "appleMusle": 0.0,
          "narrowNoseRatio": 0.0,
          "noseLengthRatio": 0.0,
          "profileRhinoplasty": 0.0,
          "mouthSize": 0.0,
          "philtrumLengthRatio": 0.0,
          "eyeDistanceRatio": 0.0,
          "eyeAngleRatio": 0.0,
          "openCanthus": 0.0,
          "brightEyeStrength": 0.0,
          "removeDarkCircleStrength": 0.0,
          "removeNasolabialFoldsStrength": 0.0,
          "whiteTeeth": 0.0,
          "shrinkCheekbone": 0.0,
          "thinnerHead": 0.0,
          "openExternalCanthus": 0.0,
          "shrinkJawbone": 0.0,
          "shrinkRoundFace": 0.0,
          "shrinkLongFace": 0.0,
          "shrinkGoddessFace": 0.0,
          "shrinkNaturalFace": 0.0,
          "shrinkWholeHead": 0.0,
          "contrastStrength": 0.05,
          "saturationStrength": 0.1,
          "sharpen": 0.0,
          "clear": 0.0
        }
      }
    }
  ]
}

# Live Stream Takeover JSON Definition Explanation

# Introduction

Live stream takeover refers to the ability to stop the current preset live content of a 2D digital person during a live broadcast and switch to broadcasting content specified by the user, either through text or audio files. It also supports replacing the foreground/background images in the scene during the takeover period.

# JSON Parameters Explanation
Name Type Example Value Required Description
tts_query Object No TTS (Text-to-Speech) synthesis
 content String "Dear audience, hello! It is a great honor to be here with you at this wonderful moment, welcome to today's live broadcast." Yes The text content to be synthesized into speech, must not be less than 10 characters
 ssml Boolean false No Whether to use SSML, when enabled, the query can use USSML and SSML, USSML is recommended
audio String "https://xxx/audio.mp3" No The driving audio file in mp3 format, if both tts_query and audio exist, tts_query is prioritized and audio is ignored
foregrounds Object Array Content as follows No Foreground images, require using an interruptible takeover service interface.
 url String "https://xxx/fg.png" Yes Foreground image url, supports png or jpg
 rect Int Array [0,0,1080,1920] Yes Starting position and size, with reference to a 1080*1920 resolution canvas
 start Integer 0 Yes The start time for the foreground image to appear, in milliseconds. If the set takeover image appearance time is later than the TTS broadcast completion time, then the takeover image will not be displayed, showing a no-image effect; if the set takeover image duration exceeds the TTS broadcast duration, then the takeover image is displayed until the last moment of takeover completion; if the set takeover image duration is shorter than the TTS broadcast duration, then the remaining takeover time will display a no-image effect.
 duration Integer -1 Yes The duration of the foreground image, in milliseconds, -1 is the default value, indicating it exists for the entire duration of the takeover
backgrounds Object Content as follows No Background images, require using an interruptible takeover service interface.
 url String "https://xxx/bg.png" Yes Background image url, supports png or jpg
 rect Int Array [0,0,1080,1920] Yes Starting position and size, with reference to a 1080*1920 resolution canvas, the image is processed by stretching the short side to full and center cropping the long side
 start Integer 0 Yes The start time for the background image to appear, in milliseconds. If the set takeover image appearance time is later than the TTS broadcast completion time, then the takeover image will not be displayed, showing a no-image effect; if the set takeover image duration exceeds the TTS broadcast duration, then the takeover image is displayed until the last moment of takeover completion; if the set takeover image duration is shorter than the TTS broadcast duration, then the remaining takeover time will display a no-image effect.
 duration Integer -1 Yes The duration of the background image, in milliseconds, -1 is the default value, indicating it exists for the entire duration of the takeover
# JSON File Example
{
  "tts_query": {
    "content": "尊敬的观众朋友们,大家好!下面插播一段新闻.",
    "ssml": false
  },
  "audio": "http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/softsugar.mp3",

  "backgrounds": 
        {
          "url": "https://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/background.png",
          "rect": [
            0,
            0,
            1080,
            1920
          ],
          "start": 0,
          "duration": -1
        },
  "foregrounds": [
        {
          "url": "http://dwg-aigc-paas.oss-cn-hangzhou.aliyuncs.com/test/frontgroud.png",
          "rect": [
            0,
            0,
            310,
            88
          ],
          "start": 0,
          "duration": 100
        }
      ]

}

# Live Streaming Status Callback Parameters

When using the API, the system will return information such as the live streaming status through the live streaming status callback URL provided. If you need the live streaming status callback feature, you must contact the administrator to provide the live streaming status callback URL when creating an account.
The callback will be triggered when the status of the live room changes and becomes one of the following: 35: video stream creation completed, 5: live broadcast actively closed, 50: other closures (such as long time no heartbeat, network connection interruption, etc.), 9: exception, 99: initialization failed (HTTP request is 200, received after the status 32: initialization in progress).
Callbacks will also be triggered upon completion and cancellation of live takeover (34: takeover enqueued completed, 36: takeover playback started, 37: takeover playback completed, 39: takeover exception) (excluding microphone takeover).
If the user has configured an AuthKey, authentication information such as timestamp and signature will be returned, see for details.

The provided interface implementation uses HTTP Method POST, and the Content-Type should be application/json.

Field Type Required Description
sessionId String True Unique identifier of the live session
rtcAppId String False RTC APP ID (Available when status is 35)
rtcChannelId String False RTC Channel ID (Available when status is 35)
rtcUid String False RTC UID (Available when status is 35)
rtcToken String False RTC Token (Available when status is 35)
status Integer True Status:
35: Video stream creation completed
5: Live broadcast actively closed
50: Other closures (e.g., long time no heartbeat, network connection interruption)
9: Exception
99: Initialization failed
34: Takeover enqueued completed
36: Takeover playback started
37: Takeover playback completed
39: Takeover exception
commandTraceId String False Takeover task TraceId
commandFinishTime String False Time of completion (yyyy-MM-dd HH:mm:ss)
taskResult String False Error message
userData String False User information

The above represents the full capabilities that the platform can provide for live video streaming.

Last Updated: 9/25/2024, 3:07:44 PM