Multi-audio track transcoding and packaging

更新时间:
复制 MD 格式

Transcode and package media files with multiple audio tracks by using Intelligent Media Services (IMS) to generate multi-language audio and video content compatible with various devices and players.

Transcoding and packaging workflow

image

Example of a packaged file structure:

#EXTM3U

# Audio stream definitions (multi-language)
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",NAME="Chinese",LANGUAGE="zh",DEFAULT=YES,URI="audio/chinese.m3u8"
#EXT-X-MEDIA:TYPE=AUDIO,GROUP-ID="audio",NAME="English",LANGUAGE="en",DEFAULT=NO,URI="audio/english.m3u8"

# Video stream definitions (multi-bitrate)
#EXT-X-STREAM-INF:BANDWIDTH=400000,RESOLUTION=360x640,AUDIO="audio",CODECS="hvc1,mp4a.40.5"
video/360p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=900000,RESOLUTION=720x1280,AUDIO="audio",CODECS="hvc1,mp4a.40.5"
video/720p.m3u8
#EXT-X-STREAM-INF:BANDWIDTH=2000000,RESOLUTION=1080x1920,AUDIO="audio",CODECS="hvc1,mp4a.40.5"
video/1080p.m3u8

Prerequisites

Ensure Intelligent Media Services (IMS) is activated. For instructions, see Activate IMS.

Configuration

Basic IMS configuration

Transcoding template configuration

Configuration procedure

image

Example requirements

Codec: H.264/H.265

Video resolution: 360p/540p/720p/1080p

Audio: HE-AAC at 64 kbps (default configuration).

Example configuration

Create transcoding templates for each required video resolution based on the following tables. For instructions, see Create a transcoding template.

Note

To perform Narrowband HD™ transcoding, create the corresponding templates based on the table, and then submit a ticket for Alibaba Cloud to upgrade your backend configuration.

H.264

Transcoding template

Codec

Container format

Other settings

Video-360P

H.264

m3u8(.ts)

  • Resolution (long edge fixed, short edge adaptive): 640*

  • Disable audio

  • Segment length: 5 s

  • Configure other settings as needed.

Video-540P

H.264

m3u8(.ts)

  • Resolution (long edge fixed, short edge adaptive): 960*

  • Disable audio

  • Segment length: 5 s

  • Configure other settings as needed.

Video-720P

H.264

m3u8(.ts)

  • Resolution (long edge fixed, short edge adaptive): 1280*

  • Disable audio

  • Segment length: 5 s

  • Configure other settings as needed.

Video-1080P

H.264

m3u8(.ts)

  • Resolution (long edge fixed, short edge adaptive): 1920*

  • Disable audio

  • Segment length: 5 s

  • Configure other settings as needed.

Audio-64Kbps

HE-AAC

m3u8(.ts)

  • Disable video

  • Segment length: 5 s

Note

This template cannot be created in the console. Use the API or submit a ticket instead.

H.265

Note
  • Recommended: Use the fmp4 container format. It is the standard for Apple devices and is compatible with Safari.

  • Alternative: The .ts container format can also be used, but it is not compatible with Safari.

  • Console limitation: You cannot create a stream with the fmp4 container format in the console. Instead, create the stream with the m3u8(ts) container format. Alibaba Cloud will then upgrade the configuration in the backend.

Transcoding template

Codec

Container format

Other settings

Video-360P

H.265

m3u8(.fmp4)

  • Resolution (long edge fixed, short edge adaptive): 640*

  • Disable audio

  • Segment length: 5 s

  • Configure other settings as needed.

Video-540P

H.265

m3u8(.fmp4)

  • Resolution (long edge fixed, short edge adaptive): 960*

  • Disable audio

  • Segment length: 5 s

  • Configure other settings as needed.

Video-720P

H.265

m3u8(.fmp4)

  • Resolution (long edge fixed, short edge adaptive): 1280*

  • Disable audio

  • Segment length: 5 s

  • Configure other settings as needed.

Video-1080P

H.265

m3u8(.fmp4)

  • Resolution (long edge fixed, short edge adaptive): 1920*

  • Disable audio

  • Segment length: 5 s

  • Configure other settings as needed.

Audio-64Kbps

HE-AAC

m3u8(.fmp4)

  • Disable video

  • Segment length: 5 s

Note

This template cannot be created in the console. Use the API or submit a ticket instead.

Multi-bitrate transcoding and packaging task

Multi-bitrate task submission

Call the SubmitMediaConvertJob operation to submit a transcoding task for your media files.

HlsGroupConfig parameters

Parameter

Type

Description

Type

string

The type of the data stream.

Valid values:

  • video: a video stream. Only video-related settings are processed.

  • audio: an audio stream. Only audio-related settings are processed.

  • hybrid: a hybrid stream. Both audio and video-related settings are processed.

Bandwidth

string

The bandwidth. This parameter is optional. By default, the bitrate (in bps) is used.

Valid only when Type is set to video or hybrid.

AudioGroup

string

The audio group referenced by the video stream. Valid only when Type is set to video.

SubtitleGroup

string

The subtitle group referenced by the video stream. Valid only when Type is set to video or hybrid.

Name

string

The NAME attribute of the output stream in the HLS manifest. This parameter is required when Type is set to audio or subtitle.

Group

string

The GROUP_ID attribute of the output stream in the HLS manifest. This parameter is valid only when Type is set to audio or subtitle.

Defaults to the value of the Type parameter.

Language

string

The LANGUAGE attribute of the output stream in the HLS manifest. This parameter is valid only when Type is set to audio or subtitle. The value must conform to RFC 5646.

Default

boolean

Specifies whether this is the default stream. Valid only when Type is set to audio.

AutoSelect

boolean

Specifies whether the stream should be automatically selected. Valid only when Type is set to audio.

Forced

boolean

Specifies whether the stream is forced. Valid only when Type is set to audio.

Use case 1: Transcode and package

{
  "Inputs": [
    {
      "Name": "video",
      "InputFile": {
        "Type": "OSS",
        "Media": "https://<your-bucket>.<public-endpoint>/<video_1_chinese.mp4>"
      }
    },
    {
      "Name": "EnglishAudio",
      "InputFile": {
        "Type": "OSS",
        "Media": "https://<your-bucket>.<public-endpoint>/<video_1_english.mp4>"
      }
    },
    {
      "Name": "JapaneseAudio",
      "InputFile": {
        "Type": "OSS",
        "Media": "https://<your-bucket>.<public-endpoint>/<video_1_japanese.mp4>"
      }
    }
  ],
  "OutputGroups": [
    {
      "GroupConfig": {
        "Type": "Hls",
        "OutputFileBase": {
          "Type": "OSS",
          "Media": "https://<your-bucket>.<public-endpoint>/<URI>/"
        },
        "ManifestName": "<m3u8_filename>"
      },
      "Outputs": [
        {
          "Name": "360P",
          "InputRef": "video",
          "OutputFileName": "video/360p/360p",
          "TemplateId": "Video-360P"
        },
        {
          "Name": "540P",
          "InputRef": "video",
          "OutputFileName": "video/540p/540p",
          "TemplateId": "Video-540P"
        },
        {
          "Name": "720P",
          "InputRef": "video",
          "OutputFileName": "video/720p/720p",
          "TemplateId": "Video-720P"
        },
        {
          "Name": "1080P",
          "InputRef": "video",
          "OutputFileName": "video/1080p/1080p",
          "TemplateId": "Video-1080P"
        },
        {
          "OutputFileName": "audio/chinese/chinese",
          "TemplateId": "Audio-64Kbps",
          "HlsGroupConfig": {
            "Name": "Chinese",
            "Type":"audio",
            "Language": "zh",
            "Autoselect": "TRUE",
            "Default": "TRUE"
          }
        },
        {
          "InputRef": "EnglishAudio",
          "OutputFileName": "audio/english/english",
          "TemplateId": "Audio-64Kbps",
          "HlsGroupConfig": {
            "Name": "English",
            "Type":"audio",
            "Language": "en",
            "Autoselect": "TRUE"
          }
        },
        {
          "InputRef": "JapaneseAudio",
          "OutputFileName": "audio/japanese/japanese",
          "TemplateId": "Audio-64Kbps",
          "HlsGroupConfig": {
            "Name": "Japanese",
            "Type":"audio",
            "Language": "ja",
            "Autoselect": "TRUE"
                    }
                }
            ]
        }
    ]
}

Use case 2: Add audio tracks

Procedure:

  1. Specify an input named "ExtraAudio". In the output, reference this input to transcode it into an audio HLS stream.

  2. Set InputRef in the ManifestExtend option of GroupConfig to reference the "RefManifest" file from the input, reusing the original manifest to add extra audio tracks.

{
  "Inputs": [
    {
      "Name": "ExtraAudio",
      "InputFile": {
        "Type": "OSS",
        "Media": "http://your-bucket.oss-region.aliyuncs.com/in/extra-audio.mp4"
      }
    },
    {
      "Name": "RefManifest",
      "InputFile": {
        "Type": "OSS",
        "Media": "http://your-bucket.oss-region.aliyuncs.com/in/manifest.m3u8"
      }
    }
  ],
  "OutputGroups": [
    {
      "GroupConfig": {
        "Type": "Hls",
        "OutputFileBase": {
          "Type": "OSS",
          "Media": "http://your-bucket.oss-region.aliyuncs.com/out/demo"
        },
        "ManifestName": "manifest",
        "ManifestExtend": {
          "InputRef": "RefManifest"
        }
      },
      "Outputs": [
        {
          "Name": "ExtraAudioOut",
          "InputRef": "ExtraAudio",
          "OutputFileName": "extra-audio",
          "TemplateId": "#YourAudioTemplateId",
          "hlsGroupConfig": {
            "Type": "audio",
            "Name":"Chinese",
            "Language": "zh-cn"
          }
        }
      ]
    }
  ]
}

Use case 3: Replace audio tracks

This use case builds on Use case 2. Use the Excludes option within ManifestExtend to exclude specific streams from the original manifest.

Parameter

Type

Description

Name

string

The NAME attribute of the stream to exclude.

Type

string

The TYPE attribute of the stream to exclude.

Valid values:

  • Audio

  • Subtitle

Language

string

The LANGUAGE attribute of the stream to exclude. The value must conform to RFC 5646.

{
  "Inputs": [
    {
      "Name": "ExtraAudio",
      "InputFile": {
        "Type": "OSS",
        "Media": "http://your-bucket.oss-region.aliyuncs.com/in/extra-audio.mp4"
      }
    },
    {
      "Name": "RefManifest",
      "InputFile": {
        "Type": "OSS",
        "Media": "http://your-bucket.oss-region.aliyuncs.com/in/manifest.m3u8"
      }
    }
  ],
  "OutputGroups": [
    {
      "GroupConfig": {
        "Type": "Hls",
        "OutputFileBase": {
          "Type": "OSS",
          "Media": "http://your-bucket.oss-region.aliyuncs.com/out/demo"
        },
        "ManifestName": "manifest",
        "ManifestExtend": {
          "InputRef": "RefManifest",
          "Excludes": [{
              "Language": "en",
              "Type": "Audio"
            }]
        }
      },
      "Outputs": [
        {
          "Name": "ExtraAudioOut",
          "InputRef": "ExtraAudio",
          "OutputFileName": "extra-audio",
          "TemplateId": "#YourAudioTemplateId",
          "hlsGroupConfig": {
            "Type": "audio",
            "Name":"Chinese",
            "Language": "zh-cn"
          }
        }
      ]
    }
  ]
}

Querying task results

Call the GetMediaConvertJob operation to query the details of a transcoding task.

Callback notifications

Event type: MediaConvertComplete

This event cannot be configured in the console. Call the SetEventCallback operation instead.

Key callback parameters

Parameter

Type

Required

Description

Name

String

Yes

The name of the main task.

JobId

String

Yes

The task ID.

Status

String

Yes

The task status. Success indicates that at least one output (subtask) succeeded.

TriggerSource

String

No

The source that triggered the task. API indicates the task was submitted via an API call.

FinishTime

String

No

The time the task was completed. The format must match EventTime.

UserData

string

No

The custom string specified when you submitted the task. This value is passed through and returned in the callback.

Example

{
	"FinishTime": "2025-05-09T08:03:21Z",
	"JobId": "your-job-id",
	"Status": "Success",
	"TriggerSource": "IceWorkflow",
	"UserData": "{\"ImsSrc\":\"Workflow\",\"TaskId\":\"e89a955d88ca47f0b9b79c562e5c622f\"}"
}

Playing the multi-bitrate packaged video

Use ApsaraVideo Player to play the packaged multi-bitrate video.

Video translation and multi-bitrate packaging workflow

image
  1. Prepare inputs: Provide the source file.

  2. Translate: Translate the source file into the target languages, such as English and Japanese, to generate the corresponding audio or video files.

  3. Transcode and package: Call the SubmitMediaConvertJob operation to merge the multi-language content and transcode it into a standardized, multi-bitrate video.