Transcode an MP4 file with Intelligent Media Services (IMS) to include multiple audio tracks and assign a language to each track.
Workflow
Example of the output file structure:
Duration: 00:00:31.40, start: 0.000000, bitrate: 816 kb/s
Stream #0:0[0x1](und): Video: h264 (High) (avc1 / 0x31637661), yuv420p(tv, bt709, progressive), 960x540 [SAR 1:1 DAR 16:9], 663 kb/s, 25 fps, 25 tbr, 12800 tbn (default)
Stream #0:1[0x2](zho): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 46 kb/s (default)
Stream #0:2[0x3](eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 46 kb/s (default)
Stream #0:3[0x4](jpn): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 46 kb/s (default)
Prerequisites
You have activated Intelligent Media Services (IMS). For more information, see Activate the service.
Preparations
Basic IMS configuration
-
Storage configuration: Bind an Object Storage Service (OSS) bucket to IMS. For more information, see Configure a storage address.
-
Callback configuration: Configure an HTTP or Message Notification Service (MNS) callback to receive job status notifications. For more information about callback methods and events, see Callback events overview.
Transcoding templates
Procedure
Example requirements
Codec: H.264/H.265
Video resolution: 360p/540p/720p/1080p
Audio: HE-AAC at 64 kbps (default configuration).
Configuration example
Create transcoding templates for four video resolutions as described below. For instructions, see Create a transcoding template.
To use Narrowband HD™ transcoding, create a corresponding template based on the tables. Then, submit a ticket to request a backend configuration upgrade from Alibaba Cloud.
H.264
|
Transcoding template |
Codec |
Container format |
Other parameters |
|
Video-360P |
H.264 |
mp4 |
|
|
Video-540P |
H.264 |
mp4 |
|
|
Video-720P |
H.264 |
mp4 |
|
|
Video-1080P |
H.264 |
mp4 |
|
H.265
|
Transcoding template |
Codec |
Container format |
Other parameters |
|
Video-360P |
H.265 |
mp4 |
|
|
Video-540P |
H.265 |
mp4 |
|
|
Video-720P |
H.265 |
mp4 |
|
|
Video-1080P |
H.265 |
mp4 |
|
Submit a transcoding job
Call the SubmitMediaConvertJob API operation to submit a transcoding job.
Audio parameters
|
Parameter |
Type |
Description |
|
InputRef |
String |
The name of the input stream for this audio track. This must match a |
|
LanguageControl |
String |
Controls how the language tag is set for the output stream. Valid values:
|
|
Language |
String |
The ISO 639-2 language code for the audio track. |
|
Remove |
String |
Whether to remove the audio stream. |
|
Codec |
String |
The audio codec. |
|
Profile |
String |
The audio encoding profile. |
|
Bitrate |
String |
The bitrate of the output audio. |
|
Samplerate |
String |
The sample rate. |
|
Channels |
String |
The number of audio channels. |
|
Volume |
Object |
The volume control settings. |
Scenario 1: Keep original audio
-
The
Inputsarray specifies three sources: a main video file with default audio (video), a separate English audio file (EnglishAudio), and a separate Japanese audio file (JapaneseAudio). -
In
OutputGroups.GroupConfig,"Type": "File"specifies the output as a single container file. -
Each track uses
InputRefto specify its source input andLanguageControlto determine its language tagging logic.
{
"Inputs": [
{
"Name": "video",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<video-with-default-audio.mp4>"}
},
{
"Name": "EnglishAudio",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<english-audio.mp4>"}
},
{
"Name": "JapaneseAudio",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<japanese-audio.mp4>"}
}
],
"OutputGroups": [
{
"GroupConfig": {
"Type": "File",
"OutputFileBase": {
"Type": "OSS",
"Media": "https://<your-bucket>.<public-endpoint>/<output-path>/"
}
},
"Outputs": [
{
"Name": "360P",
"OutputFileName": "video/360p/360p",
"TemplateId": "Video-360P",
"OverrideParams": {
"Audios": [
{
"InputRef": "video",
"LanguageControl": "InputFirst"
}, {
"InputRef": "EnglishAudio",
"LanguageControl": "Configured",
"Language": "eng"
}, {
"InputRef": "JapaneseAudio",
"LanguageControl": "Configured",
"Language": "jpn"
}
]
}
}
]
}
]
}
Scenario 2: Remove original audio
This configuration is similar to Scenario 1, but it omits the reference to the original video's audio track from the Audios array. As a result, the output file contains only the English and Japanese audio tracks.
{
"Inputs": [
{
"Name": "video",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<video-with-default-audio.mp4>"}
},
{
"Name": "EnglishAudio",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<english-audio.mp4>"}
},
{
"Name": "JapaneseAudio",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<japanese-audio.mp4>"}
}
],
"OutputGroups": [
{
"GroupConfig": {
"Type": "File",
"OutputFileBase": {
"Type": "OSS",
"Media": "https://<your-bucket>.<public-endpoint>/<output-path>/"
}
},
"Outputs": [
{
"Name": "360P",
"OutputFileName": "video/360p/360p",
"TemplateId": "Video-360P",
"OverrideParams": {
"Audios": [
{
"InputRef": "EnglishAudio",
"LanguageControl": "Configured",
"Language": "eng"
}, {
"InputRef": "JapaneseAudio",
"LanguageControl": "Configured",
"Language": "jpn"
}
]
}
}
]
}
]
}
Scenario 3: Select audio by language
This example uses the AudioSelector parameter to select the audio track tagged jpn from the JapaneseFile input. The output audio track sets LanguageControl to InputFirst, which inherits the language tag from the input.
{
"Inputs": [
{
"Name": "video",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<video-with-default-audio.mp4>"}
},
{
"Name": "EnglishAudio",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<english-audio.mp4>"}
},
{
"Name": "JapaneseFile",
"InputFile": {"Type": "OSS", "Media": "https://<your-bucket>.<public-endpoint>/<multilingual-file.mp4>"},
"AudioSelector": [{
"Name": "JapaneseAudio",
"Rule": "tag",
"TagConfig": {"language": "jpn"}
}]
}
],
"OutputGroups": [
{
"GroupConfig": {
"Type": "File",
"OutputFileBase": {
"Type": "OSS",
"Media": "https://<your-bucket>.<public-endpoint>/<output-path>/"
}
},
"Outputs": [
{
"Name": "360P",
"OutputFileName": "video/360p/360p",
"TemplateId": "Video-360P",
"OverrideParams": {
"Audios": [
{
"InputRef": "video",
"LanguageControl": "InputFirst"
}, {
"InputRef": "EnglishAudio",
"LanguageControl": "Configured",
"Language": "eng"
}, {
"InputRef": "JapaneseAudio",
"LanguageControl": "InputFirst"
}
]
}
}
]
}
]
}
Query the transcoding job
Call the GetMediaConvertJob API operation to retrieve the details of a transcoding job.
Callback event
Event type: MediaConvertComplete
This event is not configurable in the console. Configure it by calling the SetEventCallback API operation.
Key callback parameters
|
Parameter |
Type |
Required |
Description |
|
|
Name |
String |
Yes |
The name of the parent job. |
|
|
JobId |
String |
Yes |
The job ID. |
|
|
Status |
String |
Yes |
The job status. A value of |
|
|
TriggerSource |
String |
No |
The trigger source. |
|
|
FinishTime |
String |
No |
The completion time in UTC format: |
|
|
UserData |
String |
No |
Custom data specified when submitting the job, passed through and returned in the callback. |
|
Example
{
"FinishTime": "2025-05-09T08:03:21Z",
"JobId": "5d37357cb3a44d10ba33c52760c896cd",
"Status": "Success",
"TriggerSource": "IceWorkflow",
"UserData": "{\"ImsSrc\":\"Workflow\",\"TaskId\":\"e89a955d88ca47f0b9b79c562e5c622f\"}"
}