Video stitching combines multiple videos into a single file and transcodes it to your desired format.
Feature introduction
Video splicing merges multiple video clips into a single video and converts the video to a desired format.

Scenarios
Film and television production: In the production of movies, TV series, and short films, video splicing is a core step. It helps editors integrate different shots and scenes to build a complete narrative structure.
Content creation: On short video social media platforms, content creators often use video splicing to create vlogs, tutorials, or themed videos. This improves the appeal and visibility of their content.
Education and training: Teachers and trainers can splice different video clips to create instructional videos. This combines theory with practice to enhance student understanding and learning.
Sports event replays: In sports broadcasting, video splicing is used to create highlight reels. This helps viewers review exciting moments from a game.
Usage
Prerequisites
Intelligent Media Management (IMM) must be activated. For more information, see Activate Product.
An IMM project must be bound. To bind the project in the Object Storage Service (OSS) console, see Step 1: Bind an IMM project. To bind the project by calling an API, see Bind an object storage bucket.
Video concatenation
You can perform video concatenation only through asynchronous processing using the Java, Python, or Go SDKs.
Java
Requires Java SDK version 3.17.4 or later.
import com.aliyun.oss.ClientBuilderConfiguration;
import com.aliyun.oss.OSS;
import com.aliyun.oss.OSSClientBuilder;
import com.aliyun.oss.common.auth.CredentialsProviderFactory;
import com.aliyun.oss.common.auth.EnvironmentVariableCredentialsProvider;
import com.aliyun.oss.common.comm.SignVersion;
import com.aliyun.oss.model.AsyncProcessObjectRequest;
import com.aliyun.oss.model.AsyncProcessObjectResult;
import com.aliyuncs.exceptions.ClientException;
import java.util.Base64;
public class Demo {
public static void main(String[] args) throws ClientException {
// The endpoint for your bucket's region.
String endpoint = "https://oss-cn-hangzhou.aliyuncs.com";
// Specify the region ID, for example, cn-hangzhou.
String region = "cn-hangzhou";
// Obtain credentials from environment variables. Before running this sample,
// ensure that the OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
EnvironmentVariableCredentialsProvider credentialsProvider = CredentialsProviderFactory.newEnvironmentVariableCredentialsProvider();
// Specify the bucket name.
String bucketName = "examplebucket";
// Specify the name of the concatenated video file.
String targetObject = "dest.mp4";
// Specify the name of the source video file.
String sourceVideo = "src.mp4";
// Specify the names of the video files to concatenate.
String video1 = "concat1.mp4";
String video2 = "concat2.mp4";
// Create an OSSClient instance.
// Call the shutdown method to release resources when the OSSClient instance is no longer in use.
ClientBuilderConfiguration clientBuilderConfiguration = new ClientBuilderConfiguration();
clientBuilderConfiguration.setSignatureVersion(SignVersion.V4);
OSS ossClient = OSSClientBuilder.create()
.endpoint(endpoint)
.credentialsProvider(credentialsProvider)
.clientConfiguration(clientBuilderConfiguration)
.region(region)
.build();
try {
// Encode the video file names.
String video1Encoded = Base64.getUrlEncoder().withoutPadding().encodeToString(video1.getBytes());
String video2Encoded = Base64.getUrlEncoder().withoutPadding().encodeToString(video2.getBytes());
// Build the video processing style string and video concatenation parameters.
String style = String.format("video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_%s/sur,o_%s,t_0", video1Encoded, video2Encoded);
// Build the asynchronous processing instruction.
String bucketEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(bucketName.getBytes());
String targetEncoded = Base64.getUrlEncoder().withoutPadding().encodeToString(targetObject.getBytes());
String process = String.format("%s|sys/saveas,b_%s,o_%s/notify,topic_QXVkaW9Db252ZXJ0", style, bucketEncoded, targetEncoded);
// Create an AsyncProcessObjectRequest object.
AsyncProcessObjectRequest request = new AsyncProcessObjectRequest(bucketName, sourceVideo, process);
// Execute the asynchronous processing task.
AsyncProcessObjectResult response = ossClient.asyncProcessObject(request);
System.out.println("EventId: " + response.getEventId());
System.out.println("RequestId: " + response.getRequestId());
System.out.println("TaskId: " + response.getTaskId());
} finally {
// Shut down the OSSClient.
ossClient.shutdown();
}
}
}Python
Requires Python SDK version 2.18.4 or later.
# -*- coding: utf-8 -*-
import base64
import oss2
from oss2.credentials import EnvironmentVariableCredentialsProvider
def main():
# Obtain credentials from environment variables. Before running this sample, ensure that the
# OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
auth = oss2.ProviderAuthV4(EnvironmentVariableCredentialsProvider())
# The endpoint for your bucket's region. For example, for the China (Hangzhou) region,
# set the endpoint to https://oss-cn-hangzhou.aliyuncs.com.
endpoint = 'https://oss-cn-hangzhou.aliyuncs.com'
# Specify the region ID, for example, cn-hangzhou.
region = 'cn-hangzhou'
# Specify the bucket name, for example, examplebucket.
bucket = oss2.Bucket(auth, endpoint, 'examplebucket', region=region)
# Specify the name of the concatenated video file.
target_object = 'out.mp4'
# Specify the name of the source video file.
source_video = 'emrfinal.mp4'
# Specify the names of the video files to concatenate.
video1 = 'osshdfs.mp4'
video2 = 'product.mp4'
# Build the video processing style string and video concatenation parameters.
video1_encoded = base64.urlsafe_b64encode(video1.encode()).decode().rstrip('=')
video2_encoded = base64.urlsafe_b64encode(video2.encode()).decode().rstrip('=')
style = f"video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_{video1_encoded}/sur,o_{video2_encoded},t_0"
# Build the asynchronous processing instruction.
bucket_encoded = base64.urlsafe_b64encode('examplebucket'.encode()).decode().rstrip('=')
target_encoded = base64.urlsafe_b64encode(target_object.encode()).decode().rstrip('=')
process = f"{style}|sys/saveas,b_{bucket_encoded},o_{target_encoded}/notify,topic_QXVkaW9Db252ZXJ0"
print(process)
# Execute the asynchronous processing task.
try:
result = bucket.async_process_object(source_video, process)
print(f"EventId: {result.event_id}")
print(f"RequestId: {result.request_id}")
print(f"TaskId: {result.task_id}")
except Exception as e:
print(f"Error: {e}")
if __name__ == "__main__":
main()
Go
Requires Go SDK version 3.0.2 or later.
package main
import (
"encoding/base64"
"fmt"
"os"
"strings"
"github.com/aliyun/aliyun-oss-go-sdk/oss"
)
func main() {
// Obtain credentials from environment variables. Before running this sample, ensure that the
// OSS_ACCESS_KEY_ID and OSS_ACCESS_KEY_SECRET environment variables are set.
provider, err := oss.NewEnvironmentVariableCredentialsProvider()
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Create an OSSClient instance.
// Set yourEndpoint to the endpoint for your bucket's region. For example, for the China (Hangzhou) region,
// use https://oss-cn-hangzhou.aliyuncs.com.
// Set yourRegion to the corresponding region ID, for example, cn-hangzhou.
client, err := oss.New("yourEndpoint", "", "", oss.SetCredentialsProvider(&provider), oss.AuthVersion(oss.AuthV4), oss.Region("yourRegion"))
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the bucket name, for example, examplebucket.
bucketName := "examplebucket"
bucket, err := client.Bucket(bucketName)
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the name of the concatenated video file.
targetObject := "dest.mp4"
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
// Specify the name of the source video file.
sourcevideo := "src.mp4"
// Specify the names of the video files to concatenate.
video1 := "concat1.mp4"
video2 := "concat2.mp4"
// Build the video processing style string and video concatenation parameters.
style := fmt.Sprintf("video/concat,ss_0,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_%s/sur,o_%s,t_0", strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(video1)), "="), strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(video2)), "="))
// Build the asynchronous processing instruction.
process := fmt.Sprintf("%s|sys/saveas,b_%v,o_%v/notify,topic_QXVkaW9Db252ZXJ0", style, strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(bucketName)), "="), strings.TrimRight(base64.URLEncoding.EncodeToString([]byte(targetObject)), "="))
fmt.Printf("%#v\n", process)
rs, err := bucket.AsyncProcessObject(sourcevideo, process)
if err != nil {
fmt.Println("Error:", err)
os.Exit(-1)
}
fmt.Printf("EventId:%s\n", rs.EventId)
fmt.Printf("RequestId:%s\n", rs.RequestId)
fmt.Printf("TaskId:%s\n", rs.TaskId)
}The response to an asynchronous processing request does not include the task result. To obtain the result, use Simple Message Queue (SMQ) (formerly MNS). For more information, see Message Notification.
Parameters
Operation: video/concat
The following tables describe the parameters.
Concatenation parameters
The sequence of pre and sur in the request string determines the concatenation order for video/concat:
/pre: The video file to prepend./sur: The video file to append.
Parameter | Type | Required | Description |
ss | int | No | The start time of the segment to concatenate from the prepended or appended video, in milliseconds. Valid values:
|
t | int | No | The duration of the segment to concatenate from the prepended or appended video, in milliseconds. Valid values:
|
o | string | Yes | An OSS Object in the current Bucket. The object name must be encoded using URL-safe Base64 encoding. |
Transcoding parameters
Parameter | Type | Required | Description |
ss | int | No | The transcoding start time for the concatenated video, in milliseconds. Valid values:
|
t | int | No | The transcoding duration for the concatenated video, in milliseconds. Valid values:
|
f | string | Yes | The video container. Valid values:
|
vn | int | No | Specifies whether to disable the video stream. Valid values:
|
vcodec | string | Yes | The video codec. Valid values:
Note The mxf and flv containers do not support h265. |
fps | float | No | The frame rate. Defaults to the frame rate of the main video file specified by align. The frame rate must be between 0 and 240. |
fpsopt | int | No | The frame rate option. Valid values:
Note This parameter must be set together with fps. |
pixfmt | string | No | The pixel format. Defaults to the pixel format of the main video file specified by align. Valid values:
|
s | string | No | The resolution.
|
sopt | int | No | The resolution option. Valid values:
Note This parameter must be set together with s. |
scaletype | string | No | The scaling method. Valid values:
|
arotate | int | No | Specifies whether to adjust the resolution based on the video's orientation. Valid values:
|
g | int | No | The keyframe interval. Default: 150. Valid values: 1 to 100,000. |
vb | int | No | The video bitrate, in bits per second (bps). Valid values: 10,000 to 100,000,000. Note This parameter and crf are mutually exclusive. If neither is set, the system uses a default bitrate based on the output resolution. |
vbopt | int | No | The video bitrate option. Valid values:
Note This parameter must be set together with vb. |
crf | float | No | The rate control factor. Valid values: 0 to 51. A higher value results in lower quality. A value of 18 to 38 is recommended. |
maxrate | int | No | The peak bitrate, in bits per second (bps). Default: 0. Valid values: 10,000 to 100,000,000. Note This parameter must be set together with crf. |
bufsize | int | No | The buffer size, in bits. Default: 0. Valid values: 10,000 to 200,000,000. Note This parameter must be set together with crf. |
an | int | No | Specifies whether to disable the audio stream. Valid values:
|
acodec | string | Yes | The audio codec. Valid values:
Note Codec support varies by container. For example: mxf supports only pcm; mp4 does not support pcm; mov does not support flac or opus; asf and avi do not support opus; ts does not support flac, vorbis, amr, or pcm; and flv does not support flac, vorbis, amr, opus, or pcm. |
ar | int | No | The audio sample rate. Defaults to the sample rate of the main video file specified by align. Valid values:
Note Supported sample rates vary by codec. mp3 supports rates up to 48 kHz. opus supports 8 kHz, 12 kHz, 16 kHz, 24 kHz, and 48 kHz. ac3 supports 32 kHz, 44.1 kHz, and 48 kHz. amr supports only 8 kHz and 16 kHz. |
ac | int | No | The number of audio channels. Defaults to the number of audio channels in the main video file specified by align. Valid values: 1 to 8. Note The supported number of audio channels varies by codec. mp3 supports only mono and stereo channels. ac3 supports up to 6 channels (5.1). amr supports only mono channels. |
aq | int | No | The audio quality. Valid values: 0 to 100. Note This parameter is mutually exclusive with ab. If neither is set, the encoder's default bitrate is used. |
ab | int | No | The audio bitrate, in bits per second (bps). Valid values: 1,000 to 10,000,000. |
abopt | string | No | The audio bitrate option. Valid values:
Note This parameter must be set together with ab. |
align | int | No | The index of the main video file in the concatenation list. Its parameters are used as the default for transcoding. Default: 0 (the first video file in the list). |
adepth | int | No | The audio bit depth. Valid values: 16 or 24. Note This parameter applies only if the audio codec ( |
Video concatenation also uses the sys/saveas and notify parameters. For more information, see save as and notification.
Media segmentation parameters
/segment: Segmentation parameters.
Parameter | Type | Required | Description |
f | string | Yes | The segment format. Valid values:
|
t | int | Yes | The segment duration, in milliseconds. Valid values: 0 to 3,600,000. |
Media segmentation supports only the mp4 and ts containers.
API reference
Concatenate videos into MP4
Concatenation details
Source files:
pre.mov,example.mkv, andsur1.aacConcatenation order and duration:
File name
Order
Duration
pre.mov
1
Full video
example.mkv
2
From 10 seconds to the end
sur1.aac
3
The first 10 seconds
Completion notification: Via MNS message.
Output specifications
Video codec: h264
Video frame rate: 25 fps
Video bitrate: 1 Mbps
Audio codec: aac
Audio: 48 kHz sampling rate, stereo
Audio bitrate: 96 Kbps
Output file
oss://outbucket/outobj.mp4
Example request
// Concatenate pre.mov, example.mkv, and sur1.aac.
POST /example.mkv?x-oss-async-process HTTP/1.1
Host: video-demo.oss-cn-hangzhou.aliyuncs.com
Date: Fri, 28 Oct 2022 06:40:10 GMT
Authorization: OSS4-HMAC-SHA256 Credential=LTAI********************/20250417/cn-hangzhou/oss/aliyun_v4_request,Signature=a7c3554c729d71929e0b84489addee6b2e8d5cb48595adfc51868c299c0c218e
x-oss-async-process=video/concat,ss_10000,f_mp4,vcodec_h264,fps_25,vb_1000000,acodec_aac,ab_96000,ar_48000,ac_2,align_1/pre,o_cHJlLm1vdgo/sur,o_c3VyMS5hYWMK,t_10000|sys/saveas,b_b3V0YnVja2V0,o_b3V0b2JqLnthdXRvZXh0fQo/notify,topic_QXVkaW9Db252ZXJ0Permissions
An Alibaba Cloud account has all permissions by default. In contrast, RAM users and RAM roles have no permissions by default. An Alibaba Cloud account or an administrator must grant them access using a RAM policy or a bucket policy.
|
API |
Action |
Description |
|
GetObject |
|
Downloads an object. |
|
|
When downloading an object, if you specify the object version through versionId, this permission is required. |
|
|
|
When downloading an object, if the object metadata contains X-Oss-Server-Side-Encryption: KMS, this permission is required. |
|
API |
Action |
Description |
|
HeadObject |
|
Queries the metadata of an object. |
|
API |
Action |
Description |
|
PutObject |
|
Uploads an object. |
|
|
Required if you specify object tags by using the |
|
|
|
Required if the |
|
|
|
|
API |
Action |
Description |
|
CreateMediaConvertTask |
|
Permission to use IMM for media transcoding. |
Billing
The video concatenation process uses both Object Storage Service (OSS) and Intelligent Media Management (IMM), generating billable items from each service as follows:
OSS: The process calls the
GetObjectoperation with thex-oss-async-processparameter for video concatenation, theHeadObjectoperation to retrieve object metadata, and thePutObjectoperation to upload the resulting video to a bucket. These operations generate the billable items listed below. For detailed pricing, see OSS Pricing:API
Billable item
Description
GetObject
GET requests
You are charged request fees based on the number of successful requests.
Outbound traffic over the Internet
If you call the GetObject operation by using a public endpoint, such as oss-cn-hangzhou.aliyuncs.com, or an acceleration endpoint, such as oss-accelerate.aliyuncs.com, you are charged fees for outbound traffic over the Internet based on the data size.
Retrieval of IA objects
If IA objects are retrieved, you are charged IA data retrieval fees based on the size of the retrieved IA data.
Retrieval of Archive objects in a bucket for which real-time access is enabled
If you retrieve Archive objects in a bucket for which real-time access is enabled, you are charged Archive data retrieval fees based on the size of retrieved Archive objects.
Transfer acceleration fees
If you enable transfer acceleration and use an acceleration endpoint to access your bucket, you are charged transfer acceleration fees based on the data size.
API
Billable item
Description
PutObject
PUT requests
You are charged request fees based on the number of successful requests.
Storage fees
You are charged storage fees based on the storage class, size, and storage duration of the object.
API
Billable item
Description
HeadObject
GET requests
You are charged request fees based on the number of successful requests.
IMM: Generates the following billable items. For detailed pricing, see IMM billable items:
API
Billable item
Description
CreateMediaConvertTask
ApsaraVideo Media Processing
Fees are calculated based on the resolution and duration (in seconds) of the concatenated output video.
Considerations
Video stitching only supports asynchronous processing (x-oss-async-process).
-
Anonymous access will be denied.
Transcoding with the default sampling rate or audio channels can cause video stitching to fail due to compatibility issues with the target video container.
Video stitching supports up to 11 videos per request.