Common scenarios

更新时间:
复制 MD 格式

Production parameters, advanced configurations, and SDK examples for Image-Text Matching in common scenarios.

Important
  • Both Script-to-Video and Image-Text Matching use the SubmitBatchMediaProducingJob API to submit a task. To differentiate between them based on parameters, see Parameter differences.

  • In this API, the region specified in the OSS URL of all media assets must be the same as the OpenAPI service endpoint.

  • Supported regions: China (Shanghai), China (Beijing), China (Hangzhou), China (Shenzhen), US (Silicon Valley), and Singapore.

  • Replace all placeholders in the examples ([your-bucket], [your-region-id], [your-file-name], [your-file-path], and media asset IDs) with your actual values.

Note
  • For a better understanding of this document, first reading the Batch video production guide to familiarize yourself with the concepts and workflow of Image-Text Matching common scenarios.

  • Image-Text Matching provides two video generation modes:

    • Global Scripts

    • Storyboard Script

API reference

  • To submit a batch video production job that intelligently mixes multiple video, audio, and image assets, see SubmitBatchMediaProducingJob. Key API parameters are detailed in the InputConfigEditingConfig, and OutputConfig sections below.

  • To get detailed information about a batch video creation job, see GetBatchMediaProducingJob.

InputConfig

InputConfig specifies parameters for video clips, voiceovers, background music, and stickers.

Parameter

Type

Description

Example

Required

Supported modes

MediaArray

List<String>

  • Specify and upload source assets. You can provide a list of media asset IDs or OSS URLs. The total video duration cannot exceed two hours.

  • Supported formats: Video formats.

["****b4549d46c88681030f6e****","****549d46c88b4681030f6e****"]

Either MediaArray or MediaSearchInput is required

Both

MediaSearchInput

MediaSearchInput

Intelligently searches for matching assets by specifying a search library and descriptive text.

{"LibSearchCondition":{"SearchLibs":["ims-default-search-lib","test-20"],"SearchText":"Alibaba Cloud assistant is learning how to livestream"}}

Both

TitleArray

List<String>

An array of titles. One title is randomly selected for each production.

Max 50 titles, each up to 50 characters long.

["Title 1","Title 2"]

No

Both

SubHeadingArray

List<SubHeading>

Multi-level subheading settings.

[{"Level":1,"TitleArray":["Level 1 subtitle 1","Level 1 subtitle 2"]},{"Level":3,"TitleArray":["Level 3 subtitle"]}]

No

Both

SpeechTextArray

List<String>

  • An array of voiceover scripts. One script is randomly selected for each production.

  • Max 50 scripts, each up to 1000 characters long.

  • Supports controlling speech synthesis using SSML.

    Important

    Currently, only <break>, <s>, <sub>, <w>, <phoneme>, and <say-as> are supported.

["Voiceover content 1","Voiceover content 2"]

No

Global Scripts

SceneInfo

SceneInfo

Scene configuration parameters.

Example: Global Scripts

Example: Storyboard Script

Yes

Storyboard Script

StickerArray

List<Sticker>

  • An array of stickers. One is randomly selected for each production. Max 50 stickers. Supports media asset IDs or OSS URLs.

  • Selection rule: If you provide 10 stickers and request 20 videos, the system will pick a random starting index (e.g., 3) and select stickers cyclically: 3, 4, 5, ..., 10, 1, 2, 3, ...

  • Supported formats: Image formats.

[{"MediaId":"****9d46c8b4548681030f6e****","X":10,"Y":100,"Width":300,"Height":300,"Opacity":0.6}]

No

Both

BackgroundMusicArray

List<String>

  • An array of background music tracks. One is randomly selected for each production. Max 50 tracks. Supports media asset IDs or OSS URLs.

  • Selection rule: Works the same as StickerArray.

  • Supported formats: Audio formats.

["****b4549d46c88681030f6e****","****549d46c88b4681030f6e****"]

No

Both

BackgroundImageArray

List<String>

  • An array of background images. One is randomly selected for each production. Max 50 images. Supports media asset IDs or OSS URLs.

  • Selection rule: Works the same as StickerArray.

  • Supported formats: Image formats.

["****b4549d46c88681030f6e****","****549d46c88b4681030f6e****"]

No

Both

MediaSearchInput

Parameter

Type

Description

Required

LibSearchCondition

LibSearchCondition

Configuration for search library conditions.

Required

LibSearchCondition

Parameter

Type

Description

Example

Required

SearchLibs

List<String>

A list of search libraries.

["ims-default-search-lib"]

Yes

SearchText

String

Descriptive text for matching assets. Max 20 characters.

Ocean, coral reef, seals, dolphins, marine environment

Yes

SceneInfo

Parameter

Type

Description

Required

Scene

String

The matching scene type. For common scenarios, set this to General.

Yes

ShotInfo

ShotInfo

Configuration for the storyboard.

Note

This parameter applies only to Storyboard Script mode.

No

ShotInfo

Note

This parameter applies only to Storyboard Script mode.

Parameter

Type

Description

Required

ShotScripts

List<ShotScript>

An array of storyboard scripts.

Yes

ShotScript

Note

This parameter applies only to Storyboard Script mode.

Parameter

Type

Description

Example

Required

ScriptText

String

The script text for a single scene, used to describe the scene's content for visual matching.

He is recently developing a new magic potion.

No

SpeechText

String

  • The voiceover script for a single scene, up to 100 characters long.

  • Supports controlling speech synthesis using SSML.

    Important

    Only <break>, <s>, <sub>, <w>, <phoneme>, and <say-as> are supported.

The old magician Danny is fiddling with strange instruments; he is recently developing a new magic potion.

No

Duration

Float

  • The duration of the scene in seconds. Must be ≥ 1 second.

  • This is only effective if no voiceover is provided. If a voiceover exists, the scene duration is automatically determined by the voiceover length.

5

No

Volume

Float

  • The volume of the input video for this scene. If set, it overrides the global volume set in EditingConfig.MediaConfig.Volume for this scene.

  • Value range: [0, 10.0]. Supports two decimal places.

0.5

No

Example: Global Scripts mode

{
  // Choose either MediaArray or MediaSearchInput
  "MediaArray": [
    "****9d46c886b45481030f6e****",
    "****c886810b4549d4630f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/test1.mp4",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/test2.png"
  ],
  // Choose either MediaArray or MediaSearchInput
  "MediaSearchInput": {
        "LibSearchCondition": {
            "SearchLibs": [
                "ims-default-search-lib",
                "test-20"
            ],
            "SearchText": "Alibaba Cloud assistant is learning how to livestream"
      }
  },
  "TitleArray": [
    "Freshippo opens a new location in Huilongguan",
    "A new Freshippo store opens"
  ],
  "SubHeadingArray": [
    {
      "Level": 1,
      "TitleArray": ["Subtitle 1", "Subtitle 2"]
    },
    {
      "Level": 3,
      "TitleArray": ["Level 3 subtitle"]
    }
  ],
  "SpeechTextArray": [
    "A new Freshippo store just opened in the nearby mall. It's the grand opening today, so I rushed over to check it out. The store isn't huge, but it's packed with people. Snacks and drinks are pretty cheap, and the checkout lines are super long. Come and see for yourself!",
    "A new  Freshippo store just opened in the nearby mall. It's the grand opening today, so I rushed over to check it out.",
    "<speak>Today, our hero, table tennis legend <phoneme alphabet="ipa" ph="mɑː lʊŋ">Ma Long</phoneme>, is striving for the pinnacle of glory.</speak>"
  ],
  "Sticker": {
    "MediaId": "****b681034549d46c880f6e****",
    "X": 10,
    "Y": 100,
    "Width": 300,
    "Height": 300,
    "Opacity": 0.6
  },
  "StickerArray": [
    {
      "MediaId": "****9d46c8b4548681030f6e****",
      "X": 10,
      "Y": 100,
      "Width": 300,
      "Height": 300,
      "Opacity": 0.6
    },
    {
      "MediaURL": "http://[your-bucket].oss-[your-region-id].aliyuncs.com/test3.png",
      "X": 10,
      "Y": 100,
      "Width": 300,
      "Height": 300
    }
  ],
  "BackgroundMusicArray": [
    "****b4549d46c88681030f6e****",
    "****549d46c88b4681030f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/test4.mp3"
  ],
  "BackgroundImageArray": [
    "****6c886b4549d481030f6e****",
    "****9d46c8548b4681030f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/test1.png"
  ]
}

Example: Storyboard Script mode

{
  // Choose either MediaArray or MediaSearchInput
  "MediaArray": ["****9d46c886b45481030f6e****", "****c886810b4549d4630f6e****"],
  // Choose either MediaArray or MediaSearchInput
  "MediaSearchInput": {
        "LibSearchCondition": {
            "SearchLibs": [
                "ims-default-search-lib",
                "test-20"
            ],
            "SearchText": "Alibaba Cloud assistant is learning how to livestream"
      }
  },
  "SceneInfo": {
    "Scene": "General", // General matching 
    "ShotInfo": {
      "ShotScripts": [
        {
          "ScriptText": "This is the visual script for the first scene",
          "SpeechText": "This is the voiceover for the first scene. The scene's duration will match the voiceover length."
        },
        {
          "ScriptText": "This is the visual script for the second scene. With no voiceover, you can set a custom duration.",
          "Duration": 5.0, // Can be set when there's no voiceover script.
          "Volume": 1.0 // Set the volume of video materials.
        },
        {
          "ScriptText": "This is the visual script for the third scene.",
          "SpeechText": "<speak>Voiceover supports SSML. The battle is <phoneme alphabet=\"py\" ph=\"zheng4 hao3\">fierce</phoneme>. Today, our hero, table tennis legend Ma Long, is striving for the pinnacle of glory. <s>In the quarter-finals against the formidable Togami Shunsuke, Ma Long showed no fear, giving his all in every rally.</s> His precise shots and calm judgment gave him the upper hand. In the end, Ma Long successfully defeated his opponent to advance to the semi-finals.<break time=\"1000ms\"/></speak>"
        }
      ]
    }
  },
   "TitleArray": [
    "Freshippo opens a new location in Huilongguan",
    "A new Freshippo store opens"
  ],
  "SubHeadingArray": [
    {
      "Level": 1,
      "TitleArray": ["Subtitle 1", "Subtitle 2"]
    },
    {
      "Level": 3,
      "TitleArray": ["Level 3 subtitle"]
    }
  ],
  "StickerArray": [
    {
      "MediaId": "****9d46c8b4548681030f6e****",
      "X": 10,
      "Y": 100,
      "Width": 300,
      "Height": 300
    },
    {
      "MediaURL": "http://[your-bucket].oss-[your-region-id].aliyuncs.com/test3.png",
      "X": 10,
      "Y": 100,
      "Width": 300,
      "Height": 300
    }
  ],
  "BackgroundMusicArray": [
    "****b4549d46c88681030f6e****",
    "****549d46c88b4681030f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/test4.mp3"
  ],
  "BackgroundImageArray": [
    "****6c886b4549d481030f6e****",
    "****9d46c8548b4681030f6e****",
    "http://[your-bucket].oss-[your-region-id].aliyuncs.com/test1.png"
  ]
}

EditingConfig

EditingConfig controls titles, volume, positioning, and other production settings. Leave empty to use defaults.

Note

Parameters are the same for both generation modes.

Parameter

Type

Description

Example

Required

MediaConfig

JSON

Configuration for input video assets.

Parameter example

No

TitleConfig

JSON

Configuration for titles.

Parameter example

No

SubHeadingConfig

JSON

Configuration for multi-level subtitles.

JSON fields:

Parameter example

No

SpeechConfig

JSON

Configuration for the voiceover.

Parameter example

No

BackgroundMusicConfig

JSON

Configuration for background music.

{"Volume":0.2}

No

BackgroundImageConfig

JSON

Background image configuration. Has no effect if a background image is specified in InputConfig.

{"SubType":"Blur","Radius":0.5}

No

ProcessConfig

JSON

Configuration for the mixing and editing process.

Parameter example

No

FECanvas

JSON

Canvas configuration for front-end preview.

{"Width": 1080,"Height": 1920}

No

ProduceConfig

JSON

Standard editing and production configuration. For fields, see EditingProduceConfig.

{"AutoRegisterInputVodMedia":true,"OutputWebmTransparentChannel":true,"CoverConfig":{"StartTime":3.3},"AudioChannelCopy":"left","PipelineId":"***d54a97cff4108b555b01166d4b***","MaxBitrate":5000,"KeepOriginMaxBitrate":false,"KeepOriginVideoMaxFps":false}

No

ProcessConfig

Parameter

Type

Description

Example

Required

SingleShotDuration

Float

Duration of each segmented shot (seconds) when long video assets are split.

5

No. Default value: 3.

EnableClipSplit

Boolean

Enables AI clip segmentation (splits long assets by scene changes). If true, SingleShotDuration is ignored.

false

No. Default value: false.

AllowVfxEffect

Boolean

Whether to add special effects.

true

No. Default value: false.

VfxEffectProbability

Float

Probability of applying an effect to each clip. Range: 0.0 to 1.0. Supports 2 decimal places.

0.6

No. Default value: 0.5.

VfxFirstClipEffectList

List<String>

  • If not empty, the effect for the first clip of the video will be chosen from this list.

  • If empty, a random effect is chosen from the following defaults: slightshowstarfieldshineestarfieldshinee2starsparklecolorfulripplesstarfield.

  • Effect examples: Special effect examples.

["slightshow","starfieldshinee"]

No

VfxNotFirstClipEffectList

List<String>

  • If not empty, effects for all clips other than the first will be chosen from this list.

  • If empty, a random effect is chosen from the following defaults: zoomslightzoomzoominoutslightshake.

  • Effect examples: Special effect examples.

["zoomslight","zoom"]

No

AllowTransition

Boolean

Whether to add transition effects.

true

No. Default value: false.

TransitionDuration

Float

Duration of transitions in seconds. If TransitionDuration > ClipDuration - 1, the transition for that clip will not be applied.

0.5

No. Default value: 0.5.

TransitionList

List<String>

A list of custom transitions. If AllowTransition is true, a random transition from this list will be used. For available transitions, see Transition effects. If this list is empty, a random transition is chosen from: linearblurcolordistancecrosshatchdreamyzoomdoomscreentransition_up.

["directional", "linearblur"]

No

UseUniformTransition

Boolean

Whether to use a uniform transition throughout a single video.

true

No. Default value: true.

AllowFilter

Boolean

Whether to add custom filters.

false

No. Default value: false.

FilterList

List<String>

A list of custom filters. If AllowFilter is true, a random filter from this list is applied. For available filters, see Filters If this list is empty, no filter is applied.

["m1", "m2"]

No

AllowDuplicateMatch

Boolean

Whether a matched clip can be reused.

false

No. Default value: false.

ImageDuration

Float

The duration for static image assets, in seconds.

2

No. Default value: 2.

Example

All EditingConfig parameters are optional. Default configuration:

{
  "MediaConfig": {
    "Volume": 0 // Input video assets are muted by default
  },
  "TitleConfig": {
    "Alignment": "TopCenter",
    "AdaptMode": "AutoWrap",
    "Font": "Alibaba PuHuiTi 2.0 95 ExtraBold",
    "SizeRequestType": "Nominal",
    "Y": 0.1, // Y-coordinate for portrait video
    "Y": 0.05, // Y-coordinate for landscape video
    "Y": 0.08 // Y-coordinate for square video
  },
  "SpeechConfig": {
    "Volume": 1,  // Voiceover uses original volume by default
    "SpeechRate": 0,
    "Voice": null,
    "Style": null,
    "CustomizedVoice": null, // Voice ID. If set, Voice and Style are ignored.
    "AsrConfig": {
      "Alignment": "TopCenter",
      "AdaptMode": "AutoWrap",
      "Font": "Alibaba PuHuiTi 2.0 65 Medium",
      "SizeRequestType": "Nominal",
      "Spacing": -1,
      "Y": 0.8, // Subtitle Y-coordinate for portrait video
      "Y": 0.9, // Subtitle Y-coordinate for landscape video
      "Y": 0.85 // Subtitle Y-coordinate for square video
    }
  },
  "SubHeadingConfig": {
    "1": {
      "Y": 0.3,
      "FontSize": 40
    },
    "3": {
      "Y": 0.5,
      "FontSize": 30
    }
  },
  "BackgroundMusicConfig": {
    "Volume": 0.2,   // Background music at 20% volume by default
    "Style": null
  },
  "ProcessConfig": {
    "SingleShotDuration": 3,      // Duration of segmented shots. Choose one: SingleShotDuration or EnableClipSplit.
    "EnableClipSplit": false,      // Whether to use AI clip segmentation. If true, SingleShotDuration is ignored.
    "AllowVfxEffect": false,	  // Whether to add special effects.
    "AllowTransition": false,	  // Whether to add transitions.
    "AllowDuplicateMatch": false // In image-text matching mode, whether to allow reuse of matched clips.
  }
}

TemplateConfig

TemplateConfig contains common parameters for batch video production. For detailed parameters and examples, see TemplateConfig.

OutputConfig

Note
  • OutputConfig specifies the output destination, naming, resolution, and video count.

  • Parameters apply to both generation modes.

Parameter

Type

Description

Example

Required

MediaURL

String

The output video URL, which must include the {index} placeholder.

Format: http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name]_{index}.mp4

Example: http://example.oss-cn-shanghai.aliyuncs.com/example/example_{index}.mp4

Required if GeneratePreviewOnly is false and output is to OSS.

StorageLocation

String

The storage location for media assets output to ApsaraVideo VOD.

Format: [your-vod-bucket].oss-[your-region-id].aliyuncs.com

Example: outin-****6c886b4549d481030f6e****.oss-cn-shanghai.aliyuncs.com

Required if GeneratePreviewOnly is false and output is to VOD.

FileName

String

The output file name, which must include the {index} placeholder.

Format: [your-file-name]__{index}.mp4

Example: example_{index}.mp4

Required if GeneratePreviewOnly is false and output is to VOD.

GeneratePreviewOnly

Boolean

  • If true, the job only generates a preview timeline without actually producing a video. The output URL is not required.

  • After the job completes, you can query the result using GetBatchMediaPoducingJob to get the editing project ID (projectId), then call GetEditingProject to retrieve the preview timeline.

false

No. Default value: false.

Count

Integer

The number of videos to output.

  • Global Scripts: up to 100.

  • Storyboard Script: up to 100.

10

No. Default value: 1.

MaxDuration

Float

The maximum duration for each output video, in seconds.

If a SpeechText is provided, this parameter is ignored, and the video duration matches the voiceover length.

If no SpeechText is provided, the video duration will not exceed this value.

20

No. Default value: 15.

FixedDuration

Float

The fixed duration for each output video. If set, the video duration will be adjusted to match this value.

Note:

  • Not supported in Storyboard Script mode.

  • Supported in Global Scripts mode only if SpeechTextArray is empty.

  • You can only choose one between FixedDuration and MaxDuration.

20

No. Default value: 15.

Width

Integer

The width of the output video in pixels.

1080

Yes

Height

Integer

The height of the output video in pixels.

1920

Yes

Video

JSONObject

Configuration for the output video stream, such as CRF and codec.

{"Crf": 27}

No

Example

{
 	"MediaURL": "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name]_{index}.mp4",
 	"Count": 1,
 	"MaxDuration": 15,
 	"Width": 1080,
 	"Height": 1920,
 	"Video": {"Crf": 27},
        "GeneratePreviewOnly":false
}

SDK examples

Prerequisites

You have installed the IMS server SDK. For more information, see Get started.

Code example

This example uses the Global Scripts mode.

Expand to view code example

Expand to view code example
package com.example;

import com.alibaba.fastjson.JSONObject;
import com.aliyun.ice20201109.Client;
import com.aliyun.ice20201109.models.*;
import com.aliyun.teaopenapi.models.Config;

import java.util.*;

/**
 *  Required maven dependencies:
 *   <dependency>
 *      <groupId>com.aliyun</groupId>
 *      <artifactId>ice20201109</artifactId>
 *      <version>2.3.0</version>
 *  </dependency>
 *  <dependency>
 *      <groupId>com.alibaba</groupId>
 *      <artifactId>fastjson</artifactId>
 *      <version>1.2.9</version>
 *  </dependency>
 */

public class SmartMixBatchEditingService {

    static final String regionId = "[your-bucket]"; // Smart image-text matching supports cn-shanghai, cn-beijing, cn-hangzhou
    static final String bucket = "[your-region-id]";
    private Client iceClient;

    public static void main(String[] args) throws Exception {
        SmartMixBatchEditingService smartMixBatchEditingService = new SmartMixBatchEditingService();
        smartMixBatchEditingService.initClient();
        smartMixBatchEditingService.runExample();
    }

    public void initClient() throws Exception {
        // An Alibaba Cloud account AccessKey has access to all API operations. We recommend that you use a Resource Access Management (RAM) user for API access or daily operations.
        // In this example, the AccessKey ID and AccessKey secret are obtained from the environment variables. For configuration method, see: https://help.aliyun.com/zh/sdk/developer-reference/v2-manage-access-credentials?spm=a2c4g.11186623.0.0.423350fbOTFdOB#2a38e5c14b4em
        com.aliyun.credentials.Client credentialClient = new com.aliyun.credentials.Client();

        Config config = new Config();
        config.setCredential(credentialClient);

        // To hard-code your AccessKey ID and AccessKey secret, use the following lines. However, we recommend that you do not hard-code your AccessKey ID and AccessKey secret for security concerns.
        // config.accessKeyId = <AccessKey ID created in step 2>;
        // config.accessKeySecret = <AccessKey Secret created in step 2>;
        config.endpoint = "ice." + regionId + ".aliyuncs.com";
        config.regionId = regionId;
        iceClient = new Client(config);
    }

    public void runExample() throws Exception {

        // Video materials
        List<String> mediaArray = Arrays.asList(
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-1.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-2.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-3.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-4.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-5.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-6.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-7.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-8.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-9.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-10.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-11.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-12.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-13.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-14.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-15.mp4",
            "http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-16.mp4"
        );

        // Voiceover script
        String speechText = "In the vast expanse of the deep blue, a vibrant picture unfolds. In the clear turquoise water, coral reefs, like underwater forests, are a kaleidoscope of colors...";

        // Video title
        String title = "Protect Our Blue Home";

        JSONObject inputConfig = new JSONObject();
        inputConfig.put("MediaArray", mediaArray);
        inputConfig.put("SpeechText", speechText);
        inputConfig.put("Title", title);

        // Number of videos to produce
        int produceCount = 4;

        // Output width and height, generating portrait video
        int outputWidth = 1080;
        int outputHeight = 1920;

        //// Output width and height, generating landscape video
        //int outputWidth = 1920;
        //int outputHeight = 1080;

        // Output OSS address, must include {index} placeholder
        String mediaUrl = "http://" + bucket + ".oss-" + regionId + ".aliyuncs.com/smart_mix/output_{index}.mp4";

        JSONObject outputConfig = new JSONObject();
        outputConfig.put("MediaURL", mediaUrl);
        outputConfig.put("Count", produceCount);
        outputConfig.put("Width", outputWidth);
        outputConfig.put("Height", outputHeight);

        // Submit one-click video production task
        SubmitBatchMediaProducingJobRequest request = new SubmitBatchMediaProducingJobRequest();
        request.setInputConfig(inputConfig.toJSONString());
        request.setOutputConfig(outputConfig.toJSONString());

        SubmitBatchMediaProducingJobResponse response = iceClient.submitBatchMediaProducingJob(request);
        String jobId = response.getBody().getJobId();
        System.out.println("Start smart mix batch job, batchJobId: " + jobId);

        // Poll task status until all are finished
        System.out.println("Waiting job finished...");
        int maxTry = 3000;
        int i = 0;
        while (i < maxTry) {
            Thread.sleep(3000);
            i++;
            GetBatchMediaProducingJobRequest getRequest = new GetBatchMediaProducingJobRequest();
            getRequest.setJobId(jobId);
            GetBatchMediaProducingJobResponse getResponse = iceClient.getBatchMediaProducingJob(getRequest);
            String status = getResponse.getBody().getEditingBatchJob().getStatus();
            System.out.println("BatchJobId: " + jobId + ", status:" + status);

            if ("Failed".equals(status)) {
                System.out.println("Batch job failed. JobInfo: " + JSONObject.toJSONString(getResponse.getBody().getEditingBatchJob()));
                throw new Exception("Produce failed. BatchJobId: " + jobId);
            }

            if ("Finished".equals(status)) {
                System.out.println("Batch job finished. JobInfo: " + JSONObject.toJSONString(getResponse.getBody().getEditingBatchJob()));
                break;
            }
        }
    }
}

API input parameters

InputConfig

{
	"MediaArray": [
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-1.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-2.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-3.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-4.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-5.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-6.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-7.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-8.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-9.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-10.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-11.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-12.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-13.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-14.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-15.mp4",
		"http://ice-document-materials.oss-cn-shanghai.aliyuncs.com/test_media/sea/sea-16.mp4"
	],
	"SpeechText": "In the vast expanse of the deep blue, a vibrant picture unfolds. In the clear turquoise water, coral reefs, like underwater forests, are a kaleidoscope of colors...",
	"Title":"Protect Our Blue Home"
}

OutputConfig

{
  "Count": 4,
  "Height": 1080,
  "Width": 1920,
  "MediaURL": "http://[your-bucket].oss-[your-region-id].aliyuncs.com/[your-file-path]/[your-file-name]_{index}.mp4"
}

Result examples

Portrait

Landscape

Editing logic and advanced configuration

Processing logic

Global Scripts mode:

  • If video assets are selected from search library based on descriptive text, the text is used as a search query to intelligently find matching video clips.

  • If a long video is provided as input, it will first be segmented into shorter shots. The final video will be a combination of these shots. The default duration for each shot is 3 seconds, which can be customized using the SingleShotDuration parameter.

  • If no voiceover is provided, the system randomly selects and splices video clips to create a video of approximately 15 seconds.

  • If a voiceover is provided, the system intelligently matches visuals to the text and synchronizes them with the voiceover to produce multiple videos in a batch.

Storyboard Script mode:

  • If video assets are selected from search library based on descriptive text, the text is used to intelligently search for and retrieve matching video clips.

  • In this mode, you do not set SpeechTextArray. Instead, you control the content, duration, and voiceover for each scene using SceneInfo.ShotInfo.ShotScripts.

  • Within a single scene, the system first tries to match and trim clips based on the ScriptText. If ScriptText is not provided but SpeechText is, the matching is based on the voiceover.

  • The duration of a scene is synchronized with either the voiceover length or a custom-defined duration.

Advanced configuration

For advanced settings, see Logic and advanced configurations for batch one-click video creation.

References