AI agents: Getting started

更新时间:
复制 MD 格式

This guide helps developers integrate AI agents with Alibaba Cloud Video on Demand (VOD). It provides a quick-start guide and an API documentation structure designed for Large Language Models (LLMs).

What you can achieve

With this document, an AI agent can:

  • Understand VOD core capabilities: Learn about core VOD features, such as media upload, transcoding, playback, and media asset management, by using the structured module overview.

  • Learn to call APIs: Use the llms.txt index to find module-specific documentation containing API operations, parameter descriptions, and usage examples that teach the AI agent how to call VOD APIs.

  • Understand authentication and authorization: Learn how to configure credentials for VOD API calls by understanding its supported authentication methods, such as AccessKey and STS temporary credentials.

  • Handle common errors: Learn to resolve issues independently by using the provided list of common error codes and troubleshooting methods.

Prerequisites

Before using the VOD API, complete the following steps:

  • Activate VOD: Activate Alibaba Cloud Video on Demand (VOD) in the Alibaba Cloud console.

  • Create an AccessKey: Create an AccessKey ID and AccessKey Secret in the RAM console. For security reasons, we recommend creating a dedicated RAM user for VOD API calls and granting the AliyunVODFullAccess permission.

  • Install an SDK: Use an Alibaba Cloud SDK to call the VOD API. The POP product code for VOD is vod, and the API version is 2017-03-21.

Default parameters and conventions

Before calling the VOD API, you need to understand the following default values and conventions:

  • Default application ID: app-1000000. If the multi-application system is not enabled, all API calls are associated with the default application.

  • Default storage: If you do not specify a StorageLocation, files are uploaded to the default storage address.

  • Default transcoding template group: If you do not specify a TemplateGroupId and no workflow is associated, the default transcoding template (a no-transcoding template group) is used.

  • API call protocol: Use HTTPS for all API calls to ensure secure data transfer.

  • Request signature: All API requests require signature verification. The signing method uses HMAC-SHA1. SDKs automatically handle signing.

llms.txt

The llms.txt file is an index of the VOD documentation that is optimized for Large Language Models (LLMs) and hosted on Alibaba Cloud OSS. The file reorganizes the official documentation by scenario, API, and sub-document path, and extracts a Common mistakes to avoid list to guide code generation. This allows a coding agent to load the file all at once and expand sections on demand.

The base URL for accessing the index file is:

https://ice-document-materials.oss-cn-shanghai.aliyuncs.com/vod/llms/llms.txt

Relationship with the official documentation: llms.txt is an index. The sub-documents, such as Media Upload/Upload from URL.md, are condensed versions of key information from the official documentation. Their content is kept consistent and synchronized with the official website by the VOD documentation team.

VOD modules

VOD features are organized into modules, each corresponding to a set of API operations. The following table lists these modules and provides links to their documentation, which are also indexed in llms.txt. These links are designed for direct consumption by AI agents.

Module

Description

llms document link

Media upload

Upload audio, video, images, and auxiliary media assets using the console, client-side SDK, server-side API, or a URL.

Media Upload Overview

Media asset management

Manage uploaded media assets. Perform operations like querying information, updating metadata, deleting assets, and setting status.

Media Asset Management Overview

Media processing

Processes audio and video files by using features such as transcoding, snapshot capture, animated image generation, and watermark composition. Supports custom transcoding template groups, workflow orchestration, and AI templates for smart review and smart cover generation.

Media Processing Overview

Audio and video playback

Plays audio and video content that has been uploaded and processed. Playback is available through the console, a Player SDK, or third-party players.

Audio and Video Playback

Media security

A security framework that prevents hotlinking, unauthorized downloads, and illegal distribution of audio and video content through mechanisms such as access restriction, URL authentication, video encryption, and digital watermarks.

Media Security Overview

Media review

Provides smart review and manual review capabilities. Smart review automatically identifies non-compliant content (such as pornographic, violent, and political content) in audio and video, and supports custom AI review templates. Manual review provides APIs for creating review tasks and submitting results.

Smart review

Video AI

Performs automated analysis and processing of audio and video content, including smart review, tag recognition, DNA comparison, and cover generation.

Video AI Overview

Cloud editing

Provides cloud editing capabilities. You can use APIs to create editing projects, manage materials, and perform video composition.

Media Production (Cloud Editing)

CDN distribution and acceleration

Configure accelerated domain names and obtain playback URLs and playback credentials to distribute and play audio and video. Supports secure playback features like CDN acceleration, URL authentication, and DRM encryption.

CDN Distribution and Acceleration

Event notification

Receive notifications about media processing events, such as upload or transcoding completion, via HTTP callbacks or Message Service (MNS).

Event Notification

Data statistics

Query usage, monitor resource consumption, and perform statistical analysis to understand service usage and resource utilization.

Data Monitoring

Multi-application system

Create multiple applications under a single Alibaba Cloud account to logically isolate media assets, configurations, and permissions. This supports application-level control over media upload, playback, media asset management, and message callbacks.

Multi-application System

Server-side SDK

Use SDKs for Java, Python, PHP, and C/C++ to call APIs for media upload, management, and processing.

Server-side SDK

Live-to-VOD

Records a live stream in real time and automatically stores it as an on-demand media asset for subsequent playback, management, and distribution.

Configure Live-to-VOD

Billing

Offers pay-as-you-go and subscription billing based on metrics such as storage capacity, traffic and bandwidth, transcoding duration, media management, and value-added services.

Billing Overview

Mini-series solution

A one-stop solution for mini-series production and operations based on VOD. It provides content production, media asset management, data insights, and efficient distribution and playback.

Mini-series Solution

Player SDK

An Alibaba Cloud-developed, full-platform audio and video playback tool for Web, Android, and iOS that provides stable and smooth on-demand and live streaming playback.

Player SDK Overview

AliPlayerKit

A low-code player UI framework for video services that offers extensible components and scenario-based solutions for quick integration with on-demand, live streaming, and other scenarios.

PlayerKits Overview

API reference

Provides OpenAPI for the entire media asset lifecycle, supporting operations such as upload, management, processing, distribution, and playback.

API Overview

Media upload

VOD provides several methods for uploading media:

  • Server-side upload: Call the CreateUploadVideo operation to obtain an upload URL and credential, then upload the file using an SDK or via HTTP. This method is ideal for backend server uploads.

  • Client-side upload: Upload videos directly from the client by using an AccessKey or an STS temporary credential.

  • Upload from URL: Call the UploadMediaByURL operation and provide the source file URL. The VOD service automatically pulls and uploads the file. This method is ideal for bulk migrations or importing media from third-party URLs.

Key parameters

The following parameters are critical when calling CreateUploadVideo:

Parameter

Type

Required

Default

Description

FileName

String

Yes

The full path and file name of the source media file, including the extension (e.g., video_01.mp4).

Title

String

Yes

The media title. Maximum 128 characters.

Description

String

No

The description of the audio or video. Maximum length: 1,024 characters.

CateId

Long

No

The category ID. You can find this ID in the console: Configuration Management > Media Asset Management Configuration > Category Management.

Tags

String

No

Up to 16 comma-separated tags. Each tag can be a maximum of 32 characters.

TemplateGroupId

String

No

The transcoding template group ID. If you specify this parameter, transcoding is automatically triggered after the upload is complete. You can find this in the console by navigating to Configuration Management > Media Processing > Transcoding Template Groups.

WorkflowId

String

No

The workflow ID. If you specify this parameter, the workflow is automatically triggered after the upload is complete. If both WorkflowId and TemplateGroupId are specified, WorkflowId takes precedence.

StorageLocation

String

No

The storage address. If not specified, the file is uploaded to the default storage address. You can find this in the console by navigating to Configuration Management > Media Asset Management Configuration > Storage.

CoverURL

String

No

The URL of a custom video cover.

AppId

String

No

app-1000000

The application ID. Specifies the application in a multi-application system.

Media asset management

The media asset management module is used to manage uploaded audio, video, and auxiliary media assets. Core operations include:

  • Query media asset information: GetVideoInfo (queries a single video), GetVideoInfos (queries multiple videos in bulk), SearchMedia (searches for media assets)

  • Update media asset information: UpdateVideoInfo (updates video information), UpdateImageInfos (updates image information)

  • Delete media assets: DeleteVideo (deletes videos), DeleteAttachedMedia (deletes auxiliary media assets)

  • Bulk operations: BatchGetMediaInfos (retrieves information for up to 20 media assets at a time)

Note

The media ID (VideoId, MediaId, or ImageId) is the unique identifier for managing media assets. When you upload a video, CreateUploadVideo returns a VideoId. When you upload an auxiliary media asset, CreateUploadAttachedMedia returns a MediaId.

Media processing

The media processing module provides capabilities such as audio and video transcoding, snapshot capture, and AI review.

  • Transcoding: Configure transcoding parameters by using a transcoding template group (AddTranscodeTemplateGroup). You can trigger automatic transcoding by specifying a TemplateGroupId during upload or by using a workflow. You can set parameters such as video codec (for example, H.264), resolution (for example, 640×360), and bitrate (for example, 400 kbps).

  • Snapshot capture: Configure snapshot parameters by using a snapshot template (AddVodTemplate with TemplateType set to Snapshot). It supports various types, including standard snapshots and sprites.

  • Smart review: Configure review items (such as pornographic, violent, and political content) and scopes (cover image, video content, and title text) by using an AI template (AddAITemplate with TemplateType set to AIMediaAudit). The review is automatically triggered after a video is uploaded. You can also call CreateAudit for manual review.

  • Smart cover: Automatically generate a video cover by using an AI template (with TemplateType set to AIImage).

Smart review parameters

When calling AddAITemplate to create an AI review template:

Parameter

Type

Required

Default

Description

TemplateName

String

Yes

The name of the AI template. Maximum length: 128 bytes.

TemplateType

String

Yes

The template type: AIMediaAudit (smart review) or AIImage (smart cover).

TemplateConfig

String

Yes

The template configuration as a JSON string. It includes AuditItem (review items such as terrorism and porn), AuditRange (review scopes such as image-cover, text-title, and video), and AuditAutoBlock (whether to automatically block content: yes/no).

Distribution and playback

The distribution and playback module provides video playback URL retrieval and secure playback capabilities.

  • Get playback URLs: Call GetPlayInfo to get video playback URLs. You can specify the output format (such as MP4, FLV, or HLS) and definition.

  • Get playback credential: Call GetVideoPlayAuth to get a playback credential for encrypted playback (either HLS standard encryption or Alibaba Cloud proprietary encryption).

  • Domain name management: Call AddVodDomain to add an accelerated domain name, BatchStartVodDomain to enable it, and BatchStopVodDomain to disable it.

Domain configuration parameters

When calling AddVodDomain to add an accelerated domain name:

Parameter

Type

Required

Default

Description

DomainName

String

Yes

The accelerated domain name. Wildcard domain names are supported, such as *.example.com.

Sources

String

Yes

The list of origin addresses as a JSON array. Format: [{"content":"1.1.1.1","type":"ipaddr","priority":"20","port":80}].

Scope

String

No

domestic

The acceleration scope: domestic (Chinese mainland), overseas (regions outside the Chinese mainland, including Hong Kong, Macao, and Taiwan), or global (global acceleration).

Common errors and troubleshooting

Error code

Description

Troubleshooting

InvalidAccessKeyId.NotFound

The specified AccessKey ID does not exist.

Use aliyun configure to verify your AccessKey configuration, or check the AccessKey status in the RAM console.

SignatureDoesNotMatch

The signature does not match the calculated result.

Enable SDK debug logs to troubleshoot the signature issue: export ALIBABA_CLOUD_LOG_LEVEL=debug.

InvalidParameter

The parameter is invalid.

Check whether the request parameters meet the requirements (such as type, length, and whether they are required) by referring to the documentation for each API.

Forbidden.AccessDenied

Insufficient permissions.

Confirm that the RAM user has been granted the required VOD permissions, such as AliyunVODFullAccess. You can check the granted policies by running aliyun ram ListPoliciesForUser --UserName <user>.

ServiceUnavailable

The service is temporarily unavailable.

This indicates a temporary service issue. Retry the request with exponential backoff.

QuotaExceeded.UploadVideo

The number of uploaded videos has exceeded the quota.

Check your account's upload quota. You can submit a ticket to request a quota increase.

MediaNotFound

The media asset does not exist.

Confirm that the VideoId or MediaId is correct and that the media asset has not been deleted.

InvalidStatus.Media

The media asset is in a state that is invalid for this operation.

The asset is in a state that prevents this operation (e.g., 'under review'). Call GetVideoInfo to check the current status before retrying.