PixVerse - Image-to-video based on the first frame API reference

更新时间:
复制 MD 格式

The PixVerse image-to-video model generates a smooth video from an input image and a text prompt.

Important

This document applies only to the China (Beijing) region. Use an API key from this region.

Activate the service

  1. Go to the Alibaba Cloud Model Studio console, search for PixVerse, find the PixVerse model card, and click Activate Now.

  2. In the pop-up window, confirm the activation and authorization.

Scope

To ensure successful invocations, the model, endpoint URL, and API key must all belong to the same region. Cross-region invocations will fail.

HTTP invocation

Image-to-video tasks take a long time to run—typically 1 to 5 minutes—and use asynchronous invocation. The process consists of two core steps: Create a task → Poll for the result. Follow these steps:

Step 1: Create a task to get a task ID

Beijing region: POST https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis

Note
  • After the task is created, use the returned task_id to query the result. The task_id is valid for 24 hours. Do not create duplicate tasks. Instead, use polling to retrieve the result.

  • For guidance for beginners, see Postman.

Request parameters

Image-to-video

Supported models: pixverse/pixverse-c1-it2v, pixverse/pixverse-v6-it2v, and pixverse/pixverse-v5.6-it2v.

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "pixverse/pixverse-c1-it2v",
    "input": {
        "media": [
            {
                "type": "image_url",
                "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260121/zlpocv/wan-i2v-haigui.webp"
            }
        ],
        "prompt": "The camera slowly moves up from below the sea turtle. The sea turtle swims leisurely, and the details of its belly are clearly visible."
    },
    "parameters": {
        "resolution": "720P",
        "duration": 5,
        "audio": false,
        "watermark": true
    }
}'

Image-to-video (multi-shot)

pixverse-c1

Supported model: pixverse/pixverse-c1-it2v.

Describe the multi-shot scenario in the prompt. Setting the shot_type parameter is not supported.

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "pixverse/pixverse-c1-it2v",
    "input": {
        "media": [
            {
                "type": "image_url",
                "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260121/zlpocv/wan-i2v-haigui.webp"
            }
        ],
        "prompt": "Shot 1: The camera slowly moves up from below the sea turtle. The sea turtle swims leisurely, and the details of its belly are clearly visible. Shot 2: A long shot of the sea turtle surrounded by seaweed, which is also swaying from side to side."
    },
    "parameters": {
        "resolution": "720P",
        "duration": 8,
        "audio": false,
        "watermark": true
    }
}'

pixverse-v6

Supported model: pixverse/pixverse-v6-it2v.

Describe the multi-shot scenario in the prompt and set shot_type to multi to generate a multi-shot video with audio.

curl --location 'https://dashscope.aliyuncs.com/api/v1/services/aigc/video-generation/video-synthesis' \
    -H 'X-DashScope-Async: enable' \
    -H "Authorization: Bearer $DASHSCOPE_API_KEY" \
    -H 'Content-Type: application/json' \
    -d '{
    "model": "pixverse/pixverse-v6-it2v",
    "input": {
        "media": [
            {
                "type": "image_url",
                "url": "https://help-static-aliyun-doc.aliyuncs.com/file-manage-files/zh-CN/20260121/zlpocv/wan-i2v-haigui.webp"
            }
        ],
        "prompt": "Shot 1: The camera slowly moves up from below the sea turtle. The sea turtle swims leisurely, and the details of its belly are clearly visible. Shot 2: A long shot of the sea turtle surrounded by seaweed, which is also swaying from side to side."
    },
    "parameters": {
        "resolution": "720P",
        "duration": 8,
        "shot_type": "multi",
        "audio": false,
        "watermark": true
    }
}'

Content-Type string (Required)

The content type of the request. Must be application/json.

Authorization string (Required)

Authenticates the request with a Model Studio API key. Example: Bearer sk-xxxx.

X-DashScope-Async string (Required)

Enables asynchronous processing. HTTP requests support only asynchronous calls. Must be enable.

Important

If this request header is missing, the error "current user api does not support synchronous calls" is returned.

Request body

model string (Required)

The model name. For model output specifications, see the Model list.

Valid values:

  • pixverse/pixverse-c1-it2v

  • pixverse/pixverse-v6-it2v

  • pixverse/pixverse-v5.6-it2v

Model selection

  • For dynamic scenarios such as fights, magic effects, and high-speed motion, use c1.

  • For general scenarios, use v6. Upgrade directly from v5.6 to v6.

input object (Required)

Basic input information, including prompts and media materials.

Properties

prompt string (Optional)

The text prompt describes the elements and visual features you want in the generated video.

Chinese and English are supported. Each Chinese character or letter counts as one character. Character encoding is UTF-8. Text exceeding the limit is automatically truncated.

  • pixverse/pixverse-c1-it2v: Up to 5,000 characters.

  • pixverse/pixverse-v6-it2v: Up to 5,000 characters.

  • pixverse/pixverse-v5.6-it2v: Up to 2,048 characters.

media array (Required)

A list of media materials specifying the images required for video generation.

Each array element is a media object containing the type and url fields.

Properties

type string (Required)

The type of media material. Valid value:

  • image_url: The URL of an image.

Material limits: Provide exactly one image.

If you provide multiple images, the system uses the last one. To ensure expected results, always provide only one image.

url string (Required)

The URL of the image file. The URL must be publicly accessible over the internet.

  • HTTP and HTTPS protocols are supported.

  • Example: https://xxx/xxx.jpg.

Image limits:

  • Format: JPG, PNG, or WEBP.

  • Resolution: Width and height cannot exceed 10,000 pixels.

  • File size: Up to 20 MB.

parameters object (Optional)

Video processing parameters, such as resolution, duration, audio generation, and watermarking.

Properties

resolution string (Required)

Important

Resolution affects cost. Before calling the API, review the model pricing.

Specifies the resolution level of the generated video.

Valid values: 360P, 540P, 720P, and 1080P.

duration integer (Required)

Important

Duration affects cost. Billing is per second. Before calling the API, review the model pricing.

The duration of the generated video, in seconds.

  • pixverse/pixverse-c1-it2v: An integer from 1 to 15.

  • pixverse/pixverse-v6-it2v: An integer from 1 to 15.

  • pixverse/pixverse-v5.6-it2v:

    • When resolution is set to any resolution corresponding to 360P, 540P, or 720P: Valid values are 5, 8, and 10.

    • When resolution is set to any resolution corresponding to 1080P: Valid values are 5 and 8.

audio boolean (Optional)

Important

Audio setting affects cost. Before calling the API, review the model pricing.

Specifies whether to generate a video with audio. If enabled, the model automatically generates matching background music or sound effects based on the video content.

  • false: Default. Outputs a silent video.

  • true: Outputs a video with audio.

watermark boolean (Optional)

Specifies whether to add a watermark. The watermark appears in the lower-right corner of the video and displays the fixed text "Generated by AI".

  • false: Default. Does not add a watermark.

  • true: Adds a watermark.

shot_type string (Optional)

Supported model: pixverse/pixverse-v6-it2v.

Specifies the shot type of the generated video—single continuous shot or multiple shots.

  • single: Default. Generates a single-shot video.

  • multi: Multi-shot. The system performs intelligent shot segmentation.

Suggestion: The prompt parameter takes precedence over shot_type. For best results, ensure consistency between this parameter and the prompt description.

  • To consistently output a single-shot video: Set shot_type="single" and describe a single-shot scenario in the prompt.

  • To consistently output a multi-shot video: Set shot_type="multi" and describe a multi-shot scenario in the prompt.

seed integer (Optional)

The random number seed must be an integer in the range [0, 2147483647].

If not specified, a random seed is generated. A fixed seed improves reproducibility.

Because model generation is probabilistic, the same seed does not guarantee identical results.

Response parameters

Successful response

Save the task_id to query the task status and result.

{
    "output": {
        "task_status": "PENDING",
        "task_id": "0385dc79-5ff8-4d82-bcb6-xxxxxx"
    },
    "request_id": "4909100c-7b5a-9f92-bfe5-xxxxxx"
}

Error response

Task creation failed. See Error messages.

{
    "code": "InvalidApiKey",
    "message": "No API-key provided.",
    "request_id": "7438d53d-6eb8-4596-8835-xxxxxx"
}

output object

Task output information.

Properties

task_id string

The task ID. Valid for queries for 24 hours.

task_status string

The status of the task.

Enumeration values

  • PENDING

  • RUNNING

  • SUCCEEDED

  • FAILED

  • CANCELED

  • UNKNOWN: The task does not exist or its status is unknown.

request_id string

Unique request identifier for tracing and troubleshooting.

code string

Error code. Returned only for failed requests. See Error messages.

message string

Detailed error message. Returned only for failed requests. See Error messages.

Step 2: Query the result by task ID

Beijing region: GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id}

Note
  • Polling suggestion: Video generation takes several minutes. Use a polling mechanism with a reasonable interval—such as 15 seconds—to retrieve the result.

  • Task status flow: PENDING (queued) → RUNNING (processing) → SUCCEEDED (completed) / FAILED (failed).

  • task_id validity period: 24 hours. After this period, you can no longer query the result, and the API returns the task status as UNKNOWN.

  • RPS limit: The default records per second (RPS) for the query API is 20. For higher-frequency polling or event notifications, configure an asynchronous task callback.

  • More operations: For batch queries and task cancellation, see Manage asynchronous tasks.

Request parameters

Query task result

Replace {task_id} with the task_id value returned by the previous API call. The task_id is valid for queries for 24 hours.

curl -X GET https://dashscope.aliyuncs.com/api/v1/tasks/{task_id} \
--header "Authorization: Bearer $DASHSCOPE_API_KEY"
Headers

Authorization string (Required)

Authenticates the request with a Model Studio API key. Example: Bearer sk-xxxx.

Path parameters

task_id string (Required)

The ID of the task.

Response parameters

Task successful

{
    "request_id": "7df19cf7-d76c-4bb8-b4c5-xxxxxx",
    "output": {
        "task_id": "5abf2c85-ea81-4cbf-8918-xxxxxx",
        "task_status": "SUCCEEDED",
        "submit_time": "2026-03-20 11:48:50.499",
        "scheduled_time": "2026-03-20 11:48:50.551",
        "end_time": "2026-03-20 11:49:46.462",
        "orig_prompt": "The camera slowly moves up from below the sea turtle. The sea turtle swims leisurely, and the details of its belly are clearly visible.",
        "video_url": "https://media.pixverseai.cn/xxxx.mp4"
    },
    "usage": {
        "duration": 5,
        "shot_type": "single",
        "size": "992*944",
        "fps": 24,
        "video_count": 1,
        "audio": false,
        "SR": "720"
    }
}

Task failed

When a task fails, task_status is FAILED with an error code and message. See Error messages.

{
    "request_id": "e5d70b02-ebd3-98ce-9fe8-759d7d7b107d",
    "output": {
        "task_id": "86ecf553-d340-4e21-af6e-a0c6a421c010",
        "task_status": "FAILED",
        "code": "InvalidParameter",
        "message": "The size is not match xxxxxx"
    }
}

Task query expired

The task_id is valid for 24 hours. After this period, queries return the following error.

{
    "request_id": "a4de7c32-7057-9f82-8581-xxxxxx",
    "output": {
        "task_id": "502a00b1-19d9-4839-a82f-xxxxxx",
        "task_status": "UNKNOWN"
    }
}

output object

Task output information.

Properties

task_id string

The task ID. Valid for queries for 24 hours.

task_status string

The status of the task.

Enumeration values

  • PENDING

  • RUNNING

  • SUCCEEDED

  • FAILED

  • CANCELED

  • UNKNOWN: The task does not exist or its status is unknown.

State transitions during polling:

  • PENDING → RUNNING → SUCCEEDED or FAILED.

  • The initial query status is usually PENDING or RUNNING.

  • When the status changes to SUCCEEDED, the response contains the generated video URL.

  • If the status is FAILED, check the error message and retry the task.

submit_time string

The time when the task was submitted. format is YYYY-MM-DD HH:mm:ss.SSS.

scheduled_time string

The time when the task was executed. format is YYYY-MM-DD HH:mm:ss.SSS.

end_time string

The time when the task was completed. format is YYYY-MM-DD HH:mm:ss.SSS.

video_url string

The video URL. Returned only when task_status is SUCCEEDED.

The video format is MP4 (H.264 encoding). The video link does not currently expire, but avoid relying on it for long-term storage. Download the video promptly.

orig_prompt string

The original input prompt, corresponding to the request parameter prompt.

code string

Error code. Returned only for failed requests. See Error messages.

message string

Detailed error message. Returned only for failed requests. See Error messages.

usage object

Statistics for the output information. Only successful results are counted.

Properties

duration integer

The total duration of the generated video, used for billing.

size string

The resolution of the generated video.

fps integer

The frame rate of the generated video.

SR string

The resolution level of the generated video.

audio boolean

Indicates whether the generated video contains audio.

video_count integer

The number of generated videos. Value is fixed at 1.

shot_type string

The shot type of the generated video.

request_id string

Unique request identifier for tracing and troubleshooting.

Error codes

If the model call fails and returns an error message, see Error messages for resolution.