Model Studio models

更新时间:
复制 MD 格式

This guide shows how to integrate Alibaba Cloud Model Studio with Hologres. You can deploy models from Model Studio in Hologres by using an API key and call them through AI Functions. This approach lets you perform AI development without moving data out of the database.

Introduction

Alibaba Cloud Model Studio (Bailian) is a one-stop platform for large language model development and application. It integrates Qwen and mainstream third-party models, providing developers with OpenAI-compatible APIs, end-to-end model services, and visual application building capabilities. Model Studio provides ready-to-use model services that let you call the full Qwen model lineup without deploying or maintaining infrastructure.

Hologres is deeply integrated with Alibaba Cloud Model Studio (Bailian). You can deploy Model Studio models in Hologres with an API key and invoke them using AI Function. This lets you develop and build AI applications without moving data out of your database.

Billing

  • Network fees: The supported regions for Model Studio are China (Beijing) and Singapore. Accessing Model Studio from a Hologres instance may incur network fees. Network fees are waived during the beta period, and the billing start date will be announced on the official website.

  • Model invocation fees: Model Studio charges these fees based on API call volume. For details, see model pricing and the Model Studio console.

Limitations

  • Supported instance versions: Hologres V4.0.18 and later, or V4.1.2 and later.

  • Supported regions:

    • China (Beijing, Ulanqab)

    • China (Hangzhou, Shanghai)

    • China (Shenzhen)

    • Singapore

Model list and parameters

Deploy a model

In the Hologres console, go to Instances. Select your target instance and click AI Model at the top of the instance details page. On the Models page, you can deploy a Model Studio model. Select Alibaba Cloud Model Studio as the provider and configure the following parameters. The key parameters include:

  • Model Type: The Model Studio model you want to deploy. For details, see the Model list section below. Models that are not on this list are not supported.

  • API Key: You must activate Alibaba Cloud Model Studio and obtain an API key before using the service. Enter this key during deployment. For instructions on how to obtain an API key, see Get an API Key.

  • Model Parameter Settings: After selecting a model type, configure its parameters to meet your business needs. For details, see the Parameter description section below. You can also configure the retry mechanism.

Parameters

The following parameters are supported by different model types. For a complete description, see the Model Studio console and API documentation.

  • Text models:

    • max_tokens: The maximum number of tokens to return. For the model's token limit, see the official Model Studio documentation.

    • temperature: The sampling temperature, which controls the diversity of the generated output. Valid values: [0, 2.0).

    • top_p: The probability threshold for nucleus sampling. Valid values: (0, 1.0]. Both temperature and top_p control diversity. We recommend setting only one of them.

    • Qwen-Omni series: In addition to the common text parameters, this series supports modalities (specifies the output as text or audio), audio.voice (specifies the voice for audio output), and audio.format (specifies the audio format; wav is a supported format).

  • Translation models: Configure the following parameters to improve translation quality. For complete usage details, see Translation models.

    • source_lang: The source language. For details, see the language list.

    • terms: A JSON-formatted list of custom translation terms.

    • tm_list: The translation memory, which provides example source-target sentence pairs in JSON format.

    • domains: Text-based, domain-specific prompts.

    Example:

    {
      "extra_body": {
        "translation_options": {
          "source_lang": "zh", 
          "domains": "The sentence is from Ali Cloud IT domain. ", 
          "terms": [
            {"source": "生物传感器", "target": "biological sensor"},
            {"source": "身体健康状况", "target": "health status of the body"}
          ], 
          "tm_list":[
            {"source": "您可以通过如下方式查看集群的内核版本信息:", "target": "You can use one of the following methods to query the engine version of a cluster:"},
            {"source": "bla", "target": "bla"}
          ]
        }
      }
    }
  • Embedding models: dimension specifies the vector dimension. You can modify this parameter for only some models. For detailed usage instructions, see Vectorization models.

    • text-embedding-v4 supports 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, and 64.

    • text-embedding-v3 supports 1,024 (default), 768, 512, 256, 128, or 64.

    • qwen3-vl-embedding supports 2,560 (default), 2,048, 1,536, 1,024, 768, 512, and 256.

Retry mechanism

When you deploy a model, you can configure the retry mechanism for failed calls. The following parameters are available:

  • max_retries: The maximum number of retries. Default value: 2. Value range: [0, 100].

  • initial_retry_delay: The initial delay before a retry, in seconds. Default value: 0.5. Value range: [0.5, 8].

  • max_retry_delay: The maximum delay for a retry, in seconds. Default value: 8. Value range: [1, 60].

  • timeout: The timeout for each request, in seconds. Default value: 600. Value range: [1, 1200].

Model list

Model Studio offers models for text generation, translation, embedding, and multimodal tasks. The following table lists the model category, model_type, task type, input/output, notes, and cross-region support for each model.

Model category

model_type

Task type

Input/output

Notes

Cross-region support

Text generation

qwen3.6-plus

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-max

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-max-2026-01-23

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-max-preview

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-max

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-max-latest

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-plus

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-plus-latest

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-flash

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-long

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-long-latest

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwq-plus

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwq-plus-latest

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-v3.2

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-v3.2-exp

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-v3.1

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-r1

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-r1-0528

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-v3

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-r1-distill-qwen-1.5b

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-r1-distill-qwen-7b

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-r1-distill-qwen-14b

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

deepseek-r1-distill-qwen-32b

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

kimi-k2-thinking

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

Moonshot-Kimi-K2-Instruct

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

glm-4.6

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

glm-4.7

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

glm-5

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

MiniMax-M2.1

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

MiniMax-M2.5

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

MiniMax/MiniMax-M2.1

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

MiniMax/MiniMax-M2.5

chat/completions

Text input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-vl-235b-a22b-instruct

chat/completions

Image/video input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-vl-235b-a22b-thinking

chat/completions

Image/video input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-vl-32b-instruct

chat/completions

Image/video input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-vl-32b-thinking

chat/completions

Image/video input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-vl-8b-instruct

chat/completions

Image/video input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-vl-8b-thinking

chat/completions

Image/video input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-vl-plus

chat/completions

Image/video input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-vl-flash

chat/completions

Image/video input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-vl-ocr

chat/completions

Image input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen-vl-ocr-latest

chat/completions

Image input, text output

Supports parameters: temperature, top_p, and max_tokens.

Yes

qwen3-omni-flash

chat/completions

Text/image/audio/video input, text/audio output

Supports parameters: temperature, top_p, max_tokens, modalities, and audio.

Yes

Translation

qwen-mt-plus

translation

ai_translate

Supports parameters: source_lang, terms, tm_list, and domains.

Yes

qwen-mt-flash

translation

ai_translate

Yes

qwen-mt-turbo

translation

ai_translate

Yes

qwen-mt-lite

translation

ai_translate

Yes

Embedding

text-embedding-v1

embedding

ai_embed, text input, float[] output

Embedding dimensions: 1,536

Yes

text-embedding-v2

embedding

ai_embed, text input, float[] output

Embedding dimensions: 1,536

Yes

text-embedding-v3

embedding

ai_embed, text input, float[] output

Embedding dimensions: 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, or 64

Yes

text-embedding-v4

embedding

ai_embed, text input, float[] output

Embedding dimensions: 1,024 (default), 768, 512, 256, 128, or 64

Yes

tongyi-embedding-vision-plus

embedding

ai_embed, text/image/video input, float[] output

Embedding dimensions: 1,152. Video input is only supported in the China (Beijing) and Singapore regions.

Yes

tongyi-embedding-vision-flash

embedding

ai_embed, text/image/video input, float[] output

Embedding dimensions: 768. Video input is only supported in the China (Beijing) and Singapore regions.

Yes

multimodal-embedding-v1

embedding

ai_embed, text/image/video input, float[] output

Embedding dimensions: 1,024. Video input is only supported in the China (Beijing) and Singapore regions.

Yes

qwen3-vl-embedding

embedding

ai_embed, text/image/video input, float[] output

Embedding dimensions: 2,560 (default), 2,048, 1,536, 1,024, 768, 512, or 256

Yes

Image generation and editing

qwen-image-2.0-pro

image-generation

Supports image editing and text-to-image generation.

Yes

qwen-image-2.0-pro-2026-03-03

image-generation

Supports image editing and text-to-image generation.

Yes

qwen-image-2.0

image-generation

Supports image editing and text-to-image generation.

Yes

qwen-image-2.0-2026-03-03

image-generation

Supports image editing and text-to-image generation.

Yes

qwen-image-max

image-generation

Supports text-to-image generation.

Yes

qwen-image-plus

image-generation

Supports text-to-image generation.

Yes

qwen-image

image-generation

Supports text-to-image generation.

Yes

qwen-image-edit

image-generation

Supports image editing.

Yes

qwen-image-edit-plus

image-generation

Supports image editing.

Yes

qwen-image-edit-max

image-generation

Supports image editing.

Yes

wan2.7-image-pro

image-generation

Text-to-image (single images only); Supports 4K HD output.

Yes

wan2.7-image

image-generation

Supports text-to-image generation.

Yes

Video generation

wan2.6-r2v

video-generation

Reference-to-video (r2v) generation

Yes

wan2.6-r2v-flash

video-generation

Reference-to-video (r2v) generation

Yes

wan2.6-t2v

video-generation

Text-to-video (t2v) generation

Yes

wan2.6-i2v-flash

video-generation

Image-to-video (i2v) generation from the first frame

Yes

wan2.6-i2v

video-generation

Image-to-video (i2v) generation from the first frame

Yes

wan2.2-kf2v-flash

video-generation

Keyframe-to-video (kf2v) generation from the first and last frames

Yes

wan2.7-t2v

video-generation

Text-to-video generation

Yes

wan2.7-i2v

video-generation

Multimodal input (text, image, video, or audio) to video output

Yes

wan2.7-r2v

video-generation

Multimodal input (text, image, video, or audio) to video output

Yes

wan2.7-videoedit

video-generation

Video generation and editing from multimodal input (text, image, or video).

Yes

happyhorse-1.0-t2v

video-generation

Text-to-video generation

Only supported in China (Hangzhou) and China (Shenzhen).

Yes

happyhorse-1.0-i2v

video-generation

Image-to-video (i2v) generation from the first frame

Only supported in China (Hangzhou) and China (Shenzhen).

Yes

happyhorse-1.0-r2v

video-generation

Reference-to-video (r2v) generation

Only supported in China (Hangzhou) and China (Shenzhen).

Yes

happyhorse-1.0-video-edit

video-generation

Video editing

Only supported in China (Hangzhou) and China (Shenzhen).

Yes

Speech-to-text

fun-asr

speech-to-text

Speech recognition

Yes

Model usage

Once a model is deployed, you can use an AI Function in Hologres to call it. This lets you perform inference and build AI applications without moving data from your database. For usage instructions, see the AI Function List. For a best practice example, see Best Practice: Use AI Functions to Build a High-Performance Image Analysis System for Autonomous Driving.