This guide shows how to integrate Alibaba Cloud Model Studio with Hologres. You can deploy models from Model Studio in Hologres by using an API key and call them through AI Functions. This approach lets you perform AI development without moving data out of the database.
Introduction
Alibaba Cloud Model Studio (Bailian) is a one-stop platform for large language model development and application. It integrates Qwen and mainstream third-party models, providing developers with OpenAI-compatible APIs, end-to-end model services, and visual application building capabilities. Model Studio provides ready-to-use model services that let you call the full Qwen model lineup without deploying or maintaining infrastructure.
Hologres is deeply integrated with Alibaba Cloud Model Studio (Bailian). You can deploy Model Studio models in Hologres with an API key and invoke them using AI Function. This lets you develop and build AI applications without moving data out of your database.
Billing
-
Network fees: The supported regions for Model Studio are China (Beijing) and Singapore. Accessing Model Studio from a Hologres instance may incur network fees. Network fees are waived during the beta period, and the billing start date will be announced on the official website.
-
Model invocation fees: Model Studio charges these fees based on API call volume. For details, see model pricing and the Model Studio console.
Limitations
-
Supported instance versions: Hologres V4.0.18 and later, or V4.1.2 and later.
-
Supported regions:
-
China (Beijing, Ulanqab)
-
China (Hangzhou, Shanghai)
-
China (Shenzhen)
-
Singapore
-
Model list and parameters
Deploy a model
In the Hologres console, go to Instances. Select your target instance and click AI Model at the top of the instance details page. On the Models page, you can deploy a Model Studio model. Select Alibaba Cloud Model Studio as the provider and configure the following parameters. The key parameters include:
-
Model Type: The Model Studio model you want to deploy. For details, see the Model list section below. Models that are not on this list are not supported.
-
API Key: You must activate Alibaba Cloud Model Studio and obtain an API key before using the service. Enter this key during deployment. For instructions on how to obtain an API key, see Get an API Key.
-
Model Parameter Settings: After selecting a model type, configure its parameters to meet your business needs. For details, see the Parameter description section below. You can also configure the retry mechanism.
Parameters
The following parameters are supported by different model types. For a complete description, see the Model Studio console and API documentation.
-
Text models:
-
max_tokens: The maximum number of tokens to return. For the model's token limit, see the official Model Studio documentation. -
temperature: The sampling temperature, which controls the diversity of the generated output. Valid values: [0, 2.0). -
top_p: The probability threshold for nucleus sampling. Valid values: (0, 1.0]. Bothtemperatureandtop_pcontrol diversity. We recommend setting only one of them. -
Qwen-Omni series: In addition to the common text parameters, this series supports
modalities(specifies the output as text or audio),audio.voice(specifies the voice for audio output), andaudio.format(specifies the audio format;wavis a supported format).
-
-
Translation models: Configure the following parameters to improve translation quality. For complete usage details, see Translation models.
-
source_lang: The source language. For details, see the language list. -
terms: A JSON-formatted list of custom translation terms. -
tm_list: The translation memory, which provides example source-target sentence pairs in JSON format. -
domains: Text-based, domain-specific prompts.
Example:
{ "extra_body": { "translation_options": { "source_lang": "zh", "domains": "The sentence is from Ali Cloud IT domain. ", "terms": [ {"source": "生物传感器", "target": "biological sensor"}, {"source": "身体健康状况", "target": "health status of the body"} ], "tm_list":[ {"source": "您可以通过如下方式查看集群的内核版本信息:", "target": "You can use one of the following methods to query the engine version of a cluster:"}, {"source": "bla", "target": "bla"} ] } } } -
-
Embedding models:
dimensionspecifies the vector dimension. You can modify this parameter for only some models. For detailed usage instructions, see Vectorization models.-
text-embedding-v4supports 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, and 64. -
text-embedding-v3supports 1,024 (default), 768, 512, 256, 128, or 64. -
qwen3-vl-embeddingsupports 2,560 (default), 2,048, 1,536, 1,024, 768, 512, and 256.
-
Retry mechanism
When you deploy a model, you can configure the retry mechanism for failed calls. The following parameters are available:
-
max_retries: The maximum number of retries. Default value: 2. Value range: [0, 100].
-
initial_retry_delay: The initial delay before a retry, in seconds. Default value: 0.5. Value range: [0.5, 8].
-
max_retry_delay: The maximum delay for a retry, in seconds. Default value: 8. Value range: [1, 60].
-
timeout: The timeout for each request, in seconds. Default value: 600. Value range: [1, 1200].
Model list
Model Studio offers models for text generation, translation, embedding, and multimodal tasks. The following table lists the model category, model_type, task type, input/output, notes, and cross-region support for each model.
|
Model category |
|
Task type |
Input/output |
Notes |
Cross-region support |
|
Text generation |
qwen3.6-plus |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
qwen3-max
|
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen3-max-2026-01-23 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen3-max-preview |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen-max |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen-max-latest |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen-plus |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen-plus-latest |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen-flash |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen-long |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen-long-latest |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwq-plus |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwq-plus-latest |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-v3.2 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-v3.2-exp |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-v3.1 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-r1 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-r1-0528 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-v3 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-r1-distill-qwen-1.5b |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-r1-distill-qwen-7b |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-r1-distill-qwen-14b |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
deepseek-r1-distill-qwen-32b |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
kimi-k2-thinking |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
Moonshot-Kimi-K2-Instruct |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
glm-4.6 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
glm-4.7 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
glm-5 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
MiniMax-M2.1 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
MiniMax-M2.5 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
MiniMax/MiniMax-M2.1 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
MiniMax/MiniMax-M2.5 |
chat/completions |
Text input, text output |
Supports parameters: |
Yes |
|
|
qwen3-vl-235b-a22b-instruct |
chat/completions |
Image/video input, text output |
Supports parameters: |
Yes |
|
|
qwen3-vl-235b-a22b-thinking |
chat/completions |
Image/video input, text output |
Supports parameters: |
Yes |
|
|
qwen3-vl-32b-instruct |
chat/completions |
Image/video input, text output |
Supports parameters: |
Yes |
|
|
qwen3-vl-32b-thinking |
chat/completions |
Image/video input, text output |
Supports parameters: |
Yes |
|
|
qwen3-vl-8b-instruct |
chat/completions |
Image/video input, text output |
Supports parameters: |
Yes |
|
|
qwen3-vl-8b-thinking |
chat/completions |
Image/video input, text output |
Supports parameters: |
Yes |
|
|
qwen3-vl-plus |
chat/completions |
Image/video input, text output |
Supports parameters: |
Yes |
|
|
qwen3-vl-flash |
chat/completions |
Image/video input, text output |
Supports parameters: |
Yes |
|
|
qwen-vl-ocr |
chat/completions |
Image input, text output |
Supports parameters: |
Yes |
|
|
qwen-vl-ocr-latest |
chat/completions |
Image input, text output |
Supports parameters: |
Yes |
|
|
qwen3-omni-flash |
chat/completions |
Text/image/audio/video input, text/audio output |
Supports parameters: |
Yes |
|
|
Translation |
qwen-mt-plus |
translation |
ai_translate |
Supports parameters: |
Yes |
|
qwen-mt-flash |
translation |
ai_translate |
— |
Yes |
|
|
qwen-mt-turbo |
translation |
ai_translate |
— |
Yes |
|
|
qwen-mt-lite |
translation |
ai_translate |
— |
Yes |
|
|
Embedding |
text-embedding-v1 |
embedding |
|
Embedding dimensions: 1,536 |
Yes |
|
text-embedding-v2 |
embedding |
|
Embedding dimensions: 1,536 |
Yes |
|
|
text-embedding-v3 |
embedding |
|
Embedding dimensions: 2,048, 1,536, 1,024 (default), 768, 512, 256, 128, or 64 |
Yes |
|
|
text-embedding-v4 |
embedding |
|
Embedding dimensions: 1,024 (default), 768, 512, 256, 128, or 64 |
Yes |
|
|
tongyi-embedding-vision-plus |
embedding |
|
Embedding dimensions: 1,152. Video input is only supported in the China (Beijing) and Singapore regions. |
Yes |
|
|
tongyi-embedding-vision-flash |
embedding |
|
Embedding dimensions: 768. Video input is only supported in the China (Beijing) and Singapore regions. |
Yes |
|
|
multimodal-embedding-v1 |
embedding |
|
Embedding dimensions: 1,024. Video input is only supported in the China (Beijing) and Singapore regions. |
Yes |
|
|
qwen3-vl-embedding |
embedding |
|
Embedding dimensions: 2,560 (default), 2,048, 1,536, 1,024, 768, 512, or 256 |
Yes |
|
|
Image generation and editing |
qwen-image-2.0-pro |
image-generation |
Supports image editing and text-to-image generation. |
— |
Yes |
|
qwen-image-2.0-pro-2026-03-03 |
image-generation |
Supports image editing and text-to-image generation. |
— |
Yes |
|
|
qwen-image-2.0 |
image-generation |
Supports image editing and text-to-image generation. |
— |
Yes |
|
|
qwen-image-2.0-2026-03-03 |
image-generation |
Supports image editing and text-to-image generation. |
— |
Yes |
|
|
qwen-image-max |
image-generation |
Supports text-to-image generation. |
— |
Yes |
|
|
qwen-image-plus |
image-generation |
Supports text-to-image generation. |
— |
Yes |
|
|
qwen-image |
image-generation |
Supports text-to-image generation. |
— |
Yes |
|
|
qwen-image-edit |
image-generation |
Supports image editing. |
— |
Yes |
|
|
qwen-image-edit-plus |
image-generation |
Supports image editing. |
— |
Yes |
|
|
qwen-image-edit-max |
image-generation |
Supports image editing. |
— |
Yes |
|
|
wan2.7-image-pro |
image-generation |
Text-to-image (single images only); Supports 4K HD output. |
— |
Yes |
|
|
wan2.7-image |
image-generation |
Supports text-to-image generation. |
— |
Yes |
|
|
Video generation |
wan2.6-r2v |
video-generation |
Reference-to-video (r2v) generation |
— |
Yes |
|
wan2.6-r2v-flash |
video-generation |
Reference-to-video (r2v) generation |
— |
Yes |
|
|
wan2.6-t2v |
video-generation |
Text-to-video (t2v) generation |
— |
Yes |
|
|
wan2.6-i2v-flash |
video-generation |
Image-to-video (i2v) generation from the first frame |
— |
Yes |
|
|
wan2.6-i2v |
video-generation |
Image-to-video (i2v) generation from the first frame |
— |
Yes |
|
|
wan2.2-kf2v-flash |
video-generation |
Keyframe-to-video (kf2v) generation from the first and last frames |
— |
Yes |
|
|
wan2.7-t2v |
video-generation |
Text-to-video generation |
— |
Yes |
|
|
wan2.7-i2v |
video-generation |
Multimodal input (text, image, video, or audio) to video output |
— |
Yes |
|
|
wan2.7-r2v |
video-generation |
Multimodal input (text, image, video, or audio) to video output |
— |
Yes |
|
|
wan2.7-videoedit |
video-generation |
Video generation and editing from multimodal input (text, image, or video). |
— |
Yes |
|
|
happyhorse-1.0-t2v |
video-generation |
Text-to-video generation |
Only supported in China (Hangzhou) and China (Shenzhen). |
Yes |
|
|
happyhorse-1.0-i2v |
video-generation |
Image-to-video (i2v) generation from the first frame |
Only supported in China (Hangzhou) and China (Shenzhen). |
Yes |
|
|
happyhorse-1.0-r2v |
video-generation |
Reference-to-video (r2v) generation |
Only supported in China (Hangzhou) and China (Shenzhen). |
Yes |
|
|
happyhorse-1.0-video-edit |
video-generation |
Video editing |
Only supported in China (Hangzhou) and China (Shenzhen). |
Yes |
|
|
Speech-to-text |
fun-asr |
speech-to-text |
Speech recognition |
— |
Yes |
Model usage
Once a model is deployed, you can use an AI Function in Hologres to call it. This lets you perform inference and build AI applications without moving data from your database. For usage instructions, see the AI Function List. For a best practice example, see Best Practice: Use AI Functions to Build a High-Performance Image Analysis System for Autonomous Driving.