AI models and deployment

更新时间:
复制 MD 格式

This topic describes how to deploy and use the built-in models on the Hologres AI Node and lists the available models. To call a built-in model, you must first purchase an AI resource. After a model is deployed, you can call it using an AI function.

Prerequisites

Calling a built-in model on an AI Node requires an AI resource (GPU), which you must purchase first. For more information, see AI resource pricing and purchasing.

Model deployment

Deployment

  • You can select and deploy a model that fits your business scenario. Each model requires the specified minimum AI resource for deployment.

  • You can deploy multiple models on a single instance, but the total resources used must not exceed your purchased AI resource quota. If your AI resources are insufficient, you need to scale up.

  • Primary/secondary instances: You can only deploy models and perform model-related operations, such as changing resources or deleting models, on the primary instance. The secondary instance can view the models deployed on the primary instance and call them using an AI function.

Procedure

  1. Log on to the Hologres console and select a region in the upper-left corner.

  2. In the left-side navigation pane, click Instances, and then click the ID of the target instance.

  3. On the Instance Details page, click the AI Node tab.

  4. In the Models section, click Deploy Model.

  5. In the Deploy Model dialog box, enter a Model Name and select a Model Type. The system automatically populates the parameters in the Resource Configurations section based on the selected Model Type. Each model has a recommended minimum resource requirement. Allocate appropriate resources for the selected model to achieve optimal performance.

  6. After you complete the configuration, click OK to deploy the model.

    You can view the deployment status in the Models section and perform the following operations:

    • To adjust model configurations, click Modify Configuration in the Actions column of the target model.

    • To delete the model, click Delete in the Actions column of the target model.

      Note

      When you delete a model, Hologres does not validate whether it is in use. Proceed with caution.

Using a model

After you successfully deploy a model, you can call it using an AI function in Hologres. For more information, see AI function.

Available models

Hologres provides built-in models for various AI use cases. You can deploy models based on your business scenario and then call them using an AI function. The following table lists the available built-in models.

Category

Model name

Minimum vCPUs

Minimum memory (GB)

Minimum GPUs

Minimum GPU memory (GB)

Supported versions

Notes

PDF conversion model

ds4sd/docling-models

20

100

1 or more

48

Hologres V4.0 and later

text chunking

recursive-character-text-splitter

15

30

0

0

Hologres V3.2 and later

Select the vCPU specification based on your business workload. You do not need to specify the number of GPUs.

multimodal model

Qwen/Qwen2.5-VL-3B-Instruct

7

24

1 or more

24

Hologres V4.0 and later

multimodal model

Qwen/Qwen2.5-VL-7B-Instruct

7

30

1 or more

48

Hologres V4.0 and later

multimodal model

Qwen/Qwen2.5-VL-32B-Instruct

7

30

1 or more

96

Hologres V4.0 and later

image embedding model

clip-ViT-B-32-multilingual-v1

7

24

1

24

Hologres V4.0 and later

Image patch size: 32×32; Parameters: 88M; output vector dimension: 512

LLM

Qwen/Qwen3-1.7B

7

30

1 or more

8

Hologres V3.2 and later

LLM

Qwen/Qwen3-4B

7

30

1 or more

16

Hologres V3.2 and later

LLM

Qwen/Qwen3-8B

7

30

1 or more

32

Hologres V3.2 and later

LLM

Qwen/Qwen3-14B

7

30

1 or more

48

Hologres V3.2 and later

LLM

Qwen/Qwen3-32B

7

30

1 or more

96

Hologres V3.2 and later

sentiment classification

iic/nlp_structbert_sentiment-classification_chinese-base

7

30

1

4

Hologres V3.2 and later

text embedding model

iic/nlp_gte_sentence-embedding_chinese-base

7

30

1

12

Hologres V3.2 and later

output vector dimension: 768

text embedding model

iic/nlp_gte_sentence-embedding_chinese-large

7

30

1

16

Hologres V3.2 and later

output vector dimension: 1024

text embedding model

iic/nlp_gte_sentence-embedding_chinese-small

7

30

1

8

Hologres V3.2 and later

output vector dimension: 512

text embedding model

Qwen/Qwen3-Embedding-0.6B

7

30

1

8

Hologres V3.2 and later

text embedding model

Qwen/Qwen3-Embedding-4B

7

30

1

32

Hologres V3.2 and later

text embedding model

Qwen/Qwen3-Embedding-8B

7

30

1

48

Hologres V3.2 and later

text embedding model

BAAI/bge-base-en-v1.5

7

30

1

12

Hologres V3.2 and later

output vector dimension: 768

text embedding model

BAAI/bge-base-zh-v1.5

7

30

1

12

Hologres V3.2 and later

output vector dimension: 768

text embedding model

BAAI/bge-large-en-v1.5

7

30

1

16

Hologres V3.2 and later

output vector dimension: 1024

text embedding model

BAAI/bge-large-zh-v1.5

7

30

1

16

Hologres V3.2 and later

output vector dimension: 1024

text embedding model

BAAI/bge-small-en-v1.5

7

30

1

8

Hologres V3.2 and later

output vector dimension: 384

text embedding model

BAAI/bge-small-zh-v1.5

7

30

1

8

Hologres V3.2 and later

output vector dimension: 512

image embedding model

clip-ViT-B-32

7

24

1

24

Hologres V4.0 and later

Image patch size: 32×32; Parameters: 88M; output vector dimension: 512

image embedding model

clip-ViT-L-14

7

24

1

24

Hologres V4.0 and later

Image patch size: 14×14; Parameters: 304M; output vector dimension: 768

image embedding model

clip-ViT-B-16

7

24

1

24

Hologres V4.0 and later

Image patch size: 16×16; Parameters: 88M; output vector dimension: 512