配置模板:PAI Model Gallery

更新时间:2025-03-28 07:43:20

Model Gallery提供了丰富的模型库,涵盖了大规模语言模型(LLM)、生成式人工智能内容(AIGC)、计算机视觉(CV)、自然语言处理(NLP)和语音等多种人工智能应用场景。您可以对模型进行一键训练(支持超参数配置)、压缩、评测和部署,从而快速验证业务需求。本文介绍如何在ES中使用部署在PAI上的模型进行模型推理服务。

前提条件

步骤一、在PAI中部署模型

  1. 我们以部署Embedding模型为例,进入PAIModel Gallery界面,单击场景 > 自然语言处理 > Embedding,然后选择bge-m3通用向量模型(其余模型同理)。

    说明

    建议在ES同地域部署服务,后续配置与ES集群相同的VPC网络,以便ES可以通过私网访问部署的服务,从而实现更低的时延和更稳定的连接。

  2. 单击bge-m3通用向量模型右下角的部署,进入模型部署界面(若要通过VPC地址调用,则部署配置中的专有网络(VPC)需要配置与ES集群相同的专有网络)。

    image

    参数

    说明

    参数

    说明

    部署方式

    部署方式。支持vLLM加速部署 、FlagEmbedding两种方式及其下属机型。

    基本信息

    服务名称。您可以自定义名称。

    资源部署

    资源部署。您可以依据业务需要进行资源规格部署。

    专有网络

    专有网络。部署配置中的专有网络(VPC)需要配置与ES集群相同的专有网络

    服务功能

    服务功能。您可以依据业务需要自行设置服务功能。

    服务配置

    服务配置。您可以依据业务需要自行配置服务。

  3. 在模型部署界面进行相关参数配置,配置好后,单击界面下方的部署,即可部署模型。

  4. 单击左侧导航栏模型部署 > 模型在线服务(EAS)

    说明

    当模型服务状态为运行中时,即可调用模型接口进行使用。

    image

  5. 单击名称/ID列的模型名,进入模型的概览页签。

  6. 单击查看调用信息,即可查看模型调用的url以及Token

步骤二、在阿里云Elasticsearch中创建PAI上的模型的模型推理服务

说明

您可以在阿里云ES实例的Kibana中运行以下代码,以创建模型推理服务。

各类型方法如下:

text_embedding类型
completion类型

创建模型语法模板:

PUT _inference/text_embedding/pai_embedding
{
  "service":"alibaba-cloud-custom-model",
  "service_settings":{
    "secret_parameters":{
      "api_key":"<替换为您的api_key>"
    },
    "url":"<替换为您服务的url>",
    "path":{
      "<替换为您服务的path>":{
        "POST":{
          "headers":{
            "Authorization": "Bearer ${api_key}",
            "Content-Type": "application/json;charset=utf-8"
          },
          "request":{
            "format":"string",
            "content":"""
            {
              "input":${input}, 
              "embedding_type":"dense"
            }
            """
          },
          "response":{
            "json_parser":{
              "text_embeddings":"$.data[*].embedding"
            }
          }
        }
      }
    }
  }
}

示例:

PUT _inference/text_embedding/pai_embedding
{
  "service":"alibaba-cloud-custom-model",
  "service_settings":{
    "secret_parameters":{
      "api_key":"xxx"
    },
    "url":"http://xxx.cn-hangzhou.pai-eas.aliyuncs.com",
    "path":{
      "/":{
        "POST":{
          "headers":{
            "Authorization": "Bearer ${api_key}",
            "Content-Type": "application/json;charset=utf-8"
          },
          "request":{
            "format":"string",
            "content":"""
            {
              "input":${input}, 
              "embedding_type":"dense"
            }
            """
          },
          "response":{
            "json_parser":{
              "text_embeddings":"$.data[*].embedding"
            }
          }
        }
      }
    }
  }
}

调用模型:

POST _inference/text_embedding/pai_embedding
{
  "input":["hello", "world"]
}

Response(响应结果):

{
  "text_embedding": [
    {
      "embedding": [
        -0.016567165,
        -0.015161497,
        ...
      ]
    },
    {
      "embedding": [
        -0.023222955,
        0.031465773,
        ...
      ]
    }
  ]
}

创建模型语法模板:

PUT _inference/completion/pai_deepseek
{
  "service":"alibaba-cloud-custom-model",
  "service_settings":{
    "secret_parameters":{
      "api_key":"<替换为您的api_key>"
    },
    "url":"<替换为您服务的url>",
    "path":{
      "<替换为您服务的path>":{
        "POST":{
          "headers":{
            "Authorization": "Bearer ${api_key}"
          },
          "request":{
            "format":"string",
            "content":"""
            {
              "prompt":"${prompt}",
              "max_tokens":"${max_tokens}"
            }
            """
          },
          "response":{
            "json_parser":{
              "completion_result":"$.choices[*].text"
            }
          }
        }
      }
    }
  },
  "task_settings":{
    "parameters":{
      "max_tokens":"300"
    }
  }
}

示例:

PUT _inference/completion/pai_deepseek
{
  "service":"alibaba-cloud-custom-model",
  "service_settings":{
    "secret_parameters":{
      "api_key":"xxx"
    },
    "url":"http://xxx.cn-hangzhou.pai-eas.aliyuncs.com",
    "path":{
      "/api/predict/xxx/v1/completions":{
        "POST":{
          "headers":{
            "Authorization": "Bearer ${api_key}"
          },
          "request":{
            "format":"string",
            "content":"""
            {
              "prompt":"${prompt}",
              "max_tokens":"${max_tokens}"
            }
            """
          },
          "response":{
            "json_parser":{
              "completion_result":"$.choices[*].text"
            }
          }
        }
      }
    }
  },
  "task_settings":{
    "parameters":{
      "max_tokens":"300"
    }
  }
}

调用模型:

POST _inference/completion/pai_deepseek
{
  "input":"",
  "task_settings":{
    "parameters":{
      "prompt":"what is elastic search"
    }
  }
}

Response(响应结果):

{
  "completion": [
    {
      "result": """ and how is it used?
Elastic Search is a search engine that's built on top of Lucene, a search engine framework. It's designed to store, search, and analyze text documents efficiently. The key feature of Elastic Search is its ability to index and query structured data in real-time. It can build indices from existing data sources like databases or APIs and then query that data as if it were in-memory. Elastic Search is often used in applications that require fast search and analytics, such as search engines, customer relationship management systems, and big data platforms.

How to use it:

1. Storing Data:

Elastic Search primarily uses the NoSQL document model to store data. Each document is represented as a JSON object, which makes it easy to work with structured data.

2. Indexing:

Elastic Search provides several ways to index data. The most common method is through the REST API, where you can upload data in bulk or in real-time. It also supports indexing data from databases, APIs, or even cloud storage.

3. Querying:

Elastic Search supports various querying mechanisms. It has a simple query syntax that allows you to search for specific fields within documents. It also supports aggregate queries, which allow you to perform aggregations like counts, sums, and averages over your data. Additionally, Elastic Search allows you to match phrases, ranges, and more.

4. Updating and Deleting Data:

Elastic Search allows you to update and delete documents from your index. This"""
    }
  ]
}

  • 本页导读 (1)
  • 前提条件
  • 步骤一、在PAI中部署模型
  • 步骤二、在阿里云Elasticsearch中创建PAI上的模型的模型推理服务
AI助理

点击开启售前

在线咨询服务

你好,我是AI助理

可以解答问题、推荐解决方案等