AI Function

HologresV3. 2版本起正式支持AI Function,并提供了Embedding、RankLLM等算子。您可通过标准SQL直接调用AI Function,无需额外推理服务,实现企业级知识库构建与推理等AI场景。

前提条件

  1. 购买AI资源

  2. 部署模型

AI Function汇总表

当前Hologres支持AI Function如下表:

  • 每个AI Function会根据部署的模型,默认分配一个最佳的模型,可以通过系统表查看AI Function默认分配的模型。

  • 调用AI Function时可以不需要填写模型名称,系统会自动使用默认的模型。如果需要更换AI Function对应的模型,需要通过系统表更改,详情请参见修改AI Function对应的模型

  • 每个模型部署时需要的AI资源不同,请根据业务需要购买合适的规格。

AI Function名称

描述

支持的模型

支持的版本

ai_embed

对给定的文本计算一个固定维度的连续向量。

  • iic/nlp_gte_sentence-embedding_chinese系列。

  • Qwen/Qwen3-Embedding-XB系列。

V3.2及以上版本

ai_rank

对给定的文本进行相关性打分。

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_chunk

长文本分段

recursive-character-text-splitter。

ai_gen

通过提示词调用大语言模型进行推理并输出结果

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_classify

根据提供的分类标签对输入文本进行分类。

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_extract

从输入文本中提取指定的标签信息。

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_mask

从输入文本中将指定的标签信息脱敏,信息脱敏后用[MASKED]做占位符

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_fix_grammar

用于修复输入文本的语法错误

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_summarize

生成一段文本的摘要

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_translate

将输入文本翻译成指定的语言

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_similarity

计算两个输入文本的相似度。

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

ai_analyze_sentiment

对输入的文本进行情感分析。

Qwen3系列大模型,推荐使用Qwen/Qwen3-32B。

AI Function使用

ai_embed

  • 描述:对输入的文本计算一个固定维度的连续向量。

    SELECT ai_embed([model],content)
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • content:必填,输入的文本,支持字符类型(CHAR、VARCHAR、TEXT)。

  • 返回值说明

    • content参数值为NULL或者空字符串,则返回NULL。

    • 根据调用的模型,返回对应的向量维度。支持的模型和返回向量维度如下:

      模型名称

      返回的向量维度

      iic/nlp_gte_sentence-embedding_chinese-small

      512

      iic/nlp_gte_sentence-embedding_chinese-base

      768

      iic/nlp_gte_sentence-embedding_chinese-large

      1024

      Qwen/Qwen3-Embedding-0.6B

      1024

      Qwen/Qwen3-Embedding-4B

      2560

      Qwen/Qwen3-Embedding-8B

      4096

  • 使用示例

    SELECT ai_embed('Hologres是阿里巴巴自主研发的一站式实时数仓引擎,支持海量数据实时写入、实时更新、实时加工、实时分析.');

    返回结果如下。

    ai_embed
    -------
    {-0.020090256,-0.009496426,-0.01584659,0.014317295,-0.04607893,0.024886303,1.5420817E-4,0.026162162,-0.03752437,0.015775507,-0.04177341,-0.013460643,0.033544105,-0.029394781,0.0069257156,-0.01222168,3.0216237E-4,0.0054371865,0.016183373,0.020896716,0.035535984,0.008347215,0.031663015,0.025305834,-0.04131702,-0.033532545,0.06169271,0.006304144,0.056792237,0.040539958,-0.078987345,-0.02847365,0.039656907,0.011879058,0.0052145706,-0.004678201,-0.026654543,0.0014951921,-0.0041095354,0.028060248,0.02519349,7.0495816E-4,-0.023101272,0.00899543,0.014460813,0.0041699326,0.015528168,0.018089484,0.0066144923,-0.054523475,-0.0019379448,-0.040240895,0.016452068,0.040140025,0.029684952,-0.020101555,0.021029668,-0.0011768423,0.019794837,-0.010320337,-0.0025771167,0.014714588,-0.011602978,0.035582263,-0.037679505,-0.0253898,0.031427786,-0.0030908722,-0.022796335,0.042317867,0.0115412725,-0.023376886,-0.014504731,7.2365743E-4,0.008261744,-0.014813935,-0.0062122354,0.01741836,0.020995634,0.03629264,-0.0035633524,0.008751004,-0.04466549,0.0045474623,0.006578948,0.006502968,-0.0128487805,-0.0018316561,-0.014015705,-0.039480567,-0.040248442,0.020797035,-0.032448824,-0.030735409,-0.049299274,2.266165E-5,-0.021089396,0.016647113,-0.007873047,-0.041477893,-0.029021524,-0.023258539,0.0060021314,-0.017987624,0.038044956,-0.0032934784,-0.002384601,-0.02673212,0.025557652,-0.038381096,-0.034366325,0.05164039,0.018401438,0.019114511,0.0320619,-0.07703687,-0.008715445,0.004509739,0.004032602,-0.03357468,-0.037365053,-0.015432662,0.0022537822,-0.008038424,0.013542681,0.002655194,-0.029011743,8.822335E-4,-0.038797747,0.009556934,-0.044336736,-0.045059543,0.014049014,0.026100427,-0.009336136,-0.02371003,-0.05419659,0.0025305692,-0.0044283434,0.07481789,-0.07813458,-0.020659316,-0.02078458,0.020525886,-0.015213658,-0.029009484,0.02287392,0.002408117,-0.024104768,-0.011369992,-0.014089207,-0.01710999,0.016106417,0.01469697,-0.0042726058,0.060460363,-0.011873481,-0.020930506,-0.03034617,0.05012075,-0.0034694336,-0.029878203,-6.3509366E-4,0.019851772,-0.028228344,0.006498847,0.007332751,-0.044772487,-0.0040114652,-0.0062852474,-0.026770413,-0.002599817,0.021863673,0.005035857,-0.050771184,-0.009728754,-0.057259202,0.02631201,-0.019925602,-0.033796232,-0.0018945112,0.014107386,0.016418343,0.005303861,0.0135098295,-0.022271244,0.027085848,-0.008059362,0.009343293,-0.034445345,-0.047172863,-0.018515354,-0.053181197,-0.04407386,0.035571788,-0.041234825,0.04314981,0.002534735,0.016181212,-0.014853425,-0.04405587,0.0013351967,0.023719758,-0.049617242,0.0022366885,-0.019256268,-0.004300273,-0.017665166,0.012463366,0.034075502,0.051636428,-4.7821744E-4,0.03172388,0.017369539,0.014823238,-0.027549954,0.029251441,0.025419736,-0.02971091,0.022539496,0.0476104,-0.032902148,0.009644813,0.021973446,0.022908133,0.00939035,0.023080595,-0.03928454,-0.00559517,0.0074659847,0.019499207,0.03815899,-0.039985757,-9.70692E-4,0.03077494,0.0049040243,-0.005923366,-0.017174922,-0.01143735,0.044279836,-0.025019817,0.0055455184,-0.010266678,-0.012691499,-0.009725198,-0.013605499,-0.029754722,0.011299827,-0.016808294,0.011174802,-0.044738125,-0.0063206153,-0.013793128,0.012267754,-0.004058797,0.038677044,-0.07299788,-0.041260958,-0.030443268,-0.020508127,-0.014190403,-0.032265723,0.012994454,-0.01736776,-0.002753713,0.034751836,6.172106E-4,0.03796643,0.004366568,0.0026832006,-0.007492188,-0.007815222,-0.011610108,-0.017532434,0.011322546,-0.026171764,-3.320286E-4,-0.00809122,-0.0099913115,-0.052280765,0.06709391,0.013618772,-0.008046992,-0.026774628,-0.0030112304,-0.011148061,-0.015862592,-0.0012523888,0.019169632,-0.013183991,0.058447897,-0.066070825,-0.018601159,-0.026439346,-0.043570194,-0.040800825,0.03182613,0.024459055,-0.011861589,0.024043743,0.020251626,0.016592184,0.0050358092,-0.031315472,-0.006125953,-0.0019090442,0.012550221,0.006066753,-0.018271126,0.0056952024,-0.018685043,0.012299163,0.008602334,-0.058497902,-0.014249865,0.024619093,0.010191363,-0.011479964,-0.05295521,0.002178216,0.010789972,-0.047581896,-0.041979644,0.003018161,-0.01187381,0.013641192,-0.006910626,-0.017476687,-0.092886284,0.016303144,-0.0656557,0.0393312,-0.00734339,-0.010661719,-0.030407853,-0.0019478837,-0.025930727,0.0143570695,0.030311229,-0.02835583,-0.028756302,0.013602009,-0.0114444,0.021219743,0.018876078,-0.00706621,0.0057647913,-2.1978962E-4,-0.039444894,-0.022552658,0.040268976,-0.026431147,0.006348533,0.01962976,-0.028969157,-0.02978469,0.015192576,0.007040585,-0.01703445,-0.019894224,-0.05302061,-0.02796155,0.016022341,0.07971494,-0.020121878,0.008103532,0.029247347,-0.050322242,-0.040082593,-0.017230613,-0.061861333,-0.0011664771,0.0638918,0.6679717,-0.008337208,0.029303335,0.030245095,-0.03465869,-0.026123257,-0.027559027,0.0011931243,0.022468442,0.031079037,-0.024323825,0.051751327,0.012491362,-0.012475901,0.016949087,0.006617063,0.027306462,-0.045174897,0.014700085,-0.031133756,0.0028133818,0.0019959125,0.011375029,0.0108162435,-0.013669856,-0.041825816,0.019307282,0.010374943,0.0043516876,0.010901093,-0.06891068,-0.017824681,-0.021618223,0.042757142,0.002811404,-0.038480062,-0.06549113,-0.006518994,0.39589778,0.045161903,0.033981014,0.003074911,0.03182808,0.033946298,0.023792086,0.0010783391,-0.022792079,2.8131215E-4,-0.011764349,-0.0097329635,-0.029654287,0.03617558,0.022550045,-0.015727773,-0.015013847,-0.014165514,-0.027054857,-0.01433588,0.01823297,-0.011408835,0.010381236,-0.0030086145,0.0107870065,-0.0089501515,-0.0024821209,0.004880558,-0.003297857,-0.01678033,-0.009938173,-0.017735299,-0.01792335,-0.008095857,-0.015562422,0.00855466,0.07420776,0.0108909,-0.0021402114,0.028655833,0.022409104,0.0044330736,0.06277253,-0.0045404425,0.0012109925,-0.007891864,-0.007558582,0.025458653,-0.009270777,-0.011058675,-0.044711497,-0.042486735,0.054807488,0.022449961,-0.009994548,-0.0145181175,0.006506842,0.008109535,0.023708044,0.012324202,0.028730625,0.049536083,-0.0026819003,0.027622702,-0.021563001,0.052271232,0.01119383,-0.010243235,0.03180048,-0.045560632,-0.02265904,0.0026275457,-0.017319424,-0.03278405,0.0031222694,-0.025735423,0.02083654,-0.056031268,0.022937516,-0.047072068,0.019013148,-0.052942477,-0.06005854,0.01698916,-6.9272966E-4,-0.0020978078,-0.014311838,0.0027517779,-0.011554661,0.012647215,-0.04290714,-0.017936137,0.008038792,-0.02773122,0.033019494,-0.031243484,0.025308052,0.016123088,-0.044190194,-0.02165721,-0.057956327}

ai_rank

  • 描述:对给定的文档进行相关性打分。

    SELECT ai_rank([model],source_sentence, sentence_to_compare);
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • source_sentence:必填,文本内容,支持字符类型(CHAR、VARCHAR、TEXT)。

    • sentence_to_compare:必填,与source_sentence参数值进行对比的语句,支持字符类型(CHAR、VARCHAR、TEXT)。

  • 返回值说明

    • 返回FIOAT类型的相关性Score,取值区间:[0, 1],值越大相关性越高。

    • source_sentencesentence_to_compare参数其中一个值为NULL时,返回0。

  • 使用示例

    SELECT knowledge, ai_rank('阿里巴巴2024年营收是多少?', knowledge) AS score
      FROM (
          VALUES ('Amazon 2024年营收6380亿美元'), 
                 ('Alibaba 2024年营收9411.68亿元'), 
                 ('阿里巴巴2023年的营收8686.87亿元')
      ) AS knowledge_table(knowledge)
      ORDER BY score DESC;

    返回结果如下。

    knowledge	                  | score
    -----------------------------|-------
    Alibaba 2024年营收9411.68亿元	 |0.899999976
    阿里巴巴2023年的营收8686.87亿元 |0.200000003
    Amazon 2024年营收6380亿美元	   |0.100000001

ai_chunk

  • 描述:对长文本进行分段(切片)。

    SELECT ai_chunk([model,] long_sentence[, chunk_size, chunk_overlap, separators])
  • 参数说明

    参数名称

    说明

    model

    可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    long_sentence

    必填,需要分段的源文本,支持字符类型(CHAR、VARCHAR、TEXT)。

    chunk_size

    可选,每个分段的Chunk长度,支持INT类型, 默认值是300。

    chunk_overlap

    可选,相邻ChunkOverlap长度,避免句子切到不同Chunk破坏语义,支持INT类型,默认值是50。

    separators

    可选,切分Chunk的分隔符,TEXT[]类型。默认值是["\n\n", "\n", " ", ""], 适用于处理英文文本。如果是中文文本建议使用中文分隔符["\n\n", "\n", "。", "!", "?", ";", ",", " "]

  • 返回值说明

    • 返回TEXT[]类型,表示分割好的Chunks列表。

    • long_sentence参数值为None时,则返回None。

  • 使用示例

    SELECT ai_chunk('Hologres是阿里巴巴自主研发的一站式实时数仓引擎(Real-Time Data Warehouse),支持海量数据实时写入、实时更新、实时加工、实时分析,支持标准SQL(兼容PostgreSQL协议和语法,支持大部分PostgreSQL函数),支持PB级数据多维分析(OLAP)与即席分析(Ad Hoc),支持高并发低延迟的在线数据服务(Serving),支持多种负载的细粒度隔离与企业级安全能力,与MaxCompute、Flink、DataWorks深度融合,提供企业级离在线一体化全栈数仓解决方案。',40,10);

    返回结果如下。

    ai_chunk
    ---
    "{"Hologres是阿里巴巴自主研发的一站式实时数仓引擎(Real-Time Data","Warehouse),支持海量数据实时写入、实时更新、实时加工、实时分析,支持标","工、实时分析,支持标准SQL(兼容PostgreSQL协议和语法,支持大部分Po","语法,支持大部分PostgreSQL函数),支持PB级数据多维分析(OLAP)与","维分析(OLAP)与即席分析(Ad","Hoc),支持高并发低延迟的在线数据服务(Serving),支持多种负载的细粒度","支持多种负载的细粒度隔离与企业级安全能力,与MaxCompute、Flink、D","te、Flink、DataWorks深度融合,提供企业级离在线一体化全栈数仓解","线一体化全栈数仓解决方案。"}"

ai_gen

  • 描述:通过提示词调用大语言模型进行推理并输出结果。

    SELECT ai_gen([model,] text)
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • text:待输入的提示词,支持字符类型(CHAR、VARCHAR、TEXT)。

  • 返回值说明

    • 返回大模型对该问题的回答。

    • text参数值为NULL时,则返回NULL;

    • text参数值为空字符串("")时,则返回空字符串("")。

  • 使用示例

    CREATE TABLE questions (
        question TEXT
    );
    
    INSERT INTO questions (question) VALUES
      ('什么是人工智能?'),
      ('如何提高英语口语水平?'),
      ('健康饮食有哪些注意事项?');
    
    SELECT
      question,
      ai_gen('请用 20 个字回答如下问题: ' || question) AS answer
    FROM
      questions;

    返回结果如下。

     question            |	answer
      --------------------|-------
    如何提高英语口语水平?	  |多说多练,模仿发音,积累词汇,勇于开口。
    健康饮食有哪些注意事项?	|均衡搭配,控制油盐糖,多蔬果,少加工,适量饮水,规律饮食。
    什么是人工智能?	        |人工智能是模拟人类智能的计算机系统,能学习、推理、感知和解决问题。

ai_classfy

  • 描述:根据提供的分类标签对输入文本进行分类。

    SELECT ai_classify([model,] content, labels)
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • content:必填,需要分类的文本,支持字符类型(CHAR、VARCHAR、TEXT)。

    • labels:必填,表示期望输出的分类标签列表,标签数量须在2~20之间,支持ARRAY类型。

  • 返回值说明

    • 返回分类结果,即匹配的标签。返回值类型为TEXT。

    • content参数值为NULL时,则返回NULL。

    • content是空字符串("")时,则返回NULL。

    • labels参数值数量不正确时,将报错。

  • 使用示例

    CREATE TABLE product_detail(
        product_name TEXT,
        product_desc TEXT
    );
    INSERT INTO product_detail VALUES
    ('iphone','苹果手机'),
    ('p50','华为手机'),
    ('x200','vivo手机'),
    ('aaa','Dior连衣裙'),
    ('bbb','Dior裤子'),
    ('声声乌龙','茶颜悦色的奶茶'),
    ('夹心饼干','奥利奥的饼干');
    
    --用ai_classify对文本分类
    SELECT
        product_name,
        ai_classify(product_desc, ARRAY['电子产品', '服装', '食品']) AS catalog
      FROM
         product_detail
      LIMIT 10;

    返回结果如下。

    product_name	|catalog
    --------------|------
    aaa	          |服装
    iphone	      |电子产品
    声声乌龙	      |食品
    p50	          |电子产品
    x200	        |电子产品
    bbb	          |服装
    夹心饼干	      |食品

ai_extract

  • 描述:从输入文本中提取指定的标签信息。

    SELECT ai_extract([model,] content, labels)
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • content:必填,需输入的文本信息,支持字符类型(CHAR、VARCHAR、TEXT)。

    • labels:必填,指定的标签信息,标签数量须在1~20之间,支持ARRAY类型。

  • 返回值说明

    • 返回每个Label对应的提取信息,为JSON格式。

    • content参数值为NULL或空字符串("")时,则返回NULL。

    • labels参数值数量不正确时,将报错。

  • 使用示例

    CREATE TABLE users (
      user_id TEXT,
      resume TEXT
    );
    
    INSERT INTO users (user_id, resume) VALUES
      ('u001', '姓名:张三,男,28岁。邮箱:zhangsan@example.com,电话:13800138000。工作经验丰富。'),
      ('u002', '姓名:李四,女,35岁。电话:13900139000,邮箱:lisi@example.com。具有管理经验。'),
      ('u003', '姓名:王五,男,25岁。邮箱:wangwu@example.com。电话:13700137000。');
    
    SELECT
      user_id,
      ai_extract(resume, ARRAY['姓名','邮箱','电话','性别','年龄']) AS user_desc_obj
    FROM
      users;

    返回结果如下。

    user_id	|user_desc_obj
    --------|-------------
    u002	  |"{"姓名":"李四","年龄":"35岁","性别":"女","电话":"13900139000","邮箱":"lisi@example.com"}"
    u003	  |"{"姓名":"王五","年龄":"25岁","性别":"男","电话":"13700137000","邮箱":"wangwu@example.com"}"
    u001	  |"{"姓名":"张三","年龄":"28岁","性别":"男","电话":"13800138000","邮箱":"zhangsan@example.com"}"

ai_mark

  • 描述:从输入文本中将指定的标签信息脱敏,信息脱敏后用[MASKED]做占位符。

    SELECT ai_mask([model,] content, labels)
  • 参数说明:

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • content:必填,需要脱敏的输入文本,支持字符类型(CHAR、VARCHAR、TEXT)。

    • labels:必填,需要脱敏的标签信息,标签数量须在1~20之间,支持ARRAY类型。

  • 返回值说明

    • 返回脱敏后的文本内容。

    • content参数值为NULL时,则返回NULL。

    • content参数值为空字符串("")时,则返回空字符串。

    • labels参数值数量不正确,将报错。

  • 使用示例

    SELECT ai_mask(
      '用户王小明,身份证号:23030611111111,手机号:13888888888。',
      ARRAY['身份证', '手机号']); 

    返回结果如下。

    ai_mask
    -------
    用户王小明,身份证号:[MASKED],手机号:[MASKED]。

ai_fix_grammar

  • 描述:用于修复输入文本的语法错误。

    SELECT ai_fix_grammar([model,] content)
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • content:必填,需要修复语法的输入文本,支持字符类型(CHAR、VARCHAR、TEXT)。

  • 返回值说明

    • 返回修正后的文本内容。

    • content参数值为NULL时,则返回NULL。

    • content参数值为空字符串("")时,则返回空字符串("")。

  • 使用示例

    SELECT ai_fix_grammar('He dont know what to did.');

    返回结果如下。

    ai_fix_grammar
    --------------
    He doesn't know what to do.

ai_summarize

  • 描述:根据输入的文本,生成一段文本的摘要。

    SELECT ai_summarize([model,] content[, max_words])
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • content:必填,输入文本,支持字符类型(CHAR、VARCHAR、TEXT)。

    • max_words:可选,大模型最多输出的字数,模型将依据该值尽量接近输出结果。默认值为50,若设置为0,则表示不做任何限制。

  • 返回值说明

    • 返回文本的摘要。

    • content参数值为NULL时,则返回NULL。

    • content参数值为空字符串("")时,则返回空字符串("")。

    • max_words若取值小于0时,将报错。

  • 使用示例

    SELECT ai_summarize('Hologres是阿里巴巴自主研发的一站式实时数仓引擎(Real-Time Data Warehouse),支持海量数据实时写入、实时更新、实时加工、实时分析,支持标准SQL(兼容PostgreSQL协议和语法,支持大部分PostgreSQL函数),支持PB级数据多维分析(OLAP)与即席分析(Ad Hoc),支持高并发低延迟的在线数据服务(Serving),支持多种负载的细粒度隔离与企业级安全能力,与MaxCompute、Flink、DataWorks深度融合,提供企业级离在线一体化全栈数仓解决方案。', 15);

    返回结果如下。

    ai_summarize
    ------------
    Hologres是阿里自主研发的实时数仓引擎,支持海量数据实时处理与多维分析。

ai_translate

  • 描述:将输入文本翻译成指定的语言。

    SELECT ai_translate([model,] content, to_lang)
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • content:必填,输入需翻译的文本,支持字符类型(CHAR、VARCHAR、TEXT)。

    • to_lang: 必填,目标语言Code。详情,请参见ISO-639

  • 返回值说明

    • 返回翻译后的文本。

    • content参数值为NULL时,则返回NULL。

    • content参数值为空字符串("")时,则返回空字符串("")。

    • to_lang参数值非法时,将报错。

  • 使用示例

    SELECT ai_translate('Hologres是阿里巴巴自主研发的一站式实时数仓引擎(Real-Time Data Warehouse),支持海量数据实时写入、实时更新、实时加工、实时分析,支持标准SQL(兼容PostgreSQL协议和语法,支持大部分PostgreSQL函数),支持PB级数据多维分析(OLAP)与即席分析(Ad Hoc),支持高并发低延迟的在线数据服务(Serving),支持多种负载的细粒度隔离与企业级安全能力,与MaxCompute、Flink、DataWorks深度融合,提供企业级离在线一体化全栈数仓解决方案。', 'en');

    返回结果如下。

    ai_translate
    -----------
    Hologres is a self-developed one-stop real-time data warehouse engine by Alibaba, supporting real-time writing, real-time updating, real-time processing, and real-time analysis of massive data. It supports standard SQL (compatible with PostgreSQL protocol and syntax, supporting most PostgreSQL functions), supports multi-dimensional analysis (OLAP) and ad-hoc analysis at the PB-level, supports high-concurrency, low-latency online data services (Serving), supports fine-grained isolation for multiple workloads and enterprise-level security capabilities, and is deeply integrated with MaxCompute, Flink, and DataWorks, providing an enterprise-level fully stacked data warehouse solution that integrates online and offline processing.

ai_similarity

  • 描述:计算两个输入文本的相似度。

    SELECT ai_similarity([model,] text1, text2)
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • text1text2:需要对比相似度的两个文本,支持字符类型(CHAR、VARCHAR、TEXT)。

  • 返回值说明

    • 返回区间[0, 1]之间的FLOAT类型的数值,值越大相似度越高。0表示完全不相似,1表示两个文本相等。

    • text1text2参数中有一个值为NULL时,则返回0。

    • text1text2参数值均为空字符串("")时,则返回1。

    • text1text2参数中一个值为空字符串(""),另外一个为非空字符串时,则返回0。

  • 使用示例

    CREATE TABLE products2 (
        product_name TEXT
    );
    
    INSERT INTO products2 (product_name) VALUES
      ('白色衬衫'), ('黑色西装裤'), ('休闲上衣'), ('运动外套'), ('白色连衣裙'),
      ('蓝牙耳机'), ('牛奶巧克力'), ('白色上衣'), ('男士T恤'), ('羽绒服');
    
    SELECT product_name FROM products2 
      ORDER BY ai_similarity(product_name, '白色上衣') DESC LIMIT 5;
    

    返回结果如下。

    product_name
    ----------
    白色上衣
    白色衬衫
    休闲上衣
    白色连衣裙
    男士T

ai_analyze_sentiment

  • 描述:对输入的文本进行情感分析。

    select ai_analyze_sentiment([model,] content)
  • 参数说明

    • model:可选,AI Function使用的模型名称。默认由系统根据已部署的模型自动分配最佳模型。如需更换模型,应先完成目标模型的部署,再修改系统表配置。具体操作,请参见修改AI Function对应的模型

    • content:必填,输入的文本,支持字符类型(CHAR、VARCHAR、TEXT)。

  • 返回值说明

    • 返回分析之后的情感Labels,为字符类型。 不同的模型,返回的Labels内容有区别。

    • content参数值为NULL或者空字符串("")时,则返回NULL。

  • 使用示例

    --大语言模型:
    SELECT ai_analyze_sentiment('洞房花烛夜,金榜提名时。');
    -- output example: positive
    
    --使用iic/nlp_structbert_sentiment-classification_chinese-base模型
    SELECT ai_analyze_sentiment('清明时节雨纷纷,路上行人欲断魂。');
    -- output example: 负面

AI Function与模型

查看Function与模型的映射关系

Hologres提供list_ai_function_infos系统表,用于查看AI Function与模型的映射关系。在Hologres管控台部署模型后,该系统表会自动更新每个AI Function对用的模型,您可以通过AI Function调用对应的模型。

说明

不同AI Function需要适配特定类型的模型,例如:ai_embed适合embedding模型、ai_classify适合大语言模型。若实例中仅部署了一种模型,可能会出现部分AI Function分配模型为空的情况。AI Function没有部署对应的模型时,将无法使用该AI function。

SELECT * FROM list_ai_function_infos();

返回结果如下。

    function_name     |    model_name    
----------------------+------------------
 ai_embed             | my_gte_embedding
 ai_classify          | my_qwen32b
 ai_extract           | my_qwen32b

修改AI Function对应的模型

AI Function与部署的模型有默认映射值,您可以通过如下方式修改AI Function对应的模型。修改后,使用AI Function时将会调用新的模型。

  • 全局修改

    SELECT set_ai_function_info('<function_name>', '');

    参数说明

    • function_name:AI Function名称,您可以在AI Function汇总表中查看AI Function名称。

    • model_name:已经部署的模型名称。您可登录Hologres管理控制台AI节点页面,查看已部署的模型名称。

    说明

    当输入的AI Function名称和已部署模型名称不存在时,将会报错。

    使用示例

    SELECT set_ai_function_info('ai_embed', 'my_gte_embedding');
  • SESSION级别修改

    SESSION级别修改后,使用AI Function调用模型时,SESSION级别的配置会优先于全局修改(ai_function_info)的配置。

    --仅在该连接上生效
    SET hg_experimental_ai_function_name_to_model_name_mapping='<function_name>:<model_name>[,<function_name1>:<model_name1>]';