AI Function

更新时间:
复制 MD 格式

AI functions let you call AI services directly from your database to perform tasks like text generation, conditional checks, sentiment analysis, and classification. This topic explains how to use these functions in AnalyticDB for MySQL.

Important

Because AI models are stochastic, AI functions can produce different output even for the same input.

  • Text functions

    • ai_filter: Checks input text for factual accuracy or compliance.

    • ai_translate: Translates input text into a specified language.

    • ai_classify: Analyzes the semantics of input text and classifies it into predefined categories or labels.

    • ai_extract: Extracts specified types of structured information, such as names, dates, addresses, and keywords, from unstructured text.

    • ai_sentiment: Analyzes the sentiment of input text and returns a sentiment label such as positive, negative, or neutral.

    • ai_similarity: Computes the semantic similarity between two texts.

    • ai_mask: Masks sensitive information in input text, such as ID card numbers, phone numbers, and email addresses.

    • ai_summarize: Generates a concise and accurate summary of long text.

    • ai_embed: Generates a text embedding.

    • ai_pdf_extract: Extracts information from a PDF document.

  • Audio functions

    • ai_audio_transcribe: Transcribes an audio clip into text. You can specify parameters such as the segmentation mode and target language.

    • ai_audio_embed: Converts an audio clip into an array of embedding vectors.

  • Aggregate functions

    • ai_agg: Aggregates and analyzes multiple rows of text data to generate a summary.

Prerequisites

  • ENI access is enabled for your AnalyticDB for MySQL Enterprise Edition, Basic Edition, Data Lakehouse Edition, or Data Warehouse Edition (Elastic Mode) cluster.

    Important
    • Log on to the AnalyticDB for MySQL console. On the Cluster Management > Cluster Information page, turn on the ENI network switch in the Network Information section.

    • Enabling or disabling the ENI network interrupts database connections for about two minutes, causing read and write operations to fail. Carefully evaluate this impact before proceeding.

  • The AnalyticDB for MySQL cluster kernel version is 3.2.8 or later.

    Note

    To view and update the minor version, go to the Configuration Information section on the Cluster Information page in the AnalyticDB for MySQL console. To upgrade a cluster that is already on the latest default baseline version, contact Alibaba Cloud Service Support on DingTalk (DingTalk ID: x5v_rm8wqzuqf).

Prerequisites

  1. Step 1: Activate the model operator

    Activate the model operator on the AI application purchase page.

  2. Step 2: Use AI functions

    Once activated, the model is automatically created in the backend. You can then call AI functions directly in your SQL queries.

    Example:

    select ai_filter("qwen3-max", "Is Zhejiang in China?")

Text functions

ai_filter

ai_filter(text)
ai_filter(model_name, text)
  • Description: Checks the input text for factual accuracy or compliance. Returns true if the text is credible, reasonable, or compliant with predefined rules; otherwise, returns false.

  • Parameters:

  • Return type: BOOLEAN.

Example 1

SELECT ai_filter("Is Zhejiang in China?")

Result

+---------------------------------+
| ai_filter("Is Zhejiang in China?")       |
+---------------------------------+
| 1                               | 

Example 2

SELECT ai_filter("qwen_plus_external", "Is Zhejiang in China?")

Result

+----------------------------------------------+
| ai_filter("qwen_plus_external", "Is Zhejiang in China?")       |
+----------------------------------------------+
| 1                                            | 

ai_translate

ai_translate(text, targetLang)
ai_translate(model_name, text, targetLang)
  • Description: Translates the input text into the specified language.

  • Parameters:

    • targetLang: The target language for translation. Data type: VARCHAR.

      Important

      The model must support the target language.

  • Return type: VARCHAR.

Example

SELECT ai_translate("AnalyticDB for MySQL is a data analytics platform based on a lakehouse architecture.", "cn") as translate_text

Result

+-------------------------------------------------------+
| translate_text                                        |
+-------------------------------------------------------+
| AnalyticDB for MySQL 是一个基于湖仓一体架构的数据分析平台。 | 

ai_classify

ai_classify(text, labels)
ai_classify(model_name, text, labels)
  • Description: Classifies the input text into one of the predefined categories based on semantic analysis.

  • Parameters:

    • labels: A list of predefined category labels. Data type: VARCHAR or Array<VARCHAR>.

  • Return type: VARCHAR.

Example 1

SELECT ai_classify('Last night, I tried a new pasta recipe in the kitchen. The sauce was rich, and my family loved it!', "['travel', 'cooking', 'reading', 'driving']")
AS predicted_label;

Result

+----------------------------------------------+
| predicted_label                              |
+----------------------------------------------+
| cooking                                      | 

Example 2

SELECT ai_classify('Last night, I tried a new pasta recipe in the kitchen. The sauce was rich, and my family loved it!', ARRAY['travel', 'cooking', 'reading', 'driving'])
AS predicted_label;

Result

+----------------------------------------------+
| predicted_label                              |
+----------------------------------------------+
| cooking                                      | 

ai_extract

ai_extract(text, labels)
ai_extract(model_name, text, labels)
  • Description: Extracts specified types of structured information, such as names, dates, addresses, and keywords, from unstructured text.

  • Parameters:

    • labels: A list of predefined labels for the information to extract. Data type: VARCHAR or Array<VARCHAR>.

  • Return type: VARCHAR.

Example

select ai_extract('Yesterday (June 15, 2024), I bought an iPhone 15 Pro on Taobao. Its titanium body and A17 chip are truly amazing!', "['product_name', 'date', 'key_feature']") as result

Result

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| product_name=iPhone 15 Pro, date=2024年6月15日, key_feature=titanium body and A17 chip                                         | 

ai_generate

ai_generate(text)
ai_generate(model_name, text)
  • Description: Generates text based on the provided input text (prompt).

  • Parameters:

  • Return type: VARCHAR.

Example

select ai_generate('Briefly introduce the TPC-H benchmark in one sentence.') as result

Result

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| TPC-H is a standard benchmark used to evaluate the decision support capabilities of a database. | 

ai_sentiment

ai_sentiment(text)
ai_sentiment(model_name, text)
  • Description: Analyzes the sentiment of the input text. Returns one of the following four labels: positive, negative, neutral, or mixed.

  • Parameters:

  • Return type: VARCHAR.

Example

SELECT ai_sentiment('These headphones have excellent sound quality and are comfortable to wear. Highly recommended!') AS sentiment;

Result

+----------------------------------------------+
| sentiment                                    |
+----------------------------------------------+
| positive                                     | 

ai_similarity

ai_similarity(text1, text2)
ai_similarity(model_name, text1, text2)
  • Description: Computes the semantic similarity between two texts. Returns a similarity score from 0 to 10, where a higher score indicates greater similarity.

  • Parameters:

    • model_name: The model name. Data type: VARCHAR.

    • text1 and text2: The input texts. Data type: VARCHAR.

  • Return type: VARCHAR.

Example

SELECT ai_similarity(
  'How do I reset my account password?',
  'I forgot my login password, how can I retrieve it?'
) AS result;

Result

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| 9.3                                          | 

ai_mask

ai_mask(text, labels)
ai_mask(model_name, text, labels)
  • Description: Masks sensitive information (such as ID numbers, phone numbers, and email addresses) in the input text.

  • Parameters:

    • labels: A list of labels that identify the sensitive information to mask. Data type: VARCHAR or Array<VARCHAR>.

  • Return type: VARCHAR.

Example

SELECT ai_mask(
  'Contact me: 1381234****, my email is user@example.com, and my ID number is 110101199003072316',
  "['phone', 'email', 'id_card']"
) AS result;

Result

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| Contact me: [MSKED], my email is [MSKED], and my ID number is [MSKED]   | 

ai_summarize

ai_summarize(text)
ai_summarize(model_name, text)
  • Description: Generates a concise and accurate summary of a long text.

  • Parameters:

  • Return type: VARCHAR.

Example

SELECT ai_summarize('Above the winding lotus pond, all you see are vast leaves. The leaves rise high out of the water, like the skirts of graceful dancers. Among the layers of leaves, a few white flowers are scattered—some opening softly, some shyly in bud—like pearls, like stars in a blue sky, like beauties fresh from a bath. A breeze passes and brings faint fragrance, like a distant song from a tall building. At that moment, leaves and flowers tremble slightly, and the ripple flashes across the pond like lightning. The leaves, packed shoulder to shoulder, now seem to hold a wave of deep green. Beneath them flows quiet water, hidden from view, while the leaves look all the more charming. Moonlight pours down like water, gently over the leaves and flowers. A thin mist floats above the pond. Leaves and flowers look as if washed in milk, as if wrapped in a light veil of dream. It is full moon, yet a pale cloud hangs in the sky, so the light is not too bright; but I find it just right—deep sleep is necessary, yet a light nap has its own flavor. The moonlight comes filtered through trees. Dense shrubs cast jagged dark shadows like ghosts, while the sparse silhouettes of willows look like paintings on the lotus leaves. The moonlight in the pond is uneven, but light and shadow form a harmonious rhythm, like a famous tune played on a violin.') AS result;

Result

+---------------------------------------------------------+
| result                                                  |
+---------------------------------------------------------+
| The moonlit lotus pond is serene and beautiful: vast lotus leaves resemble dancers' skirts, dotted with white flowers like pearls, stars, or fresh-bathed beauties. A gentle breeze carries a sweet fragrance, causing the leaves and flowers to tremble and create a green ripple. Underneath, hidden water enhances their charm. The moonlight, like flowing water, and a light mist create a dreamlike scene, making the leaves and flowers appear as if washed in milk. Though the moon is full, a thin cloud softens its light, casting mottled shadows from trees and shrubs, creating a harmonious interplay of light and shadow reminiscent of a famous violin melody. | 

ai_embed

ai_embed(text)
ai_embed(model_name, text)
ai_embed(model_name, text, options)
  • Description: Generates a text embedding.

  • Parameters:

    • options: Optional parameters. Data type: VARCHAR. The following parameters are supported.

      Parameter

      Description

      dimension

      The embedding dimension.

      type

      The embedding type. The model must support the specified type.

      • document (default)

      • query

  • Return type: Array.

Example

SELECT ai_embed("text_embedding_v4", "What should I have for lunch today?", "{'dimension':1024}")

Result

[-0.02406704,-0.09344522,-0.021460138,-0.059166607,0.027912581,...]

ai_pdf_extract

ai_pdf_extract(url, options)
ai_pdf_extract(model_name, url, options)
  • Description: Extracts information from a PDF document.

  • Parameters:

    • model_name: The model name. Data type: VARCHAR.

    • url: The URL of the source PDF file. Data type: VARCHAR.

    • options: Optional parameters. Data type: VARCHAR. The following parameters are supported.

      Parameter

      Description

      page_index

      The page number to extract from.

      role_arn

      (Optional) The role ARN of the account that owns the OSS file.

      This parameter is not required if you use the OSS service of the account where your AnalyticDB for MySQL instance resides.

      schema_only

      Specifies whether to include the file structure in the output.

      • true: The output includes the file structure.

      • false: The output does not include the file structure.

      components

      (Optional) Defines custom components to extract from the document.

      The components parameter is a JSON array, where each object has the following structure:

      Parameter

      Description

      name

      The name of the structure.

      type

      The type of the structure.

      instruction

      A natural language description of the content to extract for this component.

  • Return type: JSON.

Example

SELECT ai_pdf_extract(
  'qwen35_plus',
  'oss://my-bucket/invoices/invoice_001.pdf',
  '{"page_index": 1}'
);

Result

{"data":[{"content":[{"description":"Services\nCost of various services.","rate":"$50.00\n+Tax","qty":"10","amount":"$500.00"},{"description":"Consulting\nConsultation for your business.","rate":"$75.00\n+Tax","qty":"15","amount":"$1,125.00"},{"description":"Materials\nCost of materials and supplies to complete job.","rate":"$122.39\n+Tax","qty":"1","amount":"$122.39"}],"name":"table","type":"table"},{"content":{"billed_to":"Your Client\n1234 Clients Street\nCity, California\n90210\nUnited States\n1-888-123-8910","date_issued":"26/2/2021","invoice_number":"INV-10012","amount_due":"$1,699.48","due_date":"25/4/2021","subtotal":"$1,798.39","discount":"-$179.84","tax":"$80.93","total":"$1,699.48","deposit_requested":"$169.95","deposit_due":"$169.95"},"name":"form","type":"form"},{"content":{"filename":"stocklist.pdf","size":117249,"creation_date":"2026-03-09 06:37:58","modification_date":"2026-03-09 06:37:58","page_count":1,"url":"oss://test-oss-hk/ai/pdf_test/stocklist.pdf","page_index":1},"name":"metadata","type":"metadata"}]}

Audio functions

ai_audio_transcribe

This function supports only built-in models. For more information about how to create a model service, see Create an inference service.

ai_audio_transcribe(url)
ai_audio_transcribe(model_name, url)
ai_audio_transcribe(model_name, url, options)
  • Description: Transcribes an audio clip into text. You can specify parameters such as the diarization mode and target language. The function returns the transcribed text or a URL to the output file.

  • Parameters:

    • model_name: The model name. Data type: VARCHAR.

    • url: The URL of the input audio file. Data type: VARCHAR.

    • options: Optional. The parameters are provided as a JSON-formatted string. The supported parameters are as follows.

      Parameter

      Description

      Value

      Default

      language

      The target language for transcription.

      • cn (Chinese)

      • en (English)

      • ja (Japanese)

      • yue (Cantonese)

      • fspk (Mixed Chinese and English)

      cn

      diarization_mode

      The segmentation mode for the transcribed text.

      • word (by word)

      • sentence (by sentence)

      • speaker (by speaker)

      word (by word)

      output_type

      The output format.

      • JSON (returns the transcribed text directly as a JSON object)

      • URL (saves the output to Object Storage Service (OSS) and returns a URL to the text file)

      JSON

  • Return value: VARCHAR.

Example 1

SELECT ai_audio_transcribe("https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav") as result

Result

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| {"TaskId": "xxxx", "Transcription": {"AudioInfo": {"Size": 128480, "Duration": 3834, "SampleRate": 16000, "Language": "cn"}, "Paragraphs": [{"ParagraphId": "1768447804076500000", "SpeakerId": "1", "Words": [{"Id": 10, "SentenceId": 1, "Start": 100, "End": 595, "Text": "Hello, "}, {"Id": 20, "SentenceId": 1, "Start": 596, "End": 841, "Text": "world, "}, {"Id": 30, "SentenceId": 1, "Start": 844, "End": 1588, "Text": "this is "}, {"Id": 40, "SentenceId": 1, "Start": 1588, "End": 2580, "Text": "Alibaba "}, {"Id": 50, "SentenceId": 1, "Start": 2580, "End": 3076, "Text": "Speech "}, {"Id": 60, "SentenceId": 1, "Start": 3076, "End": 3820, "Text": "Lab."}]}], "AudioSegments": [[100, 3820]]}}                                | 

Example 2

SELECT ai_audio_transcribe("tingwu", "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav", "{'diarization_mode':'sentence'}") as result

Result

+----------------------------------------------+
| result                                       |
+----------------------------------------------+
| {"TaskId": "xxxx", "Transcription": {"AudioInfo": {"Size": 128480, "Duration": 3834, "SampleRate": 16000, "Language": "cn"}, "Speakers": [{"SpeakerId": "1", "Sentences": [{"SentenceId": 1, "Start": 100, "End": 3820, "Text": "Hello, world, this is the Alibaba Speech Lab."}]}]}}| 

ai_audio_embed

This function generates an audio embedding and supports only built-in models. For more information about how to create a model service, see Create an inference service.

ai_audio_embed(url)
ai_audio_embed(model_name, url)
ai_audio_embed(model_name, url, options)
  • Description: Converts an audio clip into its corresponding embedding vector, which is returned as an array of floating-point numbers. You can specify start and end times to generate an embedding for a specific audio segment.

  • Parameters:

    • options: Optional. The parameters are provided as a JSON-formatted string. The supported parameters are as follows.

      Parameter

      Description

      Note

      access_id

      The Access Key ID for Object Storage Service (OSS).

      If you specify role_arn, you do not need to provide the access_id, access_secret, or token parameters.

      access_secret

      The Access Key Secret for Object Storage Service (OSS).

      token

      The token credential for Object Storage Service (OSS).

      role_arn

      The ARN of the role.

      start_time

      The start time of the audio clip.

      If not specified, defaults to the beginning of the audio.

      end_time

      The end time of the audio clip.

      If not specified, defaults to the end of the audio.

  • Return value: Array<FLOAT>.

Example

SELECT ai_audio_embed("https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav")

Result

[0.4234241,0.3575562,0.022631245,-0.09829298,-0.40605977,-0.72465694,-1.8039973,2.4773192,0.91702485,0.15146817,0.7209677,0.78260005,0.8019302,2.7577147,-0.86063147,-0.8249354,0.28732437,0.9009849,0.8899265,-1.5320694,-2.0866642,0.5051579,1.3553369,2.0348034,-0.74032015,-0.66065377,0.2756823,0.15714298,-2.8250875,0.8162984,0.20878711,0.8285897,1.4863579,1.0645012,-1.2933203,0.6919463,0.87850314,1.2212433,-0.44420764,-1.0146322,-1.8209833,0.16996253,-1.534067,-0.10253409,0.07611574,2.4005992,-1.2903963,0.6613166,-0.2554945,0.97570425,-0.108424075,0.82061934,-0.18102498,-0.27137256,-1.2723781,-0.9188018,-0.027503261,-0.15346082,-0.9564006,-0.4618776,-1.077228,-0.27756396,0.28749034,0.2570488,-0.39753774,0.5270934,0.6970718,-1.2984067,2.4312186,1.485573,1.6031872,-0.6158852,-1.1512791,0.49305362,-1.1939837,0.72933173,0.2659372,2.5559616,-0.41886744,0.6304756,-0.6171451,-0.5282561,0.78426796,-0.57570314,-0.91166306,-0.531768,0.96598166,0.17423655,-0.9133532,-0.09258405,-0.26951456,-0.34667063,-0.21612991,-0.054430377,0.052798696,0.5198285,1.9431844,-0.5638291,-2.1043544,-0.47198293,0.21959576,0.5975169,0.3951007,0.28112686,1.792782,-0.15896009,-0.71152246,1.1540072,0.37563428,0.0016163457,0.6544563,0.043974634,-1.6375796,-0.3698572,1.5310435,-0.20282944,0.73641545,2.1867669,1.4115202,-0.8114418,-0.36265984,0.2191038,-0.8933491,0.6616433,1.545423,0.10970531,-0.109743424,0.44605017,-0.087265715,0.29077715,-2.4435875,-1.1484046,0.078253455,0.02283861,-0.29482847,2.0237377,-0.6009212,-1.2377249,0.05194488,-0.16573223,0.8370868,-0.43602425,-0.4706988,0.17996503,0.88799554,0.6562157,-0.7520023,0.64789253,0.36865348,-1.4820247,0.5299587,1.1884397,1.4351603,0.21073115,-0.1712549,2.9410155,1.7485802,-0.6828801,-0.4833641,-0.4477328,2.3307724,-0.35595444,-0.61682695,-0.5370858,-0.8068234,1.2195143,-1.0834758,0.45275012,1.6243625,-1.3629726,-0.2959109,0.05071621,-0.7280639,-0.1713935,-0.43650395,-0.2131698,0.25380868,-0.16288652,2.0921175,0.3555297,-0.10752705,0.5716714,-0.09943808,-1.9066161,0.12683412,0.8566219,-0.20323975,1.4260024,-0.28789783,0.32412016,0.88897985,0.86555415,0.5940666,0.6645551,0.046334613,-1.5640875,-0.3510718,0.55363727,-0.13503519,0.67008746,-0.9184686,0.025278697,1.0128921,-0.61241907,-0.01107134,-0.8309523,-0.51025873,1.1141272,0.28813183,-0.10634196,-0.9954202,-0.07171043,-0.5856978,-1.2660285,2.1327019,-0.60268104,-0.6227884,-0.42101067,0.059599742,-0.09640202,-1.258711,0.5054924,-1.1847625,-0.044398762,-0.98595804,0.8883682,0.6221085,-2.5484293,0.7249505,0.69930685,0.7739025,-0.8139478,-1.1988907,-1.0416493,2.0153732,-1.7091763,-0.5611238,-1.2147603,0.9113469,1.5113174,0.23810485,-1.702736,-0.3295935,-0.41867778,-1.0378691,-0.45600057,-0.43525052,-0.1078409,-1.2993969,0.12842774,0.026097976,-0.7705405,-1.0907317,0.28274077,1.236289,1.6190177,2.1874366,0.16072829,-0.33150536,2.218483,-1.1703843,0.10300327,1.1994884,0.48275462,-0.40795812,0.5020531,1.1787555,-0.082187966,0.6315653,0.36654752,-0.24940589,-0.8652801,1.6283739,0.41405886,0.6377814,0.08396838,-1.0169003,1.2100558,1.4457762,-0.07999261,-0.012102162,0.85055244,-0.09711141,-1.0452846,0.13768612,-2.0506873,-1.6474499,0.043265514,-1.0009454,-0.0111249415,-1.2523409,-0.080719866,-0.6187693,-1.398226,0.6425289,-0.4808641,-0.06030046,-0.10275636,-0.31625932,-1.5993032,-0.20966552,-1.4618409,0.34925935,-0.5034448,0.100028045,0.25327235,-1.078896,-0.23394233,1.2247928,-2.6050038,-0.71609926,-0.77765155,-1.2089496,0.8526703,-0.1358416,1.1074059,1.1545771,-0.94525933,0.41012967,0.9361201,-0.14788401,-0.29333082,1.5782444,1.1100405,-0.4074414,-0.3862537,-0.5779069,-0.88644946,0.2233385,1.3612705,1.2413827,-1.3625424,-1.3623037,0.3056319,-1.4446377,0.64613384,0.15064861,-0.61473364,-1.3611295,-0.1975697,-1.0701923,0.7591377,-1.2106745,-2.067824,0.45041704,-0.71582735,-1.743847,1.169414,2.0158787,0.4734838,-0.3133036,-1.9916989,1.1441987,0.9155275,-1.3003027,0.82898057,-0.7439868,1.2072865,0.46877453,0.6648313,0.80477613,1.6927507,0.5842916,0.36608973,2.259473,-1.1628797,-0.3311869,0.36989415,-0.25035658,-0.28012496,1.092324,-0.40238732,0.0046352614,-2.0625768,1.161326,0.92277956,-0.20316431,-0.15164377,1.1715052,-0.7067665,-0.8608931,0.079684004,-0.89916384,0.02488108,-0.57668805,-0.879138,-1.7274998,0.824049,1.5638052,0.28415555,-0.17347385,-2.085917,0.4987632,-0.031601395,-1.0190377,-0.7815816,0.69643223,-0.33574122,2.1242745,0.19785832,0.55690974,-1.3932099,-0.44952002,-0.7719798,1.1326215,0.43839702,0.996999,-0.55692834,1.4014084,-2.3939395,-1.0112343,-1.4143248,1.1514114,1.1233637,-0.3385779,-0.23665123,-0.100857966,0.4633971,1.6215613,-0.0692888,-0.031505972,1.0472811,0.57112235,0.6015763,0.07582237,0.52702487,-1.3809607,-1.7482765,0.38386008,0.99316126,0.95603,0.40644804,-1.5072054,-0.34419048,0.63205683,-1.0854999,-0.92245156,-0.2712947,-0.75696105,0.996232,-0.10738732,-0.4674776,-1.2149413,-1.5094053,1.6796608,0.21961057,-0.35295358,-1.2609407,-0.040009048,-0.38785484,0.7788784,-0.65823495,2.0559616,-1.0074826,1.2282485,1.2540467,0.4914942,-0.47057188,-0.47061247,-0.16255763,-0.6718562,-0.53630847,0.4804698,-0.3134068,0.6407026,-0.727981,0.0481851,0.06927338,0.8321921,-0.6639807,0.74932885,0.23291564,0.76362675,-1.2966217,0.8806557,-1.2141875,0.6996881,-0.8293652,0.9085288,1.8878758,0.11363957,0.148718,0.5030497,-1.0422761,0.08673843,-0.80342984,-0.8046266,-0.18026677,0.28900644,0.76534355,-1.3163859,-0.72775376,0.36529016,-0.3660175,0.31056792,0.052575216,-1.11831,0.7895246,1.1172394,-0.31374845,0.17143561,-0.42633826,0.16579832,-0.012790448,-1.4290546,-0.47322562,0.7557427,1.0487514,-0.14331971,0.30455515,0.6938429,0.5565799]

Aggregate functions

ai_agg

ai_agg(text)
ai_agg(model_name, text)
  • Description: The ai_agg aggregate function generates a summary by semantically aggregating and analyzing text from multiple rows. It collects text from all rows in a group and passes the content to an AI model for summarization.

  • Parameters:

    • model_name: The name of the model to use. If omitted, the default model is used. Data type: VARCHAR.

    • text: The text content to be aggregated. Data type: VARCHAR.

  • Return value: VARCHAR.

Example 1: Using the default model

SELECT product_name, ai_agg(comment) AS summary
FROM product_reviews
GROUP BY product_name;

Result

+--------------+------------------------------------------------------+
| product_name | summary                                              |
+--------------+------------------------------------------------------+
| Headphones A | Users generally praised the sound quality and comfort, but some reported insufficient battery life. |
+--------------+------------------------------------------------------+

Example 2: Specifying a model name

SELECT product_name, ai_agg('qwen-plus', comment) AS summary
FROM product_reviews
GROUP BY product_name;

Result

+--------------+------------------------------------------------------+
| product_name | summary                                              |
+--------------+------------------------------------------------------+
| Headphones A | Users generally praised the sound quality and comfort, but some reported insufficient battery life. |
+--------------+------------------------------------------------------+

Example 3: Aggregating data from the entire table

SELECT ai_agg(feedback) AS overall_summary
FROM customer_feedback;

Result

+------------------------------------------------------------+
| overall_summary                                            |
+------------------------------------------------------------+
| The overall feedback indicates high satisfaction with product quality, but highlights that logistics speed needs improvement. |
+------------------------------------------------------------+

Model list

The following models are currently available:

Brand

Model name

Type

Qwen

qwen3-max

text model

qwen3.6-plus

text model

qwen3.5-plus

text model

qwen3.5-flash

text model

DeepSeek

deepseek-v4-flash

text model

deepseek-v4-pro

text model

deepseek-v3.2

text model

Zhipu AI

glm-5.1

text model

glm-5

text model

Moonshot AI

kimi-k2.6

text model

MiniMax

MiniMax-M2.7

text model

Configure default models for AI functions

When calling an ai function without specifying a model name, the system uses a default model. You can change the default models for each model type by setting its corresponding parameter. For instructions on how to set parameters, see Config and Hint configuration parameters.

Model type

Default model

Parameter

TEXT_TO_TEXT (text model)

qwen-plus

AI_FUNCTION_TEXT_TO_TEXT_MODEL

TEXT_TO_EMBEDDING (text embedding model)

text-embedding-v2

AI_FUNCTION_TEXT_TO_EMBEDDING_MODEL

AUDIO_TO_TEXT (audio-to-text model)

tingwu

AI_FUNCTION_AUDIO_TO_TEXT_MODEL

AUDIO_TO_EMBEDDING (audio embedding model)

qwen2.5-vl-embedding

AI_FUNCTION_AUDIO_TO_EMBEDDING_MODEL

IMAGE_TO_TEXT (image model)

qwen35_plus

AI_FUNCTION_IMAGE_TO_TEXT_MODEL

Typical use cases

Use case 1: Text retrieval

Scenario: In applications such as RAG and customer service systems, retrieve results from a text library that semantically match a user's query.

Use the ai_embed function to convert the search query and the text in the library into embeddings. Then, calculate the cosine similarity and sort the results to retrieve the 10 most similar texts.

SELECT cosine_similarity(ai_embed("The semantic text to search for"), ai_embed(text)) AS cos
FROM text_table
ORDER BY cos
LIMIT 10

To further optimize vector search performance, see vector search.

Use case 2: Speaker recognition

image.png

Scenario: For meetings or daily conversations, transcribe recordings into text and distinguish between speakers in the dialogue.

To further analyze speaker information, you can segment the audio by speaker and then compare the embedding of each segment with those in a voice dataset. For the complete solution, see speaker recognition.

  • Transcription and segmentation: Use the AI_AUDIO_TRANSCRIBE function to transcribe audio into text and segment it by speaker.

    SELECT AI_AUDIO_TRANSCRIBE("oss://xxx", "{'diarization_mode':'speaker'}")
  • Speaker recognition: Use the AI_AUDIO_EMBED function to generate embeddings for the input audio and for audio in the library. Then, calculate the l2_distance and select the 10 closest matches.

    SELECT l2_distance(input_embedding, origin_embedding) AS ld
    FROM(
      SELECT ai_audio_embed("oss://xxx") AS input_embedding
      JOIN
      SELECT ai_audio_embed(url) AS origin_embedding
      FROM audio_table)
    ORDER BY ld
    LIMIT 10

Use case 3: Extract negative customer reviews

Scenario: Filter out negative feedback from user comments.

The customer_comment table stores daily customer comments. Use the ai_sentiment function to perform sentiment analysis on each comment and filter for all negative comments on January 1, 2026.

-- The qwen_plus_external model, which corresponds to the qwen-plus model in Model Studio, must be created beforehand.
SELECT name, comment
FROM customer_comment
WHERE ai_sentiment("qwen_plus_external", comment)='negative'
AND dt='20260101'