AI Function
AI functions let you call AI services directly from your database to perform tasks like text generation, conditional checks, sentiment analysis, and classification. This topic explains how to use these functions in AnalyticDB for MySQL.
Because AI models are stochastic, AI functions can produce different output even for the same input.
-
Text functions
-
ai_filter: Checks input text for factual accuracy or compliance.
-
ai_translate: Translates input text into a specified language.
-
ai_classify: Analyzes the semantics of input text and classifies it into predefined categories or labels.
-
ai_extract: Extracts specified types of structured information, such as names, dates, addresses, and keywords, from unstructured text.
-
ai_sentiment: Analyzes the sentiment of input text and returns a sentiment label such as positive, negative, or neutral.
-
ai_similarity: Computes the semantic similarity between two texts.
-
ai_mask: Masks sensitive information in input text, such as ID card numbers, phone numbers, and email addresses.
-
ai_summarize: Generates a concise and accurate summary of long text.
-
ai_embed: Generates a text embedding.
-
ai_pdf_extract: Extracts information from a PDF document.
-
-
Audio functions
-
ai_audio_transcribe: Transcribes an audio clip into text. You can specify parameters such as the segmentation mode and target language.
-
ai_audio_embed: Converts an audio clip into an array of embedding vectors.
-
-
Aggregate functions
-
ai_agg: Aggregates and analyzes multiple rows of text data to generate a summary.
-
Prerequisites
-
ENI access is enabled for your AnalyticDB for MySQL Enterprise Edition, Basic Edition, Data Lakehouse Edition, or Data Warehouse Edition (Elastic Mode) cluster.
Important-
Log on to the AnalyticDB for MySQL console. On the page, turn on the ENI network switch in the Network Information section.
-
Enabling or disabling the ENI network interrupts database connections for about two minutes, causing read and write operations to fail. Carefully evaluate this impact before proceeding.
-
-
The AnalyticDB for MySQL cluster kernel version is 3.2.8 or later.
NoteTo view and update the minor version, go to the Configuration Information section on the Cluster Information page in the AnalyticDB for MySQL console. To upgrade a cluster that is already on the latest default baseline version, contact Alibaba Cloud Service Support on DingTalk (DingTalk ID:
x5v_rm8wqzuqf).
Prerequisites
-
Step 1: Activate the model operator
Activate the model operator on the AI application purchase page.
-
Step 2: Use AI functions
Once activated, the model is automatically created in the backend. You can then call AI functions directly in your SQL queries.
Example:
select ai_filter("qwen3-max", "Is Zhejiang in China?")
Text functions
ai_filter
ai_filter(text)
ai_filter(model_name, text)
-
Description: Checks the input text for factual accuracy or compliance. Returns
trueif the text is credible, reasonable, or compliant with predefined rules; otherwise, returnsfalse. -
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
-
Return type: BOOLEAN.
Example 1
SELECT ai_filter("Is Zhejiang in China?")
Result
+---------------------------------+
| ai_filter("Is Zhejiang in China?") |
+---------------------------------+
| 1 |
Example 2
SELECT ai_filter("qwen_plus_external", "Is Zhejiang in China?")
Result
+----------------------------------------------+
| ai_filter("qwen_plus_external", "Is Zhejiang in China?") |
+----------------------------------------------+
| 1 |
ai_translate
ai_translate(text, targetLang)
ai_translate(model_name, text, targetLang)
-
Description: Translates the input text into the specified language.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
targetLang: The target language for translation. Data type: VARCHAR.
ImportantThe model must support the target language.
-
-
Return type: VARCHAR.
Example
SELECT ai_translate("AnalyticDB for MySQL is a data analytics platform based on a lakehouse architecture.", "cn") as translate_text
Result
+-------------------------------------------------------+
| translate_text |
+-------------------------------------------------------+
| AnalyticDB for MySQL 是一个基于湖仓一体架构的数据分析平台。 |
ai_classify
ai_classify(text, labels)
ai_classify(model_name, text, labels)
-
Description: Classifies the input text into one of the predefined categories based on semantic analysis.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
labels: A list of predefined category labels. Data type: VARCHAR or Array<VARCHAR>.
-
-
Return type: VARCHAR.
Example 1
SELECT ai_classify('Last night, I tried a new pasta recipe in the kitchen. The sauce was rich, and my family loved it!', "['travel', 'cooking', 'reading', 'driving']")
AS predicted_label;
Result
+----------------------------------------------+
| predicted_label |
+----------------------------------------------+
| cooking |
Example 2
SELECT ai_classify('Last night, I tried a new pasta recipe in the kitchen. The sauce was rich, and my family loved it!', ARRAY['travel', 'cooking', 'reading', 'driving'])
AS predicted_label;
Result
+----------------------------------------------+
| predicted_label |
+----------------------------------------------+
| cooking |
ai_extract
ai_extract(text, labels)
ai_extract(model_name, text, labels)
-
Description: Extracts specified types of structured information, such as names, dates, addresses, and keywords, from unstructured text.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
labels: A list of predefined labels for the information to extract. Data type: VARCHAR or Array<VARCHAR>.
-
-
Return type: VARCHAR.
Example
select ai_extract('Yesterday (June 15, 2024), I bought an iPhone 15 Pro on Taobao. Its titanium body and A17 chip are truly amazing!', "['product_name', 'date', 'key_feature']") as result
Result
+----------------------------------------------+
| result |
+----------------------------------------------+
| product_name=iPhone 15 Pro, date=2024年6月15日, key_feature=titanium body and A17 chip |
ai_generate
ai_generate(text)
ai_generate(model_name, text)
-
Description: Generates text based on the provided input text (prompt).
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
-
Return type: VARCHAR.
Example
select ai_generate('Briefly introduce the TPC-H benchmark in one sentence.') as result
Result
+----------------------------------------------+
| result |
+----------------------------------------------+
| TPC-H is a standard benchmark used to evaluate the decision support capabilities of a database. |
ai_sentiment
ai_sentiment(text)
ai_sentiment(model_name, text)
-
Description: Analyzes the sentiment of the input text. Returns one of the following four labels:
positive,negative,neutral, ormixed. -
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
-
Return type: VARCHAR.
Example
SELECT ai_sentiment('These headphones have excellent sound quality and are comfortable to wear. Highly recommended!') AS sentiment;
Result
+----------------------------------------------+
| sentiment |
+----------------------------------------------+
| positive |
ai_similarity
ai_similarity(text1, text2)
ai_similarity(model_name, text1, text2)
-
Description: Computes the semantic similarity between two texts. Returns a similarity score from 0 to 10, where a higher score indicates greater similarity.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text1 and text2: The input texts. Data type: VARCHAR.
-
-
Return type: VARCHAR.
Example
SELECT ai_similarity(
'How do I reset my account password?',
'I forgot my login password, how can I retrieve it?'
) AS result;
Result
+----------------------------------------------+
| result |
+----------------------------------------------+
| 9.3 |
ai_mask
ai_mask(text, labels)
ai_mask(model_name, text, labels)
-
Description: Masks sensitive information (such as ID numbers, phone numbers, and email addresses) in the input text.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
labels: A list of labels that identify the sensitive information to mask. Data type: VARCHAR or Array<VARCHAR>.
-
-
Return type: VARCHAR.
Example
SELECT ai_mask(
'Contact me: 1381234****, my email is user@example.com, and my ID number is 110101199003072316',
"['phone', 'email', 'id_card']"
) AS result;
Result
+----------------------------------------------+
| result |
+----------------------------------------------+
| Contact me: [MSKED], my email is [MSKED], and my ID number is [MSKED] |
ai_summarize
ai_summarize(text)
ai_summarize(model_name, text)
-
Description: Generates a concise and accurate summary of a long text.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
-
Return type: VARCHAR.
Example
SELECT ai_summarize('Above the winding lotus pond, all you see are vast leaves. The leaves rise high out of the water, like the skirts of graceful dancers. Among the layers of leaves, a few white flowers are scattered—some opening softly, some shyly in bud—like pearls, like stars in a blue sky, like beauties fresh from a bath. A breeze passes and brings faint fragrance, like a distant song from a tall building. At that moment, leaves and flowers tremble slightly, and the ripple flashes across the pond like lightning. The leaves, packed shoulder to shoulder, now seem to hold a wave of deep green. Beneath them flows quiet water, hidden from view, while the leaves look all the more charming. Moonlight pours down like water, gently over the leaves and flowers. A thin mist floats above the pond. Leaves and flowers look as if washed in milk, as if wrapped in a light veil of dream. It is full moon, yet a pale cloud hangs in the sky, so the light is not too bright; but I find it just right—deep sleep is necessary, yet a light nap has its own flavor. The moonlight comes filtered through trees. Dense shrubs cast jagged dark shadows like ghosts, while the sparse silhouettes of willows look like paintings on the lotus leaves. The moonlight in the pond is uneven, but light and shadow form a harmonious rhythm, like a famous tune played on a violin.') AS result;
Result
+---------------------------------------------------------+
| result |
+---------------------------------------------------------+
| The moonlit lotus pond is serene and beautiful: vast lotus leaves resemble dancers' skirts, dotted with white flowers like pearls, stars, or fresh-bathed beauties. A gentle breeze carries a sweet fragrance, causing the leaves and flowers to tremble and create a green ripple. Underneath, hidden water enhances their charm. The moonlight, like flowing water, and a light mist create a dreamlike scene, making the leaves and flowers appear as if washed in milk. Though the moon is full, a thin cloud softens its light, casting mottled shadows from trees and shrubs, creating a harmonious interplay of light and shadow reminiscent of a famous violin melody. |
ai_embed
ai_embed(text)
ai_embed(model_name, text)
ai_embed(model_name, text, options)
-
Description: Generates a text embedding.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
options: Optional parameters. Data type: VARCHAR. The following parameters are supported.
Parameter
Description
dimension
The embedding dimension.
type
The embedding type. The model must support the specified type.
-
document (default)
-
query
-
-
-
Return type: Array.
Example
SELECT ai_embed("text_embedding_v4", "What should I have for lunch today?", "{'dimension':1024}")
Result
[-0.02406704,-0.09344522,-0.021460138,-0.059166607,0.027912581,...]
ai_pdf_extract
ai_pdf_extract(url, options)
ai_pdf_extract(model_name, url, options)
-
Description: Extracts information from a PDF document.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
url: The URL of the source PDF file. Data type: VARCHAR.
-
options: Optional parameters. Data type: VARCHAR. The following parameters are supported.
Parameter
Description
page_index
The page number to extract from.
role_arn
(Optional) The role ARN of the account that owns the OSS file.
This parameter is not required if you use the OSS service of the account where your AnalyticDB for MySQL instance resides.
schema_only
Specifies whether to include the file structure in the output.
-
true: The output includes the file structure.
-
false: The output does not include the file structure.
components
(Optional) Defines custom components to extract from the document.
The
componentsparameter is a JSON array, where each object has the following structure:Parameter
Description
name
The name of the structure.
type
The type of the structure.
-
METADATA: File metadata, such as the filename, size, and page count.
-
FORM: Structured form data, typically key-value pairs.
-
TABLE: Tabular data extracted from tables within the document.
-
CUSTOM: A custom component defined by a natural language instruction.
instruction
A natural language description of the content to extract for this component.
-
-
-
Return type: JSON.
Example
SELECT ai_pdf_extract(
'qwen35_plus',
'oss://my-bucket/invoices/invoice_001.pdf',
'{"page_index": 1}'
);
Result
{"data":[{"content":[{"description":"Services\nCost of various services.","rate":"$50.00\n+Tax","qty":"10","amount":"$500.00"},{"description":"Consulting\nConsultation for your business.","rate":"$75.00\n+Tax","qty":"15","amount":"$1,125.00"},{"description":"Materials\nCost of materials and supplies to complete job.","rate":"$122.39\n+Tax","qty":"1","amount":"$122.39"}],"name":"table","type":"table"},{"content":{"billed_to":"Your Client\n1234 Clients Street\nCity, California\n90210\nUnited States\n1-888-123-8910","date_issued":"26/2/2021","invoice_number":"INV-10012","amount_due":"$1,699.48","due_date":"25/4/2021","subtotal":"$1,798.39","discount":"-$179.84","tax":"$80.93","total":"$1,699.48","deposit_requested":"$169.95","deposit_due":"$169.95"},"name":"form","type":"form"},{"content":{"filename":"stocklist.pdf","size":117249,"creation_date":"2026-03-09 06:37:58","modification_date":"2026-03-09 06:37:58","page_count":1,"url":"oss://test-oss-hk/ai/pdf_test/stocklist.pdf","page_index":1},"name":"metadata","type":"metadata"}]}
Audio functions
ai_audio_transcribe
This function supports only built-in models. For more information about how to create a model service, see Create an inference service.
ai_audio_transcribe(url)
ai_audio_transcribe(model_name, url)
ai_audio_transcribe(model_name, url, options)
-
Description: Transcribes an audio clip into text. You can specify parameters such as the diarization mode and target language. The function returns the transcribed text or a URL to the output file.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
url: The URL of the input audio file. Data type: VARCHAR. -
options: Optional. The parameters are provided as a JSON-formatted string. The supported parameters are as follows.Parameter
Description
Value
Default
language
The target language for transcription.
-
cn (Chinese)
-
en (English)
-
ja (Japanese)
-
yue (Cantonese)
-
fspk (Mixed Chinese and English)
cn
diarization_mode
The segmentation mode for the transcribed text.
-
word (by word)
-
sentence (by sentence)
-
speaker (by speaker)
word (by word)
output_type
The output format.
-
JSON (returns the transcribed text directly as a JSON object)
-
URL (saves the output to Object Storage Service (OSS) and returns a URL to the text file)
JSON
-
-
-
Return value: VARCHAR.
Example 1
SELECT ai_audio_transcribe("https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav") as result
Result
+----------------------------------------------+
| result |
+----------------------------------------------+
| {"TaskId": "xxxx", "Transcription": {"AudioInfo": {"Size": 128480, "Duration": 3834, "SampleRate": 16000, "Language": "cn"}, "Paragraphs": [{"ParagraphId": "1768447804076500000", "SpeakerId": "1", "Words": [{"Id": 10, "SentenceId": 1, "Start": 100, "End": 595, "Text": "Hello, "}, {"Id": 20, "SentenceId": 1, "Start": 596, "End": 841, "Text": "world, "}, {"Id": 30, "SentenceId": 1, "Start": 844, "End": 1588, "Text": "this is "}, {"Id": 40, "SentenceId": 1, "Start": 1588, "End": 2580, "Text": "Alibaba "}, {"Id": 50, "SentenceId": 1, "Start": 2580, "End": 3076, "Text": "Speech "}, {"Id": 60, "SentenceId": 1, "Start": 3076, "End": 3820, "Text": "Lab."}]}], "AudioSegments": [[100, 3820]]}} |
Example 2
SELECT ai_audio_transcribe("tingwu", "https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav", "{'diarization_mode':'sentence'}") as result
Result
+----------------------------------------------+
| result |
+----------------------------------------------+
| {"TaskId": "xxxx", "Transcription": {"AudioInfo": {"Size": 128480, "Duration": 3834, "SampleRate": 16000, "Language": "cn"}, "Speakers": [{"SpeakerId": "1", "Sentences": [{"SentenceId": 1, "Start": 100, "End": 3820, "Text": "Hello, world, this is the Alibaba Speech Lab."}]}]}}|
ai_audio_embed
This function generates an audio embedding and supports only built-in models. For more information about how to create a model service, see Create an inference service.
ai_audio_embed(url)
ai_audio_embed(model_name, url)
ai_audio_embed(model_name, url, options)
-
Description: Converts an audio clip into its corresponding embedding vector, which is returned as an array of floating-point numbers. You can specify start and end times to generate an embedding for a specific audio segment.
-
Parameters:
-
model_name: The model name. Data type: VARCHAR.
-
text: The input text. Data type: VARCHAR.
-
options: Optional. The parameters are provided as a JSON-formatted string. The supported parameters are as follows.Parameter
Description
Note
access_id
The Access Key ID for Object Storage Service (OSS).
If you specify
role_arn, you do not need to provide theaccess_id,access_secret, ortokenparameters.access_secret
The Access Key Secret for Object Storage Service (OSS).
token
The token credential for Object Storage Service (OSS).
role_arn
The ARN of the role.
start_time
The start time of the audio clip.
If not specified, defaults to the beginning of the audio.
end_time
The end time of the audio clip.
If not specified, defaults to the end of the audio.
-
-
Return value: Array<FLOAT>.
Example
SELECT ai_audio_embed("https://dashscope.oss-cn-beijing.aliyuncs.com/samples/audio/paraformer/hello_world_female2.wav")
Result
[0.4234241,0.3575562,0.022631245,-0.09829298,-0.40605977,-0.72465694,-1.8039973,2.4773192,0.91702485,0.15146817,0.7209677,0.78260005,0.8019302,2.7577147,-0.86063147,-0.8249354,0.28732437,0.9009849,0.8899265,-1.5320694,-2.0866642,0.5051579,1.3553369,2.0348034,-0.74032015,-0.66065377,0.2756823,0.15714298,-2.8250875,0.8162984,0.20878711,0.8285897,1.4863579,1.0645012,-1.2933203,0.6919463,0.87850314,1.2212433,-0.44420764,-1.0146322,-1.8209833,0.16996253,-1.534067,-0.10253409,0.07611574,2.4005992,-1.2903963,0.6613166,-0.2554945,0.97570425,-0.108424075,0.82061934,-0.18102498,-0.27137256,-1.2723781,-0.9188018,-0.027503261,-0.15346082,-0.9564006,-0.4618776,-1.077228,-0.27756396,0.28749034,0.2570488,-0.39753774,0.5270934,0.6970718,-1.2984067,2.4312186,1.485573,1.6031872,-0.6158852,-1.1512791,0.49305362,-1.1939837,0.72933173,0.2659372,2.5559616,-0.41886744,0.6304756,-0.6171451,-0.5282561,0.78426796,-0.57570314,-0.91166306,-0.531768,0.96598166,0.17423655,-0.9133532,-0.09258405,-0.26951456,-0.34667063,-0.21612991,-0.054430377,0.052798696,0.5198285,1.9431844,-0.5638291,-2.1043544,-0.47198293,0.21959576,0.5975169,0.3951007,0.28112686,1.792782,-0.15896009,-0.71152246,1.1540072,0.37563428,0.0016163457,0.6544563,0.043974634,-1.6375796,-0.3698572,1.5310435,-0.20282944,0.73641545,2.1867669,1.4115202,-0.8114418,-0.36265984,0.2191038,-0.8933491,0.6616433,1.545423,0.10970531,-0.109743424,0.44605017,-0.087265715,0.29077715,-2.4435875,-1.1484046,0.078253455,0.02283861,-0.29482847,2.0237377,-0.6009212,-1.2377249,0.05194488,-0.16573223,0.8370868,-0.43602425,-0.4706988,0.17996503,0.88799554,0.6562157,-0.7520023,0.64789253,0.36865348,-1.4820247,0.5299587,1.1884397,1.4351603,0.21073115,-0.1712549,2.9410155,1.7485802,-0.6828801,-0.4833641,-0.4477328,2.3307724,-0.35595444,-0.61682695,-0.5370858,-0.8068234,1.2195143,-1.0834758,0.45275012,1.6243625,-1.3629726,-0.2959109,0.05071621,-0.7280639,-0.1713935,-0.43650395,-0.2131698,0.25380868,-0.16288652,2.0921175,0.3555297,-0.10752705,0.5716714,-0.09943808,-1.9066161,0.12683412,0.8566219,-0.20323975,1.4260024,-0.28789783,0.32412016,0.88897985,0.86555415,0.5940666,0.6645551,0.046334613,-1.5640875,-0.3510718,0.55363727,-0.13503519,0.67008746,-0.9184686,0.025278697,1.0128921,-0.61241907,-0.01107134,-0.8309523,-0.51025873,1.1141272,0.28813183,-0.10634196,-0.9954202,-0.07171043,-0.5856978,-1.2660285,2.1327019,-0.60268104,-0.6227884,-0.42101067,0.059599742,-0.09640202,-1.258711,0.5054924,-1.1847625,-0.044398762,-0.98595804,0.8883682,0.6221085,-2.5484293,0.7249505,0.69930685,0.7739025,-0.8139478,-1.1988907,-1.0416493,2.0153732,-1.7091763,-0.5611238,-1.2147603,0.9113469,1.5113174,0.23810485,-1.702736,-0.3295935,-0.41867778,-1.0378691,-0.45600057,-0.43525052,-0.1078409,-1.2993969,0.12842774,0.026097976,-0.7705405,-1.0907317,0.28274077,1.236289,1.6190177,2.1874366,0.16072829,-0.33150536,2.218483,-1.1703843,0.10300327,1.1994884,0.48275462,-0.40795812,0.5020531,1.1787555,-0.082187966,0.6315653,0.36654752,-0.24940589,-0.8652801,1.6283739,0.41405886,0.6377814,0.08396838,-1.0169003,1.2100558,1.4457762,-0.07999261,-0.012102162,0.85055244,-0.09711141,-1.0452846,0.13768612,-2.0506873,-1.6474499,0.043265514,-1.0009454,-0.0111249415,-1.2523409,-0.080719866,-0.6187693,-1.398226,0.6425289,-0.4808641,-0.06030046,-0.10275636,-0.31625932,-1.5993032,-0.20966552,-1.4618409,0.34925935,-0.5034448,0.100028045,0.25327235,-1.078896,-0.23394233,1.2247928,-2.6050038,-0.71609926,-0.77765155,-1.2089496,0.8526703,-0.1358416,1.1074059,1.1545771,-0.94525933,0.41012967,0.9361201,-0.14788401,-0.29333082,1.5782444,1.1100405,-0.4074414,-0.3862537,-0.5779069,-0.88644946,0.2233385,1.3612705,1.2413827,-1.3625424,-1.3623037,0.3056319,-1.4446377,0.64613384,0.15064861,-0.61473364,-1.3611295,-0.1975697,-1.0701923,0.7591377,-1.2106745,-2.067824,0.45041704,-0.71582735,-1.743847,1.169414,2.0158787,0.4734838,-0.3133036,-1.9916989,1.1441987,0.9155275,-1.3003027,0.82898057,-0.7439868,1.2072865,0.46877453,0.6648313,0.80477613,1.6927507,0.5842916,0.36608973,2.259473,-1.1628797,-0.3311869,0.36989415,-0.25035658,-0.28012496,1.092324,-0.40238732,0.0046352614,-2.0625768,1.161326,0.92277956,-0.20316431,-0.15164377,1.1715052,-0.7067665,-0.8608931,0.079684004,-0.89916384,0.02488108,-0.57668805,-0.879138,-1.7274998,0.824049,1.5638052,0.28415555,-0.17347385,-2.085917,0.4987632,-0.031601395,-1.0190377,-0.7815816,0.69643223,-0.33574122,2.1242745,0.19785832,0.55690974,-1.3932099,-0.44952002,-0.7719798,1.1326215,0.43839702,0.996999,-0.55692834,1.4014084,-2.3939395,-1.0112343,-1.4143248,1.1514114,1.1233637,-0.3385779,-0.23665123,-0.100857966,0.4633971,1.6215613,-0.0692888,-0.031505972,1.0472811,0.57112235,0.6015763,0.07582237,0.52702487,-1.3809607,-1.7482765,0.38386008,0.99316126,0.95603,0.40644804,-1.5072054,-0.34419048,0.63205683,-1.0854999,-0.92245156,-0.2712947,-0.75696105,0.996232,-0.10738732,-0.4674776,-1.2149413,-1.5094053,1.6796608,0.21961057,-0.35295358,-1.2609407,-0.040009048,-0.38785484,0.7788784,-0.65823495,2.0559616,-1.0074826,1.2282485,1.2540467,0.4914942,-0.47057188,-0.47061247,-0.16255763,-0.6718562,-0.53630847,0.4804698,-0.3134068,0.6407026,-0.727981,0.0481851,0.06927338,0.8321921,-0.6639807,0.74932885,0.23291564,0.76362675,-1.2966217,0.8806557,-1.2141875,0.6996881,-0.8293652,0.9085288,1.8878758,0.11363957,0.148718,0.5030497,-1.0422761,0.08673843,-0.80342984,-0.8046266,-0.18026677,0.28900644,0.76534355,-1.3163859,-0.72775376,0.36529016,-0.3660175,0.31056792,0.052575216,-1.11831,0.7895246,1.1172394,-0.31374845,0.17143561,-0.42633826,0.16579832,-0.012790448,-1.4290546,-0.47322562,0.7557427,1.0487514,-0.14331971,0.30455515,0.6938429,0.5565799]
Aggregate functions
ai_agg
ai_agg(text)
ai_agg(model_name, text)
-
Description: The
ai_aggaggregate function generates a summary by semantically aggregating and analyzing text from multiple rows. It collects text from all rows in a group and passes the content to an AI model for summarization. -
Parameters:
-
model_name: The name of the model to use. If omitted, the default model is used. Data type: VARCHAR. -
text: The text content to be aggregated. Data type: VARCHAR.
-
-
Return value: VARCHAR.
Example 1: Using the default model
SELECT product_name, ai_agg(comment) AS summary
FROM product_reviews
GROUP BY product_name;
Result
+--------------+------------------------------------------------------+
| product_name | summary |
+--------------+------------------------------------------------------+
| Headphones A | Users generally praised the sound quality and comfort, but some reported insufficient battery life. |
+--------------+------------------------------------------------------+
Example 2: Specifying a model name
SELECT product_name, ai_agg('qwen-plus', comment) AS summary
FROM product_reviews
GROUP BY product_name;
Result
+--------------+------------------------------------------------------+
| product_name | summary |
+--------------+------------------------------------------------------+
| Headphones A | Users generally praised the sound quality and comfort, but some reported insufficient battery life. |
+--------------+------------------------------------------------------+
Example 3: Aggregating data from the entire table
SELECT ai_agg(feedback) AS overall_summary
FROM customer_feedback;
Result
+------------------------------------------------------------+
| overall_summary |
+------------------------------------------------------------+
| The overall feedback indicates high satisfaction with product quality, but highlights that logistics speed needs improvement. |
+------------------------------------------------------------+
Model list
The following models are currently available:
|
Brand |
Model name |
Type |
|
Qwen |
qwen3-max |
text model |
|
qwen3.6-plus |
text model |
|
|
qwen3.5-plus |
text model |
|
|
qwen3.5-flash |
text model |
|
|
DeepSeek |
deepseek-v4-flash |
text model |
|
deepseek-v4-pro |
text model |
|
|
deepseek-v3.2 |
text model |
|
|
Zhipu AI |
glm-5.1 |
text model |
|
glm-5 |
text model |
|
|
Moonshot AI |
kimi-k2.6 |
text model |
|
MiniMax |
MiniMax-M2.7 |
text model |
Configure default models for AI functions
When calling an ai function without specifying a model name, the system uses a default model. You can change the default models for each model type by setting its corresponding parameter. For instructions on how to set parameters, see Config and Hint configuration parameters.
|
Model type |
Default model |
Parameter |
|
TEXT_TO_TEXT (text model) |
qwen-plus |
AI_FUNCTION_TEXT_TO_TEXT_MODEL |
|
TEXT_TO_EMBEDDING (text embedding model) |
text-embedding-v2 |
AI_FUNCTION_TEXT_TO_EMBEDDING_MODEL |
|
AUDIO_TO_TEXT (audio-to-text model) |
tingwu |
AI_FUNCTION_AUDIO_TO_TEXT_MODEL |
|
AUDIO_TO_EMBEDDING (audio embedding model) |
qwen2.5-vl-embedding |
AI_FUNCTION_AUDIO_TO_EMBEDDING_MODEL |
|
IMAGE_TO_TEXT (image model) |
qwen35_plus |
AI_FUNCTION_IMAGE_TO_TEXT_MODEL |
Typical use cases
Use case 1: Text retrieval
Scenario: In applications such as RAG and customer service systems, retrieve results from a text library that semantically match a user's query.
Use the ai_embed function to convert the search query and the text in the library into embeddings. Then, calculate the cosine similarity and sort the results to retrieve the 10 most similar texts.
SELECT cosine_similarity(ai_embed("The semantic text to search for"), ai_embed(text)) AS cos
FROM text_table
ORDER BY cos
LIMIT 10
To further optimize vector search performance, see vector search.
Use case 2: Speaker recognition

Scenario: For meetings or daily conversations, transcribe recordings into text and distinguish between speakers in the dialogue.
To further analyze speaker information, you can segment the audio by speaker and then compare the embedding of each segment with those in a voice dataset. For the complete solution, see speaker recognition.
-
Transcription and segmentation: Use the
AI_AUDIO_TRANSCRIBEfunction to transcribe audio into text and segment it by speaker.SELECT AI_AUDIO_TRANSCRIBE("oss://xxx", "{'diarization_mode':'speaker'}") -
Speaker recognition: Use the
AI_AUDIO_EMBEDfunction to generate embeddings for the input audio and for audio in the library. Then, calculate thel2_distanceand select the 10 closest matches.SELECT l2_distance(input_embedding, origin_embedding) AS ld FROM( SELECT ai_audio_embed("oss://xxx") AS input_embedding JOIN SELECT ai_audio_embed(url) AS origin_embedding FROM audio_table) ORDER BY ld LIMIT 10
Use case 3: Extract negative customer reviews
Scenario: Filter out negative feedback from user comments.
The customer_comment table stores daily customer comments. Use the ai_sentiment function to perform sentiment analysis on each comment and filter for all negative comments on January 1, 2026.
-- The qwen_plus_external model, which corresponds to the qwen-plus model in Model Studio, must be created beforehand.
SELECT name, comment
FROM customer_comment
WHERE ai_sentiment("qwen_plus_external", comment)='negative'
AND dt='20260101'