关系型LLM查询

更新时间:
复制为 MD 格式

当您需要在数据库内直接对文本数据进行情感分析、实体提取、内容分类或基于内容的问答时,传统SQL难以胜任。PolarDB PostgreSQL的关系型LLM查询功能,允许您在SQL语句中直接调用大语言模型(LLM),无需构建复杂的数据导出和处理管道,即可实现库内Data+AI一体化分析,提升了AI应用的开发和数据处理效率。

适用范围

使用本功能前,请确保您的环境满足以下条件:

  • 集群限制:

    • 企业版:

      • 引擎:PostgreSQL 16(内核小版本2.0.16.10.11.0及以上)。

      • 节点:增加AI节点,并设置AI节点的连接数据库账号添加与管理AI节点

        说明
        • 若您在购买集群时已添加AI节点,则可以直接为AI节点设置连接数据库的账号。

        • AI节点的连接数据库账号需具有读写权限,以确保能够顺利读取和写入目标数据库。

    • 标准版:

      • 引擎:PostgreSQL 16(内核小版本2.0.16.10.11.0及以上)。

      • 地域:华北2(北京)。

  • 访问地址:使用集群地址连接PolarDB集群。

准备工作

为了更好地演示关系型LLM查询,先创建polar_ai扩展与电影(movies)和评论(reviews)两张表,并插入一些样本数据。

  1. 创建polar_ai扩展:在数据库中执行以下命令,创建功能扩展。

    CREATE EXTENSION IF NOT EXISTS polar_ai;
  2. 创建数据表结构:执行以下SQL语句创建moviesreviews表。

    CREATE TABLE movies (
        id SERIAL PRIMARY KEY,
        title TEXT NOT NULL,
        genre TEXT,
        release_year INT,
        director TEXT,
        rotten_tomatoes_link TEXT UNIQUE NOT NULL,
        movie_info TEXT
    );
    
    CREATE TABLE reviews (
        id SERIAL PRIMARY KEY,
        user_name TEXT NOT NULL,
        rating DECIMAL(2,1) CHECK (rating >= 0.0 AND rating <= 5.0),
        review_content TEXT,
        rotten_tomatoes_link TEXT NOT NULL,
        review_date DATE DEFAULT CURRENT_DATE
    );
  3. 为数据表填充样本数据:执行以下SQL语句插入电影和评论信息。

    INSERT INTO movies (title, genre, release_year, director, rotten_tomatoes_link, movie_info) VALUES
    ('The Shawshank Redemption', 'Drama', 1994, 'Frank Darabont', 'https://www.xxx.com/m/shawshank_redemption', 'A story of hope and friendship in prison.'),
    ('The Godfather', 'Crime', 1972, 'Francis Ford Coppola', 'https://www.xxx.com/m/godfather', 'Epic tale of power and family.'),
    ('Forrest Gump', 'Drama', 1994, 'Robert Zemeckis', 'https://www.xxx.com/m/forrest_gump', 'Life of a kind man through historical events.');
    
    INSERT INTO reviews (user_name, rating, review_content, rotten_tomatoes_link, review_date) VALUES
    ('Alice', 4.8, 'An inspiring masterpiece!', 'https://www.xxx.com/m/shawshank_redemption', '2024-01-15'),
    ('Bob', 4.6, 'One of the greatest films ever made.', 'https://www.xxx.com/m/shawshank_redemption', '2024-02-20'),
    ('Charlie', 4.9, 'Brilliant acting and direction.', 'https://www.xxx.com/m/godfather', '2024-01-10'),
    ('Diana', 4.7, 'Powerful family saga.', 'https://www.xxx.com/m/godfather', '2024-03-05'),
    ('Eve', 4.5, 'Mind-blowing concept!', 'https://www.xxx.com/m/forrest_gump', '2024-01-05');

配置LLM模型

在数据库中调用LLM前,您需要先注册模型服务并完成认证。

企业版集群

  1. 创建用于处理LLM请求和响应的辅助函数_polar4ai_tongyi_in_fn函数用于将输入文本构造成符合目标模型(如通义千问)API规范的JSON请求体。_polar4ai_tongyi_out_fn函数用于从模型返回的JSON响应中提取出核心内容。

    -- 创建请求构造函数
    CREATE OR REPLACE FUNCTION polar_ai._polar4ai_tongyi_in_fn(model_name text, content text)
        RETURNS jsonb
        LANGUAGE plpgsql
        AS $function$
        BEGIN
        RETURN ('{"model": "'|| model_name ||'","messages":[{"role": "user", "content": "'|| content ||'"}],"max_tokens": 100 }')::jsonb;
        END;
        $function$;
    
    -- 创建响应解析函数
    CREATE OR REPLACE FUNCTION polar_ai._polar4ai_tongyi_out_fn(model_id text, response_json jsonb)
        RETURNS jsonb
        AS $$ select (((response_json->>'choices')::jsonb->0)::jsonb->>'message')::jsonb as result $$
        LANGUAGE 'sql' IMMUTABLE;
  2. PolarDB中创建一个AI模型:将模型名称与LLM服务的接入点(Endpoint)及处理函数关联起来。执行AI_CreateModel函数。在创建模型的过程中,需对以下参数进行调整:

    说明

    • 模型调用地址model_url域名修改:固定为http://prod-db4ai-service-proxy-hangzhou.aliyun-inc.com/v1/chat/completions

    • 模型配置信息model_config中的token固定为空。

    • 模型输入转换函数model_in_transform_fn为上述步骤创建的ai_text_embedding_in_fn函数。

    • 模型输出转换函数model_out_transform_fn为上述步骤创建的ai_text_embedding_out_fn函数

    SELECT polar_ai.AI_CreateModel(
        'polar4ai/_polar4ai_tongyi', -- 自定义模型名称
        'http://prod-db4ai-service-proxy-hangzhou.aliyun-inc.com/v1/chat/completions',
        'Polar4AI',
        '问答模型',
        '_polar4ai_tongyi',
        '{"author_type": "token", "token": ""}',
        NULL,
        'polar_ai._polar4ai_tongyi_in_fn'::regproc,
        'polar_ai._polar4ai_tongyi_out_fn'::regproc
    );
  3. (可选)查看已创建的AI模型

    SELECT * from polar_ai._ai_models;
  4. 获取并绑定Token

    1. 请前往PolarDB控制台,目标集群详情页中数据库节点区域,找到AI节点,并单击查看记录节点Tokenimage

    2. 在数据库中执行以下命令,设置访问大模型服务所需的密钥。返回t代表执行成功,f代表执行失败。

      SELECT polar_ai.AI_SetModelToken('polar4ai/_polar4ai_tongyi', '<YOUR_API_KEY>');

标准版集群

  1. 创建用于处理LLM请求和响应的辅助函数_polar4ai_tongyi_in_fn函数用于将输入文本构造成符合目标模型(如通义千问)API规范的JSON请求体。_polar4ai_tongyi_out_fn函数用于从模型返回的JSON响应中提取出核心内容。

    -- 创建请求构造函数
    CREATE OR REPLACE FUNCTION polar_ai._polar4ai_tongyi_in_fn(model_name text, content text)
        RETURNS jsonb
        LANGUAGE plpgsql
        AS $function$
        BEGIN
            RETURN jsonb_build_object(
                'model', model_name,
                'messages', jsonb_build_array(
                    jsonb_build_object(
                        'role', 'system',
                        'content', 'You are a helpful assistant.'
                    ),
                    jsonb_build_object(
                        'role', 'user',
                        'content', content
                    )
                )
            );
        END;
        $function$;
    
    
    -- 创建响应解析函数
    CREATE OR REPLACE FUNCTION polar_ai._polar4ai_tongyi_out_fn(model_id text, response_json jsonb)
        RETURNS jsonb
        AS $$
            SELECT (response_json -> 'choices' -> 0 -> 'message')::jsonb;
        $$
        LANGUAGE 'sql' IMMUTABLE;
    
  2. PolarDB中创建一个AI模型:将模型名称与LLM服务的接入点(Endpoint)及处理函数关联起来。执行AI_CreateModel函数。在创建模型的过程中,需对以下参数进行调整:

    • 模型调用地址model_url域名修改:将通义千问的HTTP请求地址的替换为阿里云大模型服务平台百炼的私网连接(PrivateLink),即将https://dashscope.aliyuncs.com更改为https://vpc-cn-beijing.dashscope.aliyuncs.com

    • 模型配置信息model_config中的token固定为空。

    • 模型输入转换函数model_in_transform_fn为上述步骤创建的ai_text_embedding_in_fn函数。

    • 模型输出转换函数model_out_transform_fn为上述步骤创建的ai_text_embedding_out_fn函数

    SELECT polar_ai.AI_CreateModel(
        'polar4ai/_polar4ai_tongyi/qwen-plus', -- 自定义模型名称
        'https://vpc-cn-beijing.dashscope.aliyuncs.com/compatible-mode/v1/chat/completions',
        'Polar4AI',
        '问答模型',
        'qwen-plus',
        '{"author_type": "token", "token": ""}',
        NULL,
        'polar_ai._polar4ai_tongyi_in_fn'::regproc,
        'polar_ai._polar4ai_tongyi_out_fn'::regproc
    );
  3. (可选)查看已创建的AI模型

    SELECT * from polar_ai._ai_models;
  4. 绑定模型Token

    首次调用模型前,需先前往阿里云大模型服务平台百炼开通服务,并获取API Key。随后,执行如下SQL命令将您的API-KEY绑定到指定模型中。返回t代表执行成功,f代表执行失败。

    SELECT polar_ai.AI_SetModelToken('polar4ai/_polar4ai_tongyi/qwen-plus', '<YOUR_API_KEY>');

创建可复用的LLM查询函数

为了简化SQL查询并提高代码的可读性和可维护性,使用AI_CallModel创建自定义模型调用函数,将封装底层模型调用、提示词(Prompt)模板替换和错误处理逻辑,提供一个简洁的调用接口。

CREATE OR REPLACE FUNCTION public.llm(prompt text, movie_info text, review_content text,model_id text default '<上一步创建的模型名称>')
    RETURNS text
    AS $$ 
DECLARE
  message text;
BEGIN
  message := replace(prompt, '{movie info}', 'movie info:' ||  quote_literal(movie_info));
  message := replace(message, '{user review}', 'user review:' ||  quote_literal(review_content));  
  return (polar_ai.AI_CALLMODEL($4,message))::jsonb->>'content'; 
END;  
$$
LANGUAGE plpgsql IMMUTABLE;

基于上述封装的LLM函数,可以结合库内的数据,执行基于关系模型与大语言模型的混合查询和处理,显著提升了数据AI应用的灵活性。以下提供几个典型的查询范式案例,供参考与类比。

LLM投影查询(SELECT)

SELECT子句中调用LLM函数,对每一行数据进行处理,生成新的信息。例如,根据电影信息和用户评论,为用户生成个性化的电影推荐。

SELECT public.llm('Recommend movies for the user based on {movie info} and {user review}',
  m.movie_info, 
  r.review_content)
FROM reviews r
JOIN movies m ON r.rotten_tomatoes_link = m.rotten_tomatoes_link;

示例返回结果如下:

                                                    llm                                                                                                                                                            
--------------------------------------------------------------------------------------------------------------------------------------------
 Based on the movie description **"A story of hope and friendship in prison"** and your enthusiastic review **"An inspiring masterpiece!"**,
 it sounds like you're looking for films that share similar emotional depth, themes of resilience, and powerful human connections. Here are 
some **recommendations** that might resonate with you:                                                        +
 ### Recommended Movies                                                                                            
 #### 1. **The Shawshank Redemption (1994)**                                                                          
 - **Why You'll Like It**: This is often
 Based on the movie description "A story of hope and friendship in prison" and the user's enthusiastic review "One of the greatest films eve
r made," it sounds like you're referring to **The Shawshank Redemption** (1994). This film is widely celebrated for its powerful themes of h
ope, resilience, and enduring friendship.                                                                     +
 Here are some movies with similar themes or emotional depth that you might enjoy:                                                          
 ###  **Similar Movies to The Shawshank Redemption:**                                                                                     
 1
 Based on the movie description **"Epic tale of power and family"** and the user review **"Brilliant acting and direction,"** it sounds like
 you're looking for films that are **dramatic, character-driven, and feature strong performances**, with a focus on **family dynamics and th
emes of power or ambition**.                                                                                  +
 Here are some highly recommended movies that match this vibe:                                                                              
 ###  **1. House of Cards (TV Series)**                                                                                                   
 - **Why?**
 Based on the movie description "Epic tale of power and family" and the user review "Powerful family saga," it sounds like you're looking fo
r films that explore themes of **family dynamics, ambition, legacy, and power struggles**—likely with dramatic or epic undertones. Here ....

LLM筛选查询(WHERE)

WHERE子句中调用LLM函数,根据语义条件对行进行过滤。例如,筛选出不适合儿童观看的电影。

SELECT m.title
FROM Movies m
JOIN Reviews r ON r.rotten_tomatoes_link = m.rotten_tomatoes_link
WHERE 
public.llm('Analyze whether this movie would be suitable for kids based on {movie info} and {user review}, answer only Yes or No', 
m.movie_info, r.review_content) =
'No.';

示例返回结果如下:

     title     
---------------
 The Godfather

组合LLM查询(SELECT + WHERE)

该查询结合了两次大型语言模型的调用:一次在SELECT子句中生成主要输出,另一次在WHERE子句中根据额外的内容适用性标准过滤结果。其目的是通过依次应用多个大型语言模型函数,以优化最终输出。

SELECT llm('Recommend movies for the user based on {movie info} and {user review}',
  m. movie_info, 
  r. review_content) AS recommendations
FROM Movies m
JOIN Reviews r ON r. rotten_tomatoes_link = m. rotten_tomatoes_link
WHERE public.llm('Analyze whether this movie would be suitable for kids based on {movie info} and {user review}, answer only Yes or No', 
m. movie_info, r. review_content) = 'No.';

示例返回结果如下:

                                                            recommendations                                                                                                                                                           
--------------------------------------------------------------------------------------------------------------------------------------------
 Based on the movie description **"Epic tale of power and family"** and the user's review **"Brilliant acting and direction,"** it sounds li
ke you're looking for movies with strong character dynamics, intense storytelling, and high production quality. Here are some recommendation
s that match these themes:                   +
 ###  **Recommended Movies:**                  
 1. **Game of Thrones (TV Series)**                 
    - *Why?* A sprawling epic centered around power struggles, political intrigue, and
 Based on the movie description **"Life of a kind man through historical events"** and the user review **"Mind-blowing concept!"**, it sound
s like you're looking for thought-provoking, character-driven films that explore moral depth or unique perspectives in history. Here are som
e recommendations that align with this theme:+
 ###  **Recommended Movies:**                   
 #### 1. **The Counterfeiters (Die Fälscher) – 2007** 

LLM聚合查询(GROUP BY)

通过调用大语言模型为每部电影提供满意度评分,以便对整体客户反馈的平均情感进行定性评估。随后,使用平均函数对这些数值评分进行聚合。这种方法将定性数据整合为定量汇总指标。

SELECT m.title, AVG(llm('Rate a satisfaction score between 0(bad) and 5 (good) based on {movie info} and {user review}, answer only score',
  r.review_content, 
  m.movie_info)::float8
) as AverageScore
FROM reviews r
JOIN movies m 
ON r.rotten_tomatoes_link = m.rotten_tomatoes_link
GROUP BY m. title;

示例返回结果如下:

          title           | averagescore 
--------------------------+--------------
 The Shawshank Redemption |            5
 Forrest Gump             |            3
 The Godfather            |            5