Build a retrieval-augmented generation (RAG) chatbot using EAS and RDS PostgreSQL

更新时间:
复制 MD 格式

This topic describes how to use an RDS for PostgreSQL instance when you deploy a retrieval-augmented generation (RAG) service. It also explains the basic features of the RAG chatbot and the unique features of RDS for PostgreSQL.

Background information

EAS

Elastic Algorithm Service (EAS) is a model online service platform in PAI that supports deploying models as online inference services and AI-powered web apps. EAS provides features such as Auto Scaling and blue-green deployment, for high-concurrency, stable online services at a low resource cost. EAS also offers resource group management, version control, and a comprehensive O&M and monitoring system. For more information, see EAS overview.

RAG

While large language models (LLMs) have achieved remarkable success, their widespread application has revealed several inherent limitations:

  • Limited domain knowledge: LLMs are typically trained on large-scale, general-purpose datasets and struggle to provide in-depth, targeted processing for specialized vertical domains.

  • Outdated information: Due to the static nature of their training datasets, LLMs cannot access and learn the latest information and knowledge in real time.

  • Misleading outputs: Due to data biases and inherent model flaws, LLMs may produce outputs that seem plausible but are factually incorrect. This phenomenon is known as "model hallucination."

Retrieval-Augmented Generation (RAG) was developed to address these challenges and enhance the accuracy of large models. By integrating external knowledge bases, RAG significantly reduces model fabrications and improves the model's ability to access and apply the latest knowledge, enabling more personalized and precise LLM customization.

RDS for PostgreSQL overview

Alibaba Cloud Relational Database Service (RDS) supports the PostgreSQL engine. The main advantages of PostgreSQL are its full implementation of the SQL standard and its support for a wide variety of data types, such as JSON, IP, and geometric data. In addition to fully supporting features such as transactions, subqueries, Multi-Version Concurrency Control (MVCC), and data integrity, RDS for PostgreSQL also integrates important features such as high availability and backup and restore. This integration reduces your O&M workload. For more information about the advanced features of RDS for PostgreSQL, see RDS for PostgreSQL.

Workflow

EAS provides a comprehensive RAG solution with flexible parameter configurations. You can customize your own conversational system by using the RAG service through the WebUI or API calls. The core architecture of RAG consists of retrieval and generation:

  • For retrieval, EAS supports multiple vector databases, including the open-source Faiss, Alibaba Cloud Milvus, Elasticsearch, Hologres, OpenSearch, and RDS for PostgreSQL.

  • For generation, EAS supports a rich set of open-source models, such as Qwen, Llama, Mistral, and Baichuan, as well as ChatGPT API calls.

This solution uses RDS for PostgreSQL as an example to demonstrate how to build an LLM-based RAG chatbot with EAS and RDS for PostgreSQL. The process is as follows:

  1. Prepare the RDS for PostgreSQL vector database

    First, create an RDS for PostgreSQL instance and prepare the configuration items required to associate the instance when you deploy the RAG service.

  2. Deploy the RAG service and associate the RDS for PostgreSQL instance

    Deploy the RAG service on the EAS platform and associate the RDS for PostgreSQL instance.

  3. Use the RAG chatbot

    In the RAG chatbot, you can connect to RDS for PostgreSQL, upload your enterprise knowledge base files, and perform knowledge-based Q&A.

Prerequisites

A VPC, a vSwitch, and a security group have been created. For more information, see Create a VPC with an IPv4 CIDR block and Create a security group.

Notes

This solution is limited by server resource size and the default LLM token limit, which restricts conversation length. Use this guide to test basic RAG chatbot retrieval features.

Prepare the RDS for PostgreSQL vector database

Step 1: Create an RDS for PostgreSQL instance and a database

  1. Create an RDS for PostgreSQL instance.

    1. Go to the ApsaraDB RDS instance creation page.

    2. On the purchase page, configure the following key parameters. For more information about the other parameters, see Create an RDS for PostgreSQL instance.

      • Engine: Select PostgreSQL.

      • VPC: Select the VPC that you created.

      • Privileged Account: In the More Configurations section, select Set Now and then configure the database account and password.

    3. Follow the on-screen instructions to complete the payment and activation.

  2. Create a database.

    1. Click the name of the instance that you created. In the navigation pane on the left, click Database Management and then click Create Database.

    2. In the Create Database panel, configure the Database (DB) Name. For Authorized Account, select the privileged account that you created. For more information about the other parameters, see Create accounts and databases.

    3. After you configure the parameters, click Create.

Step 2: Prepare configuration items

  1. Obtain the database endpoint.

    On the instance details page for the RDS for PostgreSQL instance, select Database Connection from the navigation pane on the left. You can then view the internal and public endpoints and their corresponding port numbers.

    • Use the internal endpoint: The RAG service must be in the same VPC as the database instance.

    • Use the public endpoint: If EAS accesses RDS for PostgreSQL over the public network, EAS must have public network access. To ensure that the PostgreSQL instance can receive public requests from the EAS instance, you must apply for a public endpoint for the PostgreSQL instance. You must also add the relevant Elastic IP Address (EIP) or 0.0.0.0/0 to the whitelist. The steps are as follows:

      1. Apply for a public endpoint for the RDS for PostgreSQL instance. For more information, see Apply for or release a public endpoint.

      2. Enable public network access for EAS. You must attach a NAT Gateway and an EIP to the VPC that you will use when you deploy the RAG service. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet.

        Note

        The RAG service can use the same VPC as the RDS for PostgreSQL instance or a different VPC.

      3. Add 0.0.0.0/0 or the EIP to the public access whitelist of the RDS for PostgreSQL instance. For more information, see Configure a whitelist.

  2. Obtain the privileged account and password.

    On the instance details page for the RDS for PostgreSQL instance, select Account Management from the navigation pane on the left. On this page, you can view the privileged account that you created. The password was set when you created the instance. If you forget the password, you can click Reset Password to change it.

Deploy the RAG service and associate the RDS for PostgreSQL instance

  1. Log on to the PAI console, select a workspace, and then click Go to EAS.

  2. On the Inference Service tab, click Deploy Service. In the Scenario-based Model Deployment section, click RAG-based Smart Dialogue Deployment.

  3. Configure the following key parameters. For more information about the other parameters, see Step 1: Deploy the RAG service.

    Parameter

    Description

    Basic Information

    Version

    Select LLM-Integrated Deployment.

    RAG Version

    Select pai-rag:0.3.4.

    Model Type

    Select qwen1.5-1.8b.

    Resource Information

    Deployment

    The system automatically recommends a suitable resource specification based on the selected model category. If you switch to a different resource specification, the model service might fail to start.

    Vector Database Settings

    Vector Database Type

    Select RDS PostgreSQL.

    Host address

    Set this to the internal or public endpoint of the RDS for PostgreSQL instance.

    Port

    Set this to the port number of the RDS for PostgreSQL instance, for example, 5432.

    Database

    Enter the name of the created database.

    Table Name

    Enter a new table name or an existing table name. If the table exists, its schema must meet the PAI-RAG requirements. For example, you can enter the name of a table that was automatically created when you previously deployed a RAG service using EAS.

    Account

    Enter the created privileged account.

    Password

    Enter the password for the privileged account.

    OSS Path

    Select an existing OSS storage folder in the current region. Knowledge base management is implemented by mounting the OSS path.

    VPC

    VPC

    • If you use the internal endpoint for the host, the RAG service must be configured with the same VPC as the RDS for PostgreSQL instance.

    • If you use the public endpoint for the host, you must configure a VPC for the RAG service. Ensure that the VPC has public network access. For more information, see Use the SNAT feature of an Internet NAT gateway to access the Internet. You must also add the attached EIP or 0.0.0.0/0 to the public access whitelist of the RDS for PostgreSQL instance. For more information, see Configure a whitelist.

    vSwitch

    Security Group Name

  4. After you configure the parameters, click Deploy.

Use the RAG chatbot

1. Check the vector database configuration

The basic usage of the RAG conversational system is as follows. For more information, see Large Model RAG Conversational System (v0.3.x).

  1. Click the name of the target RAG service, and then click View Web App in the upper-right corner of the page.

  2. Check whether the PostgreSQL vector database configuration is correct.

    The system automatically configures the `default` knowledge base and applies the vector database settings that you configured during the RAG service deployment. In the Vector Database Configuration section, verify that the PostgreSQL configuration is correct. You can modify the configuration items if needed and then click Update Knowledge Base.

2. Upload enterprise knowledge base files

On the Knowledge Base tab, click the File Management tab to upload knowledge base files. For more information about how to upload knowledge base files, see RAG knowledge base management.

After you upload the knowledge base, the system automatically stores the files in the vector database in the PAI-RAG format. If you upload a knowledge base file with the same name as an existing file, the original file is overwritten in all vector databases except Faiss. Supported file types include .html, .htm, .txt, .pdf, .pptx, .md, Excel (.xlsx or .xls), .jsonl, .jpeg, .jpg, .png, .csv, and Word (.docx), for example, rag_chatbot_test_doc.txt.

3. Perform knowledge-based Q&A

On the Chat tab, select a knowledge base and an intent. To use more tools, select Query Knowledge Base. You can then perform knowledge-based Q&A.

Support for unique RDS for PostgreSQL features

  1. Go to the RDS instance list, switch to the region where the instance is located, and click the instance name to go to the instance details page.

  2. In the navigation pane on the left, select Database Management. Then, in the Actions column for the target database, click SQL Query.

  3. Enter the Database Account and Database Password. These are the privileged account and password that you set when you created the RDS for PostgreSQL instance. Then, click Log On.

  4. After you log on, you can query the list of imported knowledge bases in the database instance. In the database tree on the left, expand the public schema of the testdb database. In the SQL editor, run a SELECT query on the vector storage table. The query results include columns such as id, text, metadata_, and node_id. The metadata_ column stores metadata, such as file path, file name, file type, file size, and creation date, in JSON format.