This topic describes how to create and use a RAGFlow knowledge base with Data Transmission Service (DTS).
Prerequisites
-
You have created a vector database that meets the following requirements:
Supported vector databases
Requirements
AnalyticDB for PostgreSQL instance
PolarSearch cluster
A PolarDB for MySQL cluster with the PolarSearch feature enabled.
Lindorm instance
The engine type must include Search Engine and Vector Engine.
PolarDB for PostgreSQL cluster
The PGVector plugin must be installed.
-
An OSS bucket with a storage class of Standard has been created in the same region as the vector database. For the storage redundancy type, we recommend Zone-redundant Storage (Recommended).
-
Region: This feature is available only in the China (Hangzhou), China (Shanghai), China (Beijing), China (Shenzhen), China (Hong Kong), Singapore, and Indonesia (Jakarta) regions.
Precautions
-
You cannot disable the public endpoint for a RAGFlow knowledge base after it is enabled.
-
A registered RAGFlow account is valid only for the corresponding RAGFlow knowledge base.
Billing
For more information, see Billing for Data Preparation.
Procedure
Create a RAGFlow knowledge base
-
Go to the RAGFlow knowledge base list page for the destination region.
-
Log on to the Data Transmission Service (DTS) console.
-
In the left-side navigation pane, click Data Preparation.
-
In the upper-left corner, select the region where the data preparation instance resides.
-
Click the RAGFlow knowledge base tab.
-
-
Click Create Knowledge Base to open the task configuration page.
-
Configure the RAGFlow knowledge base.
-
In the Deployment Scope section, enter an Instance Name for the RAGFlow knowledge base.
-
In the Network and Zone section, select a VPC, a Primary Zone and vSwitch, and a Secondary Zone and vSwitch for the RAGFlow knowledge base.
-
In the RAGFlow Knowledge Base Configuration section, enter the Number of Knowledge Base Services.
NoteIn this example, the Configuration Plan is kept as Default.
-
In the Vector Database Configuration section, configure the vector database.
NoteIf you select Import from Existing Instance, enter the Database Name, Database Schema Name, and Database Account of the existing instance.
ADB PostgreSQL
Set Engine to AnalyticDB for PostgreSQL. In the Database field, select the destination AnalyticDB for PostgreSQL instance and enter the Database Name, Database Schema Name, Database Account, and Password for that instance.
PolarSearch
Set Engine to PolarSearch. In the Database field, select a PolarDB for MySQL cluster with PolarSearch enabled, and enter the Database Account and Password for that cluster.
PolarDB PostgreSQL
Set Engine to PolarDB PostgreSQL. In the Database field, select the destination PolarDB for PostgreSQL cluster and enter the Database Name, Database Schema Name, Database Account, and Password for that cluster.
Lindorm
Set Engine to Lindorm. In the Database field, select the destination Lindorm instance and enter the Database Account and Password for that instance.
-
In the OSS Configuration section, select the destination bucket and enter the data storage path.
-
-
After you complete the configuration, click Buy Now on the right side of the page.
-
Return to the RAGFlow knowledge base list page and wait for the instance to start. The Status changes to Running.
NoteYou can click the refresh icon
in the upper-right corner to view the latest status of the RAGFlow knowledge base.
Configure an IP whitelist
-
In the Actions column of the target RAGFlow knowledge base, click Set up a white list.
-
In the Set up a white list panel, add IP addresses to the whitelist based on your access method.
Access Method
Example Scenario
IP whitelist
Description
Internal network
The client and the RAGFlow knowledge base are in the same VPC.
The private IP address or CIDR block of the client.
-
Separate multiple IP addresses or CIDR blocks with commas (,).
-
To find the client's public IP address, run the
curl ipinfo.io/ip(recommended) orcurl ifconfig.mecommand.
Internet
The client is on your on-premises server.
The public IP address or CIDR block of the client.
-
-
Click Yes.
Log on to RAGFlow
-
In the Actions column of the target RAGFlow knowledge base, click Manage.
NoteYou can also click Actions in the Login to Knowledge Base column and choose to log on over the internal network or the internet.
-
In the Endpoint section, click Login external network address or Login Intranet Address.
NoteTo access the RAGFlow knowledge base over the internet, you must enable the public endpoint for the instance.
-
On the logon page, enter the email address and password for your account, and then click Login.
-
On the RAGFlow page, manage knowledge bases and perform other operations.
NoteFor more information, see the official RAGFlow documentation.
(Optional) Network configuration
By default, RAGFlow cannot access external networks. To add model providers in RAGFlow, you must configure a NAT gateway for the VPC that hosts the vector database used by RAGFlow. This allows the RAGFlow knowledge base to access external models.
-
Connect over a private network (Alibaba Cloud Model Studio)
Accessing Alibaba Cloud Model Studio over a private network improves data transfer security and efficiency. You can use PrivateLink to establish a network connection between your VPC and Alibaba Cloud Model Studio. For detailed instructions, see Access Model Studio models and application APIs over a private network.
-
Connect over the internet
Configure a NAT gateway for the VPC that hosts the vector database used by RAGFlow to allow access to external models. For more information about NAT gateways, see Public NAT gateway.
Appendix
Enable the public endpoint
-
In the Actions column of the target RAGFlow knowledge base, click Manage.
-
In the Endpoint section, click Open external network address.
-
In the Open external network address dialog box that appears, click OK.
-
Wait for the public endpoint to be enabled. The Status in the Basic Information section changes to Running.
Register a RAGFlow account
-
Go to the RAGFlow logon page for the target RAGFlow knowledge base.
-
On the RAGFlow logon page, click Register.
-
Enter the email address, name, and password for the account.
-
Click Continue.
A message
appears at the top of the page, indicating that the account has been registered successfully.