RAG engine-PolarDB(PolarDB)-阿里云帮助中心

This document provides instructions for the following RAG engine features on the knowledge platform: administrator environment configuration, knowledge base management, document import and parsing, retrieval tests, conversational Q&A, and skill retrieval. You can perform each operation using the console or through API calls.

Overview

Retrieval-Augmented Generation (RAG) is a core feature of the knowledge platform. It vectorizes document content and stores it in a knowledge base to enable precise retrieval based on semantic similarity. A large language model (LLM) then generates a response based on the retrieved results.

A typical RAG workflow includes the following steps:

Admin configuration: Configure Object Storage Service (OSS), the document engine, and AI models to initialize the platform.
Create a knowledge base: Create a knowledge base to organize your document collection and set the chunking parameters.
Import and parse: Upload documents and trigger parsing. The system splits the documents into text chunks and generates vector embeddings.
Retrieval and Q&A: Run a retrieval test to verify recall, and then create a conversational assistant for Q&A.
Skill retrieval: Use an AI assistant's built-in skill to retrieve information directly from the knowledge base without manual API calls.

Role permissions are as follows:

Actions	Required role
Query knowledge base, retrieve, Q&A	VIEWER or higher
Create knowledge base, upload or delete documents, trigger parsing	DEVELOPER or higher
Configure AI models, Object Storage Service (OSS), document engine, resource authorization	ADMIN

Administrator configuration

An administrator (ADMIN) only needs to perform this configuration once after the initial platform deployment. You can skip this section if the environment is already configured.

OSS configuration

The RAG engine stores source files, intermediate parsing artifacts, and chunks in Object Storage Service (OSS).

UI

In the left navigation pane, select RAG Engine > Service Configuration.

In the Object storage (OSS) configuration panel, configure the following parameters:

Parameter	Description	Example
Storage type	The type of object storage service.	`OSS`, `S3`, `MINIO`
Access key ID	Your access key ID.	`AKxxxxxxxx`
Secret access key	Your secret access key. It is stored server-side with AES-GCM encryption. To keep a previously saved key, leave this field empty.	********
Endpoint URL	The endpoint URL for Object Storage Service (OSS).	`https://oss-cn-hangzhou.aliyuncs.com`
Region	The Object Storage Service (OSS) region.	`cn-hangzhou`
Bucket	The default bucket name.	`my-rag-bucket`
Prefix path	An optional path prefix for all objects stored in the bucket.	`rag/`
Signature version	The signature version for authenticating requests.	`s3`, `v4`
Addressing style	The bucket addressing style.	`virtual`, `path`

Click Test Connection to verify the configuration.
Click Save Configuration. The system then automatically restarts the RAG engine service (approx. 10-30 seconds).

API

Save the OSS configuration:

POST /api/oss/config
Content-Type: application/json

{
  "storage_type":       "OSS",
  "endpoint_url":       "https://oss-cn-hangzhou.aliyuncs.com",
  "region":             "cn-hangzhou",
  "bucket":             "my-rag-bucket",
  "access_key":         "AKxxx",
  "secret_key":         "xxxxxxx",
  "prefix_path":        "rag/",
  "signature_version":  "s3",
  "addressing_style":   "virtual"
}

Saving the configuration automatically restarts the RAG engine service (approx. 10-30 seconds). Once the restart is complete, call the test endpoint to verify connectivity:

POST /api/oss/test
Content-Type: application/json

{
  "storage_type":       "OSS",
  "endpoint_url":       "https://oss-cn-hangzhou.aliyuncs.com",
  "region":             "cn-hangzhou",
  "bucket":             "my-rag-bucket",
  "access_key":         "AKxxx",
  "secret_key":         "xxxxxxx",
  "prefix_path":        "rag/",
  "signature_version":  "s3",
  "addressing_style":   "virtual"
}

Expected response: {"code": 0, "connected": true}.

Document engine

The document engine stores chunks, vectors, and full-text indexes. The following two engine types are supported:

Scenario	Recommended engine
To avoid deploying additional services.	PostgreSQL
To use an existing OpenSearch service.	OpenSearch

UI

On the Service Configuration page, scroll to the Document engine configuration section.
Under Engine type at the top, select PostgreSQL or OpenSearch.

Switch to the corresponding tab and enter the connection parameters.

PostgreSQL tab

Parameter	Default	Description
Host / port	`localhost` / `5432`	The instance address for PostgreSQL.
Username / password	`postgres` / Leave empty if saved	Database credentials. The password is encrypted for storage.
Database	`_rag_doc_db`	The name of the dedicated database for the document engine.
Pool size / max overflow / pool timeout	20 / 10 / 30	Connection pool settings.
Full-text search engine	`gin`	The engine type for full-text indexing.
Full-text search language	`chinese`	The language for tokenization.
Hybrid mode	`weighted_fusion`	The fusion strategy for hybrid search results.
FTS top-N / vector top-N	100 / 100	The maximum number of results to retrieve from each retrieval path.

OpenSearch tab
Parameter
Default
Description
Host / port
opensearch / 9201
The instance address for OpenSearch.
Username / password
admin / Leave empty if saved
Administrator credentials.

Click Test Connection to verify connectivity, and then click Save Configuration.
Under Engine type, click Switch and Restart. The system automatically restarts the RAG engine service to switch to the new engine.

Important

Switching engines prevents existing knowledge bases from accessing chunk data in the previous engine. For a full data migration, you must re-parse all existing documents.

API

Save the PostgreSQL configuration:

POST /api/doc-engine/pg/config
Content-Type: application/json

{
  "host": "localhost",
  "port": 5432,
  "database": "_rag_doc_db",
  "username": "postgres",
  "password": "your_password"
}

Test PostgreSQL connectivity (you can pass an empty body to use the saved configuration):
```
POST /api/doc-engine/pg/test
Content-Type: application/json

{}
```

Switch the engine type (triggers a restart):

POST /api/doc-engine/engine-type
Content-Type: application/json

{ "engine_type": "postgresql" }

LLM and embedding models

The RAG engine requires an embedding model to vectorize chunks and a chat LLM for conversational Q&A.

UI

In the left navigation pane, select RAG Engine > Model configuration.
Add a model provider: In the Add from preset model providers section, find your target provider (such as Tongyi, OpenAI, or DeepSeek), click Add, and enter the API key and an optional base URL.
Import a custom model (Optional): Click Add custom model. Fill in the model provider, model name, API key, base URL, and model type (chat, embedding, rerank, image2text). Click Validate to confirm the model is available before clicking Add.
Set default models: In the Tenant default models section at the top of the page, specify the following:
- LLM: For example, qwen-plus.
- Embedding: For example, text-embedding-v3.
- VLM (Optional): For PDF image understanding, for example, qwen-vl-plus.
Click Save. The settings take effect immediately; no restart is required.

API

Set the API key for a model provider.

POST /api/ragflow/llm/set_api_key
Content-Type: application/json

{
  "llm_factory": "Tongyi-Qianwen",
  "api_key":     "sk-xxxxxxxxxxxxxxxxxxxxxx",
  "base_url":    "https://dashscope.aliyuncs.com/compatible-mode/v1"
}

To view a list of available model providers, call GET /api/ragflow/llm/factories.

Set the tenant default models.

POST /api/ragflow/set_tenant_info
Content-Type: application/json

{
  "tenant_id":  "<Get this from GET /api/ragflow/tenant_info>",
  "llm_id":     "qwen-plus@Tongyi-Qianwen",
  "embd_id":    "text-embedding-v3@Tongyi-Qianwen",
  "asr_id":     "",
  "img2txt_id": ""
}

Override default models at the user level (Optional).

PUT /api/ragflow/defaults
Content-Type: application/json

{ "llm_id": "gpt-4o@OpenAI" }

Administrator self-check

After completing the configurations above, use the following API calls to verify the status of each component:

GET /api/ragflow/status        # Check if the RAG engine service is online
GET /api/ragflow/tenant_info   # Check if the default models are set
POST /api/oss/test             # Check if OSS is reachable (requires complete OSS configuration parameters)
POST /api/doc-engine/pg/test   # Check if the document engine is reachable (can pass an empty body {})

If all four calls are successful (connected: true or code: 0), you can start using RAG features.

Create a knowledge base

Console

In the left navigation pane, choose RAG engine > knowledge base management.

In the upper-right corner of the page, click + New knowledge base. In the dialog box that appears, configure the following parameters:

Parameter	Value	Description
Name	`quickstart-kb`	The name can contain only letters, digits, and hyphens, which simplifies script references.
Description	Enter a brief business description.	This helps identify the knowledge base when multiple people are collaborating.
Chunking method	`General (Naive)`	This is the recommended method for general documents. Other options include `manual` (for long texts), `paper` (for research papers), `qa` (for Q&A pairs), and `table` (for tables).
Embedding model	Leave blank	The knowledge base uses the tenant's default model automatically, which prevents dimension mismatches.
Permission	`me` / `team`	`me`: Visible only to you. `team`: Shared with the entire tenant.

Click Create. The ID of the new entry is its dataset_id (for example, ds-abc123), which is required for subsequent API and Skill calls.

API

POST /api/ragflow/datasets
Content-Type: application/json

{
  "name":            "quickstart-kb",
  "description":     "My first knowledge base",
  "chunk_method":    "naive",
  "permission":      "me"
}

After the request succeeds, note the data.id from the response body (for example, ds-abc123). This ID is required for subsequent API calls.

Import and parse documents

Upload files

Console

In the knowledge base list, click the folder icon for the target knowledge base to open the Document Management page.
Click Upload in the upper-right corner of the page. Drag and drop files into the upload box or click to select them. Supported formats include PDF, Word, Excel, PowerPoint, TXT, and Markdown.
Once the upload is complete, close the upload dialog. The new document appears in the document list, and its status is unprocessed.

API

curl -X POST "http://<host>/api/ragflow/datasets/<dataset_id>/documents" \
     -H "Authorization: Bearer <token>" \
     -F "file=@./product-manual.pdf" \
     -F "file=@./FAQ.md"

After you upload the documents, their status is unprocessed.

Trigger parsing

Console

On the Document Management page, select the documents to parse by checking them individually or using the Select All option.
At the top of the page, click Start Parsing.
Note
If no embedding model is configured, a warning appears. Click Configure to go to the knowledge base configuration page and add one.
During parsing, the document list automatically refreshes to show the progress. The document status changes as follows: unprocessed → parsing → completed (or failed).

API

POST /api/ragflow/datasets/<dataset_id>/documents/parse
Content-Type: application/json

{ "ids": ["doc-xxx", "doc-yyy"] }

View and modify chunks

Console

On the Document Management page, click the document icon for a completed document.
In the chunk drawer on the right, you can perform the following actions on each chunk:
- Click Edit to manually correct errors, such as OCR mistakes or incorrect paragraph merging.
- Use the toggle to enable or disable a chunk. Disabling a chunk temporarily excludes it from retrieval if it is low-quality.
- Click Add Chunk to manually add content that was missed during parsing.

API

GET  /api/ragflow/datasets/<dataset_id>/documents/<doc_id>/chunks?page=1&size=20
POST /api/ragflow/datasets/<dataset_id>/documents/<doc_id>/chunks/switch

Retrieval test

Before creating a dialogue assistant, run a retrieval test to ensure documents are retrieved correctly.

UI

In the navigation pane on the left, choose RAG Engine > Content Retrieval.
From the Select Knowledge Base drop-down list at the top of the page, select one or more knowledge bases.
Enter a question in the text box and click Retrieve.

In the parameters panel on the right, adjust the following retrieval parameters as needed:

Parameter	Default	Description
Top K	10	The number of chunks to return.
Similarity threshold	0.2	Filters out chunks with a score below this threshold.
Vector weight	0.3	For hybrid retrieval, this sets the weight of the vector score. The remaining weight is allocated to keyword matching.
Keyword	Off	When enabled, the system also performs BM25 keyword matching.
Metadata filtering	Empty	Narrows the retrieval scope by document type or tags.

Review the results. Each chunk displays a comprehensive score, a vector score, and a keyword score.

Interpretation criteria:

Observation	Conclusion
The top 3 retrieved chunks answer the question.	The retrieval quality is good. You can proceed to create a dialogue assistant.
The top 3 chunks are not relevant, but correct results appear within the top 10.	Increase the Top K value or enable rerank.
None of the chunks are relevant.	Check the chunk quality or adjust the chunking parameters.

API

POST /api/ragflow/retrieval
Content-Type: application/json

{
  "question": "What are the conditions for product returns?",
  "dataset_ids": ["<dataset_id>"],
  "top_k":    10,
  "similarity_threshold":     0.2,
  "vector_similarity_weight": 0.3,
  "highlight": true
}

The response for each chunk includes the following:

similarity: The comprehensive score.
vector_similarity / term_similarity: The vector and keyword scores.
content_with_weight: The content, with highlight markup.
docnm_kwd / doc_id: The source document.

Assistants and Q&A

Create an assistant

Console

In the navigation pane on the left, choose RAG engine > assistant.

Click + New assistant, and configure the following parameters:

Basic information

Parameter	Description
Name	The display name of the assistant.
Avatar	Upload a custom avatar or keep the default.
Associated knowledge base	Select one or more knowledge bases.

Model parameters

Parameter	Suggested value	Description
Model	Tenant default	The model the assistant will use. Defaults to the tenant's model, but can be overridden with a specific one.
Temperature	0.1	A lower value makes the response adhere more strictly to the content in the knowledge base.
Top-p	0.3	Tune either this parameter or temperature.
Similarity threshold	0.2	Filters out retrieved chunks with a score below this threshold.
Vector weight	0.3	In hybrid search, this is the weight of the vector similarity score.
Top-n	6	The maximum number of chunks to pass to the LLM.
Show citations	Enabled	Displays citations in the response that link to the original source chunks.
Fallback response	Custom text	The response returned when no results are found in the knowledge base. This helps prevent model hallucination.

System prompt: Configure the system prompt. Use the {{knowledge}} variable to reference the retrieved results.

Click Create.

API

POST /api/ragflow/chats
Content-Type: application/json

{
  "name":        "Product Support Assistant",
  "dataset_ids": ["<dataset_id>"],
  "llm":  { "temperature": 0.1, "top_p": 0.3 },
  "prompt": {
    "similarity_threshold":     0.2,
    "vector_similarity_weight": 0.3,
    "top_n":            6,
    "show_quote":       true,
    "empty_response":   "I'm sorry, but I couldn't find any relevant information in the knowledge base.",
    "prompt":           "You are a product support assistant. Answer the user's questions concisely based only on the provided {{knowledge}}."
  }
}

Start a conversation

Console

In the assistant list, click an assistant's card to open the conversation view.
In the session list on the left, click + New conversation to start a new session, or select an existing one to continue.
In the input box, type your question and press Enter to send. Use Shift+Enter for a new line. The response streams in real-time.
At the end of each response, you can expand the citation markers to view the matched chunks and a link to the original source.

API

Create a session.

POST /api/ragflow/chats/<chat_id>/sessions
Content-Type: application/json

{ "name": "First conversation" }

Start a streaming Q&A.

POST /api/ragflow/chats/<chat_id>/completions
Content-Type: application/json

{
  "session_id": "<session_id>",
  "question":   "How long do returns take?",
  "stream":     true
}

Each frame in the streaming response has the following format:

data: {"code":0,"data":{"answer":"Typically within 7 business days...","reference":{...},"finish":false}}

Retrieve with skills

The platform includes two preset skills that let an AI assistant access a knowledge base directly, without manual API calls.

Skill name	Capability	Applicable role
Knowledge Base Search Agent (polardb-kb-search-agent)	Read-only: List knowledge bases, list documents, and perform semantic search.	General users (prevents accidental operations)
Knowledge Base Agent (polardb-kb-agent)	Read-write: Manage knowledge bases, upload or delete documents, modify chunks, and search.	administrators and operations

Import a skill

An administrator can enable the corresponding skill by navigating to skill management > import presets.

Usage

After you enable a skill, the AI assistant can retrieve information from the knowledge base. Simply ask questions in natural language, and the assistant will automatically search for relevant content and use it to generate a response.

Command reference

Semantic search:

# Basic search (uses the default configured knowledge base)
python3 scripts/search.py "How do I configure the database connection pool?"

# Specify knowledge bases and return more results
python3 scripts/search.py "deployment process" --dataset-ids ds-abc123,ds-xyz --top-k 10

# Include images (for questions about charts or architecture diagrams)
python3 scripts/search.py "system architecture diagram" --with-images --json

View knowledge bases:

python3 scripts/datasets.py list --json           # List all accessible knowledge bases
python3 scripts/datasets.py info ds-abc123        # View details

Check document and parsing status:

python3 scripts/list_documents.py ds-abc123 --run DONE --suffix pdf
python3 scripts/parse_status.py  ds-abc123        # Aggregate parsing progress

Management operations (polardb-kb-agent only):
```
python3 scripts/datasets.py create --name "new-kb" --chunk-method naive
python3 scripts/upload_documents.py ds-abc123 ./docs/*.pdf
python3 scripts/parse_documents.py ds-abc123 --all
python3 scripts/update_dataset.py ds-abc123 --chunk-method paper
```
Note
All skill scripts read the ONTOLOGY_BASE_URL and ONTOLOGY_API_KEY environment variables by default. These variables are automatically injected when you import the skill into the AI assistant, so you do not need to set them.

Use cases

Developer Q&A: Import internal design documents into a knowledge base to query architecture details and API specifications directly from an AI assistant.
Customer service SOP: Import FAQs into a knowledge base so customer service agents can retrieve standard responses using an AI assistant.
Ticket analysis: Import historical incident reports. When a new incident occurs, use an AI assistant to search for similar cases.
Bulk semantic analysis: Use the --json flag with your scripts to retrieve answers for a batch of questions and export the results for evaluation.