This document provides instructions for the following RAG engine features on the knowledge platform: administrator environment configuration, knowledge base management, document import and parsing, retrieval tests, conversational Q&A, and skill retrieval. You can perform each operation using the console or through API calls.
Overview
Retrieval-Augmented Generation (RAG) is a core feature of the knowledge platform. It vectorizes document content and stores it in a knowledge base to enable precise retrieval based on semantic similarity. A large language model (LLM) then generates a response based on the retrieved results.
A typical RAG workflow includes the following steps:
Admin configuration: Configure Object Storage Service (OSS), the document engine, and AI models to initialize the platform.
Create a knowledge base: Create a knowledge base to organize your document collection and set the chunking parameters.
Import and parse: Upload documents and trigger parsing. The system splits the documents into text chunks and generates vector embeddings.
Retrieval and Q&A: Run a retrieval test to verify recall, and then create a conversational assistant for Q&A.
Skill retrieval: Use an AI assistant's built-in skill to retrieve information directly from the knowledge base without manual API calls.
Role permissions are as follows:
Actions | Required role |
Query knowledge base, retrieve, Q&A | VIEWER or higher |
Create knowledge base, upload or delete documents, trigger parsing | DEVELOPER or higher |
Configure AI models, Object Storage Service (OSS), document engine, resource authorization | ADMIN |
Administrator configuration
An administrator (ADMIN) only needs to perform this configuration once after the initial platform deployment. You can skip this section if the environment is already configured.
OSS configuration
The RAG engine stores source files, intermediate parsing artifacts, and chunks in Object Storage Service (OSS).
UI
In the left navigation pane, select RAG Engine > Service Configuration.
In the Object storage (OSS) configuration panel, configure the following parameters:
Parameter
Description
Example
Storage type
The type of object storage service.
OSS,S3,MINIOAccess key ID
Your access key ID.
AKxxxxxxxxSecret access key
Your secret access key. It is stored server-side with AES-GCM encryption. To keep a previously saved key, leave this field empty.
********
Endpoint URL
The endpoint URL for Object Storage Service (OSS).
https://oss-cn-hangzhou.aliyuncs.comRegion
The Object Storage Service (OSS) region.
cn-hangzhouBucket
The default bucket name.
my-rag-bucketPrefix path
An optional path prefix for all objects stored in the bucket.
rag/Signature version
The signature version for authenticating requests.
s3,v4Addressing style
The bucket addressing style.
virtual,pathClick Test Connection to verify the configuration.
Click Save Configuration. The system then automatically restarts the RAG engine service (approx. 10-30 seconds).
API
Save the OSS configuration:
POST /api/oss/config Content-Type: application/json { "storage_type": "OSS", "endpoint_url": "https://oss-cn-hangzhou.aliyuncs.com", "region": "cn-hangzhou", "bucket": "my-rag-bucket", "access_key": "AKxxx", "secret_key": "xxxxxxx", "prefix_path": "rag/", "signature_version": "s3", "addressing_style": "virtual" }Saving the configuration automatically restarts the RAG engine service (approx. 10-30 seconds). Once the restart is complete, call the test endpoint to verify connectivity:
POST /api/oss/test Content-Type: application/json { "storage_type": "OSS", "endpoint_url": "https://oss-cn-hangzhou.aliyuncs.com", "region": "cn-hangzhou", "bucket": "my-rag-bucket", "access_key": "AKxxx", "secret_key": "xxxxxxx", "prefix_path": "rag/", "signature_version": "s3", "addressing_style": "virtual" }Expected response:
{"code": 0, "connected": true}.
Document engine
The document engine stores chunks, vectors, and full-text indexes. The following two engine types are supported:
Scenario | Recommended engine |
To avoid deploying additional services. | PostgreSQL |
To use an existing OpenSearch service. | OpenSearch |
UI
On the Service Configuration page, scroll to the Document engine configuration section.
Under Engine type at the top, select
PostgreSQLorOpenSearch.Switch to the corresponding tab and enter the connection parameters.
PostgreSQL tab
Parameter
Default
Description
Host / port
localhost/5432The instance address for PostgreSQL.
Username / password
postgres/ Leave empty if savedDatabase credentials. The password is encrypted for storage.
Database
_rag_doc_dbThe name of the dedicated database for the document engine.
Pool size / max overflow / pool timeout
20 / 10 / 30
Connection pool settings.
Full-text search engine
ginThe engine type for full-text indexing.
Full-text search language
chineseThe language for tokenization.
Hybrid mode
weighted_fusionThe fusion strategy for hybrid search results.
FTS top-N / vector top-N
100 / 100
The maximum number of results to retrieve from each retrieval path.
OpenSearch tab
Parameter
Default
Description
Host / port
opensearch/9201The instance address for OpenSearch.
Username / password
admin/ Leave empty if savedAdministrator credentials.
Click Test Connection to verify connectivity, and then click Save Configuration.
Under Engine type, click Switch and Restart. The system automatically restarts the RAG engine service to switch to the new engine.
Switching engines prevents existing knowledge bases from accessing chunk data in the previous engine. For a full data migration, you must re-parse all existing documents.
API
Save the PostgreSQL configuration:
POST /api/doc-engine/pg/config Content-Type: application/json { "host": "localhost", "port": 5432, "database": "_rag_doc_db", "username": "postgres", "password": "your_password" }Test PostgreSQL connectivity (you can pass an empty body to use the saved configuration):
POST /api/doc-engine/pg/test Content-Type: application/json {}Switch the engine type (triggers a restart):
POST /api/doc-engine/engine-type Content-Type: application/json { "engine_type": "postgresql" }
LLM and embedding models
The RAG engine requires an embedding model to vectorize chunks and a chat LLM for conversational Q&A.
UI
In the left navigation pane, select RAG Engine > Model configuration.
Add a model provider: In the Add from preset model providers section, find your target provider (such as Tongyi, OpenAI, or DeepSeek), click Add, and enter the API key and an optional base URL.
Import a custom model (Optional): Click Add custom model. Fill in the model provider, model name, API key, base URL, and model type (
chat,embedding,rerank,image2text). Click Validate to confirm the model is available before clicking Add.Set default models: In the Tenant default models section at the top of the page, specify the following:
LLM: For example,
qwen-plus.Embedding: For example,
text-embedding-v3.VLM (Optional): For PDF image understanding, for example,
qwen-vl-plus.
Click Save. The settings take effect immediately; no restart is required.
API
Set the API key for a model provider.
POST /api/ragflow/llm/set_api_key Content-Type: application/json { "llm_factory": "Tongyi-Qianwen", "api_key": "sk-xxxxxxxxxxxxxxxxxxxxxx", "base_url": "https://dashscope.aliyuncs.com/compatible-mode/v1" }To view a list of available model providers, call
GET /api/ragflow/llm/factories.Set the tenant default models.
POST /api/ragflow/set_tenant_info Content-Type: application/json { "tenant_id": "<Get this from GET /api/ragflow/tenant_info>", "llm_id": "qwen-plus@Tongyi-Qianwen", "embd_id": "text-embedding-v3@Tongyi-Qianwen", "asr_id": "", "img2txt_id": "" }Override default models at the user level (Optional).
PUT /api/ragflow/defaults Content-Type: application/json { "llm_id": "gpt-4o@OpenAI" }
Administrator self-check
After completing the configurations above, use the following API calls to verify the status of each component:
GET /api/ragflow/status # Check if the RAG engine service is online
GET /api/ragflow/tenant_info # Check if the default models are set
POST /api/oss/test # Check if OSS is reachable (requires complete OSS configuration parameters)
POST /api/doc-engine/pg/test # Check if the document engine is reachable (can pass an empty body {})If all four calls are successful (connected: true or code: 0), you can start using RAG features.
Create a knowledge base
Console
In the left navigation pane, choose RAG engine > knowledge base management.
In the upper-right corner of the page, click + New knowledge base. In the dialog box that appears, configure the following parameters:
Parameter
Value
Description
Name
quickstart-kbThe name can contain only letters, digits, and hyphens, which simplifies script references.
Description
Enter a brief business description.
This helps identify the knowledge base when multiple people are collaborating.
Chunking method
General (Naive)This is the recommended method for general documents. Other options include
manual(for long texts),paper(for research papers),qa(for Q&A pairs), andtable(for tables).Embedding model
Leave blank
The knowledge base uses the tenant's default model automatically, which prevents dimension mismatches.
Permission
me/teamme: Visible only to you.team: Shared with the entire tenant.Click Create. The ID of the new entry is its
dataset_id(for example,ds-abc123), which is required for subsequent API and Skill calls.
API
POST /api/ragflow/datasets
Content-Type: application/json
{
"name": "quickstart-kb",
"description": "My first knowledge base",
"chunk_method": "naive",
"permission": "me"
}After the request succeeds, note the data.id from the response body (for example, ds-abc123). This ID is required for subsequent API calls.
Import and parse documents
Upload files
Console
In the knowledge base list, click the folder icon for the target knowledge base to open the Document Management page.
Click Upload in the upper-right corner of the page. Drag and drop files into the upload box or click to select them. Supported formats include PDF, Word, Excel, PowerPoint, TXT, and Markdown.
Once the upload is complete, close the upload dialog. The new document appears in the document list, and its status is unprocessed.
API
curl -X POST "http://<host>/api/ragflow/datasets/<dataset_id>/documents" \
-H "Authorization: Bearer <token>" \
-F "file=@./product-manual.pdf" \
-F "file=@./FAQ.md"After you upload the documents, their status is unprocessed.
Trigger parsing
Console
On the Document Management page, select the documents to parse by checking them individually or using the Select All option.
At the top of the page, click Start Parsing.
NoteIf no embedding model is configured, a warning appears. Click Configure to go to the knowledge base configuration page and add one.
During parsing, the document list automatically refreshes to show the progress. The document status changes as follows: unprocessed → parsing → completed (or failed).
API
POST /api/ragflow/datasets/<dataset_id>/documents/parse
Content-Type: application/json
{ "ids": ["doc-xxx", "doc-yyy"] }View and modify chunks
Console
On the Document Management page, click the document icon for a completed document.
In the chunk drawer on the right, you can perform the following actions on each chunk:
Click Edit to manually correct errors, such as OCR mistakes or incorrect paragraph merging.
Use the toggle to enable or disable a chunk. Disabling a chunk temporarily excludes it from retrieval if it is low-quality.
Click Add Chunk to manually add content that was missed during parsing.
API
GET /api/ragflow/datasets/<dataset_id>/documents/<doc_id>/chunks?page=1&size=20
POST /api/ragflow/datasets/<dataset_id>/documents/<doc_id>/chunks/switchRetrieval test
Before creating a dialogue assistant, run a retrieval test to ensure documents are retrieved correctly.
UI
In the navigation pane on the left, choose RAG Engine > Content Retrieval.
From the Select Knowledge Base drop-down list at the top of the page, select one or more knowledge bases.
Enter a question in the text box and click Retrieve.
In the parameters panel on the right, adjust the following retrieval parameters as needed:
Parameter
Default
Description
Top K
10
The number of chunks to return.
Similarity threshold
0.2
Filters out chunks with a score below this threshold.
Vector weight
0.3
For hybrid retrieval, this sets the weight of the vector score. The remaining weight is allocated to keyword matching.
Keyword
Off
When enabled, the system also performs BM25 keyword matching.
Metadata filtering
Empty
Narrows the retrieval scope by document type or tags.
Review the results. Each chunk displays a comprehensive score, a vector score, and a keyword score.
Interpretation criteria:
Observation
Conclusion
The top 3 retrieved chunks answer the question.
The retrieval quality is good. You can proceed to create a dialogue assistant.
The top 3 chunks are not relevant, but correct results appear within the top 10.
Increase the Top K value or enable rerank.
None of the chunks are relevant.
Check the chunk quality or adjust the chunking parameters.
API
POST /api/ragflow/retrieval
Content-Type: application/json
{
"question": "What are the conditions for product returns?",
"dataset_ids": ["<dataset_id>"],
"top_k": 10,
"similarity_threshold": 0.2,
"vector_similarity_weight": 0.3,
"highlight": true
}The response for each chunk includes the following:
similarity: The comprehensive score.vector_similarity/term_similarity: The vector and keyword scores.content_with_weight: The content, with highlight markup.docnm_kwd/doc_id: The source document.
Assistants and Q&A
Create an assistant
Console
In the navigation pane on the left, choose RAG engine > assistant.
Click + New assistant, and configure the following parameters:
Basic information
Parameter
Description
Name
The display name of the assistant.
Avatar
Upload a custom avatar or keep the default.
Associated knowledge base
Select one or more knowledge bases.
Model parameters
Parameter
Suggested value
Description
Model
Tenant default
The model the assistant will use. Defaults to the tenant's model, but can be overridden with a specific one.
Temperature
0.1
A lower value makes the response adhere more strictly to the content in the knowledge base.
Top-p
0.3
Tune either this parameter or temperature.
Similarity threshold
0.2
Filters out retrieved chunks with a score below this threshold.
Vector weight
0.3
In hybrid search, this is the weight of the vector similarity score.
Top-n
6
The maximum number of chunks to pass to the LLM.
Show citations
Enabled
Displays citations in the response that link to the original source chunks.
Fallback response
Custom text
The response returned when no results are found in the knowledge base. This helps prevent model hallucination.
System prompt: Configure the system prompt. Use the
{{knowledge}}variable to reference the retrieved results.Click Create.
API
POST /api/ragflow/chats
Content-Type: application/json
{
"name": "Product Support Assistant",
"dataset_ids": ["<dataset_id>"],
"llm": { "temperature": 0.1, "top_p": 0.3 },
"prompt": {
"similarity_threshold": 0.2,
"vector_similarity_weight": 0.3,
"top_n": 6,
"show_quote": true,
"empty_response": "I'm sorry, but I couldn't find any relevant information in the knowledge base.",
"prompt": "You are a product support assistant. Answer the user's questions concisely based only on the provided {{knowledge}}."
}
}Start a conversation
Console
In the assistant list, click an assistant's card to open the conversation view.
In the session list on the left, click + New conversation to start a new session, or select an existing one to continue.
In the input box, type your question and press Enter to send. Use Shift+Enter for a new line. The response streams in real-time.
At the end of each response, you can expand the citation markers to view the matched chunks and a link to the original source.
API
Create a session.
POST /api/ragflow/chats/<chat_id>/sessions Content-Type: application/json { "name": "First conversation" }Start a streaming Q&A.
POST /api/ragflow/chats/<chat_id>/completions Content-Type: application/json { "session_id": "<session_id>", "question": "How long do returns take?", "stream": true }Each frame in the streaming response has the following format:
data: {"code":0,"data":{"answer":"Typically within 7 business days...","reference":{...},"finish":false}}
Retrieve with skills
The platform includes two preset skills that let an AI assistant access a knowledge base directly, without manual API calls.
Skill name | Capability | Applicable role |
Knowledge Base Search Agent (polardb-kb-search-agent) | Read-only: List knowledge bases, list documents, and perform semantic search. | General users (prevents accidental operations) |
Knowledge Base Agent (polardb-kb-agent) | Read-write: Manage knowledge bases, upload or delete documents, modify chunks, and search. | administrators and operations |
Import a skill
An administrator can enable the corresponding skill by navigating to skill management > import presets.
Usage
After you enable a skill, the AI assistant can retrieve information from the knowledge base. Simply ask questions in natural language, and the assistant will automatically search for relevant content and use it to generate a response.
Command reference
Semantic search:
# Basic search (uses the default configured knowledge base) python3 scripts/search.py "How do I configure the database connection pool?" # Specify knowledge bases and return more results python3 scripts/search.py "deployment process" --dataset-ids ds-abc123,ds-xyz --top-k 10 # Include images (for questions about charts or architecture diagrams) python3 scripts/search.py "system architecture diagram" --with-images --jsonView knowledge bases:
python3 scripts/datasets.py list --json # List all accessible knowledge bases python3 scripts/datasets.py info ds-abc123 # View detailsCheck document and parsing status:
python3 scripts/list_documents.py ds-abc123 --run DONE --suffix pdf python3 scripts/parse_status.py ds-abc123 # Aggregate parsing progressManagement operations (polardb-kb-agent only):
python3 scripts/datasets.py create --name "new-kb" --chunk-method naive python3 scripts/upload_documents.py ds-abc123 ./docs/*.pdf python3 scripts/parse_documents.py ds-abc123 --all python3 scripts/update_dataset.py ds-abc123 --chunk-method paperNoteAll skill scripts read the
ONTOLOGY_BASE_URLandONTOLOGY_API_KEYenvironment variables by default. These variables are automatically injected when you import the skill into the AI assistant, so you do not need to set them.
Use cases
Developer Q&A: Import internal design documents into a knowledge base to query architecture details and API specifications directly from an AI assistant.
Customer service SOP: Import FAQs into a knowledge base so customer service agents can retrieve standard responses using an AI assistant.
Ticket analysis: Import historical incident reports. When a new incident occurs, use an AI assistant to search for similar cases.
Bulk semantic analysis: Use the
--jsonflag with your scripts to retrieve answers for a batch of questions and export the results for evaluation.