RAG engine

更新时间:
复制 MD 格式

This document provides instructions for the following RAG engine features on the knowledge platform: administrator environment configuration, knowledge base management, document import and parsing, retrieval tests, conversational Q&A, and skill retrieval. You can perform each operation using the console or through API calls.

Overview

Retrieval-Augmented Generation (RAG) is a core feature of the knowledge platform. It vectorizes document content and stores it in a knowledge base to enable precise retrieval based on semantic similarity. A large language model (LLM) then generates a response based on the retrieved results.

A typical RAG workflow includes the following steps:

  1. Admin configuration: Configure Object Storage Service (OSS), the document engine, and AI models to initialize the platform.

  2. Create a knowledge base: Create a knowledge base to organize your document collection and set the chunking parameters.

  3. Import and parse: Upload documents and trigger parsing. The system splits the documents into text chunks and generates vector embeddings.

  4. Retrieval and Q&A: Run a retrieval test to verify recall, and then create a conversational assistant for Q&A.

  5. Skill retrieval: Use an AI assistant's built-in skill to retrieve information directly from the knowledge base without manual API calls.

Role permissions are as follows:

Actions

Required role

Query knowledge base, retrieve, Q&A

VIEWER or higher

Create knowledge base, upload or delete documents, trigger parsing

DEVELOPER or higher

Configure AI models, Object Storage Service (OSS), document engine, resource authorization

ADMIN

Administrator configuration

An administrator (ADMIN) only needs to perform this configuration once after the initial platform deployment. You can skip this section if the environment is already configured.

OSS configuration

The RAG engine stores source files, intermediate parsing artifacts, and chunks in Object Storage Service (OSS).

UI

  1. In the left navigation pane, select RAG Engine > Service Configuration.

  2. In the Object storage (OSS) configuration panel, configure the following parameters:

    Parameter

    Description

    Example

    Storage type

    The type of object storage service.

    OSS, S3, MINIO

    Access key ID

    Your access key ID.

    AKxxxxxxxx

    Secret access key

    Your secret access key. It is stored server-side with AES-GCM encryption. To keep a previously saved key, leave this field empty.

    ********

    Endpoint URL

    The endpoint URL for Object Storage Service (OSS).

    https://oss-cn-hangzhou.aliyuncs.com

    Region

    The Object Storage Service (OSS) region.

    cn-hangzhou

    Bucket

    The default bucket name.

    my-rag-bucket

    Prefix path

    An optional path prefix for all objects stored in the bucket.

    rag/

    Signature version

    The signature version for authenticating requests.

    s3, v4

    Addressing style

    The bucket addressing style.

    virtual, path

  3. Click Test Connection to verify the configuration.

  4. Click Save Configuration. The system then automatically restarts the RAG engine service (approx. 10-30 seconds).

API

  1. Save the OSS configuration:

    POST /api/oss/config
    Content-Type: application/json
    
    {
      "storage_type":       "OSS",
      "endpoint_url":       "https://oss-cn-hangzhou.aliyuncs.com",
      "region":             "cn-hangzhou",
      "bucket":             "my-rag-bucket",
      "access_key":         "AKxxx",
      "secret_key":         "xxxxxxx",
      "prefix_path":        "rag/",
      "signature_version":  "s3",
      "addressing_style":   "virtual"
    }
  2. Saving the configuration automatically restarts the RAG engine service (approx. 10-30 seconds). Once the restart is complete, call the test endpoint to verify connectivity:

    POST /api/oss/test
    Content-Type: application/json
    
    {
      "storage_type":       "OSS",
      "endpoint_url":       "https://oss-cn-hangzhou.aliyuncs.com",
      "region":             "cn-hangzhou",
      "bucket":             "my-rag-bucket",
      "access_key":         "AKxxx",
      "secret_key":         "xxxxxxx",
      "prefix_path":        "rag/",
      "signature_version":  "s3",
      "addressing_style":   "virtual"
    }

    Expected response: {"code": 0, "connected": true}.

Document engine

The document engine stores chunks, vectors, and full-text indexes. The following two engine types are supported:

Scenario

Recommended engine

To avoid deploying additional services.

PostgreSQL

To use an existing OpenSearch service.

OpenSearch

UI

  1. On the Service Configuration page, scroll to the Document engine configuration section.

  2. Under Engine type at the top, select PostgreSQL or OpenSearch.

  3. Switch to the corresponding tab and enter the connection parameters.

    • PostgreSQL tab

      Parameter

      Default

      Description

      Host / port

      localhost / 5432

      The instance address for PostgreSQL.

      Username / password

      postgres / Leave empty if saved

      Database credentials. The password is encrypted for storage.

      Database

      _rag_doc_db

      The name of the dedicated database for the document engine.

      Pool size / max overflow / pool timeout

      20 / 10 / 30

      Connection pool settings.

      Full-text search engine

      gin

      The engine type for full-text indexing.

      Full-text search language

      chinese

      The language for tokenization.

      Hybrid mode

      weighted_fusion

      The fusion strategy for hybrid search results.

      FTS top-N / vector top-N

      100 / 100

      The maximum number of results to retrieve from each retrieval path.

    • OpenSearch tab

      Parameter

      Default

      Description

      Host / port

      opensearch / 9201

      The instance address for OpenSearch.

      Username / password

      admin / Leave empty if saved

      Administrator credentials.

  4. Click Test Connection to verify connectivity, and then click Save Configuration.

  5. Under Engine type, click Switch and Restart. The system automatically restarts the RAG engine service to switch to the new engine.

Important

Switching engines prevents existing knowledge bases from accessing chunk data in the previous engine. For a full data migration, you must re-parse all existing documents.

API

  1. Save the PostgreSQL configuration:

    POST /api/doc-engine/pg/config
    Content-Type: application/json
    
    {
      "host": "localhost",
      "port": 5432,
      "database": "_rag_doc_db",
      "username": "postgres",
      "password": "your_password"
    }
  2. Test PostgreSQL connectivity (you can pass an empty body to use the saved configuration):

    POST /api/doc-engine/pg/test
    Content-Type: application/json
    
    {}
  3. Switch the engine type (triggers a restart):

    POST /api/doc-engine/engine-type
    Content-Type: application/json
    
    { "engine_type": "postgresql" }

LLM and embedding models

The RAG engine requires an embedding model to vectorize chunks and a chat LLM for conversational Q&A.

UI

  1. In the left navigation pane, select RAG Engine > Model configuration.

  2. Add a model provider: In the Add from preset model providers section, find your target provider (such as Tongyi, OpenAI, or DeepSeek), click Add, and enter the API key and an optional base URL.

  3. Import a custom model (Optional): Click Add custom model. Fill in the model provider, model name, API key, base URL, and model type (chat, embedding, rerank, image2text). Click Validate to confirm the model is available before clicking Add.

  4. Set default models: In the Tenant default models section at the top of the page, specify the following:

    • LLM: For example, qwen-plus.

    • Embedding: For example, text-embedding-v3.

    • VLM (Optional): For PDF image understanding, for example, qwen-vl-plus.

  5. Click Save. The settings take effect immediately; no restart is required.

API

  1. Set the API key for a model provider.

    POST /api/ragflow/llm/set_api_key
    Content-Type: application/json
    
    {
      "llm_factory": "Tongyi-Qianwen",
      "api_key":     "sk-xxxxxxxxxxxxxxxxxxxxxx",
      "base_url":    "https://dashscope.aliyuncs.com/compatible-mode/v1"
    }

    To view a list of available model providers, call GET /api/ragflow/llm/factories.

  2. Set the tenant default models.

    POST /api/ragflow/set_tenant_info
    Content-Type: application/json
    
    {
      "tenant_id":  "<Get this from GET /api/ragflow/tenant_info>",
      "llm_id":     "qwen-plus@Tongyi-Qianwen",
      "embd_id":    "text-embedding-v3@Tongyi-Qianwen",
      "asr_id":     "",
      "img2txt_id": ""
    }
  3. Override default models at the user level (Optional).

    PUT /api/ragflow/defaults
    Content-Type: application/json
    
    { "llm_id": "gpt-4o@OpenAI" }

Administrator self-check

After completing the configurations above, use the following API calls to verify the status of each component:

GET /api/ragflow/status        # Check if the RAG engine service is online
GET /api/ragflow/tenant_info   # Check if the default models are set
POST /api/oss/test             # Check if OSS is reachable (requires complete OSS configuration parameters)
POST /api/doc-engine/pg/test   # Check if the document engine is reachable (can pass an empty body {})

If all four calls are successful (connected: true or code: 0), you can start using RAG features.

Create a knowledge base

Console

  1. In the left navigation pane, choose RAG engine > knowledge base management.

  2. In the upper-right corner of the page, click + New knowledge base. In the dialog box that appears, configure the following parameters:

    Parameter

    Value

    Description

    Name

    quickstart-kb

    The name can contain only letters, digits, and hyphens, which simplifies script references.

    Description

    Enter a brief business description.

    This helps identify the knowledge base when multiple people are collaborating.

    Chunking method

    General (Naive)

    This is the recommended method for general documents. Other options include manual (for long texts), paper (for research papers), qa (for Q&A pairs), and table (for tables).

    Embedding model

    Leave blank

    The knowledge base uses the tenant's default model automatically, which prevents dimension mismatches.

    Permission

    me / team

    me: Visible only to you. team: Shared with the entire tenant.

  3. Click Create. The ID of the new entry is its dataset_id (for example, ds-abc123), which is required for subsequent API and Skill calls.

API

POST /api/ragflow/datasets
Content-Type: application/json

{
  "name":            "quickstart-kb",
  "description":     "My first knowledge base",
  "chunk_method":    "naive",
  "permission":      "me"
}

After the request succeeds, note the data.id from the response body (for example, ds-abc123). This ID is required for subsequent API calls.

Import and parse documents

Upload files

Console

  1. In the knowledge base list, click the folder icon for the target knowledge base to open the Document Management page.

  2. Click Upload in the upper-right corner of the page. Drag and drop files into the upload box or click to select them. Supported formats include PDF, Word, Excel, PowerPoint, TXT, and Markdown.

  3. Once the upload is complete, close the upload dialog. The new document appears in the document list, and its status is unprocessed.

API

curl -X POST "http://<host>/api/ragflow/datasets/<dataset_id>/documents" \
     -H "Authorization: Bearer <token>" \
     -F "file=@./product-manual.pdf" \
     -F "file=@./FAQ.md"

After you upload the documents, their status is unprocessed.

Trigger parsing

Console

  1. On the Document Management page, select the documents to parse by checking them individually or using the Select All option.

  2. At the top of the page, click Start Parsing.

    Note

    If no embedding model is configured, a warning appears. Click Configure to go to the knowledge base configuration page and add one.

  3. During parsing, the document list automatically refreshes to show the progress. The document status changes as follows: unprocessedparsingcompleted (or failed).

API

POST /api/ragflow/datasets/<dataset_id>/documents/parse
Content-Type: application/json

{ "ids": ["doc-xxx", "doc-yyy"] }

View and modify chunks

Console

  1. On the Document Management page, click the document icon for a completed document.

  2. In the chunk drawer on the right, you can perform the following actions on each chunk:

    • Click Edit to manually correct errors, such as OCR mistakes or incorrect paragraph merging.

    • Use the toggle to enable or disable a chunk. Disabling a chunk temporarily excludes it from retrieval if it is low-quality.

    • Click Add Chunk to manually add content that was missed during parsing.

API

GET  /api/ragflow/datasets/<dataset_id>/documents/<doc_id>/chunks?page=1&size=20
POST /api/ragflow/datasets/<dataset_id>/documents/<doc_id>/chunks/switch

Retrieval test

Before creating a dialogue assistant, run a retrieval test to ensure documents are retrieved correctly.

UI

  1. In the navigation pane on the left, choose RAG Engine > Content Retrieval.

  2. From the Select Knowledge Base drop-down list at the top of the page, select one or more knowledge bases.

  3. Enter a question in the text box and click Retrieve.

  4. In the parameters panel on the right, adjust the following retrieval parameters as needed:

    Parameter

    Default

    Description

    Top K

    10

    The number of chunks to return.

    Similarity threshold

    0.2

    Filters out chunks with a score below this threshold.

    Vector weight

    0.3

    For hybrid retrieval, this sets the weight of the vector score. The remaining weight is allocated to keyword matching.

    Keyword

    Off

    When enabled, the system also performs BM25 keyword matching.

    Metadata filtering

    Empty

    Narrows the retrieval scope by document type or tags.

  5. Review the results. Each chunk displays a comprehensive score, a vector score, and a keyword score.

    Interpretation criteria:

    Observation

    Conclusion

    The top 3 retrieved chunks answer the question.

    The retrieval quality is good. You can proceed to create a dialogue assistant.

    The top 3 chunks are not relevant, but correct results appear within the top 10.

    Increase the Top K value or enable rerank.

    None of the chunks are relevant.

    Check the chunk quality or adjust the chunking parameters.

API

POST /api/ragflow/retrieval
Content-Type: application/json

{
  "question": "What are the conditions for product returns?",
  "dataset_ids": ["<dataset_id>"],
  "top_k":    10,
  "similarity_threshold":     0.2,
  "vector_similarity_weight": 0.3,
  "highlight": true
}

The response for each chunk includes the following:

  • similarity: The comprehensive score.

  • vector_similarity / term_similarity: The vector and keyword scores.

  • content_with_weight: The content, with highlight markup.

  • docnm_kwd / doc_id: The source document.

Assistants and Q&A

Create an assistant

Console

  1. In the navigation pane on the left, choose RAG engine > assistant.

  2. Click + New assistant, and configure the following parameters:

    Basic information

    Parameter

    Description

    Name

    The display name of the assistant.

    Avatar

    Upload a custom avatar or keep the default.

    Associated knowledge base

    Select one or more knowledge bases.

    Model parameters

    Parameter

    Suggested value

    Description

    Model

    Tenant default

    The model the assistant will use. Defaults to the tenant's model, but can be overridden with a specific one.

    Temperature

    0.1

    A lower value makes the response adhere more strictly to the content in the knowledge base.

    Top-p

    0.3

    Tune either this parameter or temperature.

    Similarity threshold

    0.2

    Filters out retrieved chunks with a score below this threshold.

    Vector weight

    0.3

    In hybrid search, this is the weight of the vector similarity score.

    Top-n

    6

    The maximum number of chunks to pass to the LLM.

    Show citations

    Enabled

    Displays citations in the response that link to the original source chunks.

    Fallback response

    Custom text

    The response returned when no results are found in the knowledge base. This helps prevent model hallucination.

    System prompt: Configure the system prompt. Use the {{knowledge}} variable to reference the retrieved results.

  3. Click Create.

API

POST /api/ragflow/chats
Content-Type: application/json

{
  "name":        "Product Support Assistant",
  "dataset_ids": ["<dataset_id>"],
  "llm":  { "temperature": 0.1, "top_p": 0.3 },
  "prompt": {
    "similarity_threshold":     0.2,
    "vector_similarity_weight": 0.3,
    "top_n":            6,
    "show_quote":       true,
    "empty_response":   "I'm sorry, but I couldn't find any relevant information in the knowledge base.",
    "prompt":           "You are a product support assistant. Answer the user's questions concisely based only on the provided {{knowledge}}."
  }
}

Start a conversation

Console

  1. In the assistant list, click an assistant's card to open the conversation view.

  2. In the session list on the left, click + New conversation to start a new session, or select an existing one to continue.

  3. In the input box, type your question and press Enter to send. Use Shift+Enter for a new line. The response streams in real-time.

  4. At the end of each response, you can expand the citation markers to view the matched chunks and a link to the original source.

API

  1. Create a session.

    POST /api/ragflow/chats/<chat_id>/sessions
    Content-Type: application/json
    
    { "name": "First conversation" }
  2. Start a streaming Q&A.

    POST /api/ragflow/chats/<chat_id>/completions
    Content-Type: application/json
    
    {
      "session_id": "<session_id>",
      "question":   "How long do returns take?",
      "stream":     true
    }

    Each frame in the streaming response has the following format:

    data: {"code":0,"data":{"answer":"Typically within 7 business days...","reference":{...},"finish":false}}

Retrieve with skills

The platform includes two preset skills that let an AI assistant access a knowledge base directly, without manual API calls.

Skill name

Capability

Applicable role

Knowledge Base Agent (polardb-kb-agent)

Read-write: Manage knowledge bases, upload or delete documents, modify chunks, and search.

administrators and operations

Import a skill

An administrator can enable the corresponding skill by navigating to skill management > import presets.

Usage

After you enable a skill, the AI assistant can retrieve information from the knowledge base. Simply ask questions in natural language, and the assistant will automatically search for relevant content and use it to generate a response.

Command reference

  • # Basic search (uses the default configured knowledge base)
    python3 scripts/search.py "How do I configure the database connection pool?"
    
    # Specify knowledge bases and return more results
    python3 scripts/search.py "deployment process" --dataset-ids ds-abc123,ds-xyz --top-k 10
    
    # Include images (for questions about charts or architecture diagrams)
    python3 scripts/search.py "system architecture diagram" --with-images --json
  • View knowledge bases:

    python3 scripts/datasets.py list --json           # List all accessible knowledge bases
    python3 scripts/datasets.py info ds-abc123        # View details
  • Check document and parsing status:

    python3 scripts/list_documents.py ds-abc123 --run DONE --suffix pdf
    python3 scripts/parse_status.py  ds-abc123        # Aggregate parsing progress
  • Management operations (polardb-kb-agent only):

    python3 scripts/datasets.py create --name "new-kb" --chunk-method naive
    python3 scripts/upload_documents.py ds-abc123 ./docs/*.pdf
    python3 scripts/parse_documents.py ds-abc123 --all
    python3 scripts/update_dataset.py ds-abc123 --chunk-method paper
    Note

    All skill scripts read the ONTOLOGY_BASE_URL and ONTOLOGY_API_KEY environment variables by default. These variables are automatically injected when you import the skill into the AI assistant, so you do not need to set them.

Use cases

  • Developer Q&A: Import internal design documents into a knowledge base to query architecture details and API specifications directly from an AI assistant.

  • Customer service SOP: Import FAQs into a knowledge base so customer service agents can retrieve standard responses using an AI assistant.

  • Ticket analysis: Import historical incident reports. When a new incident occurs, use an AI assistant to search for similar cases.

  • Bulk semantic analysis: Use the --json flag with your scripts to retrieve answers for a batch of questions and export the results for evaluation.