Build a RAG application with Dify and Tablestore

更新时间:
复制 MD 格式

Enterprise Q&A systems often struggle to provide accurate answers. By building a RAG (retrieval-augmented generation) application that integrates Dify with Alibaba Cloud Tablestore as a vector database, you can achieve highly accurate knowledge retrieval and intelligent question answering.

How it works

Tablestore is a high-performance vector database that provides millisecond-level query responses. It supports hybrid search, combining both vector and full-text retrieval. A single table can store tens of billions of vector data points. Dify orchestrates the application and provides the user interface, simplifying the process of building and deploying RAG applications.

A RAG application implements intelligent Q&A through the following core process:

  1. Knowledge vectorization: Dify automatically splits enterprise documents into knowledge segments, converts them into vector representations, and stores them in a Tablestore vector database.

  2. Similarity search: When a user asks a question, the system quickly retrieves the most relevant knowledge segments from Tablestore.

  3. Augmented generation: The retrieved knowledge is combined with the user's question and fed into a large language model to generate an accurate and well-founded answer.

image

Prepare the deployment environment

Create a Tablestore instance

Tablestore serves as the vector database for the RAG application. It stores the vectorized representations of your documents and provides efficient similarity search.

  1. Log on to the Tablestore console. Click Create Instance, select VCU Mode (Formerly Reserved Mode), then click Activate Now.

  2. Set the instance parameters as described in the following list. Keep the default settings for the other configuration items.

    • Region: Select a region for the instance, such as China (Hangzhou).

    • VCUs: Set to 0 to reserve no computing resources for the instance.

    • Scalability: Select Enable.

    • Storage Type: Select High-performance.

  3. Click Buy Now > Pay > Subscribe to create the instance.

Create an ECS instance

The ECS instance serves as the runtime environment for the Dify application. It requires sufficient computing resources to support the Docker container cluster.

  1. Log on to the ECS console and click Create Instance.

  2. Create an ECS instance with the following configurations. You can keep the default settings for other parameters.

    • Billing Method: Select Pay-As-You-Go.

    • Region: Select a region for the instance. For network reasons, this tutorial uses the China (Hong Kong) region.

    • Network and Zone: Select the default Virtual Private Cloud (VPC) and zone.

    • Instance: Click All Instance Types, then search for and select ecs.e-c1m2.large.

      Note

      If this instance type is sold out, select another one.

    • Images: Select Public Image and choose Alibaba Cloud Linux (Alibaba Cloud Linux 3.2104 LTS 64-bit).

    • System Disks: Set the capacity of the ESSD Entry disk to 40 GiB.

    • Public IP Address: Select Assign Public IPv4 Address.

    • Bandwidth Billing Method: Select Pay-by-traffic to save costs.

    • Maximum Bandwidth: Select 5 Mbps or higher.

    • Security Group: Select Existing Security Group.

    • Logon Credential: Select Custom Password, set the logon name to root, and set your Logon Password. Keep the password in a secure location.

Deploy and access Dify

Step 1: Install the Docker environment

Dify uses a containerized deployment. You must first install Docker and Docker Compose on your ECS instance to manage container orchestration.

  1. Install the dnf-plugins-core plug-in.

    dnf -y install dnf-plugins-core
  2. Add the official Docker repository.

    dnf config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
  3. Install Docker Engine and Docker Compose.

    dnf -y install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
  4. Start the Docker service and enable it to start automatically on system boot.

    systemctl enable --now docker

Step 2: Deploy the Dify service

Next, get the Dify source code and configure environment variables, including the connection parameters for Tablestore as the vector database.

  1. Install the Git version control tool.

    yum -y install git
  2. Clone the Dify repository.

    git clone https://github.com/langgenius/dify.git
  3. Navigate to the Docker deployment directory.

    cd dify/docker
  4. Copy the environment file template.

    cp .env.example .env
  5. Edit the environment file.

    vi .env

    Modify the following configuration items:

    Parameter

    Description

    VECTOR_STORE

    The vector database type. Set this to tablestore.

    TABLESTORE_ENDPOINT

    Important

    By default, new Tablestore instances have public network access disabled. To use a public endpoint, go to the Tablestore console. On the Instance Management page, navigate to the Network Management tab. In the Allowed Network Type section, select Internet and click Configure to save the configuration.

    Go to the Tablestore console. In the Instances, click the instance alias to go to the Instance Management page. Copy the instance name and endpoint. Select an endpoint based on your deployment:

    • If the ECS instance and Tablestore instance are in the same region, you can use either the public or VPC endpoint.

    • If the ECS instance and Tablestore instance are in different regions, you must use the public endpoint.

    TABLESTORE_INSTANCE_NAME

    TABLESTORE_ACCESS_KEY_ID

    Go to the AccessKey Management page to create an AccessKey for your Alibaba Cloud account. Get and save the ACCESS_KEY_ID and ACCESS_KEY_SECRET.

    TABLESTORE_ACCESS_KEY_SECRET

  6. Start the Dify container cluster.

    docker compose up -d

    A successful startup shows the following output:

     ✔ Network docker_default             Created                                                                                                                                                            0.1s
     ✔ Network docker_ssrf_proxy_network  Created                                                                                                                                                            0.1s
     ✔ Container docker-sandbox-1         Started                                                                                                                                                            0.8s
     ✔ Container docker-redis-1           Started                                                                                                                                                            1.0s
     ✔ Container docker-ssrf_proxy-1      Started                                                                                                                                                            1.3s
     ✔ Container docker-web-1             Started                                                                                                                                                            1.0s
     ✔ Container docker-db-1              Started                                                                                                                                                            0.9s
     ✔ Container docker-plugin_daemon-1   Started                                                                                                                                                            2.4s
     ✔ Container docker-api-1             Started                                                                                                                                                            2.4s
     ✔ Container docker-worker-1          Started                                                                                                                                                            2.3s
     ✔ Container docker-nginx-1           Started                                                                                                                                                            3.8s

Step 3: Configure security group access rules

To access the Dify management interface from the internet, open the required port in the ECS security group.

  1. Go to the ECS console. For the target instance, click .

  2. Configure a security group rule: Set Action to Allow and Protocol Type to Web HTTP Traffic Access. For Destination, enter the Dify service port (default is 80).

    Set Priority to 1 and Source to 0.0.0.0/0 to allow access from any IPv4 address.

  3. Click OK to save the rule.

Step 4: Access the Dify interface

In a browser, go to http://server_ip, where server_ip is the public IP address of your ECS instance. When you first access this page, you are redirected to an initialization page. Follow the prompts to set up an administrator account. After setup is complete, the system automatically logs you in.

The initialization page includes fields for Email, Username, and Password. The password must contain both letters and numbers and be at least 8 characters long. After filling in the details, click Setup.

Build the RAG application and verify the results

Step 1: Configure model services and an API key

The RAG application requires a large language model to understand questions and generate answers, and an Embedding model to convert text into vector representations.

  1. Install a model provider and configure its API key.

    1. On the Dify home page, click your user avatar, and then click Settings in the drop-down menu.

    2. On the Settings page, click Model Provider, select Tongyi, and click Install.

    3. After installation, in the list of models to be configured, click Setup.

      The Setup button is located in the API-KEY area on the model provider card (for example, Tongyi).

    4. Follow the on-screen prompts to get an API key from Alibaba Cloud Model Studio and enter it here. Then, click Save.

      In the configuration form, set Use International Endpoint to No.

  2. Configure the default system models.

    1. On the Model Provider page, click System Model Settings to the right of the Model List.

    2. Use the following recommended models, or select other models that suit your needs.

      • System Reasoning Model: Select qwen3-max-preview.

      • Embedding Model: Select text-embedding-v4.

      • Rerank Model: Select gte-rerank-v2.

      • Speech-to-Text Model: Select paraformer-realtime-v2.

      • Text-to-Speech Model: Select tts-1.

After configuring the models, press the ESC key to return to the Dify home page.

Step 2: Create a knowledge base

The knowledge base is a core component of the RAG application. It stores enterprise documents and provides intelligent retrieval.

  1. On the Dify home page, click Knowledge, and then click Create Knowledge.

  2. Click I want to create an empty Knowledge. In the Create an empty Knowledge dialog box, enter a name for the knowledge base (for example, Tablestore), and then click Create.

  3. On the knowledge base details page, click Add file. Select the sample file Model_Studio_Series_Mobile_Phone_Product_Introduction.docx, and then click Next to configure text segmentation and cleaning.

  4. Configure the processing parameters: set Index Method to High Quality and Retrieval Setting to hybrid search. After configuring the parameters, click Save & Process. You can click Go to document to check the processing status of the uploaded document.

    When the document is processed, the status column displays Available.

    You can go to the Tablestore console to view the knowledge data saved to the vector database.

    In the Data Management tab of the vector index table, each knowledge data record includes id (primary key), metadata, metadata_tags, page_content, and vector columns. This indicates that the knowledge document has been successfully parsed and written to the database as vector embeddings.

Step 3: Create an assistant and verify RAG

Finally, create a chatbot application. By comparing its responses before and after adding the knowledge base, you can verify the retrieval-augmented capability of the RAG application.

  1. On the Dify home page, click Studio, and then click Create from Blank.

  2. Select the Beginner-friendly template, then choose the Chatbot app type. Enter a name for the application and click Create.

    After creation, you are directed to the Orchestrate configuration interface.

  3. Test the chatbot by asking it a question, for example, What are the Model Studio phone models?. At this stage, the assistant's answer may be inaccurate or lack detail.

    At this point, no Knowledge has been added on the Orchestrate page, and the prompt is empty. The assistant's response is based only on the general knowledge of the large model. It interprets "Model Studio" as the Alibaba Cloud large model service platform rather than the intended business domain, confirming the inaccuracy.

  4. On the application's Orchestrate page, click Add next to Knowledge, select the knowledge base you created earlier, and then click Restart to reset the debugging area as prompted.

  5. Ask the chatbot the same question again. Using the knowledge base, the assistant can provide an accurate and detailed answer.

    The Citations section below the assistant's reply shows the retrieved source documents. The knowledge base retrieval mode is configured as High Quality·Hybrid Search.

Going to production

Best practices

To ensure high availability for your Dify application in a production environment, we recommend deploying it on ACK. For a detailed implementation plan, see the solution Quickly deploy Dify to build AI applications and select the production deployment option (using ACK).

Risk management

  • Access control: Manage resource access with RAM users. Strictly control permission scopes, granting only the necessary operational permissions related to Tablestore and ECS.

  • Auditing: Enable ActionTrail to log all access and modification operations on critical resources, ensuring all actions are traceable.

  • Data backup: Periodically back up your Tablestore data to prevent data loss from accidental failures or human error.

Cost optimization

Based on your peak traffic patterns, configure reserved VCU resources for your Tablestore instance to optimize costs. Also, enable elastic capacity to handle sudden traffic surges. For more details, see VCU billing modes.

Clean up resources

After completing the tutorial, if you do not plan to deploy the application in a production environment, clean up the cloud resources promptly to avoid unnecessary charges.

  1. Release the ECS instance

    Go to the ECS console, find the target instance, click All Operations, and follow the on-screen instructions to release the instance.

  2. Delete the Tablestore data table

    Go to the Tablestore console. In the Instances, click the alias of the target instance. On the Instance Management page, first click Indexes to the right of the target data table. On the Indexes page, delete the search index. Then, return to the data table list, and to the right of the target data table, click to delete the data table.

  3. Release the Tablestore instance

    When the reserved VCU is set to 0 and there is no data in the instance, no fees are incurred. If you need to clean up all resources, release the Tablestore instance as follows:

    1. Go to the Resource Unsubscription page. For Product Name, select Tablestore Standard Instance, and click Search.

    2. In the Actions column of the target order, click Unsubscribe Resource. Read the unsubscription rules carefully, and then click Unsubscribe . Select a Reason For Unsubscription (such as The service is no longer required due to business changes.) and click OK. If prompted for identity verification, follow the on-screen instructions.

References