RAG applications

更新时间:
复制 MD 格式

Retrieval-augmented generation (RAG) extends LLM responses with your own data by retrieving relevant content from an external knowledge base before generating an answer. This reduces hallucination and lets LLMs answer questions about private or domain-specific data without fine-tuning.

PAI provides tools to build, deploy, and manage RAG applications. Deploy a RAG service on EAS with your choice of vector databases and LLMs, or use LangStudio to build RAG application flows for specialized domains.

Capabilities

Capability

Description

Deploy a RAG chatbot on EAS

Combine vector retrieval with LLM generation in a single service. Access the service through WebUI or API. WebUI lets you configure inference parameters and upload knowledge base files.

Build RAG flows in LangStudio

Design and deploy RAG application flows in a visual, flow-based environment. Tailor RAG solutions for specific domains such as finance and healthcare.

Connect to messaging platforms

Use AppFlow to link a PAI RAG service to third-party messaging platforms and build AI-powered chatbots and intelligent customer service agents.

Supported components

EAS-based RAG services support flexible configuration of both retrieval and generation components.

Component type

Supported options

Vector databases

Faiss, Elasticsearch, Hologres, OpenSearch, and RDS PostgreSQL

LLMs

Deploy models from Model Gallery, or connect to any LLM service that supports the OpenAI API.

Access methods

WebUI, API

Get started

Choose an approach based on how much control you need:

Approach

When to use

Documentation

LangStudio

You want a visual, flow-based interface to build domain-specific RAG applications.

Use LangStudio to create a DeepSeek- and RAG-based Q&A application flow for finance and healthcare

EAS scenario-based deployment

You need full control over which vector database and LLM to use.

Deploy and call a RAG chatbot service