ChatWithKnowledgeBaseStream
This service combines a knowledge base with a large model to provide intelligent Q&A. You can access the streaming interface using Server-Sent Events (SSE) or the Java asynchronous SDK.
Operation description
Use this API to retrieve answers from a large language model based on content from a specified knowledge base. You can customize the request by configuring various parameters, including the database instance ID, knowledge retrieval parameters, and model inference parameters. The API includes a default system prompt template, and you can also specify a custom one.
DBInstanceId: Required. The ID of the database instance.
KnowledgeParams: Optional. Parameters for knowledge retrieval, such as retrieval content and the merge policy.
ModelParams: Required. Parameters for model inference, such as the message list and the model name.
PromptTemplate: Optional. A custom system prompt template.
Try it now
Test
RAM authorization
|
Action |
Access level |
Resource type |
Condition key |
Dependent action |
|
gpdb:ChatWithKnowledgeBaseStream |
get |
*DBInstance
|
None | None |
Request syntax
POST / HTTP/1.1
Request parameters
|
Parameter |
Type |
Required |
Description |
Example |
| DBInstanceId |
string |
Yes |
The instance ID. Note
You can call the DescribeDBInstances operation to query the IDs of all AnalyticDB for PostgreSQL instances in a specified region. |
gp-xxxxxxxxx |
| RegionId |
string |
Yes |
The instance's region ID. |
cn-hangzhou |
| KnowledgeParams |
object |
No |
Parameters for knowledge retrieval. If omitted, the API performs a chat-only operation. |
|
| MergeMethod |
string |
No |
Specifies the method for merging results from multiple knowledge bases. Default:
|
"RRF" |
| MergeMethodArgs |
object |
No |
The arguments for the result merging method. |
|
| Rrf |
object |
No |
Parameters for the |
|
| K |
integer |
No |
The constant |
60 |
| Weight |
object |
No |
Parameters for the |
|
| Weights |
array |
No |
An array of weights for each |
|
|
number |
No |
The weight for a |
0.01 |
|
| RerankFactor |
number |
No |
Specifies the factor for reranking vector search results. The value must be greater than 1 and less than or equal to 5. Note
|
5.0 |
| RerankModel |
object |
No |
The rerank model to use. |
|
| Name |
string |
No |
The name of the rerank model. |
qwen3-rerank |
| Instruct |
string |
No |
An instruction for the rerank model. |
Given a web search query, retrieve relevant passages that answer the query |
| SourceCollection |
array<object> |
Yes |
An array of knowledge bases to search. |
|
|
array<object> |
No |
An item in the |
||
| Collection |
string |
Yes |
The name of the collection to search. |
cloud_index_adb_50943_prod |
| Namespace |
string |
No |
The namespace that contains the collection. Note
You can call the ListNamespaces operation to view available namespaces. |
ddstar_vector |
| NamespacePassword |
string |
Yes |
The password for the specified namespace. Note
This value is specified in the |
namespacePassword |
| QueryParams |
object |
No |
Parameters for the knowledge base query. |
|
| Filter |
string |
No |
A filter expression to apply to the search, similar to a SQL |
method_id='e41695f0-2851-40ac-b21d-dd337b60d71c' |
| GraphEnhance |
boolean |
No |
Specifies whether to enable knowledge graph enhancement. Default value: |
true |
| GraphSearchArgs |
object |
No |
The parameters for knowledge graph search. |
|
| GraphTopK |
integer |
No |
The number of top entities and relationship edges to return. Default value: |
60 |
| HybridSearch |
string |
No |
Specifies the hybrid search algorithm. If omitted, the system performs a basic score comparison of vector search and full-text retrieval results. Valid values:
|
Cascaded |
| HybridSearchArgs |
object |
No |
The arguments for the specified hybrid search algorithm. Supports
|
|
|
any |
No |
{"RRF":{"k":60}} |
||
| Metrics |
string |
No |
The distance metric for vector search. Valid values:
|
cosine |
| RecallWindow |
array |
No |
The recall window. Specifies a window of context to include around retrieved chunks. The value must be a two-element array Note
|
|
|
integer |
No |
An integer that specifies a bound of the recall window. The first element of the array represents the number of chunks to include before the retrieved chunk, and the second element represents the number of chunks to include after. Note
|
[-1,1] |
|
| RerankFactor |
number |
No |
The rerank factor. If specified, the system reranks the results from the vector search. The value must be greater than 1 and less than or equal to 5. Note
|
2.0 |
| RerankModel |
object |
No |
The rerank model to use. |
|
| Name |
string |
No |
The name of the rerank model. |
qwen3-rerank |
| Instruct |
string |
No |
An instruction for the rerank model. |
Given a web search query, retrieve relevant passages that answer the query |
| TopK |
integer |
No |
The number of top results to return from this collection. |
101 |
| UseFullTextRetrieval |
boolean |
No |
Specifies whether to use full-text retrieval for hybrid search. If |
true |
| TopK |
integer |
No |
The total number of top results to return after merging results from all collections. |
10 |
| PromptParams |
string |
No |
A template for the system prompt. It must include placeholders such as |
"参考以下知识回答问题:{{ text_chunks }}" |
| ModelParams |
object |
Yes |
An object that contains parameters for the Large Language Model (LLM) call. |
|
| MaxTokens |
integer |
No |
The maximum number of tokens to generate. |
8192 |
| Messages |
array<object> |
Yes |
A list of messages in the conversation. |
|
|
object |
Yes |
An object representing a single message. |
||
| Content |
string |
Yes |
The content of the message. |
你是一个有帮助的助手。 |
| Role |
string |
Yes |
The role of the message author. Valid values:
|
user |
| Model |
string |
Yes |
The name of the Large Language Model to use. For a list of available models, refer to the Model Studio documentation. |
qwen-plus |
| N |
integer |
No |
The number of candidate responses to generate. |
1 |
| PresencePenalty |
number |
No |
The presence penalty. A value between -2.0 and 2.0. |
1.0 |
| Seed |
integer |
No |
The random seed for sampling. |
42 |
| Stop |
array |
No |
A list of stop sequences. |
|
|
string |
No |
A stop sequence. |
"\n" |
|
| Temperature |
number |
No |
The sampling temperature. A value between 0 and 2. |
0.6 |
| Tools |
array<object> |
No |
A list of tools the model can call. |
|
|
array<object> |
No |
The details of a tool. |
||
| Function |
object |
No |
The function information. |
|
| Description |
string |
No |
A description of the function tool. |
获取天气。 |
| Name |
string |
No |
The name of the function tool. |
get_weather |
| Parameters |
any |
No |
The parameters of the function, described as a JSON Schema object. |
{"type": "object", ...} |
| TopP |
number |
No |
The nucleus sampling probability threshold. A value between 0 and 1. |
0.9 |
| IncludeKnowledgeBaseResults |
boolean |
No |
Specifies whether to include the retrieved knowledge base results in the response. Default value: |
false |
Response elements
|
Element |
Type |
Description |
Example |
|
object |
The response schema. |
||
| RequestId |
string |
The request ID. |
ABB39CC3-4488-4857-905D-2E4A051D0521 |
| MultiCollectionRecallResult |
object |
The retrieval results from multiple knowledge bases. |
|
| Entities |
array |
A list of retrieved entities. |
|
|
string |
The details of a retrieved entity. |
{'entities': []} |
|
| Matches |
array<object> |
A list of retrieved matches. |
|
|
array<object> |
A retrieved match. |
||
| Content |
string |
The document content. |
ADBPG向量数据库。 |
| FileName |
string |
The file name. |
a14b0221-e3f2-4cf2-96cd-b3c293510770.jpg |
| FileURL |
string |
The public URL of the retrieved image. By default, the URL is valid for 2 hours. You can use the |
http://dailyshort-sh.oss-cn-shanghai.aliyuncs.com/vod-8efba5/f06147795c6c71f080605420848d0302/0ca34d5743a84bf7c68f489a60715dac-ld.mp4 |
| Id |
string |
The unique ID of the vector record. Note
If this parameter is left empty, the database automatically generates a UUID. If you provide an ID that conflicts with an existing one, the existing record is updated with the data from the request. |
273e3fc7-8f56-4167-a1bb-d35d2f3b9043 |
| LoaderMetadata |
any |
Metadata from the document loader, captured during document ingestion. |
{"page":1} |
| Metadata |
object |
The user-defined metadata. |
|
|
any |
{"update_time":"1754446789199","is_publish":"1"} |
||
| RerankScore |
number |
The rerank score. |
0.12 |
| RetrievalSource |
integer |
The source of the match. |
0.12 |
| Score |
number |
The similarity score. The score is calculated based on the distance metric specified when the index was created ( |
10 |
| Vector |
array |
The vector data. |
|
|
number |
A value in the vector. |
[] |
|
| Relations |
array |
A list of relationship edges. |
|
|
string |
The details of a relationship edge. |
{'relations': []} |
|
| RequestId |
string |
The request ID. |
ABB39CC3-4488-4857-905D-2E4A051D0521 |
| Status |
string |
The status of the API call. Valid values:
|
success |
| Tokens |
integer |
The number of tokens consumed. |
42 |
| Usage |
object |
The number of tokens consumed for embedding. |
|
| EmbeddingTokens |
integer |
The number of tokens used for embedding. Note
A token is the smallest unit created by splitting the input text. A token can be a unit such as a word, a phrase, a punctuation mark, or a character. |
158 |
| ChatCompletion |
object |
The model response. |
|
| Choices |
array<object> |
The streaming output content. |
|
|
array<object> |
An item in the streaming output. |
||
| FinishReason |
string |
The reason the model stopped generating output. |
finish |
| Index |
integer |
The index of the choice. |
0 |
| Message |
object |
The response from the large language model (LLM). |
|
| Content |
string |
The message content. |
杭州的天气是晴天。 |
| Role |
string |
The role of the message author. Valid values:
|
user |
| ToolCalls |
array<object> |
The tool call responses. |
|
|
array<object> |
A tool call response. |
||
| Id |
string |
The ID of the tool call. |
"chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e" |
| Function |
object |
Details of the function that the model wants to call. |
|
| Arguments |
string |
The arguments for the function call, generated by the model in JSON format. |
{"city":"hangzhou"} |
| Name |
string |
The name of the function to call. |
"get_weather" |
| Index |
integer |
The index of the tool in the |
1 |
| ReasoningContent |
string |
The model's chain of thought (CoT) content. |
逻辑推导过程 |
| Created |
integer |
The creation time, in Unix timestamp format. |
1758529748 |
| Id |
string |
The response ID. |
273e3fc7-8f56-4167-a1bb-d35d2f3b9043 |
| Model |
string |
The name of the model used. |
qwen-plus |
| Usage |
object |
The token usage statistics for the completion. |
|
| CompletionTokens |
integer |
The number of tokens in the generated response. |
42 |
| PromptTokens |
integer |
The number of tokens in the input prompt. |
42 |
| PromptTokensDetails |
object |
Details about the prompt token usage. |
|
| CachedTokens |
integer |
The number of prompt tokens served from the cache. |
24 |
| TotalTokens |
integer |
The total number of tokens. |
42 |
| Message |
string |
The response message. |
Successful |
| Status |
string |
The status of the request. Valid values:
|
success |
Examples
Success response
JSON format
{
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"MultiCollectionRecallResult": {
"Entities": [
"{'entities': []}"
],
"Matches": [
{
"Content": "ADBPG向量数据库。\n",
"FileName": "a14b0221-e3f2-4cf2-96cd-b3c293510770.jpg",
"FileURL": "http://dailyshort-sh.oss-cn-shanghai.aliyuncs.com/vod-8efba5/f06147795c6c71f080605420848d0302/0ca34d5743a84bf7c68f489a60715dac-ld.mp4",
"Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043",
"LoaderMetadata": "{\"page\":1}\n",
"Metadata": {
"key": "{\"update_time\":\"1754446789199\",\"is_publish\":\"1\"}"
},
"RerankScore": 0.12,
"RetrievalSource": 0.12,
"Score": 10,
"Vector": [
0
]
}
],
"Relations": [
"{'relations': []}"
],
"RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
"Status": "success",
"Tokens": 42,
"Usage": {
"EmbeddingTokens": 158
}
},
"ChatCompletion": {
"Choices": [
{
"FinishReason": "finish",
"Index": 0,
"Message": {
"Content": "杭州的天气是晴天。\n",
"Role": "user",
"ToolCalls": [
{
"Id": "\"chatcmpl-c1bebafa-cc48-44e2-88c6-1a3572952f8e\"\n",
"Function": {
"Arguments": "{\"city\":\"hangzhou\"}\n",
"Name": "\"get_weather\"\n"
},
"Index": 1
}
],
"ReasoningContent": "逻辑推导过程"
}
}
],
"Created": 1758529748,
"Id": "273e3fc7-8f56-4167-a1bb-d35d2f3b9043\n",
"Model": "qwen-plus\n",
"Usage": {
"CompletionTokens": 42,
"PromptTokens": 42,
"PromptTokensDetails": {
"CachedTokens": 24
},
"TotalTokens": 42
}
},
"Message": "Successful",
"Status": "success"
}
Error codes
See Error Codes for a complete list.
Release notes
See Release Notes for a complete list.