访问向量数据库

更新时间:
复制为 MD 格式

开启向量引擎优化功能后,您可以通过SQLAPI接口访问云原生数据仓库AnalyticDB PostgreSQL向量数据库。本文介绍两种访问方式的使用方法和示例代码。

通过SQL访问向量数据库

Java

云原生数据仓库AnalyticDB PostgreSQL向量数据库支持使用PostgreSQLGreenplumJDBC驱动包连接。JDBC连接数据库的操作,请参见JDBC

在确保有PostgreSQL JDBC驱动的前提下,如果您使用Maven,可以在pom.xml文件中添加以下依赖:

<dependency>
    <groupId>org.postgresql</groupId>
    <artifactId>postgresql</artifactId>
    <version>42.2.5</version>
</dependency>

通过Java代码访问向量数据库的示例代码如下:

import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.ResultSet;
import java.sql.Statement;

public class GreenplumSample {
    public static void main(String[] args) {
        String url = "jdbc:postgresql://yourhost:yourport/yourdbname";
        String user = "yourusername";
        String password = "yourpassword";

        try {
            // 加载驱动
            Class.forName("org.postgresql.Driver");
            // 创建连接
            Connection con = DriverManager.getConnection(url, user, password);
            // 创建语句
            Statement st = con.createStatement();
            // 执行查询
            String query = "SELECT * FROM yourtable LIMIT 10";
            ResultSet rs = st.executeQuery(query);
            // 处理结果
            while(rs.next()) {
                // 假设知道结果集中至少有一列,且为字符串类型
                String resultColumn = rs.getString(1);
                System.out.println(resultColumn);
            }
            // 关闭资源
            rs.close();
            st.close();
            con.close();
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Python

云原生数据仓库AnalyticDB PostgreSQL向量数据库可以使用psycopg2工具连接,并通过Python代码导入和查询向量数据。psycopg2连接数据库的操作,请参见Python

通过python代码访问向量数据库的示例代码如下:

import psycopg2
from psycopg2 import pool

# 创建连接池
connection_pool = psycopg2.pool.SimpleConnectionPool(
    minconn=1,
    maxconn=10,
    user='your_username',
    password='your_password',
    host='your_host',
    port='your_port',
    database='your_database'
)

# 连接探活检查
def is_connection_alive(conn):
    try:
        conn.cursor().execute("SELECT 1")
    except (psycopg2.OperationalError, psycopg2.InterfaceError):
        return False
    return True

# 从连接池获取连接对象
def get_connection():
    conn = connection_pool.getconn()
    while not is_connection_alive(conn):
        conn = connection_pool.getconn()
    return conn

# 将连接对象放回连接池
def release_connection(conn):
    connection_pool.putconn(conn)

# 使用连接对象执行查询
def execute_query(query):
    conn = get_connection()
    cursor = conn.cursor()
    try:
        cursor.execute(query)
        if query.startswith("SELECT"):
            result = cursor.fetchall()
        else:
            conn.commit()
            result = None
    except (psycopg2.DatabaseError, psycopg2.InterfaceError) as e:
        print(f"Error executing query: {e}")
        conn.rollback()
        result = None
    cursor.close()
    release_connection(conn)
    return result

# 示例查询
# 查询
select_query = "SELECT * FROM your_table"
result = execute_query(select_query)
if result:
    print(result)

# 插入
insert_query = "INSERT INTO your_table (column1, column2) VALUES ('value1', 'value2')"
execute_query(insert_query)

# 更新
update_query = "UPDATE your_table SET column1 = 'new_value' WHERE column2 = 'value2'"
execute_query(update_query)

# 删除
delete_query = "DELETE FROM your_table WHERE column1 = 'value1'"
execute_query(delete_query)

C

云原生数据仓库AnalyticDB PostgreSQL向量数据库可以使用libpq库连接,并使用C语言代码导入和查询向量数据。

通过C语言代码访问向量数据库的示例代码如下:

#include <stdio.h>
#include <stdlib.h>
#include <libpq-fe.h>

int main() {
    const char *conninfo;
    PGconn *conn;
    PGresult *res;
    int nFields;
    int i, j;

    // 设置连接字符串
    conninfo = "dbname=yourdbname user=yourusername host=yourhostname port=yourport password=yourpassword";

    // 创建连接
    conn = PQconnectdb(conninfo);

    // 检查状态
    if (PQstatus(conn) != CONNECTION_OK) {
        fprintf(stderr, "连接失败: %s", PQerrorMessage(conn));
        PQfinish(conn);
        exit(1);
    }

    // 执行一个查询
    res = PQexec(conn, "SELECT * FROM yourtablename LIMIT 10");

    // 检查是否正常返回了结果集
    if (PQresultStatus(res) != PGRES_TUPLES_OK) {
        fprintf(stderr, "SELECT命令未返回数据: %s", PQerrorMessage(conn));
        PQclear(res);
        PQfinish(conn);
        exit(1);
    }

    // 获取字段数
    nFields = PQnfields(res);

    // 打印每一行
    for (i = 0; i < PQntuples(res); i++) {
        for (j = 0; j < nFields; j++) {
            printf("%s = %s", PQfname(res, j), PQgetvalue(res, i, j));
        }
        printf("\n");
    }

    // 清理
    PQclear(res);

    // 关闭连接
    PQfinish(conn);

    return 0;
}

通过API访问向量数据库

OpenAPI封装了云原生数据仓库AnalyticDB PostgreSQL向量操作的DDLDML,使您可以通过OpenAPI来管理向量数据。

前提条件

操作流程

  1. 安装SDK。

  2. 初始化Client。

  3. 初始化向量库。

  4. 创建Namespace。

  5. 创建Collection。

  6. 上传向量数据。

  7. 召回向量数据。

安装SDK

SDK下载地址请参见SDK参考

Java

如果您使用Maven,可以在pom.xml文件中添加以下依赖:

<dependency>
    <groupId>com.aliyun</groupId>
    <artifactId>gpdb20160503</artifactId>
    <version>3.12.0</version>
</dependency>

<!--  依赖版本不低于 -->
<dependency>
    <groupId>com.aliyun</groupId>
    <artifactId>tea-openapi</artifactId>
    <version>0.3.1</version>
</dependency>
<dependency>
    <groupId>com.aliyun</groupId>
    <artifactId>tea</artifactId>
    <version>1.2.8</version>
</dependency>
<dependency>
    <groupId>com.aliyun</groupId>
    <artifactId>openapiutil</artifactId>
    <version>0.2.1</version>
</dependency>
<dependency>
    <groupId>com.aliyun</groupId>
    <artifactId>credentials-java</artifactId>
    <version>0.3.0</version>
</dependency>
<dependency>
    <groupId>com.aliyun</groupId>
    <artifactId>tea-util</artifactId>
    <version>0.2.21</version>
</dependency>

Go

执行以下命令安装Go SDK:

go get github.com/alibabacloud-go/gpdb-20160503/v4/client
go get github.com/alibabacloud-go/darabonba-openapi/v2/client
go get github.com/alibabacloud-go/tea-utils/v2/service
go get github.com/alibabacloud-go/tea

Python 3

当您没有指定SDK版本时,将自动安装最新版本的SDK,具体代码如下:

pip install alibabacloud_gpdb20160503
pip install alibabacloud_tea_openapi

当您需要安装指定版本的SDK时(本文alibabacloud_gpdb201605033.5.0版本为例,alibabacloud_tea_openapi0.3.8版本为例),请执行如下命令:

pip install alibabacloud_gpdb20160503==3.5.0
pip install alibabacloud_tea_openapi==0.3.8

初始化Client

初始化访问OpenAPIClient,调用示例如下,环境变量参数如下:

  • ALIBABA_CLOUD_ACCESS_KEY_ID:访问OpenAPIAccess Key ID。

  • ALIBABA_CLOUD_ACCESS_KEY_SECRET:访问OpenAPISecret Access Key。

Java

import com.aliyun.gpdb20160503.Client;
import com.aliyun.teaopenapi.models.Config;

public static Client getClient() throws Exception {
    Config config = new Config();
    config.setAccessKeyId(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID"));
    config.setAccessKeySecret(System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET"));
    config.setRegionId("cn-beijing");    // 实例所在region,请按照目标实例设置
    config.setEndpoint("gpdb.****uncs.com");   // 如果通过公网ip访问,不需配置此项,否则按照https://api.aliyun.com/product/gpdb配置
    config.setMaxIdleConns(200);  // 最大连接数,按照此Client的最大并发设置
    return new Client(config);
}

Go

package main

import (
	"os"
	
	openapi "github.com/alibabacloud-go/darabonba-openapi/v2/client"
	gpdb20160503 "github.com/alibabacloud-go/gpdb-20160503/v4/client"
	"github.com/alibabacloud-go/tea/tea"
)

func CreateClient() (_result *gpdb20160503.Client, _err error) {
	config := &openapi.Config{
		AccessKeyId:     tea.String(os.Getenv("ALIBABA_CLOUD_ACCESS_KEY_ID")),
		AccessKeySecret: tea.String(os.Getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET")),
	}
	// Endpoint 请参考 https://api.aliyun.com/product/gpdb
	config.Endpoint = tea.String("gpdb.aliyuncs.com")
	_result = &gpdb20160503.Client{}
	_result, _err = gpdb20160503.NewClient(config)
	return _result, _err
}

Python 3

from alibabacloud_tea_openapi import models as open_api_models
from alibabacloud_gpdb20160503.client import Client
import os

ALIBABA_CLOUD_ACCESS_KEY_ID = os.environ['ALIBABA_CLOUD_ACCESS_KEY_ID']
ALIBABA_CLOUD_ACCESS_KEY_SECRET = os.environ['ALIBABA_CLOUD_ACCESS_KEY_SECRET']

def get_client():
    config = open_api_models.Config(
        access_key_id=ALIBABA_CLOUD_ACCESS_KEY_ID,
        access_key_secret=ALIBABA_CLOUD_ACCESS_KEY_SECRET
    )
    config.region_id = "cn-beijing"  # 实例所在region
    return Client(config)

初始化向量库

在使用向量检索前,需初始化knowledgebase库以及全文检索相关功能。

调用示例如下,参数说明,请参见InitVectorDatabase - 初始化向量数据库

Java

import com.aliyun.gpdb20160503.models.*;
import com.aliyun.gpdb20160503.Client;
import com.google.gson.Gson;

public static void initVectorDatabase() throws Exception {
    String region = "cn-beijing";  // 实例所在region
    String instanceId = "gp-bp1c62r3l489****";  // 实例id
    String managerAccount = "myaccount";  // 实例初始账号
    String managerAccountPassword = "myaccount_password";  // 实例初始密码

    InitVectorDatabaseRequest request = new InitVectorDatabaseRequest();
    request.setRegionId(region);
    request.setDBInstanceId(instanceId);
    request.setManagerAccount(managerAccount);
    request.setManagerAccountPassword(managerAccountPassword);
    Client client = getClient();
    InitVectorDatabaseResponse response = client.initVectorDatabase(request);
    System.out.println(response.getStatusCode());
    System.out.println(new Gson().toJson(response.getBody()));
}


public static void main(String[ ] args) throws Exception {
    initVectorDatabase();
}

Go

package main

import (
	"fmt"

	openapi "github.com/alibabacloud-go/darabonba-openapi/v2/client"
	gpdb20160503 "github.com/alibabacloud-go/gpdb-20160503/v4/client"
	util "github.com/alibabacloud-go/tea-utils/v2/service"
	"github.com/alibabacloud-go/tea/tea"
)

func initVectorDatabase() {
	client, _err := CreateClient()
	if _err != nil {
		panic(_err)
	}

	initVectorDatabaseRequest := &gpdb20160503.InitVectorDatabaseRequest{
		RegionId:               tea.String("cn-beijing"),
		DBInstanceId:           tea.String("gp-bp1c62r3l489****"),
		ManagerAccount:         tea.String("myaccount"),
		ManagerAccountPassword: tea.String("myaccount_password"),
	}
	runtime := &util.RuntimeOptions{}
	response, _err := client.InitVectorDatabaseWithOptions(initVectorDatabaseRequest, runtime)
	if _err != nil {
		panic(_err)
	}
	fmt.Printf("response is %#v\n", response.Body)
}

func main() {
	initVectorDatabase()
}

Python 3

from alibabacloud_gpdb20160503 import models as gpdb_20160503_models

def init_vector_database():
    region_id = "cn-beijing"  # 实例所在region
    dbinstance_id = "gp-bp1c62r3l489****"  # 实例id
    manager_account = "myaccount"  # 实例初始账号
    manager_account_password = "myaccount_password"  # 实例初始密码

    request = gpdb_20160503_models.InitVectorDatabaseRequest(
        region_id=region_id,
        dbinstance_id=dbinstance_id,
        manager_account=manager_account,
        manager_account_password=manager_account_password
    )
    response = get_client().init_vector_database(request)
    print(f"init_vector_database response code: {response.status_code}, body:{response.body}")

if __name__ == '__main__':
    init_vector_database()

# output: body:
# {
#    "Message":"success",
#    "RequestId":"FC1E0318-E785-1F21-A33C-FE4B0301B608",
#    "Status":"success"
# }

创建Namespace

Namespace用于Schema隔离,在使用向量前,需至少创建一个Namespace或者使用publicNamespace。

调用示例如下,参数说明,请参见CreateNamespace - 创建命名空间

Java

import com.aliyun.gpdb20160503.models.*;
import com.aliyun.gpdb20160503.Client;
import com.google.gson.Gson;

public static void createNamespace() throws Exception {
    String region = "cn-beijing";  // 实例所在region
    String instanceId = "gp-bp1c62r3l489****";  // 实例id
    String managerAccount = "myaccount";  // 实例初始账号
    String managerAccountPassword = "myaccount_password";  // 实例初始密码

    String namespace = "vector_test";  // 要创建的namespace
    String namespacePassword = "vector_test_password";  // namespace对应的密码

    CreateNamespaceRequest request = new CreateNamespaceRequest();
    request.setRegionId(region);
    request.setDBInstanceId(instanceId);
    request.setManagerAccount(managerAccount);
    request.setManagerAccountPassword(managerAccountPassword);
    request.setNamespace(namespace);
    request.setNamespacePassword(namespacePassword);
    Client client = getClient();
    CreateNamespaceResponse response = client.createNamespace(request);
    System.out.println(response.getStatusCode());
    System.out.println(new Gson().toJson(response.getBody()));
}


public static void main(String[ ] args) throws Exception {
    createNamespace();
}

Go

package main

import (
	"fmt"

	openapi "github.com/alibabacloud-go/darabonba-openapi/v2/client"
	gpdb20160503 "github.com/alibabacloud-go/gpdb-20160503/v4/client"
	util "github.com/alibabacloud-go/tea-utils/v2/service"
	"github.com/alibabacloud-go/tea/tea"
)

func createNamespace() {
	client, _err := CreateClient()
	if _err != nil {
		panic(_err)
	}

	createNamespaceRequest := &gpdb20160503.CreateNamespaceRequest{
		RegionId:               tea.String("cn-beijing"),
		DBInstanceId:           tea.String("gp-bp1c62r3l489****"),
		ManagerAccount:         tea.String("myaccount"),
		ManagerAccountPassword: tea.String("myaccount_password"),
		Namespace:              tea.String("vector_test"),
		NamespacePassword:      tea.String("vector_test_password"),
	}
	runtime := &util.RuntimeOptions{}
	response, _err := client.CreateNamespaceWithOptions(createNamespaceRequest, runtime)
	if _err != nil {
		panic(_err)
	}
	fmt.Printf("response is %#v\n", response.Body)
}

func main() {
	createNamespace()
}

Python 3

def create_namespace():
    region_id = "cn-beijing"  # 实例所在region
    dbinstance_id = "gp-bp1c62r3l489****"  # 实例id
    manager_account = "myaccount"  # 实例初始账号
    manager_account_password = "myaccount_password"  # 实例初始密码
    namespace = "vector_test"  # 要创建的namespace
    namespace_password = "vector_test_password"  # namespace对应的密码

    request = gpdb_20160503_models.CreateNamespaceRequest(
        region_id=region_id,
        dbinstance_id=dbinstance_id,
        manager_account=manager_account,
        manager_account_password=manager_account_password,
        namespace=namespace,
        namespace_password=namespace_password
    )
    response = get_client().create_namespace(request)
    print(f"create_namespace response code: {response.status_code}, body:{response.body}")

if __name__ == '__main__':
    create_namespace()

# output: body:
# {
#    "Message":"success",
#    "RequestId":"78356FC9-1920-1E09-BB7B-CCB6BD267124",
#    "Status":"success"
# }

创建完后,可以在实例的knowledgebase库查看对应的Schema。

SELECT schema_name FROM information_schema.schemata;

创建Collection

Collection用于存储向量数据,并使用Namespace隔离。

调用示例如下,参数说明,请参见创建向量数据集

Java

import com.aliyun.gpdb20160503.models.*;
import com.aliyun.gpdb20160503.Client;
import com.google.gson.Gson;

import java.util.HashMap;
import java.util.Map;

public static void createCollection() throws Exception {
    String region = "cn-beijing";  // 实例所在region
    String instanceId = "gp-bp1c62r3l489****";  // 实例id
    String managerAccount = "myaccount";  // 实例初始账号
    String managerAccountPassword = "myaccount_password";  // 实例初始密码
    String namespace = "vector_test";  // 已创建的namespace

    String collection = "document";  // 要创建的collection
    Map<String, String> metadata = new HashMap<>();
    metadata.put("title", "text");
    metadata.put("link", "text");
    metadata.put("content", "text");
    metadata.put("pv", "int");
    String fullTextRetrievalFields = "title,content";  // 全文检索字段
    Long dimension = 10L;  // 向量维度

    CreateCollectionRequest request = new CreateCollectionRequest();
    request.setRegionId(region);
    request.setDBInstanceId(instanceId);
    request.setManagerAccount(managerAccount);
    request.setManagerAccountPassword(managerAccountPassword);
    request.setNamespace(namespace);
    request.setCollection(collection);
    request.setMetadata(new Gson().toJson(metadata));
    request.setFullTextRetrievalFields(fullTextRetrievalFields);
    request.setDimension(dimension);
    request.setParser("zh_cn");
    Client client = getClient();
    CreateCollectionResponse response = client.createCollection(request);
    System.out.println(response.getStatusCode());
    System.out.println(new Gson().toJson(response.getBody()));
}


public static void main(String[ ] args) throws Exception {
    createCollection();
}

Go

package main

import (
	"fmt"

	openapi "github.com/alibabacloud-go/darabonba-openapi/v2/client"
	gpdb20160503 "github.com/alibabacloud-go/gpdb-20160503/v4/client"
	util "github.com/alibabacloud-go/tea-utils/v2/service"
	"github.com/alibabacloud-go/tea/tea"
)

func createCollection() {
	client, _err := CreateClient()
	if _err != nil {
		panic(_err)
	}

	createCollectionRequest := &gpdb20160503.CreateCollectionRequest{
		RegionId:                tea.String("cn-beijing"),
		DBInstanceId:            tea.String("gp-bp1c62r3l489****"),
		ManagerAccount:          tea.String("myaccount"),
		ManagerAccountPassword:  tea.String("myaccount_password"),
		Namespace:               tea.String("vector_test"),
		Collection:              tea.String("document"),
		Dimension:               tea.Int64(3),
		Parser:                  tea.String("zh_cn"),
		FullTextRetrievalFields: tea.String("title,content"),
		Metadata:                tea.String("{\"title\":\"text\",\"content\":\"text\",\"response\":\"int\"}"),
	}
	runtime := &util.RuntimeOptions{}
	response, _err := client.CreateCollectionWithOptions(createCollectionRequest, runtime)
	if _err != nil {
		panic(_err)
	}
	fmt.Printf("response is %#v\n", response.Body)
}

func main() {
	createCollection()
}

Python 3

def create_collection():
    region_id = "cn-beijing"  # 实例所在region
    dbinstance_id = "gp-bp1c62r3l489****"  # 实例id
    manager_account = "myaccount"  # 实例初始账号
    manager_account_password = "myaccount_password"  # 实例初始密码
    namespace = "vector_test"  # 已创建的namespace
    collection = "document"  # 要创建的collection
    metadata = '{"title":"text", "content": "text", "page":"int"}'
    full_text_retrieval_fields = "title,content"  # 全文检索字段
    dimension = 8  # 向量维度

    request = gpdb_20160503_models.CreateCollectionRequest(
        region_id=region_id,
        dbinstance_id=dbinstance_id,
        manager_account=manager_account,
        manager_account_password=manager_account_password,
        namespace=namespace,
        collection=collection,
        metadata=metadata,
        full_text_retrieval_fields=full_text_retrieval_fields,
        dimension=dimension
    )
    response = get_client().create_collection(request)
    print(f"create_collection response code: {response.status_code}, body:{response.body}")

if __name__ == '__main__':
    create_collection()

# output: body:
# {
#    "Message":"success",
#    "RequestId":"7BC35B66-5F49-1E79-A153-8D26576C4A3E",
#    "Status":"success"
# }

创建完后,可以在实例的knowledgebase库查看对应的Table。

SELECT tablename FROM pg_tables WHERE schemaname='vector_test';

上传向量数据

将准备好的Embedding向量数据上传到对应的Collection中。

调用示例如下,参数说明,请参见UpsertCollectionData - 上传向量数据

Java

import com.aliyun.gpdb20160503.models.*;
import com.aliyun.gpdb20160503.Client;
import com.google.gson.Gson;

import java.util.ArrayList;
import java.util.Arrays;
import java.util.HashMap;
import java.util.List;
import java.util.Map;

public static void upsertCollectionData() throws Exception {
    String region = "cn-beijing";  // 实例所在region
    String instanceId = "gp-bp1c62r3l489****";  // 实例id
    String namespace = "vector_test";  // 已创建的namespace
    String namespacePassword = "vector_test_password";  // namespace密码
    String collection = "document";  // 已创建的collection

    UpsertCollectionDataRequest request = new UpsertCollectionDataRequest();
    request.setRegionId(region);
    request.setDBInstanceId(instanceId);
    request.setNamespace(namespace);
    request.setNamespacePassword(namespacePassword);
    request.setCollection(collection);
    request.setRows(getRows());
    Client client = getClient();
    UpsertCollectionDataResponse response = client.upsertCollectionData(request);
    System.out.println(response.getStatusCode());
    System.out.println(new Gson().toJson(response.getBody()));
}

public static List<UpsertCollectionDataRequest.UpsertCollectionDataRequestRows> getRows() {
    List<UpsertCollectionDataRequest.UpsertCollectionDataRequestRows> rows = new ArrayList<>();
    UpsertCollectionDataRequest.UpsertCollectionDataRequestRows row = new UpsertCollectionDataRequest.UpsertCollectionDataRequestRows();
    Map<String, String> metadata = new HashMap<>();
    metadata.put("title", "测试文档");
    metadata.put("content", "测试内容");
    metadata.put("link", "http://127.0.0.1/document1");
    metadata.put("pv", "1000");
    row.setMetadata(metadata);
    row.setVector(Arrays.asList(0.2894745251078251,0.5364747050266715,0.1276845661831275,0.22528871956822372,0.7009319238651552,0.40267406135256123,0.8873626696379067,0.1248525955774931,0.9115507046412368,0.2450859133174706));
    rows.add(row);
    return rows;
}


public static void main(String[ ] args) throws Exception {
    upsertCollectionData();
}

Go

package main

import (
	"fmt"

	openapi "github.com/alibabacloud-go/darabonba-openapi/v2/client"
	gpdb20160503 "github.com/alibabacloud-go/gpdb-20160503/v4/client"
	util "github.com/alibabacloud-go/tea-utils/v2/service"
	"github.com/alibabacloud-go/tea/tea"
)

func upsertCollectionData() {
	client, _err := CreateClient()
	if _err != nil {
		panic(_err)
	}

	rows0Metadata := map[string]*string{
		"title":    tea.String("测试文档"),
		"content":  tea.String("测试内容"),
		"response": tea.String("1"),
	}
	rows0 := &gpdb20160503.UpsertCollectionDataRequestRows{
		Metadata: rows0Metadata,
		Id:       tea.String("0CB55798-ECF5-4064-B81E-FE35B19E01A6"),

		Vector:   [ ]*float64{tea.Float64(0.2894745251078251), tea.Float64(0.5364747050266715), tea.Float64(0.1276845661831275)},

	}
	upsertCollectionDataRequest := &gpdb20160503.UpsertCollectionDataRequest{
		RegionId:          tea.String("cn-beijing"),

		Rows:              [ ]*gpdb20160503.UpsertCollectionDataRequestRows{rows0},

		DBInstanceId:      tea.String("gp-bp1c62r3l489****"),
		Collection:        tea.String("document"),
		Namespace:         tea.String("vector_test"),
		NamespacePassword: tea.String("vector_test_password"),
	}
	runtime := &util.RuntimeOptions{}
	response, _err := client.UpsertCollectionDataWithOptions(upsertCollectionDataRequest, runtime)
	if _err != nil {
		panic(_err)
	}
	fmt.Printf("response is %#v\n", response.Body)
}

func main() {
	upsertCollectionData()
}

Python 3

def upsert_collection_data():
    region_id = "cn-beijing"  # 实例所在region
    dbinstance_id = "gp-bp1c62r3l489****"  # 实例id
    namespace = "vector_test"  # 已创建的namespace
    namespace_password = "vector_test_password"  # namespace密码
    collection = "document"  # 已创建的collection


    rows = [ ]

    rows.append(gpdb_20160503_models.UpsertCollectionDataRequestRows(
        id="0CB55798-ECF5-4064-B81E-FE35B19E01A6",
        metadata={
            "page": 1,
            "content": "测试内容",
            "title": "测试文档"
        },
        vector=[0.2894745251078251, 0.5364747050266715, 0.14858841010401188, 0.42140750105351877,
                0.5780346820809248, 0.1145475372279496, 0.04329004329004329, 0.43246796493549741]
    ))

    request = gpdb_20160503_models.UpsertCollectionDataRequest(
        region_id=region_id,
        dbinstance_id=dbinstance_id,
        namespace=namespace,
        namespace_password=namespace_password,
        collection=collection,
        rows=rows,
    )
    response = get_client().upsert_collection_data(request)
    print(f"upsert_collection_data response code: {response.status_code}, body:{response.body}")

if __name__ == '__main__':
    upsert_collection_data()

# output: body:
# {
#    "Message":"success",
#    "RequestId":"8FEE5D1E-ECE8-1F2F-A17F-48039125CDC3",
#    "Status":"success"
# }

上传完成,可以在实例的knowledgebase库查看数据。

SELECT * FROM vector_test.document;

召回向量数据

准备需要召回的查询向量或全文检索字段,执行查询接口。

调用示例如下,参数说明,请参见QueryCollectionData - 召回向量数据

Java

import com.aliyun.gpdb20160503.models.QueryCollectionDataRequest;
import com.aliyun.gpdb20160503.models.QueryCollectionDataResponse;
import com.aliyun.gpdb20160503.Client;
import com.google.gson.Gson;

import java.util.Arrays;

public static void queryCollectionData() throws Exception {
    QueryCollectionDataRequest request = new QueryCollectionDataRequest();
    request.setDBInstanceId("gp-bp1c62r3l489****");
    request.setCollection("document");
    request.setNamespace("vector_test");
    request.setNamespacePassword("vector_test_password");
    request.setContent("测试");
    request.setFilter("pv > 10");
    request.setTopK(10L);
    request.setVector(Arrays.asList(0.7152607422256894,0.5524872066437732,0.1168505269851303,0.704130971473022,0.4118874999967596,0.2451574619214022,0.18193414783144812,0.3050522957905741,0.24846180714868163,0.0549715380856951));
    request.setRegionId("cn-beijing");

    Client client = getClient();
    QueryCollectionDataResponse response = client.queryCollectionData(request);
    System.out.println(response.getStatusCode());
    System.out.println(new Gson().toJson(response.getBody()));
}


public static void main(String[ ] args) throws Exception {
    queryCollectionData();
}

返回结果如下:

{
  "Matches": {
    "match": [
      {
        "Id": "0CB55798-ECF5-4064-B81E-FE35B19E01A6",
        "Metadata": {
          "title":"测试文档",
          "content":"测试内容",
          "link":"http://127.0.0.1/document1",
          "pv":"1000"
        },
        "Values": [
           0.2894745251078251,
           0.5364747050266715,
           0.1276845661831275,
           0.22528871956822372,
           0.7009319238651552,
           0.40267406135256123,
           0.8873626696379067,
           0.1248525955774931,
           0.9115507046412368,
           0.2450859133174706
        ]
      }
    ]
  },
  "RequestId": "ABB39CC3-4488-4857-905D-2E4A051D0521",
  "Status": "success"
}

Go

package main

import (
	"fmt"

	openapi "github.com/alibabacloud-go/darabonba-openapi/v2/client"
	gpdb20160503 "github.com/alibabacloud-go/gpdb-20160503/v4/client"
	util "github.com/alibabacloud-go/tea-utils/v2/service"
	"github.com/alibabacloud-go/tea/tea"
)

func queryCollectionData() {
	client, _err := CreateClient()
	if _err != nil {
		panic(_err)
	}

	queryCollectionDataRequest := &gpdb20160503.QueryCollectionDataRequest{
		RegionId:          tea.String("cn-beijing"),
		DBInstanceId:      tea.String("gp-bp1c62r3l489****"),
		Collection:        tea.String("document"),
		Namespace:         tea.String("vector_test"),
		NamespacePassword: tea.String("vector_test_password"),
		Content:           tea.String("测试"),
		Filter:            tea.String("response > 0"),
		TopK:              tea.Int64(10),

		Vector:            [ ]*float64{tea.Float64(0.7152607422256894), tea.Float64(0.5524872066437732), tea.Float64(0.1168505269851303)},

	}
	runtime := &util.RuntimeOptions{}
	response, _err := client.QueryCollectionDataWithOptions(queryCollectionDataRequest, runtime)
	if _err != nil {
		panic(_err)
	}
	fmt.Printf("response is %#v\n", response.Body)
}

func main() {
	queryCollectionData()
}

返回结果如下:

{
   "Matches": {
      "match": [
         {
            "Id": "0CB55798-ECF5-4064-B81E-FE35B19E01A6",
            "Metadata": {
               "content": "测试内容",
               "response": "1",
               "source": "3",
               "title": "测试文档"
            },
            "MetadataV2": {
               "content": "测试内容",
               "response": 1,
               "source": 3,
               "title": "测试文档"
            },
            "Score": 0.9132830731723668,
            "Values": {
               "value": [
                  0.28947452,
                  0.5364747,
                  0.12768456
               ]
            }
         }
      ]
   },
   "RequestId": "707D2202-61A6-53DF-AAD2-E8DE276CE292",
   "Status": "success"
}

Python 3

def query_collection_data():
    region_id = "cn-beijing"  # 实例所在region
    dbinstance_id = "gp-bp1c62r3l489****"  # 实例id
    namespace = "vector_test"  # 已创建的namespace
    namespace_password = "vector_test_password"  # namespace密码
    collection = "document"  # 已创建的collection

    content = "test query"
    vector = [1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0]

    request = gpdb_20160503_models.QueryCollectionDataRequest(
        region_id=region_id,
        dbinstance_id=dbinstance_id,
        namespace=namespace,
        namespace_password=namespace_password,
        collection=collection,
        top_k=5,
        content=content,
        vector=vector,
    )
    response = get_client().query_collection_data(request)
    print(f"query_collection_data response code: {response.status_code}, body:{response.body}")

if __name__ == '__main__':
    query_collection_data()

# output:
# query_collection_data response code: 200, body:{'Matches': {'match': [{'Id': '0CB55798-ECF5-4064-B81E-FE35B19E01A6', 'Metadata': {'source': 1, 'page': '1', 'title': '测试文档', 'content': '测试内容'}, 'Score': 0.7208109110736349, 'Values': {'value': [0.28947452, 0.5364747, 0.1485884, 0.4214075, 0.5780347, 0.114547536, 0.043290045, 0.7]}}]}, 'RequestId': '709E2C82-FE25-1722-9DBB-00AD0F85ABBB', 'Status': 'success'}

相关文档

如果有更多其他语言客户端的使用需求,请参见pgvector兼容模式使用指南