Data Collection SDK 2.0

更新时间:
复制 MD 格式

DataCollectionClient manages document pushes for data collection in OpenSearch search applications. Use it to push behavioral data, user data, or item data — either as single synchronous requests or in buffered batches.

Namespace: OpenSearch\Client

SDK downloads and demos:

Choose a push method

DataCollectionClient provides two ways to push documents to the server:

MethodHow it worksUse when
pushSends documents synchronously in a single callYou have a ready JSON payload and want to push it immediately
add + commitBuffers documents locally, then sends them all in one commit callYou want to accumulate multiple documents before sending

Constructor

Instantiates a DataCollectionClient using an existing OpenSearchClient.

Interface definition

void OpenSearch\Client\DataCollectionClient::__construct(\OpenSearch\Client\OpenSearchClient $openSearchClient)

Parameters

ParameterTypeRequiredDescription
$openSearchClientOpenSearch\Client\OpenSearchClientYesThe base client. Handles signature calculation, server communication, and result parsing.

add

Adds a document to the local SDK client buffer. The document is not sent to the server until you call commit.

Call add multiple times to accumulate documents, then flush them all with a single commit call.

Interface definition

\OpenSearch\Generated\Common\OpenSearchResult OpenSearch\Client\DataCollectionClient::add(array $fields)

Parameters

ParameterTypeRequiredDescription
$fieldsarrayYesAll fields of a document. Supports behavioral data, user data, or item data. For example: array("user_id" => "1021468", "bhv_type" => "click").

commit

Flushes all documents in the SDK client buffer to the server.

Important

The buffer is cleared before the request is sent. If the server returns an error and you need to retry, regenerate the documents and call commit again to avoid data loss.

Interface definition

\OpenSearch\Generated\Common\OpenSearchResult OpenSearch\Client\DataCollectionClient::commit(string $searchAppName, string $dataCollectionName, string $dataCollectionType)

Parameters

ParameterTypeRequiredDescription
$searchAppNamestringYesThe name of the associated search application.
$dataCollectionNamestringYesThe name of the data collection. The console displays this name when you enable the data collection feature.
$dataCollectionTypestringYesThe data collection type. Set to BEHAVIOR.

push

Synchronously sends a batch of documents to the server in a single call.

Interface definition

\OpenSearch\Generated\Common\OpenSearchResult OpenSearch\Client\DataCollectionClient::push(string $docJson, string $searchAppName, string $dataCollectionName, string $dataCollectionType)

Parameters

ParameterTypeRequiredDescription
$docJsonstringYesA JSON-encoded array of documents to push.
$searchAppNamestringYesThe name of the associated search application.
$dataCollectionNamestringYesThe name of the data collection. The console displays this name when you enable the data collection feature.
$dataCollectionTypestringYesThe data collection type. Set to BEHAVIOR.

Behavioral data fields

The following fields apply to behavioral data documents (documents where $dataCollectionType is BEHAVIOR).

FieldTypeDescriptionExample
user_idstringUnique user ID."1120021255"
biz_idintegerNumeric business ID that maps to an OpenSearch application.1365378
rnstringThe request_id value returned in the search response. Return it as-is."156516585419723283227314"
trace_idstringSet to Alibaba if the search result comes from OpenSearch."Alibaba"
trace_infostringThe ops_request_misc value returned in the search response. Return it as-is."%7B%22request%5Fid%22..."
item_idstringPrimary key value from the primary table in the OpenSearch application."2223"
item_typestringItem type, such as goods."goods"
bhv_typestringBehavior type, such as click."click"
bhv_timestringUNIX timestamp (in seconds) when the behavior occurred."1566475047"

Examples

PHP: push documents using push

<?php
require_once("Config.inc.php");
use OpenSearch\Client\DataCollectionClient;
use OpenSearch\Generated\DataCollection\Command;

$searchAppName = "opensearch_app_name";
$dataCollectionName = "opened_data_collection_name";
$dataCollectionType = "BEHAVIOR";

$docs = json_encode(array(
    [
        "cmd" => Command::$__names[Command::ADD],
        "fields" => [
            "user_id"    => "1120021255",
            "biz_id"     => 1365378,
            // The request_id value from search results — return as-is.
            "rn"         => "156516585419723283227314",
            // Set to Alibaba when the result comes from OpenSearch.
            "trace_id"   => "Alibaba",
            // The ops_request_misc value from search results — return as-is.
            "trace_info" => "%7B%22request%5Fid%22%3A%22156516585419723283227314%22%2C%22scm%22%3A%2220140713.120006678..%22%7D",
            "item_id"    => "2223",
            "item_type"  => "goods",
            "bhv_type"   => "click",
            // UNIX timestamp in seconds.
            "bhv_time"   => "1566475047"
        ]
    ]
));

$dataCollectionClient = new DataCollectionClient($client);
$ret = $dataCollectionClient->push($docs, $searchAppName, $dataCollectionName, $dataCollectionType);
print_r(json_decode($ret->result, true));

PHP: push documents using add and commit

<?php
require_once("Config.inc.php");
use OpenSearch\Client\DataCollectionClient;
use OpenSearch\Generated\DataCollection\Command;

$searchAppName = "opensearch_app_name";
$dataCollectionName = "opened_data_collection_name";
$dataCollectionType = "BEHAVIOR";

$dataCollectionClient = new DataCollectionClient($client);

// Buffer a document locally. The document is not sent until commit() is called.
// Call add() multiple times to accumulate documents before flushing.
$dataCollectionClient->add([
    "user_id"    => "1120021255",
    "biz_id"     => 1365378,
    "rn"         => "156516585419723283227314",
    "trace_id"   => "Alibaba",
    "trace_info" => "%7B%22request%5Fid%22%3A%22156516585419723283227314%22%2C%22scm%22%3A%2220140713.120006678..%22%7D",
    "item_id"    => "2223",
    "item_type"  => "goods",
    "bhv_type"   => "click",
    "bhv_time"   => "1566475047"
]);

// Flush all buffered documents to the server.
$ret = $dataCollectionClient->commit($searchAppName, $dataCollectionName, $dataCollectionType);
print_r(json_decode($ret->result, true));

Java: push documents using push

package com.aliyun.opensearch.demo;

import com.aliyun.opensearch.DataCollectionClient;
import com.aliyun.opensearch.OpenSearchClient;
import com.aliyun.opensearch.sdk.generated.OpenSearch;
import com.aliyun.opensearch.sdk.generated.commons.OpenSearchResult;

public class PushDataCollectionDoc {
    private static String accesskey = "your ak";
    private static String secret = "your secret";
    private static String host = "your host";
    private static String searchAppName = "opensearch_app_name";
    private static String dataCollectionName = "opened_data_collection_name";
    private static String dataCollectionType = "BEHAVIOR";

    public static void main(String[] args) {
        OpenSearch opensearch = new OpenSearch(accesskey, secret, host);
        OpenSearchClient client = new OpenSearchClient(opensearch);
        DataCollectionClient dataCollectionClient = new DataCollectionClient(client);

        String docJson = "[{\"cmd\":\"ADD\",\"fields\":{\"user_id\":\"1120021255\"," +
                         "\"biz_id\":1365378,\"rn\":\"156516585419723283227314\"," +
                         "\"trace_id\":\"Alibaba\"," +
                         "\"trace_info\":\"%7B%22request%5Fid%22%3A%22156516585419723283227314%22%2C%22scm%22%3A%2220140713.120006678..%22%7D\"," +
                         "\"item_id\":\"id\",\"item_type\":\"goods\"," +
                         "\"bhv_type\":\"click\",\"bhv_time\":\"1566475047\"}}]";

        try {
            OpenSearchResult result = dataCollectionClient.push(
                docJson, searchAppName, dataCollectionName, dataCollectionType
            );
            System.out.println(result);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}

Java: push documents using add and commit

package com.aliyun.opensearch.demo;

import com.aliyun.opensearch.DataCollectionClient;
import com.aliyun.opensearch.OpenSearchClient;
import com.aliyun.opensearch.sdk.generated.OpenSearch;
import com.aliyun.opensearch.sdk.generated.commons.OpenSearchResult;
import java.util.HashMap;
import java.util.Map;

public class PushDataCollectionDoc {
    private static String accesskey = "your ak";
    private static String secret = "your secret";
    private static String host = "your host";
    private static String searchAppName = "opensearch_app_name";
    private static String dataCollectionName = "opened_data_collection_name";
    private static String dataCollectionType = "BEHAVIOR";

    public static void main(String[] args) {
        OpenSearch opensearch = new OpenSearch(accesskey, secret, host);
        OpenSearchClient client = new OpenSearchClient(opensearch);
        DataCollectionClient dataCollectionClient = new DataCollectionClient(client);

        Map<String, Object> fields = new HashMap<>();
        fields.put("user_id", "1120021255");
        fields.put("biz_id", 1365378);
        // The request_id value from search results — return as-is.
        fields.put("rn", "1564455556323223680397827");
        // Set to Alibaba when the result comes from OpenSearch.
        fields.put("trace_id", "Alibaba");
        // The ops_request_misc value from search results — return as-is.
        fields.put("trace_info", "%7B%22request%5Fid%22%3A%22156516585419723283227314%22%2C%22scm%22%3A%2220140713.120006678..%22%7D");
        fields.put("item_id", "2223");
        fields.put("item_type", "goods");
        fields.put("bhv_type", "click");
        // UNIX timestamp in seconds.
        fields.put("bhv_time", "1566475047");

        // Buffer a document locally. The document is not sent until commit() is called.
        dataCollectionClient.add(fields);

        try {
            OpenSearchResult result = dataCollectionClient.commit(
                searchAppName, dataCollectionName, dataCollectionType
            );
            System.out.println(result);
        } catch (Exception e) {
            e.printStackTrace();
        }
    }
}