Demo code for implementing scroll queries

更新时间:
复制 MD 格式

Scroll queries let you retrieve large result sets that exceed the standard pagination limit. Each request returns a scroll ID that you pass to the next request, so you can iterate through all matching documents in batches.

Prerequisites

Before you begin, ensure that you have:

  • An OpenSearch High-Performance Search Edition application

  • The API endpoint for your application (available on the application details page in the OpenSearch console)

  • An AccessKey pair with the required permissions. Use a Resource Access Management (RAM) user's AccessKey pair rather than the Alibaba Cloud account's AccessKey pair. For more information, see Create a RAM user, Create an AccessKey pair, and Access authorization rules. If you use a RAM user's AccessKey pair, make sure that the required permissions are granted to the AliyunServiceRoleForOpenSearch role using your Alibaba Cloud account.

Set up environment variables

Store your credentials as environment variables rather than hardcoding them in source code.

  • Linux and macOS — Run the following commands. Replace <access_key_id> and <access_key_secret> with the AccessKey ID and AccessKey secret of your RAM user.

    export ALIBABA_CLOUD_ACCESS_KEY_ID=<access_key_id>
    export ALIBABA_CLOUD_ACCESS_KEY_SECRET=<access_key_secret>
  • Windows — Create an environment variable file, add ALIBABA_CLOUD_ACCESS_KEY_ID and ALIBABA_CLOUD_ACCESS_KEY_SECRET with their respective values, then restart Windows for the changes to take effect.

Create a configuration file

Create Config.inc.php with your application settings. This file is shared by both the push and query code.

<?php
// Import the Autoloader.
require_once("../OpenSearch/Autoloader/Autoloader.php");
use OpenSearch\Client\OpenSearchClient;

// Read credentials from environment variables.
$accessKeyId = getenv('ALIBABA_CLOUD_ACCESS_KEY_ID');
$secret      = getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET');

// Specify the API endpoint for your region.
// Find the endpoint on the application details page in the OpenSearch console.
$endPoint = '<region endPoint>';

// Specify the application name.
$appName = '<app name>';

// Specify the drop-down suggestion model name.
$suggestName = '<suggest name>';

// Enable debug mode.
$options = array('debug' => true);

// Create the OpenSearch client.
$client = new OpenSearchClient($accessKeyId, $secret, $endPoint, $options);

Implement scroll queries

The following code shows how to implement a scroll query using OpenSearch SDK for PHP V3.

Important

In OpenSearch SDK for PHP V3, scroll queries support sorting on a single field only, and the field must be of the INT type. Sort by the primary key field to prevent duplicate results caused by data updates during the query.

Step 1: Start the scroll query

Send the first request without a scroll ID. The response returns a scroll_id and the total number of matching documents (viewtotal). Set the scroll expiration to keep the search context alive long enough to complete all iterations.

<?php
header("Content-Type:text/html;charset=utf-8");

// Import the configuration file.
require_once("Config.inc.php");
use OpenSearch\Client\SearchClient;
use OpenSearch\Util\SearchParamsBuilder;

$searchClient = new SearchClient($client);

// Build the query parameters.
$params = new SearchParamsBuilder();

// Number of documents to return per batch. No offset is needed for scroll queries.
$params->setHits(1);
$params->setAppName('The application name');
$params->setQuery("name:'Search'");

// Set the return format. Supported values: JSON, FULLJSON.
$params->setFormat("fulljson");

// Sort by the primary key field (INT type, required for scroll queries).
$params->addSort('id', SearchParamsBuilder::SORT_INCREASE);

// Apply a filter condition.
$params->setFilter('id>0');

// Specify the fields to return.
$params->setFetchFields(array('id', 'name', 'phone', 'int_arr', 'literal_arr', 'float_arr', 'cate_id'));

// Set the scroll expiration. The scroll ID from this request is valid for 3 minutes.
// The first request does not require a scroll ID.
$params->setScrollExpire('3m');

// Execute the first request.
$ret = $searchClient->execute($params->build())->result;

Step 2: Iterate through results

Use the scroll_id from each response as input to the next request. Continue until all viewtotal documents are retrieved.

for ($i = 0; $i < json_decode($ret)->result->viewtotal; $i++) {
    // Pass the scroll ID from the previous response.
    $params->setScrollId(json_decode($ret)->result->scroll_id);

    // Execute the next request and print the results.
    $ret = $searchClient->execute($params->build())->result;
    print_r($ret . '<br/><br/>');
}

Key parameters

ParameterDescriptionExample
setHits(n)Number of documents to return per batch1
setScrollExpire(t)Validity period for the scroll ID, in minutes3m
setScrollId(id)Scroll ID from the previous responsejson_decode($ret)->result->scroll_id
setFormat(f)Return format. Supported values: JSON, FULLJSONfulljson
addSort(field, order)Sort field and order (single INT field only)addSort('id', SearchParamsBuilder::SORT_INCREASE)
setFilter(condition)Filter condition'id>0'
setFetchFields(fields)Fields to include in the responsearray('id', 'name')