Scroll queries let you retrieve large result sets that exceed the standard pagination limit. Each request returns a scroll ID that you pass to the next request, so you can iterate through all matching documents in batches.
Prerequisites
Before you begin, ensure that you have:
An OpenSearch High-Performance Search Edition application
The API endpoint for your application (available on the application details page in the OpenSearch console)
An AccessKey pair with the required permissions. Use a Resource Access Management (RAM) user's AccessKey pair rather than the Alibaba Cloud account's AccessKey pair. For more information, see Create a RAM user, Create an AccessKey pair, and Access authorization rules. If you use a RAM user's AccessKey pair, make sure that the required permissions are granted to the AliyunServiceRoleForOpenSearch role using your Alibaba Cloud account.
Set up environment variables
Store your credentials as environment variables rather than hardcoding them in source code.
Linux and macOS — Run the following commands. Replace
<access_key_id>and<access_key_secret>with the AccessKey ID and AccessKey secret of your RAM user.export ALIBABA_CLOUD_ACCESS_KEY_ID=<access_key_id> export ALIBABA_CLOUD_ACCESS_KEY_SECRET=<access_key_secret>Windows — Create an environment variable file, add
ALIBABA_CLOUD_ACCESS_KEY_IDandALIBABA_CLOUD_ACCESS_KEY_SECRETwith their respective values, then restart Windows for the changes to take effect.
Create a configuration file
Create Config.inc.php with your application settings. This file is shared by both the push and query code.
<?php
// Import the Autoloader.
require_once("../OpenSearch/Autoloader/Autoloader.php");
use OpenSearch\Client\OpenSearchClient;
// Read credentials from environment variables.
$accessKeyId = getenv('ALIBABA_CLOUD_ACCESS_KEY_ID');
$secret = getenv('ALIBABA_CLOUD_ACCESS_KEY_SECRET');
// Specify the API endpoint for your region.
// Find the endpoint on the application details page in the OpenSearch console.
$endPoint = '<region endPoint>';
// Specify the application name.
$appName = '<app name>';
// Specify the drop-down suggestion model name.
$suggestName = '<suggest name>';
// Enable debug mode.
$options = array('debug' => true);
// Create the OpenSearch client.
$client = new OpenSearchClient($accessKeyId, $secret, $endPoint, $options);Implement scroll queries
The following code shows how to implement a scroll query using OpenSearch SDK for PHP V3.
In OpenSearch SDK for PHP V3, scroll queries support sorting on a single field only, and the field must be of the INT type. Sort by the primary key field to prevent duplicate results caused by data updates during the query.
Step 1: Start the scroll query
Send the first request without a scroll ID. The response returns a scroll_id and the total number of matching documents (viewtotal). Set the scroll expiration to keep the search context alive long enough to complete all iterations.
<?php
header("Content-Type:text/html;charset=utf-8");
// Import the configuration file.
require_once("Config.inc.php");
use OpenSearch\Client\SearchClient;
use OpenSearch\Util\SearchParamsBuilder;
$searchClient = new SearchClient($client);
// Build the query parameters.
$params = new SearchParamsBuilder();
// Number of documents to return per batch. No offset is needed for scroll queries.
$params->setHits(1);
$params->setAppName('The application name');
$params->setQuery("name:'Search'");
// Set the return format. Supported values: JSON, FULLJSON.
$params->setFormat("fulljson");
// Sort by the primary key field (INT type, required for scroll queries).
$params->addSort('id', SearchParamsBuilder::SORT_INCREASE);
// Apply a filter condition.
$params->setFilter('id>0');
// Specify the fields to return.
$params->setFetchFields(array('id', 'name', 'phone', 'int_arr', 'literal_arr', 'float_arr', 'cate_id'));
// Set the scroll expiration. The scroll ID from this request is valid for 3 minutes.
// The first request does not require a scroll ID.
$params->setScrollExpire('3m');
// Execute the first request.
$ret = $searchClient->execute($params->build())->result;Step 2: Iterate through results
Use the scroll_id from each response as input to the next request. Continue until all viewtotal documents are retrieved.
for ($i = 0; $i < json_decode($ret)->result->viewtotal; $i++) {
// Pass the scroll ID from the previous response.
$params->setScrollId(json_decode($ret)->result->scroll_id);
// Execute the next request and print the results.
$ret = $searchClient->execute($params->build())->result;
print_r($ret . '<br/><br/>');
}Key parameters
| Parameter | Description | Example |
|---|---|---|
setHits(n) | Number of documents to return per batch | 1 |
setScrollExpire(t) | Validity period for the scroll ID, in minutes | 3m |
setScrollId(id) | Scroll ID from the previous response | json_decode($ret)->result->scroll_id |
setFormat(f) | Return format. Supported values: JSON, FULLJSON | fulljson |
addSort(field, order) | Sort field and order (single INT field only) | addSort('id', SearchParamsBuilder::SORT_INCREASE) |
setFilter(condition) | Filter condition | 'id>0' |
setFetchFields(fields) | Fields to include in the response | array('id', 'name') |