DeepPageingIterator lets you page through large result sets without tracking scroll IDs. Each call to hasNext() and next() automatically manages the underlying scroll session, so you can focus on processing results rather than session state.
Prerequisites
Before you begin, make sure you have:
An OpenSearch application with at least one searchable table
A RAM user with the required permissions (see Access authorization rules)
The OpenSearch SDK for Java V4.0.0 added to your project
Set up credentials
Store your AccessKey pair as environment variables. Do not hardcode credentials in source code.
Linux and macOS
export ALIBABA_CLOUD_ACCESS_KEY_ID=<access_key_id>
export ALIBABA_CLOUD_ACCESS_KEY_SECRET=<access_key_secret>Replace <access_key_id> and <access_key_secret> with the AccessKey ID and AccessKey secret of your RAM user.
Windows
Create an environment variable file and add
ALIBABA_CLOUD_ACCESS_KEY_IDandALIBABA_CLOUD_ACCESS_KEY_SECRETwith your AccessKey ID and AccessKey secret.Restart Windows for the changes to take effect.
The AccessKey pair of an Alibaba Cloud account has access to all API operations. Use a Resource Access Management (RAM) user for API calls and routine O&M. For details on creating a RAM user, see Create a RAM user and Create an AccessKey pair. If using a RAM user, make sure the AliyunServiceRoleForOpenSearch role has the required permissions. See AliyunServiceRoleForOpenSearch.
Limitations
Scroll queries have the following constraints:
The
aggregate,distinct, andrankclauses are not supported.Sorting is supported on a single field only.
The
startparameter in the config clause has no effect; the default value0is always used.
How it works
The SDK builds the scroll session through a chain of objects:
OpenSearch— initialized with your credentials and API endpointOpenSearchClient— wraps theOpenSearchobjectSearcherClient— wraps theOpenSearchClientobjectConfig— defines the application name, hits per page, return fields, and data formatSearchParams— holds the query, filter, and sort conditions, plus aDeepPagingobjectDeepPageingIterator— drives the scroll loop; eachnext()call fetches the next page and advances the scroll ID automatically
Determine whether an error has occurred based on the error code and message, not the status field. See Error codes.
Implement iterative scroll queries
The following example uses OpenSearch SDK for Java V4.0.0. DeepPageingIterator handles scroll ID management automatically, so you do not need to pass a scroll ID between requests.
package com.aliyun.opensearch;
import com.aliyun.opensearch.sdk.dependencies.com.google.common.collect.Lists;
import com.aliyun.opensearch.sdk.generated.OpenSearch;
import com.aliyun.opensearch.sdk.generated.search.*;
import com.aliyun.opensearch.search.DeepPageingIterator;
import java.nio.charset.Charset;
public class testScrollIterator {
// Scroll queries do not support the aggregate, distinct, or rank clause,
// and support sorting on a single field only.
private static String appName = "Name of the OpenSearch application that you want to manage";
private static String tableName = "Name of the table to which data is to be uploaded";
private static String host = "Endpoint of the OpenSearch API in your region";
public static void main(String[] args) {
// Read credentials from environment variables.
// Set the environment variables before running this example.
String accesskey = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_ID");
String secret = System.getenv("ALIBABA_CLOUD_ACCESS_KEY_SECRET");
// Print the file encoding and default charset for debugging.
System.out.println(String.format("file.encoding: %s", System.getProperty("file.encoding")));
System.out.println(String.format("defaultCharset: %s", Charset.defaultCharset().name()));
// Build the client chain: OpenSearch -> OpenSearchClient -> SearcherClient.
OpenSearch openSearch = new OpenSearch(accesskey, secret, host);
OpenSearchClient serviceClient = new OpenSearchClient(openSearch);
SearcherClient searcherClient = new SearcherClient(serviceClient);
// Configure the query: application name, hits per page, return fields, and data format.
Config config = new Config(Lists.newArrayList(appName));
// The start parameter has no effect on scroll queries; default value 0 is used.
config.setStart(start);
// Return 5 documents per page.
config.setHits(5);
// Supported formats: JSON and FULLJSON.
config.setSearchFormat(SearchFormat.FULLJSON);
config.setFetchFields(Lists.newArrayList("id", "name", "phone", "int_arr", "literal_arr", "float_arr", "cate_id"));
// Note: Set the rerank_size parameter via the setReRankSize method of the Rank class.
// Define the query, filter, and sort conditions.
SearchParams searchParams = new SearchParams(config);
// To search across multiple index fields, specify all fields in one setQuery call.
// Multiple setQuery calls overwrite each other; only the last one takes effect.
searchParams.setQuery("name:'opensearch'");
searchParams.setFilter("cate_id<=3");
Sort sorter = new Sort();
// Sort by the id field in descending order.
sorter.addToSortFields(new SortField("id", Order.DECREASE));
searchParams.setSort(sorter);
// Attach a DeepPaging object to enable scroll queries.
DeepPaging deep = new DeepPaging();
searchParams.setDeepPaging(deep);
// Create the iterator. It manages scroll IDs automatically.
DeepPageingIterator pagesIterator = new DeepPageingIterator(searcherClient, searchParams);
// Set the interval between page fetches, in milliseconds.
// The default is 100 ms. Adjust based on your throughput needs.
pagesIterator.setPagingIntervals(80);
// Iterate through all pages.
// Check error codes and messages to detect failures, not the status field.
try {
System.out.println("test");
while (pagesIterator.hasNext()) {
System.out.println("Debugging information:" + pagesIterator.next());
}
} catch (Exception ex) {
System.out.println("Error message:" + ex.getMessage());
}
}
}Key parameters
| Parameter | Method | Description | Default |
|---|---|---|---|
| Hits per page | config.setHits(n) | Number of documents returned per page | — |
| Data format | config.setSearchFormat(...) | Return format: JSON or FULLJSON | — |
| Return fields | config.setFetchFields(...) | List of fields included in each result | — |
| Query | searchParams.setQuery(...) | Query clause; specify all index fields in a single call | — |
| Filter | searchParams.setFilter(...) | Filter condition applied to results | — |
| Sort | searchParams.setSort(...) | Sort field and direction; scroll queries support one field only | — |
| Paging interval | pagesIterator.setPagingIntervals(ms) | Delay between page fetches, in milliseconds | 100 ms |