A/B test

更新时间:
复制 MD 格式

OpenSearch provides the A/B test feature to help you evaluate and debug various algorithms. Before you apply a new configuration to all your online traffic, you can run an A/B test to validate its effects on a small portion of traffic. This prevents untested changes from negatively impacting your online services. The A/B test feature allows you to create tests for query analysis, rough sort, and fine sort configurations.

Usage

1. Basic workflow for configuring an A/B test

To obtain more comprehensive test metrics, we recommend enabling click data collection before you configure an A/B test. If you are using the A/B test feature for the first time, follow the steps below to configure and launch a test:

1. Start the test creation process:

In the OpenSearch console, navigate to Feature Extensions > A/B Test in the left-side navigation pane. Click Create to start creating an A/B test.

2. Create a test group:

A test group allows you to run tests on query analysis, rough sort, fine sort, and category prediction. The test group name is for display purposes only and can be modified. The name cannot exceed 30 characters. The Test Application field displays the current application name. After you enter the information, click Next to proceed to the test configuration.

3. Create a test:

After a test group is created, click "Create Test" to add specific tests to the group. You can create a maximum of 20 tests in each test group, with a maximum of 10 tests running online simultaneously. By default, the page includes a baseline test named Online Performance (Baseline) with ID 0 that receives 100% of the traffic.

3.1 Test name: Define a custom name for the test, which cannot exceed 30 characters. After you enter the test name, the configuration area for Sort policy types and policies appears on the right. 3.2 Add a configuration:

You can configure tests for query analysis, rough sort, and fine sort by selecting existing rules in the configuration dialog box.

When you select a query analysis type and policy:

  • If you select "Custom", the available options are all the query analysis rules that you have created for the current application.

  • Selecting "Use Default Online Configuration" applies the existing online logic and excludes this configuration item from the test.

When you select a sort policy type and policy:

The same logic applies to sort policy types and policies. Sort policies in the "Configuring" state can be edited, which affects online A/B test results. Therefore, you can only select policies in the "Published" state because they cannot be modified.

3.3 Test traffic: The minimum traffic allocation for a test is 1%. For a single scenario, the total traffic allocated to all online tests within the same test group must be less than or equal to 100%.

4. Complete the test group creation:

After configuring the test, click "Next", and then click "Complete" to return to the A/B test homepage. A message indicating that the test group was created successfully is displayed, along with recommendations for the next steps. The newly configured test group is in the "Pending" state.

5. Start testing:

After the test group is created, find the group and click "Start Testing" in the Actions column. The status of the test group changes to "Testing".

6. Activate the A/B test:

After you enable the A/B test feature and configure a test in the console, you must specify the abtest parameter in your search queries to activate the test online. The abtest parameter has two main parts: scene_tag and flow_divider. On the Search Test page in the console, you can enter the abtest value in the Parameters section to apply the test.

Example request URL:

/v3/openapi/apps/160029126/search?query=query=default:'Shenzhen'&&config=start:0,hit:10,format:fulljson&abtest=scene_tag:test_1,flow_divider:123456
  • scene_tag: The name of the test group. After creating one or more test groups in the console, set this parameter to the name of one of the test groups. Test traffic is then routed to the various tests within that group.

  • flow_divider: Required. The backend system hashes this value to distribute user query traffic among different tests according to the traffic allocation you configured in the console. We recommend using a unique user ID for this value. If a user ID is not available, you can use a device ID or IP address.

Note:

  • If you use an SDK to access OpenSearch, you do not need to encode the values for scene_tag and flow_divider if they do not contain spaces or punctuation. You can just call the corresponding interface. If the values contain punctuation, you must encode them first. For more information, see the "Practical Example" section.

  • If you access OpenSearch by making API calls, the values of scene_tag and flow_divider must be URL-encoded. The final format for the abtest parameter passed to OpenSearch is abtest=urlencode(scene_tag:urlencode(\$scene),flow_divider:urlencode(\$value)), where urlencode is a URL encoding function.

  • For more information, see the FAQ about A/B testing document.

2. Manage test groups and tests

Manage test groups

After you create test groups and tests, a list of test groups is displayed on the A/B test homepage. The list includes the Test group name, Status, Creation time, Last modified time, and Actions columns. You can perform the following management operations on the created test groups:

1. Start testing:

You can start test groups that are in the "Pending" or "Stopped" state.

2. Stop testing:

You can stop test groups that are in the "Testing" state.

3. Delete a test group:

You can delete any test group from the A/B test homepage.

Manage tests

After you create test groups and tests, a list of corresponding tests appears on the test group details page. On the A/B test homepage, click Details in the Actions column of a test group to go to its details page. The test list displays the test name, query analysis type and policy, sort policy type and policy, traffic allocation, and actions. You can perform basic management operations on the created tests:

  1. Edit a test: You can modify the test name, configuration, and traffic allocation.

  2. Delete a test: When a test is deleted, its configuration information is removed, and the test is no longer effective online.

  3. Whitelist.

To assign a specific flow_divider to a particular test for easier evaluation of its search performance, OpenSearch provides a whitelist feature. On the whitelist configuration page, enter a flow_divider value, click Add, and then click Save.

3. A/B test group details

Behavioral data status

After you create a test group, go to the test group details page. The behavioral data status can be one of the following:

  • Not Activated: No behavioral data is being uploaded for the current application. (Click here for details)

  • Activated with No Data: Behavioral data collection is enabled for the current application, but no data has been received.

  • Data Abnormal: The quality check found that the current behavioral data is unreliable due to a high number of issues.

Test group status

After you create a test group, go to the test group details page. The status of the test group can be one of the following:

  • Pending: The test group is ready to start. This status is used whether the group has never been run or has been previously stopped.

  • Testing: Indicates that the test group has been started in the console. The number of days elapsed since the test group was started is displayed.

  • Stopped: Indicates that the test group is stopped. The accumulated test duration is the total number of days the test group was actively running, excluding periods when it was stopped.

Data statistics

From the A/B test group list page, click "Details", or from the application menu, navigate to "Report Statistics > A/B test report" to go to the test group details page. On the Data Statistics page, you can view the A/B test data report. The data in the console is available on a T+1 basis. For example, if you run a test today, you can see the results in the console tomorrow. The console combines the core metrics comparison page with the detailed metrics data table. You can select the metric type from the drop-down list to view the corresponding metrics. Core metrics are presented as a line chart for a more intuitive view, showing the previous day's data by default. You can select multiple tests from the test selection drop-down menu on the page to compare their data. The available core metrics include search PV, search UV, zero-result rate, average search PV per user, exposure count, search queries, and average search queries per user.

  • Note: The A/B test report for a given day is available at 8:00 AM the following day. Even if you stop an A/B test during the day, you can still view the report data from before the test was stopped the next day.

Practical example

An e-commerce product uses OpenSearch for two types of search queries:

  • Type 1: Search traffic from end-users searching for products by keyword. The query format is:

query=config=format:fulljson&&query=default:'infant formula'&&sort=price

  • Type 2: Traffic from calls made by other internal services. The query format is:

query=config=format:fulljson&&query=cat_id:'1'|'2'|'3'&&sort=timestamp

For the first type of traffic, the user wants to run an A/B test by splitting traffic based on end-user member IDs to compare the effectiveness of different sort expressions, category prediction models, or query analysis rules. The user configures the test as follows:

1. Create a test group and tests in the A/B test feature of the console. When creating the test group, name the test group user_search.

2. Set the A/B test parameters in the query. Since the test group in the console is named user_search, the query for this use case should include the parameters scene_tag:user_search and flow_divider:xxxx, where xxxx is the end-user's member ID.

2.1 By using an SDK (The following example uses the Java SDK. The PHP SDK usage is similar.):

OpenSearch opensearch = new OpenSearch(accesskey, secret, host);
OpenSearchClient serviceClient = new OpenSearchClient(opensearch);
SearcherClient searcherClient = new SearcherClient(serviceClient);

searchParams = new SearchParams();

searchParams.setQueryString("default:'infant formula'");
searchParams.setFormat("json");
searchParams.addSort("price", "-");
searchParams.setAbtest(new Abtest().setSceneTag("user_search").setFlowDivider("Zhang San"));

aliyun-sdk-opensearch-3.4.1 (Java), opensearch-sdk-php-3.3.0 (PHP).

2.2 By using the API

i. query=config=format:fulljson&&query=default:'infant formula'&&sort=-price&abtest=scene_tag:user_search,flow_divider:%e5%bc%a0%e4%b8%89

Note

Note: The values of the scene_tag and flow_divider sub-parameters of abtest are URL-encoded here.

ii. URL-encode the value of each parameter in the request (i.e., query, sort, abtest):

query=config%3dformat%3afulljson&&query%3ddefault%3a'infant%20formula'

3. After completing these configurations, you can implement the A/B test for the first traffic scenario.

Business operations report

Interface

To open the A/B test statistics report, navigate to [Feature Extensions > A/B Test] and click "Report Statistics".

Alternatively, navigate directly to "Report Statistics > A/B test report".

The report page analyzes data across five dimensions: core metrics, traffic metrics, behavioral metrics, conversion metrics, and user analysis metrics. You can filter the data by date range and test.

For descriptions of the metrics in the A/B test report, click here.