This document describes how to use the custom sorting model feature in OpenSearch Industry Algorithm Edition.
Procedure
-
In Feature Management, create the required field features. This example uses the
system_itemtable. If the features you need are not in this table, you must first register the external MaxCompute table that contains them. For thetitlefield, create features for its raw value (custom_title), a lookup feature after tokenization (custom_title_match), and a token count after tokenization (custom_title_len). You can add other features based on your business requirements. The features described in this section are examples for building a click-through rate (CTR) model.
Similarly, for the description field, create custom_description, custom_desc_match, and custom_desc_len; for the brand_name field, create custom_tags and custom_tags_match; for the category_name field, create custom_category.
-
Combine the built-in features from the
system_internaltable with the field features that you created in the previous step. This process is called feature engineering.
To register features in bulk for a typical CTR model, use the CreateFunctionResource API operation:
In your request, set the ResourceType parameter to feature_generator, and set the Data parameter to the following JSON object. Note: Ensure that you have created all input features that are prefixed with custom_ as described in Step 1.
[
{
"input": {
"features": [
{
"type": "user",
"name": "system_raw_q_ultra"
},
{
"type": "item",
"name": "system_item_id"
}
]
},
"generator": "combo",
"output": "comb_q_nid"
},
{
"input": {
"features": [
{
"type": "user",
"name": "system_user_id"
},
{
"type": "item",
"name": "system_item_id"
}
]
},
"generator": "combo",
"output": "comb_uid_nid"
},
{
"input": {
"features": [
{
"type": "user",
"name": "system_user_id"
},
{
"type": "item",
"name": "custom_tags"
}
]
},
"generator": "combo",
"output": "comb_uid_tags"
},
{
"input": {
"features": [
{
"type": "user",
"name": "system_raw_q_ultra"
},
{
"type": "item",
"name": "custom_tags"
}
]
},
"generator": "combo",
"output": "comb_q_tags"
},
{
"input": {
"features": [
{
"type": "user",
"name": "system_exp_time"
}
]
},
"generator": "id",
"output": "exp_time"
},
{
"input": {
"features": [
{
"type": "user",
"name": "system_terms2"
}
]
},
"generator": "id",
"output": "terms2"
},
{
"input": {
"features": [
{
"type": "user",
"name": "system_raw_q_ultra"
}
]
},
"generator": "id",
"output": "raw_q_ultra"
},
{
"input": {
"features": [
{
"type": "user",
"name": "system_user_id"
}
]
},
"generator": "id",
"output": "user_id"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_item_id"
}
]
},
"generator": "id",
"output": "item_id"
},
{
"input": {
"features": [
{
"type": "item",
"name": "custom_description"
}
]
},
"generator": "id",
"output": "description"
},
{
"input": {
"features": [
{
"type": "item",
"name": "custom_desc_len"
}
]
},
"generator": "id",
"output": "desc_len"
},
{
"input": {
"features": [
{
"type": "item",
"name": "custom_title"
}
]
},
"generator": "id",
"output": "title"
},
{
"input": {
"features": [
{
"type": "item",
"name": "custom_title_len"
}
]
},
"generator": "id",
"output": "title_len"
},
{
"input": {
"features": [
{
"type": "item",
"name": "custom_category"
}
]
},
"generator": "id",
"output": "category"
},
{
"input": {
"features": [
{
"type": "item",
"name": "custom_tags"
}
]
},
"generator": "id",
"output": "tags"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_ctr_30"
}
]
},
"generator": "id",
"output": "all_nid_ctr_30"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_ctr_7"
}
]
},
"generator": "id",
"output": "all_nid_ctr_7"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_ctr_1"
}
]
},
"generator": "id",
"output": "all_nid_ctr_1"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_pv_30"
}
]
},
"generator": "id",
"output": "all_nid_pv_30"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_pv_7"
}
]
},
"generator": "id",
"output": "all_nid_pv_7"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_pv_1"
}
]
},
"generator": "id",
"output": "all_nid_pv_1"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_ipv_30"
}
]
},
"generator": "id",
"output": "all_nid_ipv_30"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_ipv_7"
}
]
},
"generator": "id",
"output": "all_nid_ipv_7"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_all_nid_ipv_1"
}
]
},
"generator": "id",
"output": "all_nid_ipv_1"
},
{
"input": {
"features": [
{
"role": "map",
"type": "item",
"name": "custom_title_match"
},
{
"role": "key",
"type": "user",
"name": "system_terms2"
}
]
},
"generator": "lookup",
"output": "term_title_match"
},
{
"input": {
"features": [
{
"role": "map",
"type": "item",
"name": "custom_desc_match"
},
{
"role": "key",
"type": "user",
"name": "system_terms2"
}
]
},
"generator": "lookup",
"output": "term_desc_match"
},
{
"input": {
"features": [
{
"role": "map",
"type": "item",
"name": "custom_tags_match"
},
{
"role": "key",
"type": "user",
"name": "system_terms2"
}
]
},
"generator": "lookup",
"output": "term_tags_match"
},
{
"input": {
"features": [
{
"role": "map",
"type": "item",
"name": "system_qterm_match_decay"
},
{
"role": "key",
"type": "user",
"name": "system_terms2"
}
]
},
"generator": "lookup",
"output": "term_os_kw_match"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_query_cnt"
}
]
},
"generator": "id",
"output": "opensearch_query_cnt"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_qterm_cnt"
}
]
},
"generator": "id",
"output": "opensearch_qterm_cnt"
},
{
"input": {
"features": [
{
"role": "map",
"type": "item",
"name": "system_query_ctr_decay"
},
{
"role": "key",
"type": "user",
"name": "system_raw_q_ultra"
}
]
},
"generator": "lookup",
"output": "os_q_ctr_decay"
},
{
"input": {
"features": [
{
"role": "map",
"type": "item",
"name": "system_qterm_ctr_decay"
},
{
"role": "key",
"type": "user",
"name": "system_terms2"
}
]
},
"generator": "lookup",
"output": "os_term_ctr_decay"
},
{
"input": {
"features": [
{
"role": "map",
"type": "item",
"name": "system_query_ctr_decay"
},
{
"role": "key",
"type": "user",
"name": "system_raw_q_ultra"
}
]
},
"generator": "lookup",
"output": "os_q_ctr_decay_nokey"
},
{
"input": {
"features": [
{
"role": "map",
"type": "item",
"name": "system_qterm_ctr_decay"
},
{
"role": "key",
"type": "user",
"name": "system_terms2"
}
]
},
"generator": "lookup",
"output": "os_term_ctr_decay_nokey"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_query_seq_decay"
}
]
},
"generator": "id",
"output": "os_q_seq_decay"
},
{
"input": {
"features": [
{
"type": "item",
"name": "system_qterm_seq_decay"
}
]
},
"generator": "id",
"output": "os_term_seq_decay"
},
{
"input": {
"features": [
{
"role": "query",
"type": "user",
"name": "system_terms2"
},
{
"role": "title",
"type": "item",
"name": "system_qterm_seq_decay"
}
],
"method": "query_common_ratio"
},
"generator": "overlap",
"output": "os_qterm_q_common_ratio"
},
{
"input": {
"features": [
{
"role": "query",
"type": "user",
"name": "system_terms2"
},
{
"role": "title",
"type": "item",
"name": "system_qterm_seq_decay"
}
],
"method": "title_common_ratio"
},
"generator": "overlap",
"output": "os_qterm_title_common_ratio"
},
{
"input": {
"features": [
{
"role": "query",
"type": "user",
"name": "system_terms2"
},
{
"role": "title",
"type": "item",
"name": "custom_title"
}
],
"method": "query_common_ratio"
},
"generator": "overlap",
"output": "title_q_common_ratio"
},
{
"input": {
"features": [
{
"role": "query",
"type": "user",
"name": "system_terms2"
},
{
"role": "title",
"type": "item",
"name": "custom_title"
}
],
"method": "title_common_ratio"
},
"generator": "overlap",
"output": "title_title_common_ratio"
},
{
"input": {
"features": [
{
"role": "query",
"type": "user",
"name": "system_terms2"
},
{
"role": "title",
"type": "item",
"name": "custom_description"
}
],
"method": "query_common_ratio"
},
"generator": "overlap",
"output": "desc_q_common_ratio"
},
{
"input": {
"features": [
{
"role": "query",
"type": "user",
"name": "system_terms2"
},
{
"role": "title",
"type": "item",
"name": "custom_description"
}
],
"method": "title_common_ratio"
},
"generator": "overlap",
"output": "desc_title_common_ratio"
},
{
"input": {
"features": [
{
"type": "user",
"name": "system_term_seq_length"
}
],
"dimension": 1
},
"generator": "raw",
"output": "term_seq_length"
}
]
After the features are created, you can edit them on the Feature Management page.
You must specify how to use these features in the model code.
Based on the Quick Start, modify the list of features to use.
Typically, you specify these features in embedding_columns.
On the custom sorting model configuration page, specify a name for the model description, such as model1. In the Python file, define the self.embedding_columns variable and pass the output fields from your feature configuration, such as comb_q_nid, comb_uid_nid, and comb_uid_tags, as a list of embedding features to the model.
-
Specify the feature description and model description to use in your custom sorting model.
In the navigation pane, choose custom sorting model. On the model creation page, configure the following parameters: select the target application, enter a model name (for example, rank_model; it must be 1 to 30 characters in length, start with a letter, and can contain uppercase letters, lowercase letters, digits, and underscores), set the model type to custom sorting model, enable scheduled training if needed, and select the feature description (for example, fg1) and model description (for example, model1) that you created.
-
After the model trains successfully, you can use it in Cava in the same way as a CTR model. For more information, see custom sorting model. Before you deploy the model, validate its performance by running an A/B test.