EasyRec
EAS内置的EasyRec Processor支持将EasyRec或TensorFlow训练的推荐模型部署为打分服务,并具备集成特征工程的能力。通过联合优化特征工程和TensorFlow模型,EasyRec Processor能够实现高性能的打分服务。本文为您介绍如何部署及调用EasyRec模型服务。
背景信息
基于EasyRec Processor的推荐引擎的功能架构图如下所示:

其中EasyRec Processor主要包含以下模块:
Item Feature Cache:将FeatureStore(Redis或Hologres)里面的特征缓存到内存中,可以减少请求FeatureStore带来的网络开销和压力。此外,Item特征缓存支持增量更新,例如实时特征的更新。
Feature Generator:特征工程模块(FG)采用相同的实现保证了离线和在线特征处理的一致性。 特征工程的实现借鉴于淘宝沉淀的特征工程方案。
TFModel:TensorFlow模型加载EasyRec导出的Saved_Model,并结合Blade做模型在CPU和GPU上的推理优化。
特征埋点和模型增量更新模块:通常应用于实时训练场景,详情请参见实时训练。
使用限制
仅支持使用通用型实例规格族g7,intel类型的CPU,详情请参见通用型。
步骤一:部署服务
使用eascmd客户端部署EasyRec模型服务时,您需要指定Processor种类为easyrec,关于如何使用客户端工具部署服务,详情请参见服务部署:EASCMD&DSW。服务配置文件示例如下:
bizdate=$1
cat << EOF > echo.json
{
"name":"ali_rec_rnk",
"metadata": {
"instance": 2,
"rpc": {
"enable_jemalloc": 1,
"max_queue_size": 100
}
},
"cloud": {
"computing": {
"instance_type": "ecs.g7.large"",
"instances": null
}
},
"model_config": {
"remote_type": "hologres",
"url": "postgresql://<AccessKeyID>:<AccessKeySecret>@<域名>:<port>/<database>",
"tables": [{"name":"<schema>.<table_name>","key":"<index_column_name>","value": "<column_name>"}],
"period": 2880,
"fg_mode": "tf",
"outputs":"probs_ctr,probs_cvr",
},
"model_path": "",
"processor": "easyrec",
"storage": [
{
"mount_path": "/home/admin/docker_ml/workspace/model/",
"oss": {
"path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final_with_fg"
}
}
]
}
EOF
# 执行部署命令。
eascmd create echo.json
# eascmd -i <AccessKeyID> -k <AccessKeySecret> -e <endpoint> create echo.json
# 执行更新命令
eascmd update ali_rec_rnk -s echo.json
其中关键参数说明如下,其他参数说明,请参见服务模型所有相关参数说明。
参数 | 是否必选 | 描述 | 示例 |
processor | 是 | EasyRec processor。 |
|
fg_mode | 是 | 用于指定特征工程模式,取值如下:
|
|
period | 是 | Item特征周期性更新的间隔, 单位是分钟。 |
|
remote_type | 是 | Item特征存储, 目前支持:
|
|
tables | 否 | Item特征表,当remote_type为hologres时需要配置,包含以下参数:
支持从多个表中读取输入数据,配置格式为: |
|
url | 否 | Hologres或Redis的访问地址,如果使用阿里云上的Redis,请使用专有网络的代理模式地址。 |
|
sep | 否 | 特征之间的分割符, remote_type为redis时需要配置。例如: |
|
prefix | 否 | remote_type为redis时需要配置。 Item特征的前缀。 |
|
步骤二:调用服务
EasyRec模型服务部署完成后,在PAI EAS模型在线服务页面,单击待调用服务服务方式列下的调用信息,查看服务的访问地址和Token信息。
EasyRec模型服务的输入输出格式为Protobuf格式,根据是否包含FG,分为以下两种调用方法:
包含FG:
fg_mode=tf
使用Java SDK
添加依赖包:在pom.xml里面加入以下内容。
<dependency> <groupId>com.aliyun.openservices.eas</groupId> <artifactId>eas-sdk</artifactId> <version>{version}</version> </dependency>
示例代码如下:
import com.aliyun.openservices.eas.predict.http.*; import com.aliyun.openservices.eas.predict.request.EasyRecRequest; PredictClient client = new PredictClient(new HttpConfig()); // 通过普通网关访问时,需要使用以用户UID开头的Endpoint,在PAI-EAS控制台服务的调用信息中可以获得该信息。 client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com"); client.setModelName("ali_rec_rnk"); // 替换为服务Token信息。 client.setToken("atxjzk****"); EasyRecRequest easyrecRequest = new EasyRecRequest(separator); easyrecRequest.appendUserFeatureString(userFeatures); easyrecRequest.appendContextFeatureString(contextFeatures); easyrecRequest.appendItemStr(itemIdStr, ","); PredictProtos.PBResponse response = client.predict(easyrecRequest); for (Map.Entry<String, PredictProtos.Results> entry : response.getResultsMap().entrySet()) { String key = entry.getKey(); PredictProtos.Results value = entry.getValue(); System.out.print("key: " + key); for (int i = 0; i < value.getScoresCount(); i++) { System.out.format("value: %.6g\n", value.getScores(i)); } } // 获取FG之后的特征,以便和离线的特征对比一致性 // 将DebugLevel设置成1,即可返回生成的特征 easyrecRequest.setDebugLevel(1); PredictProtos.PBResponse response = client.predict(easyrecRequest); Map<String, String> genFeas = response.getGenerateFeaturesMap(); for(String itemId: genFeas.keySet()) { System.out.println(itemId); System.out.println(genFeas.get(itemId)); }
使用Python SDK
仅供调试服务使用,详情请参见Python SDK使用说明。在实际应用中建议使用Java客户端。
pip install eas-prediction # python -m eas_prediction.easyrec_client_demo --help python -m eas_prediction.easyrec_client_demo \ --endpoint 1301055xxxxxxxxx.cn-hangzhou.pai-eas.aliyuncs.com \ --service_name ali_rec_rank \ --token MmQ3Yxxxxxxxxxxx \ --table_schema data/test/client/user_table_schema \ --table_data data/test/client/user_table_data \ --item_lst data/test/client/item_lst
其中:
--endpoint:需要配置为以用户UID开头的Endpoint。在PAI EAS模型在线服务页面,单击待调用服务服务方式列下的调用信息,可以获得该信息。
--token:需要配置为服务Token信息。在调用信息对话框,可以获得该信息。
不包含FG:
fg_mode=bypass
和普通的Tensorflow Model调用方法一致,详情请参见请求格式。
后续您也可以自行构建服务请求,详情请参见请求格式。
请求格式
除Python外,使用其它语言客户端调用服务都需要根据.proto文件手动生成预测的请求代码文件。如果您希望自行构建服务请求,则可以参考如下pb定义来生成相关的代码:
编写请求代码文件,例如:tf_predict.proto。
syntax = "proto3"; package tensorflow.eas; option cc_enable_arenas = true; enum ArrayDataType { // Not a legal value for DataType. Used to indicate a DataType field // has not been set. DT_INVALID = 0; // Data types that all computation devices are expected to be // capable to support. DT_FLOAT = 1; DT_DOUBLE = 2; DT_INT32 = 3; DT_UINT8 = 4; DT_INT16 = 5; DT_INT8 = 6; DT_STRING = 7; DT_COMPLEX64 = 8; // Single-precision complex DT_INT64 = 9; DT_BOOL = 10; DT_QINT8 = 11; // Quantized int8 DT_QUINT8 = 12; // Quantized uint8 DT_QINT32 = 13; // Quantized int32 DT_BFLOAT16 = 14; // Float32 truncated to 16 bits. Only for cast ops. DT_QINT16 = 15; // Quantized int16 DT_QUINT16 = 16; // Quantized uint16 DT_UINT16 = 17; DT_COMPLEX128 = 18; // Double-precision complex DT_HALF = 19; DT_RESOURCE = 20; DT_VARIANT = 21; // Arbitrary C++ data types } // Dimensions of an array message ArrayShape { repeated int64 dim = 1 [packed = true]; } // Protocol buffer representing an array message ArrayProto { // Data Type. ArrayDataType dtype = 1; // Shape of the array. ArrayShape array_shape = 2; // DT_FLOAT. repeated float float_val = 3 [packed = true]; // DT_DOUBLE. repeated double double_val = 4 [packed = true]; // DT_INT32, DT_INT16, DT_INT8, DT_UINT8. repeated int32 int_val = 5 [packed = true]; // DT_STRING. repeated bytes string_val = 6; // DT_INT64. repeated int64 int64_val = 7 [packed = true]; // DT_BOOL. repeated bool bool_val = 8 [packed = true]; } // PredictRequest specifies which TensorFlow model to run, as well as // how inputs are mapped to tensors and how outputs are filtered before // returning to user. message PredictRequest { // A named signature to evaluate. If unspecified, the default signature // will be used string signature_name = 1; // Input tensors. // Names of input tensor are alias names. The mapping from aliases to real // input tensor names is expected to be stored as named generic signature // under the key "inputs" in the model export. // Each alias listed in a generic signature named "inputs" should be provided // exactly once in order to run the prediction. map<string, ArrayProto> inputs = 2; // Output filter. // Names specified are alias names. The mapping from aliases to real output // tensor names is expected to be stored as named generic signature under // the key "outputs" in the model export. // Only tensors specified here will be run/fetched and returned, with the // exception that when none is specified, all tensors specified in the // named signature will be run/fetched and returned. repeated string output_filter = 3; } // Response for PredictRequest on successful run. message PredictResponse { // Output tensors. map<string, ArrayProto> outputs = 1; }
编写请求代码文件,例如:predict.proto。
syntax = "proto3"; package tensorflow.eas; option cc_enable_arenas = true; enum ArrayDataType { // Not a legal value for DataType. Used to indicate a DataType field // has not been set. DT_INVALID = 0; // Data types that all computation devices are expected to be // capable to support. DT_FLOAT = 1; DT_DOUBLE = 2; DT_INT32 = 3; DT_UINT8 = 4; DT_INT16 = 5; DT_INT8 = 6; DT_STRING = 7; DT_COMPLEX64 = 8; // Single-precision complex DT_INT64 = 9; DT_BOOL = 10; DT_QINT8 = 11; // Quantized int8 DT_QUINT8 = 12; // Quantized uint8 DT_QINT32 = 13; // Quantized int32 DT_BFLOAT16 = 14; // Float32 truncated to 16 bits. Only for cast ops. DT_QINT16 = 15; // Quantized int16 DT_QUINT16 = 16; // Quantized uint16 DT_UINT16 = 17; DT_COMPLEX128 = 18; // Double-precision complex DT_HALF = 19; DT_RESOURCE = 20; DT_VARIANT = 21; // Arbitrary C++ data types } // Dimensions of an array message ArrayShape { repeated int64 dim = 1 [packed = true]; } // Protocol buffer representing an array message ArrayProto { // Data Type. ArrayDataType dtype = 1; // Shape of the array. ArrayShape array_shape = 2; // DT_FLOAT. repeated float float_val = 3 [packed = true]; // DT_DOUBLE. repeated double double_val = 4 [packed = true]; // DT_INT32, DT_INT16, DT_INT8, DT_UINT8. repeated int32 int_val = 5 [packed = true]; // DT_STRING. repeated bytes string_val = 6; // DT_INT64. repeated int64 int64_val = 7 [packed = true]; // DT_BOOL. repeated bool bool_val = 8 [packed = true]; } // PredictRequest specifies which TensorFlow model to run, as well as // how inputs are mapped to tensors and how outputs are filtered before // returning to user. message PredictRequest { // A named signature to evaluate. If unspecified, the default signature // will be used string signature_name = 1; // Input tensors. // Names of input tensor are alias names. The mapping from aliases to real // input tensor names is expected to be stored as named generic signature // under the key "inputs" in the model export. // Each alias listed in a generic signature named "inputs" should be provided // exactly once in order to run the prediction. map<string, ArrayProto> inputs = 2; // Output filter. // Names specified are alias names. The mapping from aliases to real output // tensor names is expected to be stored as named generic signature under // the key "outputs" in the model export. // Only tensors specified here will be run/fetched and returned, with the // exception that when none is specified, all tensors specified in the // named signature will be run/fetched and returned. repeated string output_filter = 3; } // Response for PredictRequest on successful run. message PredictResponse { // Output tensors. map<string, ArrayProto> outputs = 1; }