EasyRec

更新时间: 2023-11-01 15:22:47

EAS内置的EasyRec Processor支持将EasyRec或TensorFlow训练的推荐模型部署为打分服务,并具备集成特征工程的能力。通过联合优化特征工程和TensorFlow模型,EasyRec Processor能够实现高性能的打分服务。本文为您介绍如何部署及调用EasyRec模型服务。

背景信息

基于EasyRec Processor的推荐引擎的功能架构图如下所示:

image..png

其中EasyRec Processor主要包含以下模块:

  • Item Feature Cache:将FeatureStore(Redis或Hologres)里面的特征缓存到内存中,可以减少请求FeatureStore带来的网络开销和压力。此外,Item特征缓存支持增量更新,例如实时特征的更新。

  • Feature Generator:特征工程模块(FG)采用相同的实现保证了离线和在线特征处理的一致性。 特征工程的实现借鉴于淘宝沉淀的特征工程方案。

  • TFModel:TensorFlow模型加载EasyRec导出的Saved_Model,并结合Blade做模型在CPU和GPU上的推理优化。

  • 特征埋点模型增量更新模块:通常应用于实时训练场景,详情请参见实时训练

使用限制

仅支持使用通用型实例规格族g7,intel类型的CPU,详情请参见通用型

步骤一:部署服务

使用eascmd客户端部署EasyRec模型服务时,您需要指定Processor种类easyrec,关于如何使用客户端工具部署服务,详情请参见服务部署:EASCMD&DSW。服务配置文件示例如下:

bizdate=$1
cat << EOF > echo.json
{
  "name":"ali_rec_rnk",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large"",
      "instances": null
    }
  },
  "model_config": {
    "remote_type": "hologres",
    "url": "postgresql://<AccessKeyID>:<AccessKeySecret>@<域名>:<port>/<database>",
    "tables": [{"name":"<schema>.<table_name>","key":"<index_column_name>","value": "<column_name>"}],
    "period": 2880,
    "fg_mode": "tf",
    "outputs":"probs_ctr,probs_cvr",
  },
  "model_path": "",
  "processor": "easyrec",
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final_with_fg"
      }
    }
  ]
}

EOF
# 执行部署命令。
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# 执行更新命令
eascmd update ali_rec_rnk -s echo.json

其中关键参数说明如下,其他参数说明,请参见服务模型所有相关参数说明

参数

是否必选

描述

示例

processor

EasyRec processor。

"processor": "easyrec"

fg_mode

用于指定特征工程模式,取值如下:

  • tf:为TensorFlow模式,使用FG。通过将FG以TF算子嵌入TensorFlow计算图并进行图优化,从而获得更高性能。

  • bypass:不使用FG, 仅部署TensorFlow模型。

    • 适用于自定义特征处理的场景。

    • 该模式下不需要配置Item特征相关的参数,即参数表中fg_mode下方的参数。

"fg_mode": "tf"

period

Item特征周期性更新的间隔, 单位是分钟。

"period": 2880

remote_type

Item特征存储, 目前支持:

  • hologres:通过SQL接口进行数据读取和写入,适用于海量数据的存储和查询。

  • redis:通过Key-Value接口进行数据读取和写入,适用于高速读写和缓存等场景。

"remote_type": "hologres"

tables

Item特征表,当remote_typehologres时需要配置,包含以下参数:

  • key:必填,item_id列名。

  • name:必填,特征表名。

  • value:可选,需要加载的列名,多个列名之间用半角逗号(,)分隔。

  • condition:可选,where子语句支持筛选Item。例如style_id<10000

  • timekey:可选,用于Item的增量更新,用于指定更新的时间戳或整型值。支持的格式:timestamp和int。

  • static:可选,表示静态特征,不用周期性更新。

支持从多个表中读取输入数据,配置格式为:"tables": [{}{}]。如果多张表有重复的列,后面的表将覆盖前面的表。

"tables": {

"key": "goods_id",

"name": "public.ali_rec_item_feature"

}

url

Hologres或Redis的访问地址,如果使用阿里云上的Redis,请使用专有网络的代理模式地址。

"url": "postgresql://LTAIXXXXX:J6geXXXXXX@hgprecn-cn-xxxxx-cn-hangzhou-vpc.hologres.aliyuncs.com:80/bigdata_rec"

sep

特征之间的分割符, remote_typeredis时需要配置。例如:"sep":","

"sep":","

prefix

remote_typeredis时需要配置。

Item特征的前缀。

"prefix": "itm_"

步骤二:调用服务

EasyRec模型服务部署完成后,在PAI EAS模型在线服务页面,单击待调用服务服务方式列下的调用信息,查看服务的访问地址和Token信息。

EasyRec模型服务的输入输出格式为Protobuf格式,根据是否包含FG,分为以下两种调用方法:

  • 包含FG:fg_mode=tf

    使用Java SDK

    添加依赖包:在pom.xml里面加入以下内容。

    <dependency>
      <groupId>com.aliyun.openservices.eas</groupId>
      <artifactId>eas-sdk</artifactId>
      <version>{version}</version>
    </dependency>

    示例代码如下:

    import com.aliyun.openservices.eas.predict.http.*;
    import com.aliyun.openservices.eas.predict.request.EasyRecRequest;
    
    PredictClient client = new PredictClient(new HttpConfig());
    // 通过普通网关访问时,需要使用以用户UID开头的Endpoint,在PAI-EAS控制台服务的调用信息中可以获得该信息。
    client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
    client.setModelName("ali_rec_rnk");
    // 替换为服务Token信息。
    client.setToken("atxjzk****");
    
    EasyRecRequest easyrecRequest = new EasyRecRequest(separator);
    easyrecRequest.appendUserFeatureString(userFeatures);
    easyrecRequest.appendContextFeatureString(contextFeatures);
    easyrecRequest.appendItemStr(itemIdStr, ",");
    
    PredictProtos.PBResponse response = client.predict(easyrecRequest);
    
    for (Map.Entry<String, PredictProtos.Results> entry : response.getResultsMap().entrySet()) {
        String key = entry.getKey();
        PredictProtos.Results value = entry.getValue();
        System.out.print("key: " + key);
        for (int i = 0; i < value.getScoresCount(); i++) {
            System.out.format("value: %.6g\n", value.getScores(i));
        }
    }
    
    // 获取FG之后的特征,以便和离线的特征对比一致性
    // 将DebugLevel设置成1,即可返回生成的特征
    easyrecRequest.setDebugLevel(1);
    PredictProtos.PBResponse response = client.predict(easyrecRequest);
    Map<String, String> genFeas = response.getGenerateFeaturesMap();
    for(String itemId: genFeas.keySet()) {
        System.out.println(itemId);
        System.out.println(genFeas.get(itemId));
    }

    使用Python SDK

    仅供调试服务使用,详情请参见Python SDK使用说明。在实际应用中建议使用Java客户端。

    pip install eas-prediction
    # python -m eas_prediction.easyrec_client_demo --help
    python -m eas_prediction.easyrec_client_demo \
      --endpoint 1301055xxxxxxxxx.cn-hangzhou.pai-eas.aliyuncs.com \
      --service_name ali_rec_rank \
      --token MmQ3Yxxxxxxxxxxx \
      --table_schema data/test/client/user_table_schema \
      --table_data data/test/client/user_table_data \
      --item_lst data/test/client/item_lst

    其中:

    • --endpoint:需要配置为以用户UID开头的Endpoint。在PAI EAS模型在线服务页面,单击待调用服务服务方式列下的调用信息,可以获得该信息。

    • --token:需要配置为服务Token信息。在调用信息对话框,可以获得该信息。

  • 不包含FG:fg_mode=bypass

    和普通的Tensorflow Model调用方法一致,详情请参见请求格式

后续您也可以自行构建服务请求,详情请参见请求格式

请求格式

除Python外,使用其它语言客户端调用服务都需要根据.proto文件手动生成预测的请求代码文件。如果您希望自行构建服务请求,则可以参考如下pb定义来生成相关的代码:

  • 编写请求代码文件,例如:tf_predict.proto。

    syntax = "proto3";
    
    package tensorflow.eas;
    option cc_enable_arenas = true;
    
    enum ArrayDataType {
      // Not a legal value for DataType. Used to indicate a DataType field
      // has not been set.
      DT_INVALID = 0;
    
      // Data types that all computation devices are expected to be
      // capable to support.
      DT_FLOAT = 1;
      DT_DOUBLE = 2;
      DT_INT32 = 3;
      DT_UINT8 = 4;
      DT_INT16 = 5;
      DT_INT8 = 6;
      DT_STRING = 7;
      DT_COMPLEX64 = 8;  // Single-precision complex
      DT_INT64 = 9;
      DT_BOOL = 10;
      DT_QINT8 = 11;     // Quantized int8
      DT_QUINT8 = 12;    // Quantized uint8
      DT_QINT32 = 13;    // Quantized int32
      DT_BFLOAT16 = 14;  // Float32 truncated to 16 bits.  Only for cast ops.
      DT_QINT16 = 15;    // Quantized int16
      DT_QUINT16 = 16;   // Quantized uint16
      DT_UINT16 = 17;
      DT_COMPLEX128 = 18;  // Double-precision complex
      DT_HALF = 19;
      DT_RESOURCE = 20;
      DT_VARIANT = 21;  // Arbitrary C++ data types
    }
    
    // Dimensions of an array
    message ArrayShape {
      repeated int64 dim = 1 [packed = true];
    }
    
    // Protocol buffer representing an array
    message ArrayProto {
      // Data Type.
      ArrayDataType dtype = 1;
    
      // Shape of the array.
      ArrayShape array_shape = 2;
    
      // DT_FLOAT.
      repeated float float_val = 3 [packed = true];
    
      // DT_DOUBLE.
      repeated double double_val = 4 [packed = true];
    
      // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
      repeated int32 int_val = 5 [packed = true];
    
      // DT_STRING.
      repeated bytes string_val = 6;
    
      // DT_INT64.
      repeated int64 int64_val = 7 [packed = true];
    
      // DT_BOOL.
      repeated bool bool_val = 8 [packed = true];
    }
    
    // PredictRequest specifies which TensorFlow model to run, as well as
    // how inputs are mapped to tensors and how outputs are filtered before
    // returning to user.
    message PredictRequest {
      // A named signature to evaluate. If unspecified, the default signature
      // will be used
      string signature_name = 1;
    
      // Input tensors.
      // Names of input tensor are alias names. The mapping from aliases to real
      // input tensor names is expected to be stored as named generic signature
      // under the key "inputs" in the model export.
      // Each alias listed in a generic signature named "inputs" should be provided
      // exactly once in order to run the prediction.
      map<string, ArrayProto> inputs = 2;
    
      // Output filter.
      // Names specified are alias names. The mapping from aliases to real output
      // tensor names is expected to be stored as named generic signature under
      // the key "outputs" in the model export.
      // Only tensors specified here will be run/fetched and returned, with the
      // exception that when none is specified, all tensors specified in the
      // named signature will be run/fetched and returned.
      repeated string output_filter = 3;
    }
    
    // Response for PredictRequest on successful run.
    message PredictResponse {
      // Output tensors.
      map<string, ArrayProto> outputs = 1;
    }
  • 编写请求代码文件,例如:predict.proto。

    syntax = "proto3";
    
    package tensorflow.eas;
    option cc_enable_arenas = true;
    
    enum ArrayDataType {
      // Not a legal value for DataType. Used to indicate a DataType field
      // has not been set.
      DT_INVALID = 0;
    
      // Data types that all computation devices are expected to be
      // capable to support.
      DT_FLOAT = 1;
      DT_DOUBLE = 2;
      DT_INT32 = 3;
      DT_UINT8 = 4;
      DT_INT16 = 5;
      DT_INT8 = 6;
      DT_STRING = 7;
      DT_COMPLEX64 = 8;  // Single-precision complex
      DT_INT64 = 9;
      DT_BOOL = 10;
      DT_QINT8 = 11;     // Quantized int8
      DT_QUINT8 = 12;    // Quantized uint8
      DT_QINT32 = 13;    // Quantized int32
      DT_BFLOAT16 = 14;  // Float32 truncated to 16 bits.  Only for cast ops.
      DT_QINT16 = 15;    // Quantized int16
      DT_QUINT16 = 16;   // Quantized uint16
      DT_UINT16 = 17;
      DT_COMPLEX128 = 18;  // Double-precision complex
      DT_HALF = 19;
      DT_RESOURCE = 20;
      DT_VARIANT = 21;  // Arbitrary C++ data types
    }
    
    // Dimensions of an array
    message ArrayShape {
      repeated int64 dim = 1 [packed = true];
    }
    
    // Protocol buffer representing an array
    message ArrayProto {
      // Data Type.
      ArrayDataType dtype = 1;
    
      // Shape of the array.
      ArrayShape array_shape = 2;
    
      // DT_FLOAT.
      repeated float float_val = 3 [packed = true];
    
      // DT_DOUBLE.
      repeated double double_val = 4 [packed = true];
    
      // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
      repeated int32 int_val = 5 [packed = true];
    
      // DT_STRING.
      repeated bytes string_val = 6;
    
      // DT_INT64.
      repeated int64 int64_val = 7 [packed = true];
    
      // DT_BOOL.
      repeated bool bool_val = 8 [packed = true];
    }
    
    // PredictRequest specifies which TensorFlow model to run, as well as
    // how inputs are mapped to tensors and how outputs are filtered before
    // returning to user.
    message PredictRequest {
      // A named signature to evaluate. If unspecified, the default signature
      // will be used
      string signature_name = 1;
    
      // Input tensors.
      // Names of input tensor are alias names. The mapping from aliases to real
      // input tensor names is expected to be stored as named generic signature
      // under the key "inputs" in the model export.
      // Each alias listed in a generic signature named "inputs" should be provided
      // exactly once in order to run the prediction.
      map<string, ArrayProto> inputs = 2;
    
      // Output filter.
      // Names specified are alias names. The mapping from aliases to real output
      // tensor names is expected to be stored as named generic signature under
      // the key "outputs" in the model export.
      // Only tensors specified here will be run/fetched and returned, with the
      // exception that when none is specified, all tensors specified in the
      // named signature will be run/fetched and returned.
      repeated string output_filter = 3;
    }
    
    // Response for PredictRequest on successful run.
    message PredictResponse {
      // Output tensors.
      map<string, ArrayProto> outputs = 1;
    }
阿里云首页 人工智能平台 PAI 相关技术圈