EasyRec Processor（推荐打分服务）_人工智能平台 PAI(PAI)-阿里云帮助中心

EAS内置的EasyRec Processor支持将EasyRec或TensorFlow训练的推荐模型部署为打分服务，并具备集成特征工程的能力。通过联合优化特征工程和TensorFlow模型，EasyRec Processor能够实现高性能的打分服务。本文为您介绍如何部署及调用EasyRec模型服务。

背景信息

EasyRec Processor是基于PAI-EAS的Processor规范（使用C或C++开发自定义Processor）编写的推理服务。应用在两种情况下：

基于Feature Generator（简称FG）和EasyRec训练的深度学习模型，在EasyRec Processor中通过把物品特征缓存到内存，并且对特征变换和推理性能做了优化，能够充分提高打分性能。在此方案基础上，还可以使用FeatureStore管理在线特征、实时特征。基于PAI-Rec推荐系统开发平台的推荐方案定制产出相关代码，打通了训练、特征变化、推理优化，结合PAI-Rec引擎，能够快速对接模型的部署和上线服务。我们大量客户选择了这一套方案，节约了成本，提高了开发效率。
基于EasyRec或者Tensorflow训练的模型，不用Feature Generator，也可以提供服务。

基于EasyRec Processor的推荐引擎的架构图如下所示：

其中EasyRec Processor主要包含以下模块：

Item Feature Cache：将FeatureStore里面的特征缓存到内存中，可以减少请求FeatureStore带来的网络开销和压力。此外，Item特征缓存支持增量更新，例如实时特征的更新。
Feature Generator：特征工程模块（FG）采用相同的实现保证了离线和在线特征处理的一致性。特征工程的实现借鉴于淘宝沉淀的特征工程方案。
TFModel：TensorFlow模型加载EasyRec导出的Saved_Model，并结合Blade做模型在CPU和GPU上的推理优化。
特征埋点和模型增量更新模块：通常应用于实时训练场景，详情请参见实时训练。

使用限制

仅支持使用通用型实例规格族g6、g7或g8机型（仅支持Intel系列的CPU），支持T4、A10、GU30、3090或4090等GPU型号，详情请参见通用型（g系列）。

版本列表

EasyRec Processor仍然在迭代中，建议使用最新的版本部署推理服务，新的版本将提供更多的功能和更高的推理性能，已经发布的版本：

Processor name	发布日期	Tensorflow版本	新增功能

Processor name	发布日期	Tensorflow版本	新增功能
easyrec	20230608	2.10	支持FeatureGenerator和Item Feature Cache。支持Online Deep Learning。支持Faiss向量召回。支持GPU推理。
easyrec-1.2	20230721	2.10	优化weighted category embedding。
easyrec-1.3	20230802	2.10	支持从MaxCompute加载item特征到item feature cache。
easyrec-1.6	20231006	2.10	特征自动扩展。 gpu placement优化。支持save_req保存请求到模型目录。
easyrec-1.7	20231013	2.10	优化keras model性能。
easyrec-1.8	20231101	2.10	支持云上版本 FeatureStore。
easyrec-kv-1.8	20231220	DeepRec (deeprec2310)	支持DeepRec EmbeddingVariable。
easyrec-1.9	20231222	2.10	修复TagFeature和RawFeature图优化问题。
easyrec-2.4	20240826	2.10	feature store cpp sdk 支持 feature db feature store cpp sdk 支持 sts token request 请求支持 double (float64) 类型

步骤一：部署服务

使用eascmd客户端部署EasyRec模型服务时，您需要指定Processor种类为easyrec-{version}，关于如何使用客户端工具部署服务，详情请参见服务部署：EASCMD。服务配置文件示例如下：

使用FG的示例

注意下面使用Shell脚本来部署，脚本中包含了AccessKeyID和AccessKeySecret的明文密码。好处是比较简单易懂，但是里面没有PAI-FeatureStore，没有讲如何从MaxCompute中加载表数据从而降低对Hologres的压力。具体使用PAI-FeatureStore和从MaxCompute加载数据，请参考步骤二：创建与部署EAS模型服务，注意这个文档里面是通过python脚本来部署，使用的是DataWorks内置的对象o，并且使用了临时STS更加安全，其中load_feature_from_offlinestore设置为True。

bizdate=$1
cat << EOF > echo.json
{
  "name":"ali_rec_rnk_with_fg",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large",
      "instances": null
    }
  },
  "model_config": {
    "remote_type": "hologres",
    "url": "postgresql://<AccessKeyID>:<AccessKeySecret>@<域名>:<port>/<database>",
    "tables": [{"name":"<schema>.<table_name>","key":"<index_column_name>","value": "<column_name>"}],
    "period": 2880,
    "fg_mode": "tf",
    "outputs":"probs_ctr,probs_cvr",
  },
  "model_path": "",
  "processor": "easyrec-2.4",
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final_with_fg"
      }
    }
  ]
}

EOF
# 执行部署命令。
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# 执行更新命令
eascmd update ali_rec_rnk_with_fg -s echo.json

不使用FG的示例

bizdate=$1
cat << EOF > echo.json
{
  "name":"ali_rec_rnk_no_fg",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large",
      "instances": null
    }
  },
  "model_config": {
    "fg_mode": "bypass"
  },
  "processor": "easyrec-1.9",
  "processor_envs": [
    {
      "name": "INPUT_TILE",
      "value": "2"
    }
  ],
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final/"
      }
    }
  ],
  "warm_up_data_path": "oss://easyrec/ali_rec_sln_acc_rnk/rnk_warm_up.bin"
}

EOF
# 执行部署命令。
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# 执行更新命令
eascmd update ali_rec_rnk_no_fg -s echo.json

其中关键参数说明如下，其他参数说明，请参见JSON部署参数说明。

参数	是否必选	描述	示例

参数	是否必选	描述	示例
processor	是	EasyRec Processor。	`"processor": "easyrec"`
fg_mode	是	用于指定特征工程模式，取值如下： tf：为TensorFlow模式，使用FG。通过将FG以TF算子嵌入TensorFlow计算图并进行图优化，从而获得更高性能。 bypass：不使用FG，仅部署TensorFlow模型。适用于自定义特征处理的场景。该模式下不需要配置 Item Feature Cache相关参数和Processor访问FeatureStore相关参数。	`"fg_mode": "tf"`
outputs	是	tf模型预测的输出变量名称，如probs_ctr。如果是多个则用逗号分隔。如果不清楚输出变量名称，请执行tf的命令saved_model_cli来查看。	"outputs":"probs_ctr,probs_cvr"
save_req	否	是否将请求获得的数据文件保存到模型目录下，保存的文件可以用来做warmup和性能测试。取值如下： true：是。 false（默认值）：否。生产环境建议设置成false，否则会影响性能。	"save_req": "false"
Item Feature Cache相关参数
period	是	Item feature cache特征周期性更新的间隔，单位是分钟。如果Item特征是天级更新的话, 一般设置的值大于一天即可（例如2880，1天1440分钟，2880即表示两天），一天之内就不需要更新特征了，因为每天例行更新服务的时候同时也会更新特征。	`"period": 2880`
remote_type	是	Item特征数据源, 目前支持： hologres：通过SQL接口进行数据读取和写入，适用于海量数据的存储和查询。 none: 不使用Item特征缓存，item特征通过请求传入，此时tables应设置为[]。	`"remote_type": "hologres"`
tables	否	Item特征表，当remote_type为hologres时需要配置，包含以下参数： key：必填，item_id列名。 name：必填，特征表名。 value：可选，需要加载的列名，多个列名之间用半角逗号（,）分隔。 condition：可选，where子语句支持筛选Item。例如`style_id<10000`。 timekey：可选，用于Item的增量更新，用于指定更新的时间戳或整型值。支持的格式：timestamp和int。 static：可选，表示静态特征，不用周期性更新。支持从多个表中读取输入Item数据，配置格式为： `"tables": [{"key":"table1", ...},{"key":"table2", ...}]` 如果多张表有重复的列，后面的表将覆盖前面的表。	`"tables": {` `"key": "goods_id",` `"name": "public.ali_rec_item_feature"` `}`
url	否	Hologres的访问地址。	`"url": "postgresql://LTAI************@hgprecn-cn-xxxxx-cn-hangzhou-vpc.hologres.aliyuncs.com:80/bigdata_rec"`
Processor访问FeatureStore相关参数
fs_project	否	FeatureStore 项目名称，使用 FeatureStore 时需指定该字段。 FeatureStore文档请参考：配置FeatureStore项目。	"fs_project": "fs_demo"
fs_model	否	FeatureStore模型特征名称。	"fs_model": "fs_rank_v1"
fs_entity	否	FeatureStore实体名称。	"fs_entity": "item"
region	否	FeatureStore 产品所在的地区。	"region": "cn-beijing"
access_key_id	否	FeatureStore 产品的 access_key_id。	"access_key_id": "xxxxx"
access_key_secret	否	FeatureStore 产品的 access_key_secret。	"access_key_secret": "xxxxx"
load_feature_from_offlinestore	否	离线特征是否直接从FeatureStore OfflineStore中获取数据，取值如下： True：是，会从FeatureStore OfflineStore中获取数据。 False（默认值）：否，会从FeatureStore OnlineStore中获取数据。	"load_feature_from_offlinestore": True
input_tile: 特征自动扩展相关参数
INPUT_TILE	否	支持item feature自动broadcast，对于一次请求中值都相同的feature（例如user_id），可以只传一个值。优势：减少了请求大小、网络传输时间和计算时间。开启：设置INPUT_TILE环境变量为2。说明 easyrec-1.3及其以上版本支持该优化。 fg_mode=tf时，已自动开启该优化，不需要再单独设置该环境变量。	"processor_envs": [ { "name": "INPUT_TILE", "value": "2" } ]

EasyRecProcessor的推理优化参数

参数

是否必选

描述

示例

TF_XLA_FLAGS

否

在使用GPU前提下，使用 XLA 对模型进行编译优化和自动算子融合

"processor_envs":

[

{

"name": "TF_XLA_FLAGS",

"value": "--tf_xla_auto_jit=2"

{

"name": "XLA_FLAGS",

"value": "--xla_gpu_cuda_data_dir=/usr/local/cuda/"

{

"name": "XLA_ALIGN_SIZE",

"value": "64"

}

]

TF调度参数

否

inter_op_parallelism_threads: 控制执行不同操作的线程数

intra_op_parallelism_threads: 控制单个操作内部使用的线程数.

一般32核CPU时，使用设置为16性能较高

"model_config": {

"inter_op_parallelism_threads": 16,

"intra_op_parallelism_threads": 16,

}

步骤二：调用服务

EasyRec模型服务部署完成后，在模型在线服务（EAS）页面，单击待调用服务服务方式列下的调用信息，查看服务的访问地址和Token信息。

我们把PAI-Rec的引擎、模型打分服务都部署在PAI-EAS上面，因此产品需要设置网络直连的方式，在PAI-EAS实例界面（如下图）右上角点击“专有网络”，设置同样的VPC、交换机、安全组：参考通过控制台配置VPC专有网络。如何使用了Hologres也要设置同样的VPC信息。如下图：

EasyRec模型服务的输入、输出格式均为Protobuf格式，因此不能在PAI-EAS的产品界面上做测试，而要根据是否包含FG用以下两种程序调用方法：

包含FG：`fg_mode=tf`

使用EAS Java SDK

使用EAS Python SDK

Maven环境配置请参考Java SDK使用说明，请求服务ali_rec_rnk_with_fg的示例代码如下：

import com.aliyun.openservices.eas.predict.http.*;
import com.aliyun.openservices.eas.predict.request.EasyRecRequest;

PredictClient client = new PredictClient(new HttpConfig());
// 通过普通网关访问时，需要使用以用户UID开头的Endpoint，在EAS控制台服务的调用信息中可以获得该信息。
client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
client.setModelName("ali_rec_rnk_with_fg");
// 替换为服务Token信息。
client.setToken("******");

EasyRecRequest easyrecRequest = new EasyRecRequest(separator);
// userFeatures: 用户特征, 特征之间用\u0002(CTRL_B)分隔, 特征名和特征值之间用:分隔。
//  user_fea0:user_fea0_val\u0002user_fea1:user_fea1_val
// 特征值的格式请参考: https://easyrec.readthedocs.io/en/latest/feature/rtp_fg.html
easyrecRequest.appendUserFeatureString(userFeatures);
// 也可以每次添加一个user特征:
// easyrecRequest.addUserFeature(String userFeaName, T userFeaValue)。
// 特征值的类型T: String, float, long, int。

// contextFeatures: context特征, 特征之间用\u0002(CTRL_B)分隔, 特征名和特征值之间用:分割, 特征值和特征值之间用:分隔。
//   ctxt_fea0:ctxt_fea0_ival0:ctxt_fea0_ival1:ctxt_fea0_ival2\u0002ctxt_fea1:ctxt_fea1_ival0:ctxt_fea1_ival1:ctxt_fea1_ival2
easyrecRequest.appendContextFeatureString(contextFeatures);
// 也可以每次添加一个context特征：
// easyrecRequest.addContextFeature(String ctxtFeaName, List<Object> ctxtFeaValue)。
// ctxtFeaValue的类型: String, Float, Long, Integer。

// itemIdStr: 要预测的itemId的列表，以半角逗号（,）分割。
easyrecRequest.appendItemStr(itemIdStr, ",");
// 也可以每次添加一个itemId:
// easyrecRequest.appendItemId(String itemId)

PredictProtos.PBResponse response = client.predict(easyrecRequest);

for (Map.Entry<String, PredictProtos.Results> entry : response.getResultsMap().entrySet()) {
    String key = entry.getKey();
    PredictProtos.Results value = entry.getValue();
    System.out.print("key: " + key);
    for (int i = 0; i < value.getScoresCount(); i++) {
        System.out.format("value: %.6g\n", value.getScores(i));
    }
}

// 获取FG之后的特征，以便和离线的特征对比一致性。
// 将DebugLevel设置成1，即可返回生成的特征。
easyrecRequest.setDebugLevel(1);
PredictProtos.PBResponse response = client.predict(easyrecRequest);
Map<String, String> genFeas = response.getGenerateFeaturesMap();
for(String itemId: genFeas.keySet()) {
    System.out.println(itemId);
    System.out.println(genFeas.get(itemId));
}

环境配置请参见Python SDK使用说明。在实际应用中建议使用Java客户端。示例代码：

from eas_prediction import PredictClient

from eas_prediction.easyrec_request import EasyRecRequest
from eas_prediction.easyrec_predict_pb2 import PBFeature
from eas_prediction.easyrec_predict_pb2 import PBRequest

if __name__ == '__main__':
    endpoint = 'http://xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com'
    service_name = 'ali_rec_rnk_with_fg'
    token = '******'

    client = PredictClient(endpoint, service_name)
    client.set_token(token)
    client.init()

    req = PBRequest()
    uid = PBFeature()
    uid.string_feature = 'u0001'
    req.user_features['user_id'] = uid
    age = PBFeature()
    age.int_feature = 12
    req.user_features['age'] = age
    weight = PBFeature()
    weight.float_feature = 129.8
    req.user_features['weight'] = weight

    req.item_ids.extend(['item_0001', 'item_0002', 'item_0003'])
    
    easyrec_req = EasyRecRequest()
    easyrec_req.add_feed(req, debug_level=0)
    res = client.predict(easyrec_req)
    print(res)

其中：

endpoint：需要配置为以用户UID开头的Endpoint。在PAI EAS模型在线服务页面，单击待调用服务服务方式列下的调用信息，可以获得该信息。
service_name: 服务名称，在PAI EAS模型在线服务页面获取。
token：需要配置为服务Token信息。在调用信息对话框，可以获得该信息。

不包含FG：`fg_mode=bypass`

使用Java SDK

使用Python SDK

Maven环境配置请参考Java SDK使用说明，请求服务ali_rec_rnk_no_fg的示例代码如下：

import java.util.List;

import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;

public class TestEasyRec {
    public static TFRequest buildPredictRequest() {
        TFRequest request = new TFRequest();
 
        request.addFeed("user_id", TFDataType.DT_STRING, 
                        new long[]{5}, new String []{ "u0001", "u0001", "u0001"});
      	request.addFeed("age", TFDataType.DT_FLOAT, 
                        new long[]{5}, new float []{ 18.0f, 18.0f, 18.0f});
        // 注意: 如果设置了INPUT_TILE=2，那么上述值都相同的feature可以只传一次:
        //    request.addFeed("user_id", TFDataType.DT_STRING,
        //            new long[]{1}, new String []{ "u0001" });
        //    request.addFeed("age", TFDataType.DT_FLOAT, 
        //            new long[]{1}, new float []{ 18.0f});
      	request.addFeed("item_id", TFDataType.DT_STRING, 
                        new long[]{5}, new String []{ "i0001", "i0002", "i0003"});  
        request.addFetch("probs");
      	return request;
    }

    public static void main(String[] args) throws Exception {
        PredictClient client = new PredictClient(new HttpConfig());

        // 如果要使用网络直连功能，需使用setDirectEndpoint方法, 如: 
        //   client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // 网络直连需打通在EAS控制台开通，提供用于访问EAS服务的源vswitch，
        // 网络直连具有更好的稳定性和性能。
        client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
        client.setModelName("ali_rec_rnk_no_fg");
        client.setToken("");
        long startTime = System.currentTimeMillis();
        for (int i = 0; i < 100; i++) {
            try {
                TFResponse response = client.predict(buildPredictRequest());
                // probs为模型的输出的字段名, 可以使用curl命令查看模型的输入输出:
                //   curl xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com -H "Authorization:{token}"
                List<Float> result = response.getFloatVals("probs");
                System.out.print("Predict Result: [");
                for (int j = 0; j < result.size(); j++) {
                    System.out.print(result.get(j).floatValue());
                    if (j != result.size() - 1) {
                        System.out.print(", ");
                    }
                }
                System.out.print("]\n");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");
        client.shutdown();
    }
}

请参考Python SDK使用说明。由于python性能比较差，建议仅在调试服务时使用，在生产环境中应使用Java SDK。请求服务ali_rec_rnk_no_fg的示例代码如下：

#!/usr/bin/env python

from eas_prediction import PredictClient
from eas_prediction import StringRequest
from eas_prediction import TFRequest

if __name__ == '__main__':
    client = PredictClient('http://xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com', 'ali_rec_rnk_no_fg')
    client.set_token('')
    client.init()
    
    # 注意请将 server_default 替换为真实模型的 signature_name，详细见上文的使用说明文档
    req = TFRequest('server_default') 
    req.add_feed('user_id', [3], TFRequest.DT_STRING, ['u0001'] * 3)
    req.add_feed('age', [3], TFRequest.DT_FLOAT, [18.0] * 3)
    # 注意: 开启INPUT_TILE=2的优化之后, 上述特征可以只传一个值
    #   req.add_feed('user_id', [1], TFRequest.DT_STRING, ['u0001'])
    #   req.add_feed('age', [1], TFRequest.DT_FLOAT, [18.0])
    req.add_feed('item_id', [3], TFRequest.DT_STRING, 
        ['i0001', 'i0002', 'i0003'])
    for x in range(0, 100):
        resp = client.predict(req)
        print(resp)

您也可以自行构建服务请求，详情请参见请求格式。

请求格式

除Python外，使用其他语言客户端调用服务都需要根据.proto文件手动生成预测的请求代码文件。如果您希望自行构建服务请求，则可以参考如下protobuf的定义来生成相关的代码：

tf_predict.proto: tensorflow模型的请求定义

syntax = "proto3";

option cc_enable_arenas = true;
option go_package = ".;tf";
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "PredictProtos";

enum ArrayDataType {
  // Not a legal value for DataType. Used to indicate a DataType field
  // has not been set.
  DT_INVALID = 0;

  // Data types that all computation devices are expected to be
  // capable to support.
  DT_FLOAT = 1;
  DT_DOUBLE = 2;
  DT_INT32 = 3;
  DT_UINT8 = 4;
  DT_INT16 = 5;
  DT_INT8 = 6;
  DT_STRING = 7;
  DT_COMPLEX64 = 8;  // Single-precision complex
  DT_INT64 = 9;
  DT_BOOL = 10;
  DT_QINT8 = 11;     // Quantized int8
  DT_QUINT8 = 12;    // Quantized uint8
  DT_QINT32 = 13;    // Quantized int32
  DT_BFLOAT16 = 14;  // Float32 truncated to 16 bits.  Only for cast ops.
  DT_QINT16 = 15;    // Quantized int16
  DT_QUINT16 = 16;   // Quantized uint16
  DT_UINT16 = 17;
  DT_COMPLEX128 = 18;  // Double-precision complex
  DT_HALF = 19;
  DT_RESOURCE = 20;
  DT_VARIANT = 21;  // Arbitrary C++ data types
}

// Dimensions of an array
message ArrayShape {
  repeated int64 dim = 1 [packed = true];
}

// Protocol buffer representing an array
message ArrayProto {
  // Data Type.
  ArrayDataType dtype = 1;

  // Shape of the array.
  ArrayShape array_shape = 2;

  // DT_FLOAT.
  repeated float float_val = 3 [packed = true];

  // DT_DOUBLE.
  repeated double double_val = 4 [packed = true];

  // DT_INT32, DT_INT16, DT_INT8, DT_UINT8.
  repeated int32 int_val = 5 [packed = true];

  // DT_STRING.
  repeated bytes string_val = 6;

  // DT_INT64.
  repeated int64 int64_val = 7 [packed = true];

  // DT_BOOL.
  repeated bool bool_val = 8 [packed = true];
}

// PredictRequest specifies which TensorFlow model to run, as well as
// how inputs are mapped to tensors and how outputs are filtered before
// returning to user.
message PredictRequest {
  // A named signature to evaluate. If unspecified, the default signature
  // will be used
  string signature_name = 1;

  // Input tensors.
  // Names of input tensor are alias names. The mapping from aliases to real
  // input tensor names is expected to be stored as named generic signature
  // under the key "inputs" in the model export.
  // Each alias listed in a generic signature named "inputs" should be provided
  // exactly once in order to run the prediction.
  map<string, ArrayProto> inputs = 2;

  // Output filter.
  // Names specified are alias names. The mapping from aliases to real output
  // tensor names is expected to be stored as named generic signature under
  // the key "outputs" in the model export.
  // Only tensors specified here will be run/fetched and returned, with the
  // exception that when none is specified, all tensors specified in the
  // named signature will be run/fetched and returned.
  repeated string output_filter = 3;
  
  // Debug flags
  // 0: just return prediction results, no debug information
  // 100: return prediction results, and save request to model_dir 
  // 101: save timeline to model_dir
  int32 debug_level = 100;
}

// Response for PredictRequest on successful run.
message PredictResponse {
  // Output tensors.
  map<string, ArrayProto> outputs = 1;
}

easyrec_predict.proto: Tensorflow模型+FG的请求定义

syntax = "proto3";

option cc_enable_arenas = true;
option go_package = ".;easyrec";
option java_package = "com.aliyun.openservices.eas.predict.proto";
option java_outer_classname = "EasyRecPredictProtos";

import "tf_predict.proto";

// context features
message ContextFeatures {
  repeated PBFeature features = 1;
}

message PBFeature {
  oneof value {
    int32 int_feature = 1;
    int64 long_feature = 2;
    string string_feature = 3;
    float float_feature = 4;
  }
}

// PBRequest specifies the request for aggregator
message PBRequest {
  // Debug flags
  // 0: just return prediction results, no debug information
  // 3: return features generated by FG module, string format, feature values are separated by \u0002, 
  //    could be used for checking feature consistency check and generating online deep learning samples 
  // 100: return prediction results, and save request to model_dir 
  // 101: save timeline to model_dir
  // 102: for recall models such as DSSM and MIND, only only return Faiss retrieved results
  //      but also return user embedding vectors.
  int32 debug_level = 1;

  // user features
  map<string, PBFeature> user_features = 2;

  // item ids, static(daily updated) item features 
  // are fetched from the feature cache resides in 
  // each processor node by item_ids
  repeated string item_ids = 3;

  // context features for each item, realtime item features
  //    could be passed as context features.
  map<string, ContextFeatures> context_features = 4;

  // embedding retrieval neighbor number.
  int32 faiss_neigh_num = 5;
}

// return results
message Results {
  repeated double scores = 1 [packed = true];
}

enum StatusCode {
  OK = 0;
  INPUT_EMPTY = 1;
  EXCEPTION = 2;
}

// PBResponse specifies the response for aggregator
message PBResponse {
  // results
  map<string, Results> results = 1;

  // item features
  map<string, string> item_features = 2;

  // fg generate features
  map<string, string> generate_features = 3;

  // context features
  map<string, ContextFeatures> context_features = 4;

  string error_msg = 5;

  StatusCode status_code = 6;

  // item ids
  repeated string item_ids = 7;

  repeated string outputs = 8;

  // all fg input features
  map<string, string> raw_features = 9;

  // output tensors
  map<string, ArrayProto> tf_outputs = 10;
}