如果不用EasyRec训练，只用EAS推理，如何对接_智能推荐 AIRec(AIRec)-阿里云帮助中心

此文档指导用户在不使用EasyRec训练的情况下，仅使用EAS进行模型推理的具体步骤。详细内容基于EasyRec Processor（推荐打分服务）。

背景

在某些情况下，用户可能已经在其他环境完成了模型训练，但希望使用EAS提供的高性能服务进行模型推理。

步骤一：准备模型

确保你的模型已经以TensorFlow的SavedModel格式导出，并上传到一个可访问的存储位置，如OSS（阿里云对象存储服务）。

步骤二：配置服务部署文件

创建一个服务配置文件，例如echo.json，并按照以下格式填写相关参数：

bizdate=$1
cat << EOF > echo.json
{
  "name":"ali_rec_rnk_no_fg",
  "metadata": {
    "instance": 2,
    "rpc": {
      "enable_jemalloc": 1,
      "max_queue_size": 100
    }
  },
  "cloud": {
    "computing": {
      "instance_type": "ecs.g7.large"",
      "instances": null
    }
  },
  "model_config": {
    "fg_mode": "bypass"
  },
  "processor": "easyrec-1.9",
  "processor_envs": [
    {
      "name": "INPUT_TILE",
      "value": "2"
    }
  ],
  "storage": [
    {
      "mount_path": "/home/admin/docker_ml/workspace/model/",
      "oss": {
        "path": "oss://easyrec/ali_rec_sln_acc_rnk/20221122/export/final/"
      }
    }
  ],
  "warm_up_data_path": "oss://easyrec/ali_rec_sln_acc_rnk/rnk_warm_up.bin"
}

EOF

在fg_mode字段中设置为bypass，表示不使用FG，仅部署TensorFlow模型。

其中关键参数说明如下，其他参数说明，请参见服务模型所有相关参数说明。

参数	是否必选	描述	示例
processor	是	EasyRec Processor。	"processor": "easyrec"
fg_mode	是	用于指定特征工程模式，取值如下： tf：为TensorFlow模式，使用FG。通过将FG以TF算子嵌入TensorFlow计算图并进行图优化，从而获得更高性能。 bypass：不使用FG，仅部署TensorFlow模型。适用于自定义特征处理的场景。该模式下不需要配置 Item Feature Cache相关参数和Processor访问FeatureStore相关参数。	"fg_mode": "tf"
outputs	是	tf模型预测的输出变量名称，如probs_ctr。如果是多个则用逗号分隔。如果不清楚输出变量名称，请执行tf的命令saved_model_cli来查看。	"outputs":"probs_ctr,probs_cvr"
save_req	否	是否将请求获得的数据文件保存到模型目录下，保存的文件可以用来做warmup和性能测试。取值如下： true：是。 false（默认值）：否。生产环境建议设置成false，否则会影响性能。	"save_req": "false"
Item Feature Cache相关参数
period	是	Item feature cache特征周期性更新的间隔，单位是分钟。如果Item特征是天级更新的话, 一般设置的值大于一天即可（例如2880，1天1440分钟，2880即表示两天），一天之内就不需要更新特征了，因为每天例行更新服务的时候同时也会更新特征。	"period": 2880
remote_type	是	Item特征数据源, 目前支持： hologres：通过SQL接口进行数据读取和写入，适用于海量数据的存储和查询。 none: 不使用Item特征缓存，item特征通过请求传入，此时tables应设置为[]。	"remote_type": "hologres"
tables	否	Item特征表，当remote_type为hologres时需要配置，包含以下参数： key：必填，item_id列名。 name：必填，特征表名。 value：可选，需要加载的列名，多个列名之间用半角逗号（,）分隔。 condition：可选，where子语句支持筛选Item。例如style_id<10000。 timekey：可选，用于Item的增量更新，用于指定更新的时间戳或整型值。支持的格式：timestamp和int。 static：可选，表示静态特征，不用周期性更新。支持从多个表中读取输入Item数据，配置格式为： "tables": [{"key":"table1", ...},{"key":"table2", ...}] 如果多张表有重复的列，后面的表将覆盖前面的表。	"tables": { "key": "goods_id", "name": "public.ali_rec_item_feature" }
url	否	Hologres的访问地址。	"url": "postgresql://LTAIXXXXX:J6geXXXXXX@hgprecn-cn-xxxxx-cn-hangzhou-vpc.hologres.aliyuncs.com:80/bigdata_rec"
Processor访问FeatureStore相关参数
fs_project	否	FeatureStore 项目名称，使用 FeatureStore 时需指定该字段。 FeatureStore文档请参考：配置FeatureStore项目。	"fs_project": "fs_demo"
fs_model	否	FeatureStore模型特征名称。	"fs_model": "fs_rank_v1"
fs_entity	否	FeatureStore实体名称。	"fs_entity": "item"
region	否	FeatureStore 产品所在的地区。	"region": "cn-beijing"
access_key_id	否	FeatureStore 产品的 access_key_id。	"access_key_id": "xxxxx"
access_key_secret	否	FeatureStore 产品的 access_key_secret。	"access_key_secret": "xxxxx"
load_feature_from_offlinestore	否	离线特征是否直接从FeatureStore OfflineStore中获取数据，取值如下： True：是，会从FeatureStore OfflineStore中获取数据。 False（默认值）：否，会从FeatureStore OnlineStore中获取数据。	"load_feature_from_offlinestore": True
input_tile: 特征自动扩展相关参数
INPUT_TILE	否	支持item feature自动broadcast，对于一次请求里面值都相同的feature（例如user_id），可以只传一个值。优势：减少了请求大小、网络传输时间和计算时间。开启：设置INPUT_TILE环境变量为2。说明	"processor_envs": [ { "name": "INPUT_TILE", "value": "2" } ]

easyrec-1.3及其以上版本支持该优化。
fg_mode=tf时，已自动开启该优化，不需要再单独设置该环境变量。

步骤三：部署服务

使用eascmd工具部署上一步创建的服务配置文件。

# 执行部署命令。
eascmd  create echo.json
# eascmd -i <AccessKeyID>  -k  <AccessKeySecret>   -e <endpoint> create echo.json
# 执行更新命令
eascmd modify ali_rec_rnk_no_fg -s echo.json

根据输出日志确保服务部署成功。服务成功部署后，你会获得服务的访问地址。

步骤四：调用服务

EasyRec模型服务的调用

在bypass模式下，根据EasyRec Processor的请求格式，使用Java或Python进行服务调用。

对于Java的示例

Maven环境配置请参考Java SDK使用说明，请求服务ali_rec_rnk_no_fg的示例代码如下：

import java.util.List;

import com.aliyun.openservices.eas.predict.http.PredictClient;
import com.aliyun.openservices.eas.predict.http.HttpConfig;
import com.aliyun.openservices.eas.predict.request.TFDataType;
import com.aliyun.openservices.eas.predict.request.TFRequest;
import com.aliyun.openservices.eas.predict.response.TFResponse;

public class TestEasyRec {
    public static TFRequest buildPredictRequest() {
        TFRequest request = new TFRequest();
 
        request.addFeed("user_id", TFDataType.DT_STRING, 
                        new long[]{5}, new String []{ "u0001", "u0001", "u0001"});
      	request.addFeed("age", TFDataType.DT_FLOAT, 
                        new long[]{5}, new float []{ 18.0f, 18.0f, 18.0f});
        // 注意: 如果设置了INPUT_TILE=2，那么上述值都相同的feature可以只传一次:
        //    request.addFeed("user_id", TFDataType.DT_STRING,
        //            new long[]{1}, new String []{ "u0001" });
        //    request.addFeed("age", TFDataType.DT_FLOAT, 
        //            new long[]{1}, new float []{ 18.0f});
      	request.addFeed("item_id", TFDataType.DT_STRING, 
                        new long[]{5}, new String []{ "i0001", "i0002", "i0003"});  
        request.addFetch("probs");
      	return request;
    }

    public static void main(String[] args) throws Exception {
        PredictClient client = new PredictClient(new HttpConfig());

        // 如果要使用网络直连功能，需使用setDirectEndpoint方法, 如: 
        //   client.setDirectEndpoint("pai-eas-vpc.cn-shanghai.aliyuncs.com");
        // 网络直连需打通在EAS控制台开通，提供用于访问EAS服务的源vswitch，
        // 网络直连具有更好的稳定性和性能。
        client.setEndpoint("xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com");
        client.setModelName("ali_rec_rnk_no_fg");
        client.setToken("");
        long startTime = System.currentTimeMillis();
        for (int i = 0; i < 100; i++) {
            try {
                TFResponse response = client.predict(buildPredictRequest());
                // probs为模型的输出的字段名, 可以使用curl命令查看模型的输入输出:
                //   curl xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com -H "Authorization:{token}"
                List<Float> result = response.getFloatVals("probs");
                System.out.print("Predict Result: [");
                for (int j = 0; j < result.size(); j++) {
                    System.out.print(result.get(j).floatValue());
                    if (j != result.size() - 1) {
                        System.out.print(", ");
                    }
                }
                System.out.print("]\n");
            } catch (Exception e) {
                e.printStackTrace();
            }
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Spend Time: " + (endTime - startTime) + "ms");
        client.shutdown();
    }
}

对于Python的示例

请参考Python SDK使用说明。由于python性能比较差，建议仅在调试服务时使用，在生产环境中应使用Java SDK。请求服务ali_rec_rnk_no_fg的示例代码如下：

#!/usr/bin/env python

from eas_prediction import PredictClient
from eas_prediction import StringRequest
from eas_prediction.tf_request_pb2 import TFRequest

if __name__ == '__main__':
    client = PredictClient('http://xxxxxxx.vpc.cn-hangzhou.pai-eas.aliyuncs.com', 'ali_rec_rnk_no_fg')
    client.set_token('')
    client.init()

    req = TFRequest()
    req.add_feed('user_id', [3], TFRequest.DT_STRING, ['u0001'] * 3)
    req.add_feed('age', [3], TFRequest.DT_FLOAT, [18.0] * 3)
    # 注意: 开启INPUT_TILE=2的优化之后, 上述特征可以只传一个值
    #   req.add_feed('user_id', [1], TFRequest.DT_STRING, ['u0001'])
    #   req.add_feed('age', [1], TFRequest.DT_FLOAT, [18.0])
    req.add_feed('item_id', [5], TFRequest.DT_STRING, 
        ['i0001', 'i0002', 'i0003'])
    for x in range(0, 100):
        resp = client.predict(req)
        print(resp)

您也可以自行构建服务请求，详情请参见请求格式。

步骤五：监控和优化

部署服务后，建议进行性能测试，并根据反馈优化服务性能和稳定性。

结语

通过以上步骤，你将能够在不使用EasyRec训练的情况下，使用EAS进行模型推理。