文档

ALS矩阵分解

更新时间:

交替最小二乘ALS(Alternating Least Squares)算法的原理是对稀疏矩阵进行模型分解,评估缺失项的值,从而得到基本的训练模型。在协同过滤分类方面,ALS算法属于User-Item CF(Collaborative Filtering),兼顾UserItem项,也称为混合CF。

使用限制

支持运行的计算资源为MaxCompute和Flink。

可视化配置组件参数

  • 输入桩

    输入桩(从左到右)

    数据类型

    建议上游组件

    是否必选

    数据

  • 组件参数

    页签

    参数

    描述

    字段设置

    user列名

    输入数据源中,用户ID列的名称。该列数据必须是BIGINT类型。

    item列名

    输入数据源中,item项的列名。该列数据必须是BIGINT类型。

    打分列名

    输入数据源中,用户对item项的打分所在的列名。该列数据必须是数值型。

    参数设置

    因子数

    默认值为10,取值范围为(0,+∞)

    迭代数

    默认值为10,取值范围为(0,+∞)

    正则化系数

    默认值为0.1,取值范围为(0,+∞)

    复选框

    是否采用隐式偏好模型。

    隐式偏好系数

    默认值为40,取值范围为(0,+∞)

    输出表生命周期

    输出模型表的生命周期,单位天。

    执行调优

    节点个数

    取值范围为1~9999。

    单个节点内存大小

    取值范围为1024 MB~64*1024 MB。

  • 输出桩

    输出桩(从左到右)

    数据类型

    下游组件

    user因子表

    ALS评分

    item因子表

    ALS评分

使用示例

例如,使用如下数据作为ALS算法模板的输入数据,可以获得输出的user因子和item因子:

  • 输入数据源

    user_id

    item_id

    rating

    10944750

    13451

    0

    10944751

    13452

    1

    10944752

    13453

    2

    10944753

    13454

    2

    10944754

    13455

    4

    ... ...

    ... ...

    ... ...

  • 输出的user因子表

    user_id

    factors

    8528750

    [0.026986524,0.03350178,0.03532385,0.019542359,0.020429865,0.02046867,0.022253247,0.027391396,0.018985065,0.04889483]

    282500

    [0.116156064,0.07193632,0.090851225,0.017075706,0.025412979,0.047022138,0.12534861,0.05869226,0.11170533,0.1640192]

    4895250

    [0.038429666,0.061858658,0.04236993,0.055866677,0.031814687,0.0417443,0.012085311,0.0379342,0.10767074,0.028392972]

    ... ...

    ... ...

  • 输出的item因子表

    item_id

    factors

    24601

    [0.0063337763,0.026349949,0.0064828005,0.01734504,0.022049638,0.0059205987,0.008568814,0.0015981696,0.0,0.013601779]

    26699

    [0.0027524426,0.0043066847,0.0031336215,0.00269448,0.0022347474,0.0020477585,0.0027995422,0.0025390312,0.0033011117,0.003957773]

    20751

    [0.03902271,0.050952066,0.032981463,0.03862796,0.048720762,0.027976315,0.02721664,0.018149626,0.0149896275,0.026251089]

    ... ...

    ... ...