Vector retrieval evaluation

更新时间:
复制 MD 格式

The Vector-based Recall Evaluation component measures the accuracy of vector retrieval systems by calculating hit rate across high-dimensional embedding spaces. Use it to evaluate u2i (user-to-item) and i2i (item-to-item) recalls in recommendation systems and information retrieval pipelines.

How it works

The component supports two recall types:

  • u2i: Uses user vectors to recall the top K items most similar to a given user.

  • i2i: Uses item vectors to recall the top K items most similar to a given item.

Hit rate is calculated as follows: given a set M of relevant items (user vectors for u2i, item vectors for i2i) that triggers the recall, the component retrieves the top K similar items and counts how many, N, fall within M. The hit rate equals |N| / |M|.

The component also outputs items that fall outside M, along with their distance values, to support bad case analysis.

The component runs in standalone or distributed mode:

  1. All workers load the user or item embedding table and build the indexes required by k-nearest neighbor (KNN) search.

  2. Workers look up the true sequence table in batches and return the top K nearest neighbors.

  3. The component computes hit rate by comparing the sequence values in the true sequence table against the top K results.

  4. Results are aggregated and written to MaxCompute tables.

Important

When evaluating recall precision, the embedding table and the true sequence table must come from different points in time. Collect embedding data at time T and true sequence data at time T+1. If both tables use data from the same point in time, the hit rate will be artificially inflated above the true value.

Input

Item embedding table

Stores item vectors, typically generated by a training algorithm such as GraphSAGE.

ColumnTypeExample
item idbigint23456677
item embeddingsstring0.1,0.2,0.3,...

User embedding table

Stores user vectors, typically generated by a training algorithm such as GraphSAGE. Required only for u2i recalls.

ColumnTypeExample
user idbigint12345
user embeddingsstring0.1,0.2,0.3,...

True sequence table

The ground truth table that maps each trigger to its set of relevant items. For u2i recalls, the trigger id column maps to the user id column. For i2i recalls, it maps to the item id column.

ColumnTypeExample
trigger idbigint12345
item idsstring23456677,2233445,6837292,...

Output

total_hitrate table

Contains the overall hit rate across all triggers.

ColumnTypeExample
hitratedouble0.4

hitrate_details table

Contains one row per trigger, matching the row count of the true sequence table.

ColumnTypeDescription
idbigintFor u2i recalls: user_id. For i2i recalls: item_id.
topk_idsstringIDs of the top K recalled items, comma-separated.
topk_distsstringDistance values corresponding to each item in topk_ids.
hitratedoubleHit rate for this trigger.
bad_idsstringRecalled items that are not in the relevant set M.
bad_distsstringDistance values corresponding to each item in bad_ids.

Component parameters

GroupParameterTypeDescription
Inputitem_emb_tablestringThe item embedding table.
Inputtrue_seq_tablestringThe ground truth table. For u2i recalls: users and user-relevant items. For i2i recalls: items and item-relevant items.
Inputuser_emb_tablestring (optional)The user embedding table. Required only for u2i recalls.
Outputtotal_hitratestringOutput table for total hit rate values.
Outputhitrate_detailsstringOutput table for per-trigger hit rate details.
Parametersrecall_typestringRecall type: u2i or i2i.
Parametersemb_dimintEmbedding dimension of the embedding table.
ParameterskintNumber of items to recall (top K).
Parametersmetricint (optional, default: 1)Similarity metric. 0 uses L2 distance and returns the top K items with the shortest distance. 1 uses inner products and returns the top K items with the greatest inner product values.
Parametersstrictbool (optional, default: False)When True, computes similarity without approximation. This eliminates minor deviations in the hit rate calculation but significantly increases computation time.
Parameterslifecycleint (optional, default: 7)Retention period for output tables, in days.
Tuningbatch_sizeint (optional, default: 1024)Number of samples processed per batch. Reduce this value if workers run out of memory.
Tuningworker_countint (optional, default: 1)Number of workers. Increase this value for large input tables or when a single worker is not fast enough.
Tuningworker_memoryint (optional, default: 20000)Memory allocated to each worker, in MB.

Sample command

The following command evaluates u2i recall hit rate. Inner products are used as the similarity metric, strict mode is disabled, and evaluation runs in batches. One worker processes 1,024 items per batch with 20,000 MB of memory. Output tables are retained for 7 days.

pai -name hitrate_gl_ext
    -Ditem_emb_table='item_emb_table'
    -Duser_emb_table='user_emb_table'
    -Dtrue_seq_table='true_seq_table'
    -Dhitrate_details='hitrate_details'
    -Dtotal_hitrate='total_hitrate'
    -Drecall_type='u2i'
    -Dk=5
    -Demb_dim=10
    -Dmetric=1
    -Dstrict=False
    -Dbatch_size=1024
    -Dworker_count=1
    -Dworker_memory=20000
    -Dlifecycle=7;