Cold start solutions-Platform For AI(PAI)-阿里云帮助中心

Why cold start is a problem

Recommendation systems typically use collaborative filtering, matrix factorization, or deep learning models to generate candidate sets, all of which rely on a user-item interaction matrix. As new users and items are constantly added, the system lacks sufficient historical interaction data to provide accurate recommendations for them or recommend them to the right users. This is the cold start problem. Existing algorithms at the recall, coarse-ranking, and fine-ranking stages over-rely on accumulated user behavior data, which is scarce for new entities. As a result, new items receive limited exposure, and the system cannot accurately model new users' interests.

For some businesses, promptly recommending new items and ensuring adequate exposure is vital for the platform ecosystem and long-term revenue. In the news industry, delayed exposure sharply diminishes a story's value. On user-generated content (UGC) platforms, poor exposure for new content discourages creators and reduces content quality. On dating platforms, insufficient attention for new users hinders growth and leads to stagnation.

The cold start problem is a critical challenge in recommendation systems. The following section describes common approaches to solving it.

How to solve the cold start problem

Cold start algorithms and strategies can be summarized by a four-word mnemonic: "Generalize, Fast, Transfer, Few".

Generalize: Map a new item to broader attributes or topics. For example, recommend a new product to users who liked items in the same category, moving up from "product" to "category." Recommend a new short video to the creator's followers, moving up from "short video" to "creator." Recommend a new article to users interested in the same topic, such as showing an article about the J-20 fighter jet to a military enthusiast, moving up from "news article" to "topic." This is essentially content-based recommendation. Generalizing to multiple higher-level concepts simultaneously often yields better results. For instance, a new product can be associated with its "category," "brand," "store," "style," and "color." Sometimes these concepts are inherent to the item—a merchant typically provides product attributes when listing. In other cases, they must be discovered, such as using an algorithm to classify an article as "military," "sports," or "beauty."

Another common technique matches user interests with items by calculating the distance or similarity between embedding vectors. Matrix factorization and deep learning models can generate these vectors, but conventional models still rely on interaction data and do not generalize well to cold start scenarios. Some models are specifically designed for cold start embedding generation, such as the one described in In-Depth Analysis and Improvement of the DropoutNet Model for Cold Start Recommendations.

Although this method appears simple, it has significant depth. It uses an item's content and attributes to compensate for the lack of historical interactions. For example, you can leverage multimodal information such as images or videos for related recommendations. On a dating platform, you could score a new user's photos for attractiveness and recommend them to users who prefer that level of attractiveness.
Fast: Speed is a decisive advantage in recommendation. Cold start items have no historical user interactions by definition, so a natural solution is to collect interaction data faster and feed it into the system. While conventional models and data are often updated daily, real-time processing can enable minute-level or even second-level updates. These methods are typically based on reinforcement learning or the contextual bandit algorithm. For more details, see Implementation and Application of the contextual bandit algorithm in Recommendation Systems.
Transfer: Transfer learning uses data from different scenarios to build models, transferring knowledge from a source domain to a target domain. For example, when launching a new service with limited data, you can use data from an established service to build the initial model. Another example is a cross-border e-commerce platform: if a new country site has limited interaction data, you can train a model on a mature site's data and fine-tune it with the new site's small sample to significantly improve cold start performance. For transfer learning to be effective, the source and target domains must be sufficiently related—for instance, the sites should ideally sell a large number of overlapping products.
Few: Few-shot learning trains models with only a small amount of supervised data. A typical few-shot method is meta-learning. For more information, see A Cold Start Recommendation Model Based on meta-learning.