Model descriptions

更新时间:
复制 MD 格式

Text classification offers several models. If you are unsure which model to choose, we recommend starting with the CNN model because it provides a good balance between performance and accuracy. The following sections describe each model to help you choose the one that best fits your specific scenario.

  • FastText classification model: This model is fast and requires minimal computing resources. It is suitable for large datasets with many class labels and for tasks that do not require deep semantic understanding.

  • CNN classification model: The CNN model is suitable for more complex scenarios than the FastText model. It can capture a wider and more detailed range of text features, which makes it ideal for tasks that require some semantic understanding. The CNN model generally performs better than the FastText model, but it takes longer to train.

  • Self-Attention classification model: Similar to the CNN model, the Self-Attention model is suitable for more complex scenarios than the FastText model and can capture a wider range of features. It is better than the CNN model at capturing long-term dependencies in text. This model is ideal for tasks that require some semantic understanding. Its training time and performance are similar to those of the CNN model.

  • Long text classification fusion model [Recommended]: This ensemble learning model was developed by Alibaba DAMO Academy. It combines mechanisms such as CNN, FastText, and Self-Attention. The model is suitable for all types of text classification scenarios and is particularly effective for long-form text, such as news articles and novels. This model has a long training time.

  • Short text classification fusion model: This model was developed by Alibaba DAMO Academy specifically for short text classification. It is suitable for scenarios where the text is shorter than 150 characters, such as text messages, microblogs, and comments. The model integrates traditional machine learning models, including Naive Bayes, FastText, support vector machines, and random forests. Its primary advantage is a fast training speed.

  • BERT few-shot classification: This model was developed by Alibaba DAMO Academy for text classification tasks with small sample sizes. The model uses a BERT model that is pre-trained on a large amount of unlabeled text. It is suitable for scenarios with limited annotated data. This model has long training and prediction times.

  • StructBERT classification model [Recommended]: This model is part of the AliceMind model system developed by Alibaba DAMO Academy. The model uses a StructBERT model that is pre-trained on a large amount of unlabeled text. It offers high accuracy but has a slow inference speed.

  • StructBERT few-shot classification: This model is based on StructBERT-base and trained for natural language inference tasks using the XNLI dataset, which is a Chinese dataset created by translating an English dataset.

    Scenarios: This model is designed for text classification tasks, especially in low-resource scenarios. Such scenarios include multi-level classification (up to three levels), many labels, and few training samples. This model has a high time overhead.

    Typical input example: The model supports hierarchical classification with up to three levels. The field names for the three levels are fixed as "Level 1 Label", "Level 2 Label", and "Level 3 Label". Each level is a single-label classification.