This topic provides the release notes for the NLP Self-learning Platform.
February 2023
Project Type | Feature Name | Feature Description | Release Date | References |
Entity extraction | Self-learning model | Added the UIE few-shot entity extraction model. | 2023-02-16 | |
Relation extraction | Self-learning model | Added the UIE few-shot relation extraction model. | 2023-02-16 | |
Text classification | Self-learning model | Added the StructBERT few-shot classification model. | 2023-02-10 |
September 2022
Project Type | Feature Name | Feature Description | Release Date | References |
Relation extraction | Self-learning model | Added the StructBERT-split model. | 2022-09-01 | |
Relation extraction | Self-learning model | Added the StructBERT-cascade model. | 2022-09-01 | |
Text summarization (generative) | Pre-trained model | A generative summarization model based on PALM 2.0. This model is suitable for generating summaries or article titles. | 2022-09-07 | |
Product description generation (Chinese) | Pre-trained model | Generates product descriptions related to selling points based on a given product and a set of selling point keywords. | 2022-09-20 | |
Weather report welcome message generation (Chinese) | Pre-trained model | Generates in-vehicle startup welcome messages based on given weather information fields. | 2022-09-28 |
July 2022
Project Type | Feature Name | Feature Description | Release Date | References |
Contract extraction | Self-learning model | Extracts entities from contract text. This model has more than 20 built-in entity labels that do not require annotation, reducing the data annotation cost for model training to less than 20% of the original cost. | 2022-07-08 | |
Judicial documents (fact-finding) | Self-learning model | Extracts fact-finding entities from judicial documents. This model has more than 10 built-in entity labels that do not require annotation, reducing the data annotation cost for model training to less than 50% of the original cost. | 2022-07-08 |
June 2022
Project Type | Feature Name | Feature Description | Release Date | References |
Product review analysis - Incremental training | Self-learning model | Add custom labels to the pre-trained models for product review analysis in the e-commerce, automotive, and local life realms. Train the model only with the new labels to obtain a complete analysis model. | 2022-06-17 |
May 2022
Project Type | Feature Name | Feature Description | Release Date | References |
Caller and callee intent recognition in telemarketing scenarios | Pre-trained model | This model is suitable for outbound telemarketing calls. It recognizes the caller's intent (such as marketing, notification, or debt collection) and the callee's intent (such as unavailability, sentiment, or willingness to communicate) from the conversation content. It can be used for voice quality inspection. | 2022-05-06 | Caller and callee intent recognition in telemarketing scenarios |
Entity extraction | Rules engine upgrade | After a model is published, you can add and modify rules without retraining the model. | 2022-05-06 |
March 2022
Project Type | Feature Name | Feature Description | Release Date | References |
Entity extraction | Self-learning model | StructBERT series models now have accelerated inference, with an average response time (RT) reduction of 45%. For an average text length of 2,000 characters, the average RT is about 3s. | 2022-03-14 | |
Text classification | Self-learning model | Added a StructBERT implementation. | 2022-03-18 | |
Dialogue classification | Self-learning model | Added a high-accuracy version (StructBERT implementation). | 2022-03-18 |
December 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Purchase decision analysis for product reviews - Automotive | Pre-trained model | Analyzes purchase decision information from user reviews, such as purchase motivation, usage scenarios, feature requirements, and questions. This helps improve products, enhance user experience, segment user profiles, and conduct targeted marketing. The model includes 25 types of labels. | 2021-12-09 | |
Product review analysis service - Automotive | Pre-trained model | An analysis service for reviews in the automotive realm. It includes 71 types of property labels. For more information, see the referenced document. | 2021-12-09 | |
Entity extraction | Self-learning model | Added the Chinese StructBERT-CRF model. This model is suitable for datasets where labels have strong dependencies. | 2021-12-03 |
November 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Purchase decision analysis for product reviews - E-commerce | Pre-trained model | Analyzes purchase decision information from user reviews, such as purchase motivation, usage scenarios, feature requirements, and questions. This helps improve products, enhance user experience, segment user profiles, and conduct targeted marketing. | 2021-11-24 | |
Entity extraction | Self-learning model | Added Chinese StructBERT. This is a distilled model based on Alibaba's self-developed StructBERT. It is pre-trained on a large amount of unlabeled corpus and is suitable for Chinese tasks with insufficient labeled data. The model is optimized for entity overlap issues. | 2021-11-19 | |
My Models page | Console update | Added the My Models page. On this page, you can query published self-learning models, call pre-trained models, view the number of purchased models, check the remaining balance of your resource plans, extend the validity period of models, and upgrade or downgrade the number of models. | 2021-11-19 | / |
October 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Product review analysis - E-commerce | Pre-trained model upgrade | Added 6 industries: automotive supplies, festive supplies, 3C digital accessories, hardware tools, pets, and flowers & plants. Updated 6 existing industries by adding property categories. For more information, see the referenced document. | 2021-10-19 | |
Dialogue classification | Self-learning model | Classifies entire dialogue texts by content type. Common scenarios include dialogue quality inspection, customer intent recognition, and telemarketing lead mining. For more information, see the referenced document. | 2021-10-12 |
September 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Document structuring - Key-value information extraction | Pre-trained model | Extracts information that follows a key-value pattern from documents. This model performs well on documents with clear key-value information patterns, such as resumes, contracts, and reports. For more information, see the referenced document. | 2021-09-23 | |
Contract element extraction - General | Pre-trained model | Extracts common elements from contracts. It supports 26 general element fields. For more information, see the referenced document. | 2021-09-07 | |
Contract element extraction | Self-learning model | Custom-developed for contract extraction scenarios (such as Party A, Party B, and date) to extract key elements or elements with specific meanings from contracts. For more information, see the referenced document. | 2021-09-01 |
August 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Bidding and bid-winning information extraction - Premium Edition | Pre-trained model upgrade | The bid-winning model was upgraded to support 7 new fields, including bidding agency and project owner. The model now supports the extraction of 36 fields. For more information, see the referenced document. | 2021-08-01 | Bidding and bid-winning information extraction - Premium Edition service |
Product review analysis - E-commerce | Pre-trained model upgrade | Added 3 industries: audio and video appliances, kitchen appliances, and kitchen/cooking utensils. Added property categories for 7 industries. For more information, see the referenced document. | 2021-08-05 |
July 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Product review analysis - High-accuracy version | Self-learning model | Based on BERT, this model has slower training and prediction speeds but higher accuracy. It requires fewer computing resources and is suitable for large training datasets. For more information, see the referenced document. | 2021-07-07 | |
Product review analysis - E-commerce | Pre-trained model upgrade | Added 9 industries: cleaning tools, personal care, home decorations, daily home supplies, home textiles, maternity supplies, storage and organization, tableware, and toys. Added property categories for 6 industries. For more information, see the referenced document. | 2021-07-12 | |
Sentence pair classification | Self-learning model | Classifies a pair of text sentences by content. It supports both single-label and multi-label classification and includes high-accuracy and high-performance versions. Common scenarios include calculating semantic equality between two sentences, matching questions with answers, and contextual single-sentence classification. For more information, see the referenced document. | 2021-07-16 | |
Annotation feature upgrade | Frontend experience optimization | Added buttons for navigating to the previous or next item, support for modifying items during annotation, clearer interactions, more intuitive data displays, and other experience optimizations. | 2021-07-28 | / |
June 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Bidding and bid-winning notice type classification service | Pre-trained model | This service can be used as a pre-processing step for the Bidding Analysis Service (Premium Edition) and the Bid-winning Analysis Service (Premium Edition) to distinguish between notice types. For more information, see the referenced document. | 2021-06-08 | |
Bidding and bid-winning information extraction - Premium Edition service | Pre-trained model | The Premium Edition supports more fields and provides higher accuracy than the Basic Edition. For more information, see the referenced document. | 2021-06-08 | Bidding and bid-winning information extraction - Premium Edition service |
May 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Product review analysis - E-commerce | Pre-trained model upgrade | Added 7 industries: mother and baby, books, apparel and accessories, outdoor, sports, stationery, and wash and clean. Added property categories for 20 industries. For more information, see the referenced document. | 2021-05-20 | |
Pre-built datasets | Experience optimization | Added pre-built test datasets for self-learning algorithm modules such as text classification, entity extraction, short text matching, and relation extraction to help you get started quickly. | 2021-05-24 | / |
Entity extraction | Self-learning model upgrade | Upgraded the rules engine to support rule combinations, AND/OR logic, and complex expressions that combine rules with model extraction results. It also supports rule effect previews and is far more efficient than the previous version. For more information, see the referenced document. | 2021-05-24 |
April 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Emotion recognition service | Pre-trained model upgrade | Added a high-accuracy version. For more information, see the referenced document. | 2021-04-12 |
March 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Suspected fraud detection in telemarketing dialogues | Pre-trained model | This model is suitable for outbound telemarketing calls. It identifies suspected fraud risks from conversation content and can be used for voice quality inspection. For more information, see the referenced document. | 2021-03-30 | Tutorial for suspected fraud detection service in telemarketing dialogues |
Product review analysis service - Local life | Pre-trained model | An analysis service for reviews in the local life realm. It currently supports the beauty, hairdressing, nail, and catering industries. For more information, see the referenced document. | 2021-03-29 | |
Emotion recognition service | Pre-trained model upgrade | Optimized the recognition of positive emotions and added three common business-related emotions: complaint, gratitude, and grievance. For more information, see the referenced document. | 2021-03-25 | |
Threat detection in telemarketing dialogues | Pre-trained model | This model is suitable for outbound telemarketing calls. It identifies threats (such as insults, complaints, and intimidation) from conversation content and can be used for voice quality inspection. For more information, see the referenced document. | 2021-03-24 | |
Bidding document parsing | Self-learning model | Launched the industry-specific algorithm for bidding document parsing. It supports platform model testing and custom model training. | 2021-03-24 | |
Entity extraction | Frontend experience optimization | Supports batch uploading of data for annotation from Excel files. | 2021-03-04 | / |
Text classification | Frontend experience optimization | Supports modifying items during annotation. Supports uploading items from files. Optimized the display of statistical results for multi-label datasets. | 2021-03-04 | / |
February 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Product review analysis - E-commerce | Pre-trained model upgrade | Added the industry code 'all' to return labels for all industries, allowing customers to filter as needed. For more information, see the referenced document. | 2021-02-19 | |
Text classification | Pre-trained model | The test interface now supports uploading files for batch prediction. | 2021-01-31 | / |
All | Frontend experience optimization | During the training phase, you can delete specific model versions, cancel publishing, and perform other operations. | 2021-02-01 | / |
January 2021
Project Type | Feature Name | Feature Description | Release Date | References |
Entity extraction | Self-learning model upgrade | The returned result now includes `conf`, which indicates the confidence level of the extracted entity. | 2021-01-20 | / |
Keyword extraction and text summarization | Pre-trained model | Based on the TextRank algorithm, this model is suitable for extracting keywords or summaries from documents. For more information, see the referenced document. | 2021-01-25 | Tutorial for keyword extraction and text summarization service |
Industry classification for telemarketing dialogues | Pre-trained model | This model is suitable for outbound telemarketing calls. It classifies dialogue applications by industry and scenario and can be used for voice quality inspection. For more information, see the referenced document. | 2021-01-31 | Tutorial for industry classification service for telemarketing dialogues |
All | Frontend experience optimization | The uploaded file name is now used as the default dataset name. The button for all model versions is now prominently displayed. The model metrics page now supports sorting. | 2021-01-31 | / |
November 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Product review analysis | Pre-trained model upgrade | Optimized the apparel and luggage industries, increasing the extraction recall rate by about 25%. The micro F1-score for negative sentiment increased by 10%. Optimized for long reviews. Added the ability to extract normalized attribute sentiment words. For more information, see the referenced document. | 2020-11-09 |
October 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Bidding and bid-winning information extraction service | Pre-trained model | Parses key elements from bidding and bid-winning documents. For more information, see the referenced document. | 2020-10-20 | Tutorial for bidding and bid-winning information extraction service |
Customer inquiry analysis service | Pre-trained model | This service is suitable for online chat scenarios between customer service and consumers in industries such as e-commerce. It analyzes consumer messages to determine intent, sentiment, emotion, points of interest, and fine-grained sentiment. For more information, see the referenced document. | 2020-10-23 | |
Dialogue knowledge extraction service | Pre-trained model | This service is suitable for online chat scenarios between customer service and consumers. It extracts customer service scripts and user questions, such as agent questions and customer answers, from conversations. This can be used for analyzing hot-spot user issues or building a customer service script library to optimize chatbots. For more information, see the referenced document. | 2020-10-30 |
September 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Product title category prediction service | Pre-trained model | Predicts the category of a product based on its title. For more information, see the referenced document. | 2020-09-18 | |
User intent recognition service for telemarketing scenarios | Pre-trained model | Recognizes the intent of user responses to customer service agents in telemarketing scenarios. For more information, see the referenced document. | 2020-09-18 | Tutorial for user intent recognition service for telemarketing scenarios |
Live ASR Garbled Text Detection Service | Pre-trained model | For live streaming scenarios, uses Automatic Speech Recognition (ASR) to convert speech to text and detect poor readability caused by crosstalk. For more information, see the document on the right. | 2020-09-29 | |
Pornography detection service for novels | Pre-trained model | Identifies whether Chinese novel content contains pornographic or obscene material. This service is suitable for novel content moderation. For more information, see the referenced document. | 2020-09-29 | |
Sentiment analysis (English) service | Pre-trained model | Predicts the sentiment expressed in social media short texts for e-commerce scenarios. For more information, see the referenced document. | 2020-09-30 | |
Sentiment analysis (Spanish) service | Pre-trained model | Predicts the sentiment expressed in social media short texts for e-commerce scenarios. For more information, see the referenced document. | 2020-09-30 | |
Sentiment analysis (Russian) service | Pre-trained model | Predicts the sentiment expressed in social media short texts for e-commerce scenarios. For more information, see the referenced document. | 2020-09-30 | |
Text embedding generation service | Pre-trained model | For more information in Chinese, see the document on the right. | 2020-09-30 |
August 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Entity extraction | Prediction service framework upgrade | Upgraded the model prediction service framework for entity extraction, improving model prediction efficiency by more than 2 times. This upgrade only applies to newly trained models. | 2020-08-18 | / |
Text classification/Entity extraction | Launched the smart annotation module | The smart annotation module is now available for text classification and entity extraction tasks. You can use the pre-annotation and active learning features provided by the platform to reduce annotation workload, improve efficiency and quality, and use the data for model training. | 2020-08-13 | / |
Profanity detection service | Pre-trained model | Identifies whether a sentence contains profanity and extracts profane keywords. For more information, see the referenced document. | 2020-08-26 | |
Emotion recognition service | Pre-trained model | Recognizes the emotion in a sentence. It currently supports 8 types of emotions. For more information, see the referenced document. | 2020-08-26 | |
Hierarchical news classification service | Pre-trained model | Identifies the type of news from news data. For more information, see the referenced document. | 2020-08-26 | |
Product review analysis | Pre-trained model upgrade | Added two industries: robotic vacuum cleaners (appliances) and wash and clean (fast-moving consumer goods), increasing the number of supported industries from 37 to 39. Upgraded the model architecture, significantly improving the accuracy of attribute opinion word extraction by 15%. For more information, see the referenced document. | 2020-08-27 |
July 2020
Project Type | Feature Name | Feature Description | Release Date | References |
All | Word document parsing optimization | Optimized the parsing of Word documents uploaded for annotation, resulting in more complete sentence parsing. | 2020-07-16 | / |
Judgment document parsing | Pre-trained model | A pre-trained model service for parsing judgment documents that can be called directly. For more information, see the referenced document. | 2020-07-09 |
June 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Entity extraction/Resume extraction | Support for incremental training | Entity extraction and resume extraction models now support incremental training for faster and more efficient model iteration. | 2020-06-18 | / |
All | Time estimation for document parsing and model publishing | After you upload a document for annotation, upload a labeled dataset, or publish a model, the platform provides a time estimate and sends a text message and email notification upon completion. | 2020-06-12 | / |
May 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Product review analysis | Pre-trained model upgrade | Upgraded the pre-trained model service for product review analysis. It now supports four features: attribute sentiment recognition, attribute sentiment word extraction, sentiment clause extraction, and full-sentence sentiment recognition. The number of supported industries increased from 24 in the Basic Edition to 37. For more information, see the referenced document. | 2020-05-30 | |
Resume extraction | Version 1.0 launched | A new project type. Based on a model trained on massive amounts of labeled data and a rules engine from within Alibaba, this feature provides high-accuracy Chinese and English resume extraction. The platform supports 27 common Chinese fields and 10 common English fields, such as name, phone number, email, work experience, and education. You can add and annotate data for other custom fields to train a custom model. No data annotation is required if you only use the resume extraction fields provided by the platform. | 2020-05-15 | / |
April 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Resume extraction | Pre-trained model | A pre-trained model service for Chinese and English resume extraction that can be called directly. For more information, see the referenced document. | 2020-04-30 | - Chinese service: Tutorial for resume extraction (Chinese) service- English service: Tutorial for resume extraction (English) service |
Product review analysis | Pre-trained model | A pre-trained model service for product review analysis that can be called directly. For more information, see the referenced document. | 2020-04-17 | |
Entity extraction | Phone number extraction module | Added a pre-built phone number extraction option to the advanced parameters for model creation. | 2020-04-03 | / |
March 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Intelligent contract review | Version 1.0 launched | A new project type. Intelligently reviews contracts for risks, including logic errors, missing clauses, inconsistent elements, and legal risks. It also reviews the qualifications of the counterparty, including multi-dimensional potential risks, risk ratings, and basic information. | 2020-03-06 | / |
February 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Entity extraction/Text classification/Product review analysis | Model training time estimation | The platform now provides an estimated training time when you train a model and sends a text message and email notification upon completion. | 2020-02-28 | / |
January 2020
Project Type | Feature Name | Feature Description | Release Date | References |
Entity extraction | Launched the entity extraction BERT model | The BERT model is suitable for few-shot datasets. For more information, see the referenced document. | 2020-01-23 | |
Text classification | Optimization for models with a very large number of categories | Optimized the long training time issue for models with a very large number of categories. | 2020-01-23 | / |
Entity extraction/Text classification/Sentiment analysis/Product review analysis | Data pre-processing | The platform provides several pre-built pre-processing rules to help organize data. For more information, see the referenced document. | 2020-01-17 | |
Short text matching | Version 1.0 launched | A new project type. Upload short text matching data to train a semantic matching model. When using the model, input two short texts to get a similarity score. | 2020-01-17 | / |
December 2019
Project Type | Feature Name | Feature Description | Release Date | References |
All | Added an asynchronous prediction API | This API supports offline calls for longer texts and files. It supports up to 10,000 characters and the following file formats: TXT, HTML, PDF, DOC, and DOCX. For more information, see the referenced document. | 2019-12-31 | |
Product review analysis | Version 1.0 launched | A new project type. Based on massive amounts of labeled data from Alibaba's e-commerce platform, this feature builds custom models for various industries to analyze product review text from multiple dimensions. No data annotation is required if you only use the review dimensions provided by the platform. | 2019-12-20 | / |
All | Increased monthly subscription options for models | Added multi-month subscription options for models. For more information, see the referenced document. | 2019-12-20 | |
Entity extraction | Launched the rules engine | The internal beta of the rules engine is now available to some users for free. You can configure rules to assist the model. For more information, see the referenced document. | 2020-12-13 |
November 2019
Project Type | Feature Name | Feature Description | Release Date | References |
Text classification | Model iteration and optimization | Updated and optimized 6 text classification models. For more information, see the referenced document. | 2019-11-01 | |
Relation extraction | Model iteration and optimization | Optimized the relation extraction model. This update integrates entity extraction, allowing the model to extract both entities and relations after training. | 2019-11-01 | / |
October 2019
Project Type | Feature Name | Feature Description | Release Date | References |
All | Support for RAM user authorization | Manage authorization for RAM users through RAM. For more information, see the referenced document. | 2019-10-25 | |
Text classification | Launched the text classification BERT model | The BERT model is suitable for few-shot datasets. For more information, see the referenced document. | 2019-01-23 |
September 2019
Project Type | Feature Name | Feature Description | Release Date | References |
All | Apsara Conference launch | The NLP Self-learning Platform was launched at the Apsara Conference. For more information, see the referenced document. | 2019-09-26 | |
All | Official commercialization | Announced the public cloud pricing plan and began official commercialization. For more information, see the referenced document. | 2019-09-23 | |
Relation extraction | Version 1.0 launched | A new project type. Extracts entities and their corresponding relations from text. | 2019-09-20 | / |
Sentiment analysis | Version 1.0 launched | A new project type. Analyzes and determines the positive or negative sentiment of text. | 2019-09-20 | / |
Key phrase extraction | Version 1.0 launched | A new project type. Extracts keywords and phrase labels from text. | 2019-09-20 | / |
Entity extraction | Annotation feature optimization | Added same-value annotation and offset fine-tuning features to the annotation page. For more information, see the referenced document. | 2019-09-06 |
August 2019
Project Type | Feature Name | Feature Description | Release Date | References |
All | Data center optimization | Supports viewing data distribution, quality inspection for uploaded datasets, and error correction feedback for model data. Added an entry point for submitting annotation requests. | 2019-08-30 | / |
Text classification | Model version iteration | Optimized the multi-class classification model to improve training speed and prediction efficiency. | 2019-08-30 | / |
July 2019
Project Type | Feature Name | Feature Description | Release Date | References |
All | Tutorial video released | Released a tutorial video to help users quickly understand how to use the platform. For more information, see the referenced document. | 2019-07-19 |
June 2019
Project Type | Feature Name | Feature Description | Release Date | References |
All | Self-learning Platform Version 1.0 launched | The Self-learning Platform entered public preview, supporting custom entity extraction and text classification algorithms. | 2019-06-10 | |
All | Model Hub optimization | Added a model version management module. For more information, see the referenced document. | 2019-06-05 | |
Entity extraction | Version 1.0 launched | A new project type. Extracts entities with specific meanings from text. | 2019-06-01 | / |
Text classification | Version 1.0 launched | A new project type. Extracts keywords and phrase labels from text. | 2019-06-01 | / |