Create a real-time dataset through event preprocessing
Dataphin supports event preprocessing to create real-time datasets that use the processed results as dataset metrics. This topic describes how to create and configure a real-time dataset through event preprocessing.
Prerequisites
-
Before creating a real-time dataset, make sure you have created an event for real-time dataset development. For more information, see .
-
Before creating a real-time dataset, create the tag project that the dataset belongs to. For more information, see .
ImportantTo create a new real-time dataset through event preprocessing, you must add a real-time computing source to the project.
Procedure
-
On the Dataphin home page, click Tag on the top menu bar to access the Asset Market section by default.
-
To enter the Add Real-time Dataset dialog box, follow these steps:
Click Workbench -> select Tag Project -> click Real-time Dataset -> click Add Dataset.
-
In the Add Real-time Dataset dialog box, select Event Preprocessing.
-
On the Add Event Preprocessing configuration page, fill in the basic information for the dataset.
Parameter
Description
Dataset Name
Enter the name of the dataset. The name can contain Chinese and English characters, numbers, and underscores (_), with a maximum of 64 characters.
Dataset Code
Enter a unique identifier for the real-time dataset. The identifier can contain Chinese and English characters, numbers, and underscores (_), with a maximum of 64 characters.
Owner
Select the owner of the real-time dataset.
Description
Enter a description of the real-time dataset, with a maximum of 1,000 characters.
-
Set up the Processing Logic for the real-time dataset.
Parameter
Description
Event List
Select the event for the dataset. For details about how to create events, see .
Primary Key
After you select the event, specify the primary key for the dataset.
NoteBy default, the primary key can only be set for Character Type or Long Integer Type fields.
Aggregation Attribute
Select the fields to process, choose a query function and time window, and the system automatically determines the return type.
-
Query functions vary based on field type:
-
Long Integer Type: Count, Sum, Max, Min.
-
String: Count, Max, Min.
-
-
Time window options include: Last 10 minutes, Last 30 minutes, Last 1 hour, Last 6 hours, Last 12 hours, Custom.
To add more aggregation attributes, click Add.
Filter Condition
Apply filter conditions to the data as needed. Supported conditions include: Greater than or equal to, Greater than, Less than or equal to, Less than, Not empty, Empty, In range, Not in range, Or, And, Later than, Later than or equal to, Earlier than, Earlier than or equal to.
To add more filter conditions, click Add Filter Condition. When multiple filter conditions are configured, the following logical operations are supported: Or, And.
-
Or: Returns data that matches any of the specified conditions.
-
And: Returns data only when all specified conditions are met.
-
-
To finalize, click Publish.
What to do next
After you create and publish the real-time dataset, you can create real-time tags for it. For more information, see .