Configure the Amazon RDS for MySQL Input Component
The Amazon RDS for MySQL input component retrieves data from an Amazon RDS for MySQL data source. To synchronize data from an Amazon RDS for MySQL data source to another data source, configure this component to read from the source and then configure the destination data source for synchronization. This topic describes how to configure the Amazon RDS for MySQL input component.
Prerequisites
You have created an Amazon RDS for MySQL data source. For more information, see Create an Amazon RDS for MySQL data source.
The account used to configure the Amazon RDS for MySQL input component properties must have read-through permission for the data source. If the account lacks this permission, request data source permissions. For more information, see Request data source permissions.
Procedure
On the Dataphin homepage, in the top menu bar, click Development, and then click Data Integration.
On the Integration page, in the top menu bar, click Project. In Dev-Prod mode, select an environment.
In the navigation pane on the left, click Offline Integration. In the Offline Integration list, click the offline pipeline that you want to develop to open its configuration page.
In the upper-right corner of the page, click Component Library to open the Component Library panel.
In the left navigation pane of the Component Library panel, click Input. In the input component list on the right, locate the Amazon RDS for MySQL component and drag it to the canvas.
Click the
icon in the Amazon RDS for MySQL input component card to open the Amazon RDS for MySQL Input Configuration dialog box.In the Amazon RDS for MySQL Input Configuration dialog box, configure the parameters.
Parameter
Description
Step Name
The name of the Amazon RDS for MySQL input component. Dataphin automatically generates the step name. Modify it as needed. Naming rules are as follows:
Can contain only Chinese characters, letters, underscores (_), and digits.
Cannot exceed 64 characters.
Source Table Quantity
Select the source table quantity. Source table quantity includes Single Table and Multiple Tables:
Single Table: Applies to scenarios where business data from one table is synchronized to one destination table.
Multiple Tables: Applies to scenarios where business data from multiple tables is synchronized to the same destination table. When data from multiple tables is written to the same data table, the union algorithm is used.
For more information about union, see INTERSECT, UNION, and EXCEPT.
Datasource
In the data source drop-down list, all Amazon RDS for MySQL data sources in the current Dataphin instance are displayed, including those that you have read-through permissions for and those that you do not have read-through permissions for. If you select Single Table as the source table option, you can select only one data source. Click the
icon to copy the current data source name. If you select Multiple Tables as the source table option, you can select multiple data sources.For data sources without read-through permissions, click Request next to the data source to request read-through permissions. For more information, see Request data source permissions.
If you do not have an Amazon RDS for MySQL data source, click Create Data Source in the drop-down list to create one. For more information, see Create an Amazon RDS for MySQL data source.
Database (Optional)
Select the database where the table resides. If left blank, the database specified during data source registration is used.
If you select Multiple Tables for Source Table Quantity, select multiple databases. Click the
icon to view all selected databases in the Database List dialog box.Table Matching Method
Select General Rules or Database Regex.
NoteThis parameter is configurable only when you select Multiple Tables for Source Table Quantity.
Table
Select the source table:
If you select Single Table for Source Tables, you can enter a table name keyword to search, or enter the exact table name and click Exact Search. After you select a table, the system automatically checks the table status. Click the
icon to copy the name of the currently selected table.If you select Multiple Tables for Source Table Quantity, you can enter different expressions to add tables based on the table matching method.
General rule: Enter a table expression in the input box to filter for tables that have the same structure. The system supports enumeration, regex-like formats, and a combination of both. For example,
table_[001-100];table_102;.If you select Database Regex as the table matching method: enter a regular expression supported by the current database in the input box. The system will use this regular expression to match tables in the destination database, and when the task runs, it will dynamically match new tables based on the database regular expression to sync them.
After you enter the expression, click Exact Search to view the list of matched tables in the Confirm Match Details dialog box.
Split Key (Optional)
The system partitions data based on the configured split key field. Use this with concurrency configuration to achieve concurrent reads. You can use a column in the source data table as the split key. Additionally, use a primary key or an indexed column as the split key to ensure transfer performance.
ImportantWhen you select a date and time type, the system performs brute-force partitioning based on the total time range and concurrency by identifying the maximum and minimum values. This does not guarantee even distribution.
Input Filter (Optional)
Configure the filter conditions for extracting data. Details are as follows:
Configure static field: Extract the corresponding data. For example,
ds=20211111.Configure variable parameter: Extract a portion of data. For example,
ds=${bizdate}.
Output Fields
The Output Fields area displays all fields from the selected table and those that match the filter conditions. The following operations are supported:
Field Management: If you do not need to output certain fields to downstream components, delete the corresponding fields:
Single field deletion: To delete a few fields, click the
icon in the Actions column to delete redundant fields.Batch field deletion: To delete many fields, click Field Management. In the Field Management dialog box, select multiple fields, click the
left arrow icon to move the selected input fields to the unselected input fields, and then click OK to complete the batch deletion of fields.
Batch Add: Click Batch Add. Batch configuration is supported in JSON, TEXT, and DDL formats.
NoteAfter batch adding is complete, clicking OK will overwrite the configured field information.
Batch configure in JSON format. For example:
// Example: [{ "index": 1, "name": "id", "type": "int(10)", "mapType": "Long", "comment": "comment1" }, { "index": 2, "name": "user_name", "type": "varchar(255)", "mapType": "String", "comment": "comment2" }]NoteIndex indicates the column number of the specified object, name indicates the field name after import, and type indicates the field type after import. For example,
"index":3,"name":"user_id","type":"String"means to import the fourth column in the file, with the field name user_id and field type String.Batch configure in TEXT format. For example:
// Example: 1,id,int(10),Long,comment1 2,user_name,varchar(255),Long,comment2The row delimiter separates information for each field. The default is a line feed (\n). Line feeds (\n), semicolons (;), and periods (.) are supported.
The column delimiter separates field names from field types. The default delimiter is a half-width comma (,). It supports
','. The field type is optional and defaults to','.
Batch configure in DDL format. For example:
CREATE TABLE tablename ( user_id serial, username VARCHAR(50), password VARCHAR(50), email VARCHAR (255), created_on TIMESTAMP, );
Create Output Field: Click +Create Output Field. Fill in Column, Type, and Remarks as prompted on the page, and select the Mapping Type. After configuring the current row, click the
icon to save.
Click OK to complete the property configuration for the Amazon RDS for MySQL input component.