Configure code editor input widget
Use the code editor widget to configure items that system components do not support.
Procedure
-
On the Dataphin homepage, select Development > Data Integration from the top menu bar.
-
On the integration page, select Project from the top menu bar (Dev-Prod mode requires selecting the environment).
-
In the navigation pane on the left, click Batch Pipeline. Then, in the Batch Pipeline list, click the Offline Pipeline you want to develop to open its configuration page.
-
Click the Component Library icon in the upper right corner of the page to open the Component Library panel.
-
In the Component Library panel's left-side navigation pane, select Input. Find the Code Editor Input widget in the list on the right and drag it onto the canvas.
-
Click the
icon in the code editor input widget card to open the Script Mode Input Configuration dialog box. -
In the Script Mode Input Configuration dialog box, configure the parameters.
Parameter
Description
Step Name
The name of the code editor input widget. Dataphin generates a default name, which you can modify as needed. The naming conventions are as follows:
-
Can only contain Chinese characters, letters, underscores (_), and numbers.
-
Cannot exceed 64 characters.
Data Source Type
Select the data source type.
Datasource
Select the corresponding data source instance.
Generate Sync Configuration File
Click Generate Sync Configuration File. A JSON configuration template is generated in the Configuration File area based on the selected data source type. You can modify or remove configuration items as needed.
Configuration File
Click the
icon to expand the editor and view or write the configuration file in full screen.The template has three main parts:
-
tableandtables: Specifies one or more input tables. Usetablefor a single table andtablesfor multiple tables. Not applicable if the component does not involve tables. -
column: The input fields. Each field requires at least anameand atype. -
parameter: Sync configuration parameters. The configuration varies by data source. For details, see the documentation for the specific component.
{ "table":"tableName",// "tables":["tableName"] "column":[{ "name":"", "type":"" }], "parameter":{} } -
-
Click Confirm to finalize the property configuration for the Script Mode input widget.
Best Practices
The code editor widget uses a configuration JSON to set up offline integration tasks. Common scenarios include:
-
Configuring advanced component settings that are not available in the graphical interface.
-
Manually configuring tables and fields when metadata is unavailable, such as during errors or outages.
-
Handling special keywords in table and field names.
-
Using database functions.
-
Implementing sharding strategies.
-
Executing query SQL statements.
The following example shows how to sync MySQL multi-table JOIN result data. Other scenarios can be configured in a similar way.
One: configuration introduction
In the Script Mode Input Configuration dialog box, set the parameters.
|
Parameter |
Description |
|
Step Name |
Enter the name of the code editor input widget. |
|
Data Source Type |
Select MySQL. |
|
Datasource |
Select the corresponding data source instance. |
|
Generate Sync Configuration File |
Click Generate Sync Configuration File. A JSON configuration template is generated in the Configuration File area based on the selected data source type. You can modify or remove configuration items as needed. |
|
Configuration File |
The template has three main parts:
|
Two: widget usage
The examples below show usage across different scenarios.
Using query SQL
To sync multi-table JOIN results or other custom SQL query results, use the following configuration file content.
{
"table": "sff_id_name",
"parameter": {
"connection": [
{
"querySql": [
"select a.id, b.name from sff_id_name a join sff_id_name_1 b on a.id = b.id"
]
}
]
},
"column": [
{
"name": "id",
"type": "Long"
},
{
"name": "name",
"type": "String"
}
]
}
For complex SQL statements, use the querySql configuration item. This string array supports multiple SELECT statements with the same structure and must be nested within a connection. You can use multiple connection items to connect to multiple databases.
Sharding
Sharding distributes data across multiple databases and tables while maintaining a consistent table structure.
-
Using querySql
{ "table": "sff_id_name", "parameter": { "connection": [ { "querySql": [ "select a.id, b.name from sff_id_name a join sff_id_name_1 b on a.id = b.id" ] }, { "jdbcUrl": [ "jdbc:mysql://ip:3306/dataphin" ], "querySql": [ "select id,name from sff_id_name" ] } ] }, "column": [ { "name": "id", "type": "Long" }, { "name": "name", "type": "String" } ] } -
Directly Configure Table Name
{ "table": "sff_id_name1", "parameter": { "connection": [ { "table": [ "sff_id_name1","sff_id_name2" ] }, { "jdbcUrl": [ "jdbc:mysql://xxx:3306/dataphin1" ], "table": [ "sff_id_name3","sff_id_name4" ] }, { "jdbcUrl": [ "jdbc:mysql://xxx:3306/dataphin2" ], "table": [ "sff_id_name5","sff_id_name6" ] } ] }, "column": [ { "name": "id", "type": "Long" }, { "name": "name", "type": "String" } ] }
Using functions
You can use functions supported by the database, such as the MySQL upper function.
{
"tables":[
"sff_id_name_2"
],
"parameter":{
"where":"id>0",
"splitPk":"id"
},
"column":[
{
"name":"id",
"type":"Long"
},
{
"name":"upper(name)",
"type":"String"
}
]
}
Keywords
MySQL uses backticks (``) to identify keywords. For example, if the keyword is id, you can use `id`.
{
"tables":[
"sff_id_name_2"
],
"parameter":{
"where":"id>0",
"splitPk":"id"
},
"column":[
{
"name":"`id`",
"type":"Long"
},
{
"name":"upper(name)",
"type":"String"
}
]
}