The Lindorm compute engine can access external Java Database Connectivity (JDBC) data sources. You can use Spark SQL to perform data forwarding and federated computing between Lindorm and external data sources.
Prerequisites
The compute engine service is enabled for your Lindorm instance. For more information, see Activate the service.
Ensure that access permissions are enabled for the external data source and that it is in the same virtual private cloud (VPC) as the Lindorm instance.
Parameter settings
An external data source is mapped to a Catalog in the Lindorm compute engine. The Catalog acts as an identifier for the data source. You must specify a custom name for the Catalog. The following table describes the parameters for a Spark SQL job that uses hello_jdbc as the Catalog name.
Configuration item | Description | Example value |
spark.sql.catalog.hello_jdbc | Declares an external data source. The value must be | org.apache.spark.sql.execution.datasources.v2.jdbc.JDBCTableCatalog |
spark.sql.catalog.hello_jdbc.url | The JDBC URL of the external data source. The Lindorm compute engine currently supports MySQL as an external data source. The compute engine includes the `mysql-connector-java-8.0.27.jar` connector. | jdbc:mysql://192.168.11.2:3306 |
spark.sql.catalog.hello_jdbc.user | The username to access the external data source. | testUser |
spark.sql.catalog.hello_jdbc.password | The password to access the external data source. | testPassword**** |
For more information about configuration items, see Data Source Option.
Access data
Configure and submit Spark jobs for an external data source using one of the following methods. Then, use Spark SQL to access the data source.
Example 1: Read data from the external JDBC table `db0.mysql_tbl0`, join the data with the Hive table `hive_tbl0`, and write the result to the Hive table `hive_tbl1`.
INSERT INTO spark_catalog.default.hive_tbl1 SELECT A.col0, A.col1, A.col2, B.col1 FROM hello_jdbc.db0.mysql_tbl0 A JOIN spark_catalog.default.hive_tbl0 B ON A.col0 = B.col0;Example 2: Filter the Hive table `hive_tbl1` and write the result to the external JDBC table `db0.mysql_tbl1`.
INSERT INTO hello_jdbc.db0.mysql_tbl1 SELECT col0, col1, col2 FROM spark_catalog.default.hive_tbl1 WHERE col0 > 12345;
For more information, see JDBC To Other Databases.