Python computation task development case using third-party libraries
Dataphin provides tenant-level third-party library management. Before you can use a Python third-party library in a computation task, you must install the required Python module through the third-party library feature.
Case description
This example uses the xlrd third-party library.
Procedure
Step 1:Install the Python module.
-
Navigate to the top menu bar on the Dataphin home page and single click Management Center > System Settings.
-
Open the Install Python Module dialog box:
Single click Python Third-Party Package > Python Module > Install Python Module.

-
In the Install Python Module dialog box, set the following parameters:
Parameter
Description
Module Name
Enter xlrd.
Python Version
Select Python 3.7.
Installation Method
Select Online Installation.
-
Click OK and wait for the
xlrdmodule to finish installing.
Step 2:Create a Python computation task and introduce the third-party library
-
On the Dataphin home page, navigate to the top menu bar and single click Development > Data Development.
-
Open the New PYTHON Task dialog box:
Select the project (Dev-Prod mode requires selecting the environment) > Single click Script Task > Click the New Icon
> Choose PYTHON.
-
In the New PYTHON Task dialog box, configure the following parameters:
Parameter
Description
Task Name
Enter a name for the task, such as xlrd package test.
Schedule Type
Select One-Time Task.
Select Directory
Select the directory to store the task.
Use Template
Disabled by default.
Python Third-Party Package
Select the xlrd package installed in Step 1.
Description
Enter a brief description, for example, xlrd package test.
-
Click OK.
-
On the Python task code editor, select Python 3.7 and write your code. The sample code is as follows:
Notedataphin.xlsrefers to any .xls file uploaded to Dataphin. Replace the resource name with the name you specified during upload. For more information, see and the referenced document.@resource_reference{"dataphin.xls"} # Reference dataphin.xls resource # Import xlrd module. import xlrd wb = xlrd.open_workbook('dataphince.xls') # Open excel sh = wb.sheet_by_name('Sheet1') # Locate worksheet by workbook # Traverse excel, print all data for i in range(sh.nrows): print(sh.row_values(i)) -
Save and submit the Python task on the code editor page.
-
Click the run code icon
. -
Click the submit code icon
in the upper right corner of the page.
-
-
Enter remarks on the Submit Remarks page.
-
Click OK And Submit.