Register data lineage using OpenAPI
Use OpenAPI to register data lineage from external systems. Once registered, you can view the lineage in the Dataphin Asset checklist.
Prerequisites
You must purchase the OpenAPI feature.
Limits
-
Custom data lineage registration is in public preview. To use this feature, contact Dataphin technical support.
-
The source of the asset objects in the registered data lineage must already be registered in Dataphin, meaning the assets must originate from existing projects, domains, or data sources. The data source types that support custom lineage registration include big data storage, relational databases, MSMQ, and custom data sources.
-
You can register or delete both table-level and field-level data lineage in a single API call.
-
Each call to the lineage registration API can register the lineage between only one pair of tables. For this pair of tables, you can also specify multiple groups of field lineage relationships, up to 100 groups. If you exceed this limit, the call fails.
-
You can only delete data lineage registered through OpenAPI. Other data lineage relationships cannot be deleted. This operation is irreversible. Confirm before you proceed.
Permissions
Users with a super administrator, system administrator, or custom global role with the Metadata - Other Features - Register External Lineage permission can register data lineage using OpenAPI.
Lineage source
Dataphin supports three methods to obtain data lineage: automatic parsing, manual configuration, and API registration.
-
Automatic parsing: The system automatically parses data lineage based on integration nodes, SQL compute nodes, and logical table tasks.
-
Manual configuration: You can manually configure data lineage in a compute node. For more information, see Custom data lineage configuration.
-
API registration: You can register data lineage using OpenAPI.
Scenarios
OpenAPI registration enables end-to-end data lineage, from first-mile extract, transform, and load (ETL) to last-mile business intelligence (BI). The lineage can include objects that already exist in Dataphin and objects that do not. For example, if a data source table has been ingested, the corresponding object may already exist.
-
Complete upstream source system lineage to trace data issues back to their source for troubleshooting.
-
Complete downstream consumption system associations to analyze the impact of data changes and assess asset value.
Register data lineage
Request parameters
|
Name |
Type |
Required |
Description |
Example value |
|||
|
OpTenantId |
long |
Yes |
The tenant ID. |
30001011 |
|||
|
|
object |
Yes |
The command to register and add data lineage. |
/ |
|||
|
|
object |
Yes |
The source asset. |
/ |
|||
|
ReferenceType |
string |
Yes |
The reference data type for the asset. Valid values: BY_GUID and BY_PROPERTY. |
BY_GUID, BY_PROPERTY |
|||
|
Guid |
string |
No |
The GUID of the asset. This parameter is required if ReferenceType is set to BY_GUID. |
odps.300000001.project1.table1 |
|||
|
MetadataType |
string |
Yes |
The asset type. Set this parameter as needed. |
TABLE |
|||
|
MetadataSubType |
string |
No |
The asset child class. Specify this parameter only when MetadataType is TABLE and ReferenceType is not BY_GUID. |
|
|||
|
Catalog |
string |
No |
An asset property. For tables, the catalog for compute source tables or logical tables is `dataphin`. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
dataphin |
|||
|
Schema |
string |
No |
An asset property. For tables, this is typically a project or business domain. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
project1, bizUnit1 |
|||
|
Env |
string |
No |
The environment. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
DEV, PROD |
|||
|
Name |
string |
No |
The asset name. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
table1 |
|||
|
ExtProperties |
object |
No |
Extension properties. |
/ |
|||
|
|
object |
Yes |
The target asset. |
/ |
|||
|
ReferenceType |
string |
Yes |
The reference data type for the asset. Valid values: BY_GUID and BY_PROPERTY. |
BY_GUID, BY_PROPERTY |
|||
|
Guid |
string |
No |
The GUID of the asset. This parameter is required if ReferenceType is set to BY_GUID. |
odps.300000001.project1.table1 |
|||
|
MetadataType |
string |
Yes |
The asset type. Set this parameter as needed. |
TABLE |
|||
|
MetadataSubType |
string |
No |
The asset child class. Specify this parameter only when MetadataType is TABLE and ReferenceType is not BY_GUID. |
|
|||
|
Catalog |
string |
No |
An asset property. For tables, the catalog for compute source tables or logical tables is `dataphin`. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
dataphin |
|||
|
Schema |
string |
No |
An asset property. For tables, this is typically a project or business domain. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
project1, bizUnit1 |
|||
|
Env |
string |
No |
The environment. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
DEV, PROD |
|||
|
Name |
string |
No |
The asset name. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
table1 |
|||
|
ExtProperties |
object |
No |
Extension properties. |
/ |
|||
|
|
array |
No |
Detailed lineage relationships. For tables, this refers to field lineage. You can leave this empty if you do not want to add field lineage. |
/ |
|||
|
|
object |
No |
A child object lineage relationship. |
/ |
|||
|
|
object |
Yes |
A reference to the source asset. |
/ |
|||
|
ReferenceType |
string |
No |
The reference data type for the asset. Valid values: BY_GUID and BY_PROPERTY. |
BY_GUID, BY_PROPERTY |
|||
|
Guid |
string |
No |
The GUID of the asset. This parameter is required if ReferenceType is set to BY_GUID. |
odps.300000001.project1.table1.column1 |
|||
|
ParentGuid |
string |
No |
The GUID of the parent asset. If the current object is a field, ParentGuid is the GUID of the table to which the field belongs. |
odps.300000001.project1.table1 |
|||
|
MetadataType |
string |
No |
The asset type. Set this parameter as needed. |
COLUMN |
|||
|
Catalog |
string |
No |
An asset property. For tables, the catalog for compute source tables or logical tables is `dataphin`. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
dataphin |
|||
|
Schema |
string |
No |
An asset property. For tables, this is typically a project or business domain. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
project1, bizUnit1 |
|||
|
Env |
string |
No |
The environment. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
DEV, PROD |
|||
|
Name |
string |
No |
The asset name. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
column1 |
|||
|
ExtProperties |
object |
No |
Extension properties. |
/ |
|||
|
|
object |
Yes |
A reference to the target asset. |
/ |
|||
|
ReferenceType |
string |
No |
The reference data type for the asset. Valid values: BY_GUID and BY_PROPERTY. |
BY_GUID, BY_PROPERTY |
|||
|
Guid |
string |
No |
The GUID of the asset. This parameter is required if ReferenceType is set to BY_GUID. |
odps.300000001.project1.table1.column1 |
|||
|
ParentGuid |
string |
No |
The GUID of the parent asset. If the current object is a field, ParentGuid is the GUID of the table to which the field belongs. |
odps.300000001.project1.table1 |
|||
|
MetadataType |
string |
No |
The asset type. Set this parameter as needed. |
COLUMN |
|||
|
Catalog |
string |
No |
An asset property. For tables, the catalog for compute source tables or logical tables is `dataphin`. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
dataphin |
|||
|
Schema |
string |
No |
An asset property. For tables, this is typically a project or business domain. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
project1, bizUnit1 |
|||
|
Env |
string |
No |
The environment. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
DEV, PROD |
|||
|
Name |
string |
No |
The asset name. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
column1 |
|||
|
ExtProperties |
object |
No |
Extension properties. |
/ |
|||
|
IsDirect |
boolean |
No |
Specifies whether the lineage is direct. Default value: true. |
/ |
|||
|
CheckAssetExist |
boolean |
No |
Specifies whether to check if the asset exists. By default, the existence of the asset is not checked. |
/ |
|||
|
RelationProperties |
object |
No |
Lineage relationship properties. |
/ |
|||
|
TenantId |
long |
No |
The tenant ID. |
300001234 |
|||
|
UserId |
string |
No |
The current user ID. |
300004567 |
|||
Response parameters
|
Name |
Type |
Description |
Example value |
|
|
|
object |
The schema of the response. |
/ |
|
|
RequestId |
string |
The ID of the request. |
82E78D6B-AA8F-1FEF-8AA3-5C9DA2A79140 |
|
|
Message |
string |
Details about a backend response exception. |
internal error |
|
|
HttpStatusCode |
integer |
The HTTP status code. |
200 |
|
|
Code |
string |
The backend response code. |
OK |
|
|
Success |
boolean |
Indicates whether the request was successful. |
/ |
|
Example
Example of a successful response
{
"RequestId":"82E78D6B-AA8F-1FEF-8AA3-5C9DA2A79140",
"Message":"internal error",
"HttpStatusCode": 200,
"Code": "OK",
"Success": "Success"
}
Delete data lineage
Request parameters
|
Name |
Type |
Required |
Description |
Example value |
||||
|
OpTenantId |
long |
Yes |
The tenant ID. |
30001011 |
||||
|
|
object |
Yes |
The command to delete registered data lineage. |
/ |
||||
|
|
object |
Yes |
The source of the data lineage. |
/ |
||||
|
ReferenceType |
string |
Yes |
The reference data type for the asset. Valid values: BY_GUID and BY_PROPERTY. |
BY_GUID, BY_PROPERTY |
||||
|
Guid |
string |
No |
The GUID of the asset. This parameter is required if ReferenceType is set to BY_GUID. |
odps.300000001.project1.table1 |
||||
|
MetadataType |
string |
Yes |
The asset type. Set this parameter as needed. |
TABLE |
||||
|
MetadataSubType |
string |
No |
The asset child class. Specify this parameter only when MetadataType is TABLE and ReferenceType is not BY_GUID. |
|
||||
|
Catalog |
string |
No |
An asset property. For tables, the catalog for compute source tables or logical tables is `dataphin`. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
dataphin |
||||
|
Schema |
string |
No |
An asset property. For tables, this is typically a project or business domain. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
project1, bizUnit1 |
||||
|
Env |
string |
No |
The environment. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
DEV, PROD |
||||
|
Name |
string |
No |
The asset name. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
table1 |
||||
|
ExtProperties |
object |
No |
Extension properties. |
/ |
||||
|
|
object |
Yes |
Lineage target |
/ |
||||
|
ReferenceType |
string |
Yes |
The reference data type for the asset. Valid values: BY_GUID and BY_PROPERTY |
BY_GUID, BY_PROPERTY |
||||
|
Guid |
string |
No |
The GUID of the asset. This parameter is required if ReferenceType is set to BY_GUID. |
odps.300000001.project1.table1 |
||||
|
MetadataType |
string |
Yes |
The asset type. Set this parameter as needed. |
TABLE |
||||
|
MetadataSubType |
string |
No |
The asset child class. Specify this parameter only when MetadataType is TABLE and ReferenceType is not BY_GUID. |
|
||||
|
Catalog |
string |
No |
An asset property. For tables, the catalog for compute source tables or logical tables is `dataphin`. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
dataphin |
||||
|
Schema |
string |
No |
An asset property. For tables, this is typically a project or business domain. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
project1, bizUnit1 |
||||
|
Env |
string |
No |
The environment. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
DEV, PROD |
||||
|
Name |
string |
No |
The asset name. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
table1 |
||||
|
ExtProperties |
object |
No |
Extension properties. |
/ |
||||
|
|
array |
No |
Detailed lineage relationships. For tables, this refers to field lineage. |
/ |
||||
|
|
object |
No |
A child object lineage relationship. |
/ |
||||
|
IsDirect |
boolean |
No |
Specifies whether the lineage is direct. Default value: true. |
/ |
||||
|
|
object |
Yes |
A reference to the source asset. |
/ |
||||
|
ReferenceType |
string |
No |
The reference data type for the asset. Valid values: BY_GUID and BY_PROPERTY. |
BY_GUID, BY_PROPERTY |
||||
|
Guid |
string |
No |
The GUID of the asset. This parameter is required if ReferenceType is set to BY_GUID. |
odps.300000001.project1.table1 |
||||
|
ParentGuid |
string |
No |
The GUID of the parent asset. If the current object is a field, ParentGuid is the GUID of the table to which the field belongs. |
odps.300000001.project1.table1 |
||||
|
MetadataType |
string |
No |
The asset type. Set this parameter as needed. |
COLUMN |
||||
|
Catalog |
string |
No |
An asset property. For tables, the catalog for compute source tables or logical tables is `dataphin`. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
dataphin |
||||
|
Schema |
string |
No |
An asset property. For tables, this is typically a project or business domain. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
project1, bizUnit1 |
||||
|
Env |
string |
No |
The environment. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
DEV, PROD |
||||
|
Name |
string |
No |
The asset name. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
column1 |
||||
|
ExtProperties |
object |
No |
Extension properties. |
/ |
||||
|
|
object |
Yes |
A reference to the target asset. |
/ |
||||
|
ReferenceType |
string |
No |
The reference data type for the asset. Valid values: BY_GUID and BY_PROPERTY. |
BY_GUID, BY_PROPERTY |
||||
|
Guid |
string |
No |
The GUID of the asset. This parameter is required if ReferenceType is set to BY_GUID. |
odps.300000001.project1.table1 |
||||
|
ParentGuid |
string |
No |
The GUID of the parent asset. If the current object is a field, ParentGuid is the GUID of the table to which the field belongs. |
odps.300000001.project1.table1 |
||||
|
MetadataType |
string |
No |
The asset type. Set this parameter as needed. |
COLUMN |
||||
|
Catalog |
string |
No |
An asset property. For tables, the catalog for compute source tables or logical tables is `dataphin`. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
dataphin |
||||
|
Schema |
string |
No |
An asset property. For tables, this is typically a project or business domain. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
project1, bizUnit1 |
||||
|
Env |
string |
No |
The environment. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
DEV, PROD |
||||
|
Name |
string |
No |
The asset name. This parameter helps identify the asset by its properties when ReferenceType is BY_PROPERTY. This parameter is not required if ReferenceType is BY_GUID. |
column1 |
||||
|
ExtProperties |
object |
No |
Extension properties. |
/ |
||||
|
CascadeDeleteLineage |
boolean |
No |
Specifies whether to automatically delete the object lineage after all detailed lineages are deleted. Default value: true. |
/ |
||||
|
TenantId |
long |
No |
The tenant ID. |
300001234 |
||||
|
UserId |
string |
No |
The current user ID. |
300004567 |
||||
Response parameters
|
Name |
Type |
Description |
Example value |
|
|
|
object |
The schema of the response. |
/ |
|
|
RequestId |
string |
The ID of the request. |
82E78D6B-AA8F-1FEF-8AA3-5C9DA2A79140 |
|
|
Message |
string |
Details about a backend response exception. |
internal error |
|
|
HttpStatusCode |
integer |
The HTTP status code. |
200 |
|
|
Code |
string |
The backend response code. |
OK |
|
|
Success |
boolean |
Indicates whether the request was successful. |
/ |
|
Example
Example of a successful response
{
"RequestId":"82E78D6B-AA8F-1FEF-8AA3-5C9DA2A79140",
"Message":"internal error",
"HttpStatusCode": 200,
"Code": "OK",
"Success": "Success"
}
Error codes
|
Error code |
HTTP status code |
Error message |
Description |
|
DataphinOpenAPIRamUnAuthorized |
401 |
You are not authorized to do this action, you should be authorized by RAM. |
The API call is not authorized. You must grant permissions to the account using Resource Access Management (RAM). |
|
DataphinOpenApiForbidden |
403 |
Openapi request is forbidden, maytenantId invalid. |
The API request was rejected. Confirm that the tenant ID is valid. |