Getting started with DTS DataBridge Agent

更新时间:
复制 MD 格式

DataBridge Agent is a multi-source data collection and parsing tool from Alibaba Cloud Data Transmission Service (DTS). It integrates the core data collection and parsing capabilities of DTS for sources such as databases, webpages, and documents with the intelligent O&M features of DTS Insight. Use it to efficiently acquire and standardize various types of heterogeneous data from a single, unified platform.

Get started

  1. Create a DataBridge Agent instance:

    1. Log on to the Data Transmission Service (DTS) console.

    2. In the left-side navigation pane, click DataBridge Agent.

    3. On the overview page, click Create instance on the right. Enter an instance name and select a region, and then click Create.

      Note
      • Instance initialization typically takes 5 to 10 minutes.

      • During the invitation-only testing period, each Alibaba Cloud account can create up to 10 DataBridge Agent instances.

  2. Start using the agent: After the DataBridge Agent instance is created, click the instance or click New conversation to begin. For example, to learn what features DTS supports, you can ask the agent to analyze the page What is Data Transmission Service (DTS) (https://help.aliyun.com/en/dts/product-overview/what-is-dts).

(Optional) Configure access credentials

If you need to use DataBridge Agent to manage your DTS tasks, you must configure access credentials for the current DataBridge Agent instance.

Warning
  • If you only need to query configuration or monitoring information for DTS tasks, you can grant read-only permissions for DTS (AliyunDTSReadOnlyAccess).

  • If you grant management permissions for DTS (AliyunDTSFullAccess) to a RAM account, DataBridge Agent can perform high-risk operations such as modifying task parameters, stopping tasks, or releasing tasks based on your commands. Proceed with caution.

  1. Run the following command:

    aliyun configure
  2. Provide the AccessKey ID, AccessKey Secret, and default region of a RAM account that has DTS management permissions:

    AccessKey ID xxx
    AccessKey Secret xxx
    Region ID xxx

Core features

Data analysis

You can retrieve information from data sources such as webpages, documents, and databases to generate analytical reports.

Examples

Analyze the data on https://help.aliyun.com/en/dts/product-overview/what-is-dts. What features does Alibaba Cloud DTS support?
Analyze the data on https://help.aliyun.com/en/model-studio/models. What models does Alibaba Cloud Model Studio support?

Complex file parsing

DataBridge Agent includes a dedicated DTS skill (aliyun-dts-file-parser) to parse PDFs and other documents into Markdown for document content extraction, PDF-to-text conversion, and document analysis.

Example

Parse file 
https://xxx.oss-cn-beijing.aliyuncs.com/dts.pdf?Expires=xxx&OSSAccessKeyId=xxx&Signature=xxx

DTS task management

DataBridge Agent has a built-in skill (aliyun-dts-task-manager) for managing your DTS tasks.

Note
  • When managing a DTS task, enter the DTS DTS Instance ID, not the Task Name.

  • You can only manage tasks for which the precheck has passed and a DTS instance has been purchased.

Example

Query information for DTS task xxx

Appendix: DTS-exclusive skills

Skill name

Description

aliyun-dts-file-parser

Uses Alibaba Cloud DTS to parse PDFs and other documents into Markdown. Used for document content extraction, PDF-to-text conversion, and document analysis.

aliyun-dts-task-manager

Manages Alibaba Cloud DTS tasks, including creation, monitoring, and modification.

aliyun-dts-web-fetch

Generates and executes a curl command for the /extract API based on the URL that you enter. It uses a progressive intelligent extraction strategy to automatically identify and switch to the optimal pipeline. Trigger words: extract, extract, scrape webpage, curl extract, fetch, retrieve webpage content, dts fetch.