Product tutorials

更新时间:
复制 MD 格式

DataWorks provides several tutorials to guide you through the entire workflow. These tutorials demonstrate how to prepare your environment, collect data, develop data, and display results. They help you understand the end-to-end process and master the core features of DataWorks.

Comprehensive example: User profile analysis

This tutorial uses a case study of website user profile analysis to demonstrate the end-to-end process, which includes data integration, data warehouse construction in Data Studio, and data governance. Using the DataWorks platform, you can efficiently sync and perform fine-grained cleaning of website user information and behavioral logs to build a comprehensive user profile model.

  • Tutorial link: Comprehensive: Website user profile analysis

  • Compute engines used in the tutorial: You can choose MaxCompute, StarRocks, or EMR.

  • Modules involved: Data Integration, Data Studio, Operation Center, Data Quality, Data Map, DataService Studio, and DataAnalysis.

Tutorials for different modules

Data development and scheduling

Related tutorials

Description

Compute engines used in the tutorial

Modules involved

Quickly experience an ETL workflow

DataWorks provides a collection of extract, transform, and load (ETL) workflow templates. These templates help you quickly understand the product's best practices. You can import a template into your workspace with one-click to replicate the example and explore the product's capabilities.

Different ETL templates use different compute engines, including MaxCompute, Function Compute, and PAI.

Data Integration

Data Development

Data analysis and visualization

Related tutorials

Description

Compute engines used in the tutorial

Modules involved

Query, analyze, and visualize data using public datasets

DataWorks provides a rich collection of official, real-world datasets with desensitized data. Each dataset includes SQL queries for specific business scenarios. Select a public dataset of interest and run the sample SQL. Then, generate visual charts and reports from the analysis results to quickly explore DataWorks features.

  • MaxCompute

  • Hologres

  • EMR Spark

DataAnalysis

Analyze public datasets for big data and AI

This tutorial shows you how to use DataWorks and the cloud-native big data computing service MaxCompute. Use public datasets for big data and AI, such as data from Taobao, Fliggy, AliMusic, GitHub, and TPC. The tutorial guides you on how to quickly perform big data analysis and familiarize yourself with the DataWorks interface and basic data analysis capabilities.

MaxCompute

DataAnalysis

Subscribe to, query, and analyze bill data

In Expenses and Costs, you can subscribe to different types of bill data, such as detailed billing item bills and daily summary bills. After you subscribe, the bill data is regularly synchronized to MaxCompute. You can use the data analysis feature of DataWorks to query and analyze the bill data. You can then generate visual charts and reports from the analysis results. You can also share your Alibaba Cloud consumption analysis report with other users.

MaxCompute

DataAnalysis

Synchronize and analyze real-time GitHub data

This tutorial is based on the GitHub Archive public dataset. It shows how to use DataWorks to collect more than 20 types of event data, such as projects and behaviors, from GitHub to Hologres in real time for analysis. It also uses built-in DataV templates to quickly build a real-time data dashboard. This lets you understand real-time changes in GitHub data from multiple dimensions, such as developers, projects, and programming languages.

Hologres

Data Integration

Data Development

Data warehouse modeling and configuration

Related tutorials

Description

Compute engines used in the tutorial

Modules involved

Quickly model a data warehouse using data model templates

Many small and medium-sized enterprises (SMEs) find building data warehouse models challenging. The process requires specialized talent, long development cycles, and high costs. To solve these problems, the DataWorks intelligent data modeling team worked with experienced data architects. They drew on a decade of experience from millions of Alibaba Cloud users across many business scenarios. By combining this experience with Alibaba Group's technology, they provide best practices for industry models. These models cover retail, e-commerce, finance, manufacturing, and other fields.

Applicable to all engines

Data modeling

Retail and e-commerce data modeling

The DataWorks intelligent data modeling product includes a data warehouse industry model template for retail and e-commerce. You can import the template with one click. This tutorial uses a retail and e-commerce business background and core model building steps to help you understand dimensional modeling theory and the intelligent data modeling product.

Applicable to all engines

Data modeling

Cloud-native integrated data warehouse

This tutorial combines products such as DataWorks, MaxCompute, and Hologres. It provides a detailed explanation of data warehouse capabilities, including offline and real-time processing, analysis services, data modeling, data governance, and the data lakehouse architecture.

DataWorks, MaxCompute, Hologres

Data Integration

DataAnalysis

Data modeling

Data Governance Center

Build and optimize a data warehouse

This tutorial is based on MaxCompute. It describes how to optimize a data warehouse through several modules, including data research, data domain partitioning, building a bus matrix, and defining statistical metrics.

MaxCompute

Data modeling

Build an enterprise data warehouse based on AnalyticDB

This tutorial describes how to build an enterprise data warehouse based on AnalyticDB and perform operations such as O&M and metadata management.

AnalyticDB for MySQL

Data Integration

Data Development

Operation Center

Data Map

Build a retail and e-commerce data warehouse

This tutorial uses an experiment for building a data warehouse in the retail and e-commerce industry as an example. It describes the technology selection, technical flow, and process implementation of DataWorks in data warehouse construction. This helps you gain a deeper understanding of Alibaba Cloud DataWorks.

Applicable to all engines. This tutorial uses MaxCompute as an example.

Data Integration

Data modeling

Data Development

Operation Center

Data governance

DataService Studio

Data governance

Related tutorials

Description

Compute engines used in the tutorial

Modules involved

Efficient data governance implementation guide

This tutorial walks you through the operational process of a governance owner. It shows you how to use the data governance planning feature to efficiently set and achieve data governance goals.

MaxCompute, E-MapReduce

Data Governance Center