Data Development Overview

更新时间: 2026-06-02 20:38:49

Data development covers two areas: code development for building compute tasks (SQL, Shell, Python, MR, Spark) and standard modeling for creating logical data models.

Prerequisites

Real-time development requires the real-time development value-added service. Activate Dataphin.

Features

  • Standard Modeling: Design data models from a business perspective with standardization, normalization, and automation. Prevents metric calculation inconsistencies and visualizes data warehouse model construction. Submitting a logical table automatically generates an intelligence data engine scheduling task for data production.

    Note

    The Standard Modeling feature is available only in projects associated with the data section.

  • Data Processing: Build complex data models, synchronization tasks, and code tasks through coding.

  • Ad Hoc Query: Run thematic queries on business data using the logical model, without dealing with physical model complexities.

  • Dual Development Modes: Dataphin supports Basic and Dev-Prod modes:

    • Basic projects in the Basic data section support standard modeling. Basic projects in the Prod data section support only data processing and ad hoc query.

    • Dev projects support standard modeling, data processing, and ad hoc query. Prod projects support standard modeling and data processing.

  • Intelligent Editor: The Dataphin code editor provides code highlighting, intelligent code hinting, and permission verification. Enhance editor code efficiency.

Access data development

On the Dataphin home page, click Development in the top menu bar to open the Development page.

Interface overview

Note

The Standard Modeling feature is not supported in the Basic Development Edition or Agile Development Edition of Dataphin.

image

Area

Description

Global Search and Code Search

  • Global Search: Search for physical tables, logical tables, meta tables, standard definitions, functions, resources, or templates by keyword. The Development and Asset tabs filter results by category.

    image

    • Project Switch: Switch between Dev and Prod projects in a Dev-Prod environment.

    • Perspective Switch: View objects from development or asset perspectives, each showing different object types.

      • Development Perspective: Includes physical tables, logical tables, meta tables, standard definitions, functions, resources, or templates.

      • Asset Perspective: Comprises physical tables, logical tables, and meta tables.

    • Object list: Displays matched objects. Supports global and project perspective switching. Filter by all or a specific object type. Click an Object Name to locate the object.

  • Code Search: Click the gageg icon to open Code Search and locate compute tasks containing specific code:

    image

    • Code Search Input Box: Enter keywords for compute tasks. Use the image icon to switch to multi-code input, or the image icon to open Advanced Search. Enhance advanced search efficiency.

    • Search Result List: Shows compute tasks matching the code. Clicking a task reveals its details.

    • Task Match Details: Displays code matches within compute tasks, including line numbers and match counts.

Note
  • Code search is limited to Submitted, Developing, and Published statuses.

  • A maximum of 50 tasks can be matched per code logic. Code search is project-specific.

  • Search functionality applies to code submitted post-version upgrade (July 14, 2020).

Project and Environment

  • Project: Displays the active project name. Click the test icon to select a project classification (Prod, Dev, or Basic) from the dropdown, or search by name.

  • Environment: Click the tstt icon to switch between Production and Development environments.

Note

Basic projects default to the production environment and do not differentiate between environments.

Feature Items

The data development section includes modules for standard modeling, data processing, recently opened, ad hoc query, and run records.

  • Standard Modeling: Covers logical tables: logical dimension tables, logical fact tables, atomic metrics, business filters, metrics, and logical aggregate tables. For more information, see Data standardization and modeling.

  • Data Processing: Manages compute task capabilities: table management, compute tasks, compute task templates, resources, and functions.

    • Table Management: Manages offline physical tables and real-time compute tables used in compute task development.

      • Offline physical table: Set up and manage offline physical tables for compute task development. Create offline physical table.

      • Real-time Compute Meta Table: Manages meta tables and mirror tables used in real-time task development.

        • Meta table: A cross-storage type table in Data Management. Create and manage input tables, output tables, and dimension tables for real-time task development. Create and manage meta tables.

        • Image table: Maps streaming tables to offline tables for real-time tasks that integrate streaming and batch processing. Reference the image table to enable sync operations between streaming and offline tables. create image table.

    • Compute tasks: Create compute tasks in SQL, MR, Jar, Shell, Python, and Virtual formats. For more information, see Compute tasks.

    • Compute Task Templates: Create templates for offline and real-time compute tasks to improve development efficiency.

    • Resources: Store and manage files for code development, such as JAR, JSON, Python, and other resource files. Upload resources and references.

    • Functions: Displays built-in functions and user-defined functions supported by the compute engine. Each function shows its name, type, command format, and description. Available functions depend on the compute source bound to the project.

      • Built-in Functions: Vary by compute engine. Check the system-displayed functions for details.

      • User-defined functions: Create custom functions by uploading JAR resources or other methods. Create user-defined functions.

  • Recently opened: Lists recently accessed compute nodes, functions, tables, and other objects.

  • Recycle Bin: Deleted objects from data development move to the recycle bin. Restore or permanently delete objects from here. .

  • Ad hoc query: Customize and run query statements and download data based on business needs. For more information, see Query and download data.

  • Run Records: Stores the past 15 days of ad hoc queries, compute task executions, logical data table previews, derived metric smoke tests, asset data previews, and OpenAPI data queries. how to view and manage run records.

Object List Directory

Displays all objects created or built in within the data development section.

Data Development Welcome Page

Outlines the workflow and development tools for data development. To create objects, click a path point or tool block, then click the image icon.

上一篇: Data Development 下一篇: Data standardization and modeling
阿里云首页 智能数据建设与治理 Dataphin 相关技术圈