Historical Data Cleanup

更新时间:
复制 MD 格式

Use Historical Data Cleanup in DMS to periodically remove old data from large tables, free up storage, and improve query performance.

Prerequisites

  • The database is MySQL.

  • The database instance uses the Stable Change or Security Collaboration control mode.

Procedure

  1. Log in to DMS 5.0.

  2. In the top navigation bar, choose Database Development > Data Change > Historical Data Cleanup.

    Note

    In DMS simple mode, click the 2023-01-28_15-57-17.png icon in the upper-left corner, then select All Features > Database Development > Data Change > Historical Data Cleanup.

  3. On the Data Change ticket application page, configure the ticket parameters and click Submit.

    Key parameters:

    Parameter

    Description

    Database

    Select a database for which you have change permissions. Read-only or table-level permissions are not sufficient. View My Permissions.

    Deletion Settings

    Enter the Table Name, Time Field, Time Accuracy, Retention Period (Days), and Filter Condition (Nullable). The system automatically generates a cleanup script based on this information.

    Note
    • For logical tables, enter the logical table name.

    • The retention period defines when data is automatically deleted. For example, a 7-day retention period removes data older than 7 days.

    For example: If Table Name is api_call_record_11, Time Field is gmt_create, Retention Period is 7, and Filter Condition is status = 1 or status=2, the following SQL statement is generated: DELETE FROM `api_call_record_11` WHERE `gmt_create` < SUBDATE(CURDATE(),INTERVAL 7 DAY) AND (status = 1 or status=2);

    Schedule

    DMS removes data in batches based on the primary key or a non-null unique key. Schedule cleanup during off-peak hours at low frequency to minimize performance impact.

    Note
    • Actual execution may deviate up to one minute from the scheduled time.

    • The minimum interval for scheduled execution is one hour. By default, the task runs at 02:00 every day.

    Policy Configuration

    Specify an execution duration. The task pauses automatically after the specified duration to avoid impacting services during peak hours.

    • Execute Task Without End Time.

    • Specify End Time (Hours): Set a duration limit to prevent downstream sync pipelines (such as DTS or AnalyticDB) from being affected.

    After specifying the duration, you can enable Periodically Optimize Table (defragmentation), disabled by default. This runs OPTIMIZE TABLE after a specified number of cleanup cycles. Default: every 60 cycles.

    Note
    • The OPTIMIZE TABLE feature is supported only for RDS for MySQL and PolarDB for MySQL databases.

    • The OPTIMIZE TABLE operation is bound by the execution duration in the policy configuration. The OPTIMIZE TABLE operation stops when that duration ends.

    Change Stakeholder

    Specified stakeholders can view and collaborate on the ticket. Only administrators and DBAs can access it otherwise.

  4. After submitting the ticket, you can enable a replication lag check, set a threshold, or modify the SQL.

    • (Optional) Enable a replication lag check to prevent excessive lag from affecting primary/standby switchovers.

      In the Basic Information section, click chunk option to set a replication lag threshold in seconds. SQL execution is interrupted if the lag exceeds this threshold.

      Note

      Currently, this feature is supported only for ApsaraDB RDS for MySQL databases.

    • (Optional) Modify the SQL.

      The system runs an SQL precheck automatically. If the precheck fails, modify the SQL based on the failure reason and retry.

    Note

    Before submitting for approval, you can modify the batch execution and scheduling configurations. Once submitted, these settings cannot be changed.

  5. Click Submit. In Security Collaboration mode, the ticket is sent for approval per the configured rules. In Stable Change mode, it is automatically approved.

  6. After approval, the system generates a scheduled task and emails the ticket owner. In the Basic Information section, click View Scheduled Tasks to view scheduling details. You can also:

    • Pause Schedule

      Note

      To permanently disable the schedule, open the ticket details page, click Close Ticket in the upper-right corner, enter a reason, and click Submit.

    • Resume Schedule

      Note

      After a ticket is closed, you must submit a new ticket to resume the schedule.

    • Change Ticket Owner

      The submitter is the ticket owner by default. Only the owner can pause or resume the schedule, and execution notifications are sent exclusively to the owner.

  7. The system executes the cleanup SQL according to the schedule policy. View scheduling details and execution history in the ticket.

    Note

    If a cleanup task is already running at the scheduled time, a new task is not created. Configure the execution frequency accordingly.

FAQ

  • Q: Will running OPTIMIZE TABLE as part of a Historical Data Cleanup task impact business?

    A: It depends. With Lock-free Schema Change enabled, OPTIMIZE TABLE does not impact business. Without it, run OPTIMIZE TABLE during off-peak hours. To enable Lock-free Schema Change, see Enable lock-free schema change.

  • Q: How can I stop an OPTIMIZE TABLE operation that is taking too long?

    A: Navigate to the Ticket Details page and pause the task in the Execute section.