Instance generation mode: Immediately after deployment

更新时间:
复制 MD 格式

To have a node generate and run instances immediately after it is deployed to the production environment, set its instance generation mode to Immediately After Deployment.

Background

After you deploy a node, you can view its latest configuration in Operation Center. Every night, DataWorks generates auto triggered instances for the next day's schedule based on the configuration of each auto triggered task. When you deploy a new or updated node to the production environment, the selected instance generation mode determines when the changes affect auto triggered instances and their dependencies.

In DataWorks, the instance generation mode provides two options that control when your changes take effect: Next Day and Immediately After Deployment.

  • Next Day: Node creation and update operations affect the auto triggered instances of the next day. If a task must run immediately after being deployed to the production environment, you can run a data backfill operation for the task.

  • Immediately After Deployment: Node creation and update operations take effect immediately. However, a time lag exists between deployment and when runnable instances are generated. This lag has different effects depending on the scenario. For more information, see Common scenarios for immediate instance generation.

Notes

  • If you set the instance generation mode to Immediately After Deployment, changes to rerun properties do not apply to instances that have already expired.

  • Nodes within a workflow cannot be individually configured for immediate generation. This option must be configured for the entire workflow on its scheduling configuration page.

  • Regardless of whether you choose Next Day or Immediately After Deployment in the scheduling configuration, the system generates all auto triggered instances for the next day between 23:30 and 24:00 daily. Tasks deployed during this period will not generate instances until the day after tomorrow.

  • Inconsistent instance generation modes for upstream and downstream tasks may create isolated nodes.

  • Time lag for immediate instance generation: To prevent unexpected behavior, a 10-minute lag is built into the immediate generation process. The scheduled time of a task must be at least 10 minutes after the deployment time for the task to run with the latest configuration.

  • Scope of immediate instance generation: Not all changes take effect immediately. For example, if you modify the data source for a node and then deploy it with immediate generation enabled, the change does not affect existing instances for that day. The daily auto triggered instances still run using the data source from before the change.

    Note

    You can run a data backfill operation on the task with the latest configuration. The data backfill process uses the latest task configuration.

Immediate instance generation

The immediate instance generation feature applies only to tasks scheduled for a future time. An instance runs as expected only if its scheduled time is after its deployment time.

  • When a new task is created, scheduled instances are generated for that day, but only instances whose scheduled time is in the future will actually run.

  • If you update the schedule time of a node to a past time, no instance is generated. If the schedule time is in the future, new instances are generated based on the new configuration and replace the previous instances.

    Note

    The scheduled time must be at least 10 minutes after the node deployment time for instances to be generated immediately.

image

Scheduled time falls within the normal execution window

  • Scenario 1: A new node generates runnable scheduled instances on the day it is created. If the scheduled time of the instance is at least 10 minutes after the deployment time, the instance is scheduled and runs normally. For more information, see Immediate instance generation after deploying a new node.

  • Scenario 2: After you update a node configuration, if the scheduled time of the instance is at least 10 minutes after the deployment time, the instance is scheduled and runs normally with the updated configuration. For more information, see Update the schedule of a deployed task.

  • Scenario 4: Impact of changing the schedule time on downstream dependencies.

Important

We recommend that you do not use this feature when modifying the schedule settings of production nodes. This feature may cause dependency changes, dependency inconsistencies, instance replacement, or instance deletion, which can make the dependencies complex for the current day. However, the task dependencies return to normal the next day.

Scheduled time falls within the dry-run window

If the scheduled time is before the deployment time, auto triggered instances are still generated, but the instances perform a dry run. The instance status is Expired instance generated in real time, and no actual code logic is executed. For more information, see Immediate instance generation after deploying a new node.

  • Scenario 1: The scheduled time is within 10 minutes after the deployment time. The instance status is Expired Instance Generated in Real Time.

    Example: Node A has a scheduled time of 09:05, and the deployment time is 09:00. Because the scheduled time is after the deployment time but the time difference is less than 10 minutes, node A generates a dry-run instance with the status Expired Instance Generated in Real Time.

  • Scenario 2: The scheduled time is before the deployment time. An instance with the status Expired Instance Generated in Real Time is immediately generated.

    Example: Node A has a scheduled time of 09:00, and the deployment time is 10:00. Because the scheduled time is before the deployment time, node A immediately generates a dry-run instance with the status Expired Instance Generated in Real Time.

Common scenarios for immediate instance generation

When instances are generated using the Instant generation after publishing mode, the instance execution and upstream/downstream dependency behavior for different scenarios is as follows:

Immediate instance generation after deploying a new node

When a new task is deployed, instances are generated immediately on the same day. Whether these instances actually run depends on the schedule time of the task. For details, see the following table:

Scenario

Description

The schedule time is in the future relative to when the instance takes effect

DataWorks generates runnable scheduled instances based on the schedule time and runs them.

The Instant generation after publishing policy affects only the instance execution on the deployment day. Whether instances are replaced depends on whether the scheduled time is at least 10 minutes after the deployment time. For more information, see Background.

The schedule time is in the past relative to when the instance takes effect

DataWorks generates an expired dry-run instance with the status Expired instance generated in real time. This instance does not actually run.

If you need to process the current day's data, you can run a backfill data operation to backfill data for the previous business date. This operation also has a 10-minute time lag when generating instances. For more information, see Background.

Example: Assume the task is deployed to the production environment at 12:00. The effective time for immediate instance generation is 12:10.

  • If the schedule time of the task is after 12:10, the task runs as scheduled.

  • If the schedule time of the task is before 12:10, the task performs a dry run, and the instance status is Expired instance generated in real time.

Update the schedule of a deployed task

After you update the schedule time of a production task and deploy it, instances from before and after the change may coexist on the same day, which can make dependencies complex. Unless necessary, we recommend that you do not use the Instant generation after publishing mode for deployed tasks. The following example illustrates a scenario where the schedule is changed from hourly to daily.

Note

This scenario occurs only on the day when the task is deployed with immediate instance generation. The next day, instances are generated normally based on the configuration.

  • Case 1: The schedule is changed from every 6 hours to daily, and the daily schedule time is in the past.

    At 09:00, the schedule time is changed to a past time, from every 6 hours to daily at 08:00. The instance dependencies for that day are as follows:

    image
  • Case 2: The schedule is changed from every 6 hours to daily, and the daily schedule time is in the future.

    At 09:00, the schedule time is changed to a future time, from every 6 hours to daily at 18:00. The instance dependencies for that day are as follows:

    image
    • Instance generation: A daily instance A3 is generated based on the new configuration after 09:00.

    • Instance replacement: The new instance A3 replaces the original instances A3 and A4.

    • Instance retention: Hourly instances scheduled before 09:10 are retained.

Note
  • The schedule time is in the future: DataWorks replaces the previously generated future instances based on the latest schedule settings.

  • The schedule time is in the past: DataWorks retains instances before the effective time of the change and replaces or deletes instances after the effective time of the change.

After you change the schedule time and deploy the node, whether instances are regenerated and actually run on the same day depends on the schedule time settings and the time when the changes are deployed to the production environment. For more information, see Background.

Impact of schedule time changes on downstream dependencies

When instances are generated immediately after deployment, downstream tasks set their dependencies based on the latest schedule settings of the upstream task. The schedule can be daily, monthly, or hourly.

Note

For a production task whose schedule time is changed, downstream instances set dependencies for both newly generated instances and unreplaced old instances based on the latest schedule settings. For information about dependency behavior in hourly and minutely schedules, see Impact of schedule time changes on downstream dependencies. This scenario occurs only when the node version to be deployed has its instance generation mode set to Instant generation after publishing and a schedule time change exists.

The following examples illustrate this scenario:

  • Case 1: The upstream node is changed from every 6 hours to every 8 hours, with immediate instance generation selected.

    image
  • Case 2: The upstream node is changed from every 6 hours to daily at 16:00, with immediate instance generation selected.

    image

After you change the schedule time of a task and select immediate instance generation, downstream instances adjust their dependencies based on the latest schedule settings to ensure correct dependency logic between newly generated instances and unreplaced old instances. Downstream tasks set the appropriate dependencies for both unreplaced old instances and newly generated instances to maintain correct scheduling logic.

Inconsistent instance generation modes for upstream and downstream tasks

If both upstream and downstream nodes are newly created and their instance generation modes are different — for example, the upstream uses T +1 generated next day and the downstream uses Instant generation after publishingisolated nodes are created. Isolated nodes are not automatically scheduled to run. If the isolated node has many downstream dependencies, this can lead to serious consequences.

image