Group LLM data processing components

更新时间:
复制 MD 格式

Group LLM data processing (DLC) components in Machine Learning Designer to run them as a single batch job — no intermediate data is written to storage between components, which reduces I/O overhead and speeds up your pipeline.

How it works

A group acts as a single execution unit that wraps multiple LLM-DLC components. When the group runs, all enclosed components execute in sequence without persisting intermediate results to storage. The group provides a unified set of configurations and produces a collective output consumed by downstream components.

image

Limitations

  • Currently, LLM-Risk Content Filtering (DLC) and LLM-Quality Predict (DLC) under the Large Model Data Preprocessing folder do not support grouping.

  • Some components do not support multi-node distributed operation. If a group contains such components, multi-node tasks fail. To check whether a component supports multi-node distributed operation, open the component's Tuning tab. If the Nodes parameter can be set to a value greater than 1, the component supports multi-node distributed operation.

image

Configuration hierarchy

Group-level configurations take precedence over individual component configurations. Within a group, settings are split across two levels:

LevelWhat to configure
Group levelText fields, image fields, video fields, computing resources, data output paths
Component levelTuning parameters specific to each component

Component-level tuning parameters are not overridden by the group. Configure them individually on each component.

Group output behavior

Grouped components do not produce individual outputs. The group emits a single collective output that combines the results of all enclosed components. In the example below, the connection originates from the center of the group, indicating that the output includes results from both LLM-Text Normalizer (DLC) and LLM-Clean Special Content (DLC)-1.

image

Create a group

Intelligent aggregation

Machine Learning Designer automatically detects canvas nodes that can be grouped. Click image to aggregate them into a group, then click image to configure resources for the group.

image

Manual aggregation

Click image or use Shift+left-click to select multiple components. Right-click in an empty area and choose Group Selected Nodes. Then click image to configure resources for the group.

image

What's next