Accelerate image generation using DeepytorchLite nodes
Use DeepytorchLite nodes to accelerate image generation in ComfyUI workflows.
Overview of DeepytorchLite acceleration nodes
Deepytorch overview
DeepytorchLite nodes use Deepytorch, an acceleration library developed by Alibaba Cloud. They can increase inference speed by over 20% in most scenarios. The actual speed increase depends on your underlying compute resources, model type, and specific workflow configuration. The acceleration nodes have the following features:
Support for mainstream model structures: DeepytorchLite nodes support a wide range of mainstream model structures, such as SD1.5, SD1.5-inpainting, SD2, SDXL, SDXL-turbo, SDXL-inpainting, SDXL-refiner, SVD, and SDXL-lighting-nstep.
Dynamic size support: Deepytorch is optimized for text-to-image scenarios with dynamic input sizes.
Support for ControlNet model acceleration.
High compatibility: DeepytorchLite is compatible with mainstream plug-ins.
Industry comparison
Acceleration framework |
Kernel optimization capability |
Dynamic size support |
Compile-free capability |
Engine file |
Plug-in compatibility |
Deepytorch |
10% to 20%+ faster than Xformers |
Support |
Support |
Very small |
High compatibility |
TensorRT |
Industry-leading optimization |
Supported |
Not supported. It cannot be optimized in time and requires long compile times. |
Large engine files. A separate engine file is generated for each model. |
Many plug-in restrictions |
Xformers |
Excellent level. A balance between performance and compatibility. |
Supported |
Supported |
None |
High compatibility |
Overview of acceleration nodes
Node types:
(1) UNet acceleration and optimization: Optimizes the model using DeepytorchLite
Node type: DeepytorchLiteOptimize
Input/Output: model
(2) VAE acceleration and optimization: Optimizes the VAE using DeepytorchLite
Node type: DeepytorchLiteOptimizeVAE
Input/Output: vae
(3) ControlNet acceleration and optimization: Optimizes the ControlNet using DeepytorchLite
Node type: DeepytorchLiteOptimizeControlNet
Input/Output: controlnet
Usage
After the environment is ready, a runnable workflow is available as shown in the following figure.

Double-click a blank area of the interface and search for Deepy. Three acceleration nodes appear in the search results. Insert the nodes into the workflow as needed.
The following shows the results of adding DeepytorchLite acceleration nodes.

Acceleration results
Example workflows:
deepytorch-demo-sdxl-controlnet.json
(1) SD1.5
Image generation configuration: SD1.5, 512 × 512, euler, 20 steps
Sampling time before acceleration: 2.17 s

Sampling time after acceleration: 1 s

(2) SDXL
Image generation configuration: SDXL, 1024 × 1024, euler, 20 steps
Sampling time before acceleration: 6.7 s

Sampling time after acceleration: 5.98 s

(3) Complex workflow
Image generation configuration: SDXL, 512 × 768, lora, controlnet, 30 steps
Sampling time before acceleration: 20.59 s

Sampling time after acceleration: 18.53 s
