Accelerate image generation using DeepytorchLite nodes

更新时间:
复制 MD 格式

Use DeepytorchLite nodes to accelerate image generation in ComfyUI workflows.

Overview of DeepytorchLite acceleration nodes

Deepytorch overview

DeepytorchLite nodes use Deepytorch, an acceleration library developed by Alibaba Cloud. They can increase inference speed by over 20% in most scenarios. The actual speed increase depends on your underlying compute resources, model type, and specific workflow configuration. The acceleration nodes have the following features:

  • Support for mainstream model structures: DeepytorchLite nodes support a wide range of mainstream model structures, such as SD1.5, SD1.5-inpainting, SD2, SDXL, SDXL-turbo, SDXL-inpainting, SDXL-refiner, SVD, and SDXL-lighting-nstep.

  • Dynamic size support: Deepytorch is optimized for text-to-image scenarios with dynamic input sizes.

  • Support for ControlNet model acceleration.

  • High compatibility: DeepytorchLite is compatible with mainstream plug-ins.

Industry comparison

Acceleration framework

Kernel optimization capability

Dynamic size support

Compile-free capability

Engine file

Plug-in compatibility

Deepytorch

10% to 20%+ faster than Xformers

Support

Support

Very small

High compatibility

TensorRT

Industry-leading optimization

Supported

Not supported. It cannot be optimized in time and requires long compile times.

Large engine files. A separate engine file is generated for each model.

Many plug-in restrictions

Xformers

Excellent level. A balance between performance and compatibility.

Supported

Supported

None

High compatibility

Overview of acceleration nodes

Node types:

(1) UNet acceleration and optimization: Optimizes the model using DeepytorchLite

Node type: DeepytorchLiteOptimize

Input/Output: model

(2) VAE acceleration and optimization: Optimizes the VAE using DeepytorchLite

Node type: DeepytorchLiteOptimizeVAE

Input/Output: vae

(3) ControlNet acceleration and optimization: Optimizes the ControlNet using DeepytorchLite

Node type: DeepytorchLiteOptimizeControlNet

Input/Output: controlnet

Usage

  1. After the environment is ready, a runnable workflow is available as shown in the following figure.

image

  1. Double-click a blank area of the interface and search for Deepy. Three acceleration nodes appear in the search results. Insert the nodes into the workflow as needed.

The following shows the results of adding DeepytorchLite acceleration nodes.

image

Acceleration results

Example workflows:

deepytorch-demo.json

deepytorch-demo-sdxl-controlnet.json

(1) SD1.5

Image generation configuration: SD1.5, 512 × 512, euler, 20 steps

Sampling time before acceleration: 2.17 s

image

Sampling time after acceleration: 1 s

image

(2) SDXL

Image generation configuration: SDXL, 1024 × 1024, euler, 20 steps

Sampling time before acceleration: 6.7 s

image

Sampling time after acceleration: 5.98 s

image

(3) Complex workflow

Image generation configuration: SDXL, 512 × 768, lora, controlnet, 30 steps

Sampling time before acceleration: 20.59 s

image

Sampling time after acceleration: 18.53 s

image