How to Run Z-Image-Turbo on a DigitalOcean GPU Droplet

Published on November 27, 2025

Technical Evangelist // AI Arcanist

How to Run Z-Image-Turbo on a DigitalOcean GPU Droplet

Stable Diffusion 1.5, Stable Diffusion XL, and Flux.1 were the biggest model releases for text-to-image deep learning models ever, and for very good reason. They had two things in common: versatility and a small size. The versatility ensured that users for all use cases could make the absolute most out of these models from hyper-realism to anime to painted styles. The small size ensured that it could be run on consumer grade GPUs at reasonable speeds, which allowed the models to proliferate far more rapidly than opponents. For these reasons, these models dropped on the public like chain reactions that continue to go off to this day.

Z-Image-Turbo is the latest model to hit these marks perfectly. This versatile model from Alibaba’s Tongyi-MAI team is truly the next generation of open-source text-to-image model, seemingly combining the absurd prompt adherence of Black Forest Labs Flux.1 model series with the absurd versatility of Stable Diffusion XL.

We are ecstatic to watch this release take off across the open source community, and want to show you how to run this model on DigitalOcean. In this quick tutorial, we will walk you through, step-by-step, how to run Z-Image-Turbo on a DigitalOcean Gradient GPU Droplet using the ComfyUI. By leveraging DigitalOcean’s NVIDIA H200 GPUs, we can generate single 2048x2048p images in just 6 seconds!

Follow along for details!

Key Takeaways

Running Z-Image-Turbo on an NVIDIA H200 is easy with the DigitalOcean Gradient Cloud
Z-Image-Turbo is the next generation in text-to-image generation open-source technology
The ComfyUI is ready to run the Z-Image-Turbo Pipeline today!

Setting up the GPU Droplet

To get started with Z-Image-Turbo, we need sufficient GPU compute. Any GPU on the DigitalOcean platform can run Z-Image-Turbo sufficiently, a testament to how well made the model is, but that doesn’t mean we should limit ourselves. Faster speeds can lead to better experimentation and more varied outputs for our results. For those reasons, we recommend at least a single NVIDIA H200 for this model. Follow this tutorial for step by step instructions on setting up your environment to run AI/ML technology with a GPU Droplet. Once your GPU Droplet has spun up and you have accessed it with SSH on your local machine, move to the next section.

Setting up the ComfyUI

To actually start generating images, we need to set up the ComfyUI. ComfyUI is the most popular open-source image generation tool, and t2i Deep Learning models success often stands on the popularization on the ComfyUI platform.

To make getting started easy, we are going to provide a short script that will get everything needed to get the ComfyUI, install the required packages, download the model files, and then run the UI.

git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
apt install python3-venv python3-pip
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd models/text_encoders
wget https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/text_encoders/qwen_3_4b.safetensors
cd ../vae
wget https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/vae/ae.safetensors
cd ../diffusion_models
wget https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/diffusion_models/z_image_turbo_bf16.safetensors
cd ../../
python main.py

From here, take the output URL from the terminal, and plug it into the simple browser of your SSH attached Cursor or VS Code windows. This can be done by pressing command+p and typing “simple browser”, and then entering the URL. We can then click the arrow button in the top right corner of the window to open the ComfyUI in our local browser window.

Generating Images with the ComfyUI

Use the template embedded in the image below to open the right schema for generating images with Z-Image-Turbo.

The json schema for this can also be found here.

Once that is done, click the run button to generate a copy of the above image! We can then edit the prompt, image width and height, and seed to modify the output of the pipeline.

Above, we can see a collection of examples we generated with the ComfyUI and Z-Image-Turbo. As we can see, the model is capable of great prompt adherence, text generation, style use, and recognition of characters. It is evidently the most powerful and versatile model we have ever seen for image generation.

Closing Thoughts

Z-Image-Turbo is a legitimate achievement. It is the greatest step forward for open-source image generation since Flux.1 released, and is arguably an even greater model. We cannot wait for the release of Z-Image-Base for fine-tuning and Z-Image-Edit for image editing. Alibaba has truly done an amazing job, and we encourage you all to try Z-Image-Turbo today with DigitalOcean Gradient’s GPU Droplets!

Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.

Learn more about our products