Technical Evangelist // AI Arcanist

Stable Diffusion 1.5, Stable Diffusion XL, and Flux.1 were the biggest model releases for text-to-image deep learning models ever, and for very good reason. They had two things in common: versatility and a small size. The versatility ensured that users for all use cases could make the absolute most out of these models from hyper-realism to anime to painted styles. The small size ensured that it could be run on consumer grade GPUs at reasonable speeds, which allowed the models to proliferate far more rapidly than opponents. For these reasons, these models dropped on the public like chain reactions that continue to go off to this day.
Z-Image-Turbo is the latest model to hit these marks perfectly. This versatile model from Alibaba’s Tongyi-MAI team is truly the next generation of open-source text-to-image model, seemingly combining the absurd prompt adherence of Black Forest Labs Flux.1 model series with the absurd versatility of Stable Diffusion XL.
We are ecstatic to watch this release take off across the open source community, and want to show you how to run this model on DigitalOcean. In this quick tutorial, we will walk you through, step-by-step, how to run Z-Image-Turbo on a DigitalOcean Gradient GPU Droplet using the ComfyUI. By leveraging DigitalOcean’s NVIDIA H200 GPUs, we can generate single 2048x2048p images in just 6 seconds!
Follow along for details!
To get started with Z-Image-Turbo, we need sufficient GPU compute. Any GPU on the DigitalOcean platform can run Z-Image-Turbo sufficiently, a testament to how well made the model is, but that doesn’t mean we should limit ourselves. Faster speeds can lead to better experimentation and more varied outputs for our results. For those reasons, we recommend at least a single NVIDIA H200 for this model. Follow this tutorial for step by step instructions on setting up your environment to run AI/ML technology with a GPU Droplet. Once your GPU Droplet has spun up and you have accessed it with SSH on your local machine, move to the next section.
To actually start generating images, we need to set up the ComfyUI. ComfyUI is the most popular open-source image generation tool, and t2i Deep Learning models success often stands on the popularization on the ComfyUI platform.
To make getting started easy, we are going to provide a short script that will get everything needed to get the ComfyUI, install the required packages, download the model files, and then run the UI.
git clone https://github.com/comfyanonymous/ComfyUI
cd ComfyUI
apt install python3-venv python3-pip
python3 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
cd models/text_encoders
wget https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/text_encoders/qwen_3_4b.safetensors
cd ../vae
wget https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/vae/ae.safetensors
cd ../diffusion_models
wget https://huggingface.co/Comfy-Org/z_image_turbo/resolve/main/split_files/diffusion_models/z_image_turbo_bf16.safetensors
cd ../../
python main.py
From here, take the output URL from the terminal, and plug it into the simple browser of your SSH attached Cursor or VS Code windows. This can be done by pressing command+p and typing “simple browser”, and then entering the URL. We can then click the arrow button in the top right corner of the window to open the ComfyUI in our local browser window.
Use the template embedded in the image below to open the right schema for generating images with Z-Image-Turbo.

The json schema for this can also be found here.
Once that is done, click the run button to generate a copy of the above image! We can then edit the prompt, image width and height, and seed to modify the output of the pipeline.
.png)
Above, we can see a collection of examples we generated with the ComfyUI and Z-Image-Turbo. As we can see, the model is capable of great prompt adherence, text generation, style use, and recognition of characters. It is evidently the most powerful and versatile model we have ever seen for image generation.
Z-Image-Turbo is a legitimate achievement. It is the greatest step forward for open-source image generation since Flux.1 released, and is arguably an even greater model. We cannot wait for the release of Z-Image-Base for fine-tuning and Z-Image-Edit for image editing. Alibaba has truly done an amazing job, and we encourage you all to try Z-Image-Turbo today with DigitalOcean Gradient’s GPU Droplets!
Thanks for learning with the DigitalOcean Community. Check out our offerings for compute, storage, networking, and managed databases.
This textbox defaults to using Markdown to format your answer.
You can type !ref in this text area to quickly search our full set of tutorials, documentation & marketplace offerings and insert the link!
Get paid to write technical tutorials and select a tech-focused charity to receive a matching donation.
Full documentation for every DigitalOcean product.
The Wave has everything you need to know about building a business, from raising funding to marketing your product.
Stay up to date by signing up for DigitalOcean’s Infrastructure as a Newsletter.
New accounts only. By submitting your email you agree to our Privacy Policy
Scale up as you grow — whether you're running one virtual machine or ten thousand.
Sign up and get $200 in credit for your first 60 days with DigitalOcean.*
*This promotional offer applies to new accounts only.