文生图模型的调用

前言

diffusersHugging Face 官方推出的开源 Python 库,专门用于 加载、运行和定制扩散模型(Diffusion Models),尤其是 文生图(Text-to-Image) 模型,如 Stable Diffusion

它的目标是:让最先进的生成式 AI 模型变得简单、高效、可组合

一句话定义

diffusers = 扩散模型的“操作系统”
它封装了从模型加载、采样调度、图像/视频/音频生成到部署优化的全流程,让开发者像调用 API 一样使用复杂生成模型。

支持 文生图 / 图生图 / Inpainting / ControlNet / LoRA 微调 / 视频生成

简单示例

安装最新版本的 diffusers

1
pip install git+https://github.com/huggingface/diffusers

以下代码片段展示了如何使用 Qwen-Image-2512

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
from modelscope import DiffusionPipeline
import torch

model_name = "Qwen/Qwen-Image-2512"

# Load the pipeline
if torch.cuda.is_available():
torch_dtype = torch.bfloat16
device = "cuda"
else:
torch_dtype = torch.float32
device = "cpu"

pipe = DiffusionPipeline.from_pretrained(model_name, torch_dtype=torch_dtype).to(device)

# Generate image
prompt = '''A 20-year-old East Asian girl with delicate, charming features and large, bright brown eyes—expressive and lively, with a cheerful or subtly smiling expression. Her naturally wavy long hair is either loose or tied in twin ponytails. She has fair skin and light makeup accentuating her youthful freshness. She wears a modern, cute dress or relaxed outfit in bright, soft colors—lightweight fabric, minimalist cut. She stands indoors at an anime convention, surrounded by banners, posters, or stalls. Lighting is typical indoor illumination—no staged lighting—and the image resembles a casual iPhone snapshot: unpretentious composition, yet brimming with vivid, fresh, youthful charm.'''

negative_prompt = "低分辨率,低画质,肢体畸形,手指畸形,画面过饱和,蜡像感,人脸无细节,过度光滑,画面具有AI感。构图混乱。文字模糊,扭曲。"


# Generate with different aspect ratios
aspect_ratios = {
"1:1": (1328, 1328),
"16:9": (1664, 928),
"9:16": (928, 1664),
"4:3": (1472, 1104),
"3:4": (1104, 1472),
"3:2": (1584, 1056),
"2:3": (1056, 1584),
}

width, height = aspect_ratios["16:9"]

image = pipe(
prompt=prompt,
negative_prompt=negative_prompt,
width=width,
height=height,
num_inference_steps=50,
true_cfg_scale=4.0,
generator=torch.Generator(device="cuda").manual_seed(42)
).images[0]

image.save("example.png")