stable-diffusion.cpp

Inference of Stable Diffusion in pure C/C++

Features

Plain C/C++ implementation based on ggml, working in the same way as llama.cpp
16-bit, 32-bit float support
4-bit, 5-bit and 8-bit integer quantization support
Accelerated memory-efficient CPU inference
- Only requires ~2.3GB when using txt2img with fp16 precision to generate a 512x512 image
AVX, AVX2 and AVX512 support for x86 architectures
Original txt2img and img2img mode
Negative prompt
Sampling method
- Euler A
Supported platforms
- Linux
- Mac OS
- Windows

More sampling methods
GPU support
Make inference faster
- The current implementation of ggml_conv_2d is slow and has high memory usage
Continuing to reduce memory usage (quantizing the weights of ggml_conv_2d)
stable-diffusion-webui style tokenizer (eg: token weighting, ...)
LoRA support
k-quants support
Cross-platform reproducibility (perhaps ensuring consistency with the original SD)

Usage

Get the Code

git clone --recursive https://github.com/leejet/stable-diffusion.cpp
cd stable-diffusion.cpp

If you have already cloned the repository, you can use the following command to update the repository to the latest code.

cd stable-diffusion.cpp
git pull origin master
git submodule update

Convert weights

download original weights(.ckpt or .safetensors). For example

curl -L -O https://huggingface.co/CompVis/stable-diffusion-v-1-4-original/resolve/main/sd-v1-4.ckpt
# curl -L -O https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors

convert weights to ggml model format

cd models
pip install -r requirements.txt
python convert.py [path to weights] --out_type [output precision]
# For example, python convert.py sd-v1-4.ckpt --out_type f16

Quantization

You can specify the output model format using the --out_type parameter

f16 for 16-bit floating-point
f32 for 32-bit floating-point
q8_0 for 8-bit integer quantization
q5_0 or q5_1 for 5-bit integer quantization
q4_0 or q4_1 for 4-bit integer quantization

Build

mkdir build
cd build
cmake ..
cmake --build . --config Release

Using OpenBLAS

cmake .. -DGGML_OPENBLAS=ON
cmake --build . --config Release

usage: ./bin/sd [arguments]

arguments:
  -h, --help                         show this help message and exit
  -M, --mode [txt2img or img2img]    generation mode (default: txt2img)
  -t, --threads N                    number of threads to use during computation (default: -1).
                                     If threads <= 0, then threads will be set to the number of CPU physical cores
  -m, --model [MODEL]                path to model
  -i, --init-img [IMAGE]             path to the input image, required by img2img
  -o, --output OUTPUT                path to write result image to (default: .\output.png)
  -p, --prompt [PROMPT]              the prompt to render
  -n, --negative-prompt PROMPT       the negative prompt (default: "")
  --cfg-scale SCALE                  unconditional guidance scale: (default: 7.0)
  --strength STRENGTH                strength for noising/unnoising (default: 0.75)
                                     1.0 corresponds to full destruction of information in init image
  -H, --height H                     image height, in pixel space (default: 512)
  -W, --width W                      image width, in pixel space (default: 512)
  --sample-method SAMPLE_METHOD      sample method (default: "eular a")
  --steps  STEPS                     number of sample steps (default: 20)
  -s SEED, --seed SEED               RNG seed (default: 42, use random seed for < 0)
  -v, --verbose                      print extra info

txt2img example

./bin/sd -m ../models/sd-v1-4-ggml-model-f16.bin -p "a lovely cat"

Using formats of different precisions will yield results of varying quality.

f32	f16	q8_0	q5_0	q5_1	q4_0	q4_1

img2img example

./output.png is the image generated from the above txt2img pipeline

./bin/sd --mode img2img -m ../models/sd-v1-4-ggml-model-f16.bin -p "cat with blue eyes" -i ./output.png -o ./img2img_output.png --strength 0.4

Memory/Disk Requirements

precision	f32	f16	q8_0	q5_0	q5_1	q4_0	q4_1
Disk	2.7G	2.0G	1.7G	1.6G	1.6G	1.5G	1.5G
Memory(txt2img - 512 x 512)	~2.8G	~2.3G	~2.1G	~2.0G	~2.0G	~2.0G	~2.0G

Stable Diffusion in pure C/C++

stable-diffusion.cpp

Features

Usage

Get the Code

Convert weights

Quantization

Build

Using OpenBLAS

txt2img example

img2img example

Memory/Disk Requirements

References

Recommend

New Grad年薪50万不封顶,垄断H1B, 这职位竟还是专为中国留学生打造的?

Reddit - https://preview.redd.it/stuck-on-trying-to-install-emacs-config-on-emac...

我30岁在湾区FIRE退休, 这是可以说的吗?

Reddit - https://preview.redd.it/doom-emacs-no-icons-in-org-mode-v0-sjgoyatvliib...

2023年7月中国主板显卡出货量：御三家占据前三名，同比增长10% - 超能网

维密，在中国“咸鱼翻身”了

超越现实！AIGC赋能数字之都|2023全球元宇宙大会上海站现场直击

Should I change job? Earn, learn or quit.

How to Create Callback-like Behavior Using AsyncStream in Swift

从20年互联网演变，看用户增长与数据驱动的底层逻辑

About Joyk