The Wayback Machine - https://web.archive.org/web/20230527140703/https://github.com/drboog/ProFusion
Skip to content

drboog/ProFusion

main
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 

ProFusion


examples

Code for Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach.

ProFusion is a framework for customizing pre-trained large-scale text-to-image generation models, which is Stable Diffusion 2 in our examples.

framework

With ProFusion, you can generate infinite number of creative images for a novel/unique concept, with single testing image, on single GPU (~20GB are needed when fine-tune with batch size 1).

framework

Example

  • Install dependencies (we revised original diffusers);

      cd ./diffusers
      pip install -e .
      cd ..
      pip install accelerate==0.16.0 torchvision transformers>=4.25.1 datasets ftfy tensorboard Jinja2 regex tqdm joblib 
    
  • Initialize Accelerate;

      accelerate config
    
  • Download a model pre-trained on FFHQ;

  • Customize model with a testing image, example is shown in the notebook test.ipynb;

Train Your Own Encoder

If you want to train a PromptNet encoder for other domains, or on your own dataset.

  • First, prepare an image-only dataset;

  • Then, run

      accelerate launch --mixed_precision="fp16" train.py\
            --pretrained_model_name_or_path="stabilityai/stable-diffusion-2-base" \
            --train_data_dir=./images_512 \
            --max_train_steps=80000 \
            --learning_rate=2e-05 \
            --output_dir="./promptnet" \
            --train_batch_size=8 \
            --promptnet_l2_reg=0.000 \
            --gradient_checkpointing
    

Citation

@misc{zhou2023enhancing,
  title={Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach}, 
  author={Yufan Zhou and Ruiyi Zhang and Tong Sun and Jinhui Xu},
  year={2023},
  eprint={2305.13579},
  archivePrefix={arXiv},
  primaryClass={cs.CV}
}

About

Code for Enhancing Detail Preservation for Customized Text-to-Image Generation: A Regularization-Free Approach

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published