인공지능

Stable Diffusion 2.0-v 공개

몽상꼴레 2022. 11. 25. 17:23

v-prediction 모델이라 불리는 Stable Diffusion 2.0-v가 공개됨.

 

5.85억 CLIP-filtered 이미지-텍스트 쌍으로 학습. 바닥부터 새로 학습했다고 함.

 

개선 사항은 다음과 같음.

 

- 2048x2048 이상의 해상도를 가진 결과물 생성 가능

- depth-guided stable diffusion model

- Brand new text-guided inpainting model

 

 

소스코드는 아래 github 사이트에서 다운 받을 수 있고..

https://github.com/Stability-AI/stablediffusion

 

GitHub - Stability-AI/stablediffusion: High-Resolution Image Synthesis with Latent Diffusion Models

High-Resolution Image Synthesis with Latent Diffusion Models - GitHub - Stability-AI/stablediffusion: High-Resolution Image Synthesis with Latent Diffusion Models

github.com

 

학습된 ckpt 파일은 아래 링크에서 다운 받을 수 있음. (5.09G)

https://huggingface.co/stabilityai/stable-diffusion-2/blob/main/768-v-ema.ckpt

 

768-v-ema.ckpt · stabilityai/stable-diffusion-2 at main

Detected Pickle imports (4) "collections.OrderedDict", "torch.FloatStorage", "torch._utils._rebuild_tensor_v2", "torch.IntStorage" What is a pickle import?

huggingface.co

 

 

그렇다면, 이걸 가지고 사용하기 쉽게 WebUI 만드는 그룹에서는 어떻게 반응했는가를 봐야겠지?

 

역시... 우리 병아리들 벌써 작업 들어 갔어~!

 

https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/5011

 

[Feature Request]: Support for new 2.0 models | 768x768 resolution + new 512x512 + depth + inpainting · Issue #5011 · AUTOMATI

Is there an existing issue for this? I have searched the existing issues and checked the recent builds/commits What would your feature do ? Support the new 768x768 model 2.0 from Stability-AI and a...

github.com

 

출장 다녀와서 조금 만지작 거려봐야겠음.

 

 

References

 

https://medium.com/mlearning-ai/stable-diffusion-v2-0-released-this-is-massive-718072bc57e1

 

Stable Diffusion V2.0 Released — This Is Massive

Stable Diffusion 2.0 is released with brand new features including an image depth recognition.

medium.com