-
Notifications
You must be signed in to change notification settings - Fork 1.1k
[Feature] Support AnimateDiff, a popular text2animation method #1980
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
60 commits
Select commit
Hold shift + click to select a range
15d09e4
first commit for animatediff
ElliotQi a39aa96
fix lint errors
ElliotQi 955248f
modify readme file and add readme_zh-CN.md
ElliotQi b46ec12
fix some typos in readme
ElliotQi 016c86e
delete test_animatediff.py
ElliotQi 127c99a
add some docstring
ElliotQi 0aa9655
Merge branch 'open-mmlab:main' into animatediff
ElliotQi c562050
Merge branch 'open-mmlab:main' into animatediff
ElliotQi 3609ac3
fix cross attention for 512*512 animation quality
ElliotQi 800faae
Merge branch 'open-mmlab:main' into animatediff
ElliotQi 8a092b3
fix some initial setting for cpu load
ElliotQi 6dd1920
add unittest samples
ElliotQi ec7e191
modify unittest codes
ElliotQi 60cd955
remove duplicated unittest files
ElliotQi 116f42a
modify unittest codes for minimum memory
ElliotQi e0994cf
modify test_unet3d resolution for minimum memory unittest
ElliotQi dda8716
modify test_unet_blocks3d input resolution for minimum memory unittest
ElliotQi 87bb203
Merge branch 'open-mmlab:main' into animatediff
ElliotQi 9e0b432
modify animatediff.py for gradio
ElliotQi 55d381c
add gradio app for animatediff
ElliotQi a10ede2
skip test with large memory
ElliotQi cde60c6
Merge branch 'main' into animatediff
ElliotQi 276e051
Merge branch 'main' into animatediff
liuwenran 4f54924
fix environment building
ElliotQi 1de46df
Merge branch 'animatediff' of github.com:ElliotQi/mmagic into animate…
ElliotQi b645f0c
Merge branch 'main' into animatediff
ElliotQi 485fdd2
Merge branch 'main' into animatediff
ElliotQi 76cb637
fix merging conflict
ElliotQi d67f61a
Merge branch 'open-mmlab:main' into animatediff
ElliotQi 541c2e9
Add different style ckpt
ElliotQi b476929
Merge branch 'animatediff' of github.com:ElliotQi/mmagic into animate…
ElliotQi eaeb9a0
Merge branch 'main' into animatediff
liuwenran 3a1de39
fix environment building
ElliotQi 7168ee2
Merge branch 'animatediff' of github.com:ElliotQi/mmagic into animate…
ElliotQi 3502e09
add new motion module
ElliotQi cf0e50c
Merge branch 'open-mmlab:main' into animatediff
ElliotQi 71102cf
add prompts for all config files in README
ElliotQi e407e79
add image in README
ElliotQi ec15c1b
fix sd ckpt auto downloading
ElliotQi 7b51631
remove unused import in test code
ElliotQi 55a2b15
align README_zh and README
ElliotQi 8e33acc
fix building error
ElliotQi e89b7a4
delete unused comments
ElliotQi f9b6f2a
fix test memory
ElliotQi fa16d66
Merge branch 'main' into animatediff
ElliotQi c7a90e8
fix text_model error for later transformer version
ElliotQi df00be0
Merge branch 'main' into animatediff
ElliotQi 67c29ca
fix comment copyright
ElliotQi 8cece90
add animatediff gradio README
ElliotQi db8023d
modify some copyright in motion_module.py
ElliotQi 178a2e8
modify README for better test guidance
ElliotQi cacfd29
fix inference without xformers and mimsave for higher version of imageio
ElliotQi 28a3437
fix errors in different versions of imageio
ElliotQi 7aed840
Merge branch 'main' into animatediff
ElliotQi 9e30b43
add train tutorial and pretrained models
ElliotQi e5d611b
fix some comments in README
ElliotQi 08bf694
delete personal information
ElliotQi cda517c
fix gradio sd selection
ElliotQi fbf49ec
add some tips for run gradio
ElliotQi 9bad0b9
add pretrained links
ElliotQi File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Large diffs are not rendered by default.
Oops, something went wrong.
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| # config for model | ||
| stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5' | ||
| models_path = '/home/AnimateDiff/models/' | ||
| randomness = dict( | ||
| seed=[ | ||
| 10917152860782582783, 6399018107401806238, 15875751942533906793, | ||
| 6653196880059936551 | ||
| ], | ||
| diff_rank_seed=True) | ||
|
|
||
| diffusion_scheduler = dict( | ||
| type='DDIMScheduler', | ||
| beta_end=0.012, | ||
| beta_schedule='linear', | ||
| beta_start=0.00085, | ||
| num_train_timesteps=1000, | ||
| prediction_type='epsilon', | ||
| set_alpha_to_one=True, | ||
| clip_sample=False, | ||
| thresholding=False, | ||
| steps_offset=1) | ||
|
|
||
| model = dict( | ||
| type='AnimateDiff', | ||
| vae=dict( | ||
| type='AutoencoderKL', | ||
| from_pretrained=stable_diffusion_v15_url, | ||
| subfolder='vae'), | ||
| unet=dict( | ||
| type='UNet3DConditionMotionModel', | ||
| unet_use_cross_frame_attention=False, | ||
| unet_use_temporal_attention=False, | ||
| use_motion_module=True, | ||
| motion_module_resolutions=[1, 2, 4, 8], | ||
| motion_module_mid_block=False, | ||
| motion_module_decoder_only=False, | ||
| motion_module_type='Vanilla', | ||
| motion_module_kwargs=dict( | ||
| num_attention_heads=8, | ||
| num_transformer_block=1, | ||
| attention_block_types=['Temporal_Self', 'Temporal_Self'], | ||
| temporal_position_encoding=True, | ||
| temporal_position_encoding_max_len=24, | ||
| temporal_attention_dim_div=1), | ||
| subfolder='unet', | ||
| from_pretrained=stable_diffusion_v15_url), | ||
| text_encoder=dict( | ||
| type='ClipWrapper', | ||
| clip_type='huggingface', | ||
| pretrained_model_name_or_path=stable_diffusion_v15_url, | ||
| subfolder='text_encoder'), | ||
| tokenizer=stable_diffusion_v15_url, | ||
| scheduler=diffusion_scheduler, | ||
| test_scheduler=diffusion_scheduler, | ||
| data_preprocessor=dict(type='DataPreprocessor'), | ||
| motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'), | ||
| dream_booth_lora_cfg=dict( | ||
| type='ToonYou', | ||
| path=models_path + 'DreamBooth_LoRA/lyriel_v16.safetensors', | ||
| steps=25, | ||
| guidance_scale=7.5)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| # config for model | ||
| stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5' | ||
| models_path = '/home/AnimateDiff/models/' | ||
| randomness = dict( | ||
| seed=[ | ||
| 1572448948722921032, 1099474677988590681, 6488833139725635347, | ||
| 18339859844376517918 | ||
| ], | ||
| diff_rank_seed=True) | ||
|
|
||
| diffusion_scheduler = dict( | ||
| type='DDIMScheduler', | ||
| beta_end=0.012, | ||
| beta_schedule='linear', | ||
| beta_start=0.00085, | ||
| num_train_timesteps=1000, | ||
| prediction_type='epsilon', | ||
| set_alpha_to_one=True, | ||
| clip_sample=False, | ||
| thresholding=False, | ||
| steps_offset=1) | ||
|
|
||
| model = dict( | ||
| type='AnimateDiff', | ||
| vae=dict( | ||
| type='AutoencoderKL', | ||
| from_pretrained=stable_diffusion_v15_url, | ||
| subfolder='vae'), | ||
| unet=dict( | ||
| type='UNet3DConditionMotionModel', | ||
| unet_use_cross_frame_attention=False, | ||
| unet_use_temporal_attention=False, | ||
| use_motion_module=True, | ||
| motion_module_resolutions=[1, 2, 4, 8], | ||
| motion_module_mid_block=False, | ||
| motion_module_decoder_only=False, | ||
| motion_module_type='Vanilla', | ||
| motion_module_kwargs=dict( | ||
| num_attention_heads=8, | ||
| num_transformer_block=1, | ||
| attention_block_types=['Temporal_Self', 'Temporal_Self'], | ||
| temporal_position_encoding=True, | ||
| temporal_position_encoding_max_len=24, | ||
| temporal_attention_dim_div=1), | ||
| subfolder='unet', | ||
| from_pretrained=stable_diffusion_v15_url), | ||
| text_encoder=dict( | ||
| type='ClipWrapper', | ||
| clip_type='huggingface', | ||
| pretrained_model_name_or_path=stable_diffusion_v15_url, | ||
| subfolder='text_encoder'), | ||
| tokenizer=stable_diffusion_v15_url, | ||
| scheduler=diffusion_scheduler, | ||
| test_scheduler=diffusion_scheduler, | ||
| data_preprocessor=dict(type='DataPreprocessor'), | ||
| motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'), | ||
| dream_booth_lora_cfg=dict( | ||
| type='ToonYou', | ||
| path=models_path + | ||
| 'DreamBooth_LoRA/majicmixRealistic_v5Preview.safetensors', | ||
| steps=25, | ||
| guidance_scale=7.5)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| # config for model | ||
| stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5' | ||
| models_path = '/home/AnimateDiff/models/' | ||
| randomness = dict( | ||
| seed=[ | ||
| 16931037867122267877, 2094308009433392066, 4292543217695451092, | ||
| 15572665120852309890 | ||
| ], | ||
| diff_rank_seed=True) | ||
|
|
||
| diffusion_scheduler = dict( | ||
| type='DDIMScheduler', | ||
| beta_end=0.012, | ||
| beta_schedule='linear', | ||
| beta_start=0.00085, | ||
| num_train_timesteps=1000, | ||
| prediction_type='epsilon', | ||
| set_alpha_to_one=True, | ||
| clip_sample=False, | ||
| thresholding=False, | ||
| steps_offset=1) | ||
|
|
||
| model = dict( | ||
| type='AnimateDiff', | ||
| vae=dict( | ||
| type='AutoencoderKL', | ||
| from_pretrained=stable_diffusion_v15_url, | ||
| subfolder='vae'), | ||
| unet=dict( | ||
| type='UNet3DConditionMotionModel', | ||
| unet_use_cross_frame_attention=False, | ||
| unet_use_temporal_attention=False, | ||
| use_motion_module=True, | ||
| motion_module_resolutions=[1, 2, 4, 8], | ||
| motion_module_mid_block=False, | ||
| motion_module_decoder_only=False, | ||
| motion_module_type='Vanilla', | ||
| motion_module_kwargs=dict( | ||
| num_attention_heads=8, | ||
| num_transformer_block=1, | ||
| attention_block_types=['Temporal_Self', 'Temporal_Self'], | ||
| temporal_position_encoding=True, | ||
| temporal_position_encoding_max_len=24, | ||
| temporal_attention_dim_div=1), | ||
| subfolder='unet', | ||
| from_pretrained=stable_diffusion_v15_url), | ||
| text_encoder=dict( | ||
| type='ClipWrapper', | ||
| clip_type='huggingface', | ||
| pretrained_model_name_or_path=stable_diffusion_v15_url, | ||
| subfolder='text_encoder'), | ||
| tokenizer=stable_diffusion_v15_url, | ||
| scheduler=diffusion_scheduler, | ||
| test_scheduler=diffusion_scheduler, | ||
| data_preprocessor=dict(type='DataPreprocessor'), | ||
| motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'), | ||
| dream_booth_lora_cfg=dict( | ||
| type='ToonYou', | ||
| path=models_path + 'DreamBooth_LoRA/rcnzCartoon3d_v10.safetensors', | ||
| steps=25, | ||
| guidance_scale=7.5)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| # config for model | ||
| stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5' | ||
| models_path = '/home/AnimateDiff/models/' | ||
| randomness = dict( | ||
| seed=[ | ||
| 5658137986800322009, 12099779162349365895, 10499524853910852697, | ||
| 16768009035333711932 | ||
| ], | ||
| diff_rank_seed=True) | ||
|
|
||
| diffusion_scheduler = dict( | ||
| type='DDIMScheduler', | ||
| beta_end=0.012, | ||
| beta_schedule='linear', | ||
| beta_start=0.00085, | ||
| num_train_timesteps=1000, | ||
| prediction_type='epsilon', | ||
| set_alpha_to_one=True, | ||
| clip_sample=False, | ||
| thresholding=False, | ||
| steps_offset=1) | ||
|
|
||
| model = dict( | ||
| type='AnimateDiff', | ||
| vae=dict( | ||
| type='AutoencoderKL', | ||
| from_pretrained=stable_diffusion_v15_url, | ||
| subfolder='vae'), | ||
| unet=dict( | ||
| type='UNet3DConditionMotionModel', | ||
| unet_use_cross_frame_attention=False, | ||
| unet_use_temporal_attention=False, | ||
| use_motion_module=True, | ||
| motion_module_resolutions=[1, 2, 4, 8], | ||
| motion_module_mid_block=False, | ||
| motion_module_decoder_only=False, | ||
| motion_module_type='Vanilla', | ||
| motion_module_kwargs=dict( | ||
| num_attention_heads=8, | ||
| num_transformer_block=1, | ||
| attention_block_types=['Temporal_Self', 'Temporal_Self'], | ||
| temporal_position_encoding=True, | ||
| temporal_position_encoding_max_len=24, | ||
| temporal_attention_dim_div=1), | ||
| subfolder='unet', | ||
| from_pretrained=stable_diffusion_v15_url), | ||
| text_encoder=dict( | ||
| type='ClipWrapper', | ||
| clip_type='huggingface', | ||
| pretrained_model_name_or_path=stable_diffusion_v15_url, | ||
| subfolder='text_encoder'), | ||
| tokenizer=stable_diffusion_v15_url, | ||
| scheduler=diffusion_scheduler, | ||
| test_scheduler=diffusion_scheduler, | ||
| data_preprocessor=dict(type='DataPreprocessor'), | ||
| motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'), | ||
| dream_booth_lora_cfg=dict( | ||
| type='ToonYou', | ||
| path=models_path + | ||
| 'DreamBooth_LoRA/realisticVisionV20_v20.safetensors', | ||
| steps=25, | ||
| guidance_scale=7.5)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,64 @@ | ||
| # config for model | ||
| stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5' | ||
| models_path = '/home/AnimateDiff/models/' | ||
| randomness = dict( | ||
| seed=[ | ||
| 13100322578370451493, 14752961627088720670, 9329399085567825781, | ||
| 16987697414827649302 | ||
| ], | ||
| diff_rank_seed=True) | ||
|
|
||
| diffusion_scheduler = dict( | ||
| type='DDIMScheduler', | ||
| beta_end=0.012, | ||
| beta_schedule='linear', | ||
| beta_start=0.00085, | ||
| num_train_timesteps=1000, | ||
| prediction_type='epsilon', | ||
| set_alpha_to_one=True, | ||
| clip_sample=False, | ||
| thresholding=False, | ||
| steps_offset=1) | ||
|
|
||
| model = dict( | ||
| type='AnimateDiff', | ||
| vae=dict( | ||
| type='AutoencoderKL', | ||
| from_pretrained=stable_diffusion_v15_url, | ||
| subfolder='vae'), | ||
| unet=dict( | ||
| type='UNet3DConditionMotionModel', | ||
| use_inflated_groupnorm=True, | ||
| unet_use_cross_frame_attention=False, | ||
| unet_use_temporal_attention=False, | ||
| use_motion_module=True, | ||
| motion_module_resolutions=[1, 2, 4, 8], | ||
| motion_module_mid_block=True, | ||
| motion_module_decoder_only=False, | ||
| motion_module_type='Vanilla', | ||
| motion_module_kwargs=dict( | ||
| num_attention_heads=8, | ||
| num_transformer_block=1, | ||
| attention_block_types=['Temporal_Self', 'Temporal_Self'], | ||
| temporal_position_encoding=True, | ||
| temporal_position_encoding_max_len=32, | ||
| temporal_attention_dim_div=1), | ||
| subfolder='unet', | ||
| from_pretrained=stable_diffusion_v15_url), | ||
| text_encoder=dict( | ||
| type='ClipWrapper', | ||
| clip_type='huggingface', | ||
| pretrained_model_name_or_path=stable_diffusion_v15_url, | ||
| subfolder='text_encoder'), | ||
| tokenizer=stable_diffusion_v15_url, | ||
| scheduler=diffusion_scheduler, | ||
| test_scheduler=diffusion_scheduler, | ||
| data_preprocessor=dict(type='DataPreprocessor'), | ||
| motion_module_cfg=dict(path=models_path + | ||
| 'Motion_Module/mm_sd_v15_v2.ckpt'), | ||
| dream_booth_lora_cfg=dict( | ||
| type='ToonYou', | ||
| path=models_path + | ||
| 'DreamBooth_LoRA/realisticVisionV20_v20.safetensors', | ||
| steps=25, | ||
| guidance_scale=7.5)) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,80 @@ | ||
| # config for model | ||
ElliotQi marked this conversation as resolved.
Show resolved
Hide resolved
|
||
| stable_diffusion_v15_url = 'runwayml/stable-diffusion-v1-5' | ||
| models_path = '/home/AnimateDiff/models/' | ||
| randomness = dict( | ||
| seed=[ | ||
| 10788741199826055526, 6520604954829636163, 6519455744612555650, | ||
| 16372571278361863751 | ||
| ], | ||
| diff_rank_seed=True) | ||
|
|
||
| val_prompts = [ | ||
| 'best quality, masterpiece, 1girl, looking at viewer,\ | ||
| blurry background, upper body, contemporary, dress', | ||
| 'masterpiece, best quality, 1girl, solo, cherry blossoms,\ | ||
| hanami, pink flower, white flower, spring season, wisteria,\ | ||
| petals, flower, plum blossoms, outdoors, falling petals,\ | ||
| white hair, black eyes,', | ||
| 'best quality, masterpiece, 1boy, formal, abstract,\ | ||
| looking at viewer, masculine, marble pattern', | ||
| 'best quality, masterpiece, 1girl, cloudy sky,\ | ||
| dandelion, contrapposto, alternate hairstyle,' | ||
| ] | ||
| val_neg_propmts = [ | ||
| '', | ||
| 'badhandv4,easynegative,ng_deepnegative_v1_75t,verybadimagenegative_v1.3,\ | ||
| bad-artist, bad_prompt_version2-neg, teeth', | ||
| '', | ||
| '', | ||
| ] | ||
| diffusion_scheduler = dict( | ||
| type='DDIMScheduler', | ||
| beta_end=0.012, | ||
| beta_schedule='linear', | ||
| beta_start=0.00085, | ||
| num_train_timesteps=1000, | ||
| prediction_type='epsilon', | ||
| set_alpha_to_one=True, | ||
| clip_sample=False, | ||
| thresholding=False, | ||
| steps_offset=1) | ||
|
|
||
| model = dict( | ||
| type='AnimateDiff', | ||
| vae=dict( | ||
| type='AutoencoderKL', | ||
| from_pretrained=stable_diffusion_v15_url, | ||
| subfolder='vae'), | ||
| unet=dict( | ||
| type='UNet3DConditionMotionModel', | ||
| unet_use_cross_frame_attention=False, | ||
| unet_use_temporal_attention=False, | ||
| use_motion_module=True, | ||
| motion_module_resolutions=[1, 2, 4, 8], | ||
| motion_module_mid_block=False, | ||
| motion_module_decoder_only=False, | ||
| motion_module_type='Vanilla', | ||
| motion_module_kwargs=dict( | ||
| num_attention_heads=8, | ||
| num_transformer_block=1, | ||
| attention_block_types=['Temporal_Self', 'Temporal_Self'], | ||
| temporal_position_encoding=True, | ||
| temporal_position_encoding_max_len=24, | ||
| temporal_attention_dim_div=1), | ||
| subfolder='unet', | ||
| from_pretrained=stable_diffusion_v15_url), | ||
| text_encoder=dict( | ||
| type='ClipWrapper', | ||
| clip_type='huggingface', | ||
| pretrained_model_name_or_path=stable_diffusion_v15_url, | ||
| subfolder='text_encoder'), | ||
| tokenizer=stable_diffusion_v15_url, | ||
| scheduler=diffusion_scheduler, | ||
| test_scheduler=diffusion_scheduler, | ||
| data_preprocessor=dict(type='DataPreprocessor'), | ||
| motion_module_cfg=dict(path=models_path + 'Motion_Module/mm_sd_v14.ckpt'), | ||
| dream_booth_lora_cfg=dict( | ||
| type='ToonYou', | ||
| path=models_path + 'DreamBooth_LoRA/toonyou_beta3.safetensors', | ||
| steps=25, | ||
| guidance_scale=7.5)) | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.