He also brought us IC-Light! I wonder why he's still contributing to open source... Surely all the big companies have made him huge offers. He's so talented
Wan 2.1 (and Hunyuan and LTXV, in descending ordee of overall video quality but each has unique strengths) work well—but slow, except LTXV—for short (single digit seconds at their usual frame rates — 16 for WAN, 24 for LXTV, I forget for Hunyuan) videos on consumer hardware. But this blows them entirely out of the water on the length it can handle, so if it does so with coherence and quality across general prompts (especially if it is competitive with WAN and Hunyuan on trainability for concepts it may not handle normally) it is potentially a radical game changer.
Wan 2.1 is solid but you start to get pretty bad continuity / drift issues when genning more than 81 frames (approx 5 seconds of video) whereas FramePack lets you generate 1+ minute.
Wow, the examples are fairly impressive and the resources used to create them are practically trivial. Seems like inference can be run on previous generation consumer hardware. I'd like to see throughput stats for inference on a 5090 too at some point.
That's a certified bop! ;) You should get elybeatmaker to do a remix!
Edit: I didn't realize that this was actually a reference to Men Without Hats - The Safety Dance. I was referencing a different parody/allusion to that song!
This is the first decent video generation model that runs on consumer hardware. Big deal and I expect ControlNet pose support soon too.
https://github.com/Lightricks/LTX-Video
It can leave LLMs behind...
'Cause LLMs don't dance, and if they don't dance, well, they're no friends of mine.
Edit: I didn't realize that this was actually a reference to Men Without Hats - The Safety Dance. I was referencing a different parody/allusion to that song!