This one took the long way 'round to get to the reveal. Image gen is almost always surprising, and frequently makes me laugh.

Starting full size Flux image:

  • CavendishOPM
    link
    fedilink
    English
    arrow-up
    6
    ·
    16 days ago

    Yes, this is all local and self-hosted.

    On the hardware front, I’m running an Intel 12900k, Nvidia 3090, 64GB ram. The workflow is all done with ComfyUI. Starting with a single image from the Flux 1 Dev model (text to image). Then passing that to a new video model called WAN 2.1 for the image-to-video.

    The Flux image takes about 2 minutes to generate but I make them big, about 2.5 megapixels with an upscale/finetune pass. The video takes about 3-4 hours to output 10 seconds of 720p at 15fps. Any larger res or more frames and I get out of memory errors. After that, I run it through a script I wrote using a frame interpolator called RIFE to get it up to 30fps.

    If you don’t need NSFW and don’t mind a Chinese company doing the video, I’ve gotten great image-to-video results from KlingAI in just a few minutes. The company behind Flux is also working on a local i2v model, and there’s another new one called HunyuanVideo that just came out a few days ago but I haven’t tried it.

    • scrion@lemmy.world
      link
      fedilink
      arrow-up
      2
      ·
      16 days ago

      Very interesting info - there is some really useful stuff in there I can use as an immediate starting point, so thank you for that!