cross-posted from: https://lemmy.ca/post/37011397

[email protected]

The popular open-source VLC video player was demonstrated on the floor of CES 2025 with automatic AI subtitling and translation, generated locally and offline in real time. Parent organization VideoLAN shared a video on Tuesday in which president Jean-Baptiste Kempf shows off the new feature, which uses open-source AI models to generate subtitles for videos in several languages.

  • @[email protected]
    link
    fedilink
    English
    71 hour ago

    The nice thing is, now at least this can be used with live tv from other countries and languages.

    Think you want to watch Japanese tv or Korean channels with out bothering about downloading, searching and syncing subtitles

    • @[email protected]
      link
      fedilink
      English
      534 minutes ago

      I prefer watching Mexican football announcers, and it would be nice to know what they’re saying. Though that might actually detract from the experience.

  • ZeroOne
    link
    fedilink
    English
    326 minutes ago

    As long as the models are OpenSource I have no complains

  • @[email protected]
    link
    fedilink
    English
    227 minutes ago

    When we getting amd’s fsr upscaling and frame-gen? Also would subtitles make more sense to use the jellyfin approach.

    • @[email protected]
      link
      fedilink
      English
      110 minutes ago

      I have an AMD card, add VLC as a game in the drivers, and you can turn on AMFM (frame gen).

      If it doesn’t work you could just turn it on system wide in display settings of the Adrenaline Software (gear upper right corner, display/gaming).

      I think it requires at least a 6000 series GPU however.

      If you have a Samsung TV or other modern smart TV connected to a laptop, you can also turn on frame-gen using Auto Motion Plus, set to Custom.

      Judder Reduction 10 is double frames, so 24 FPS -> 48.

  • billwashere
    link
    fedilink
    English
    92 hours ago

    This might be one of the few times I’ve seen AI being useful and not just slapped on something for marketing purposes.

  • @Thistlewick
    link
    English
    114 hours ago

    Amazing. I can finally find out exactly what that nurse is yelling about while she gets railed by the local basketball team.

  • Clot
    link
    fedilink
    English
    106 hours ago

    Will it be possible to export these AI subs?

  • @[email protected]
    link
    fedilink
    English
    117 hours ago

    The technology is nowhere near being good though. On synthetic tests, on the data it was trained and tweeked on, maybe, I don’t know.
    I corun an event when we invite speakers from all over the world, and we tried every way to generate subtitles, all of them run on the level of YouTube autogenerated ones. It’s better than nothing, but you can’t rely on it really.

    • @[email protected]
      link
      fedilink
      English
      34 hours ago

      No, but I think it would be super helpful to synchronize subtitles that are not aligned to the video.

      • @[email protected]
        link
        fedilink
        English
        43 hours ago

        This is already trivial. Bazarr has been doing it for all my subtitles for almost a decade.

    • @[email protected]
      link
      fedilink
      English
      34 hours ago

      You were not able to test it yet calling it nowhere near good 🤦🏻

      Like how should you know?!

      • @[email protected]
        link
        fedilink
        English
        2
        edit-2
        2 hours ago

        Relax, they didn’t write a new way of doing magic, they integrated a solution from the market.
        I don’t know what the new BMW car they introduce this year is capable of, but I know for a fact it can’t fly.

  • @[email protected]
    link
    fedilink
    English
    38 hours ago

    No such comment yet? I’ll be the first then.

    Oh no, AI bad, next thing they add is cryptocurrency mining!

    • katy ✨
      link
      fedilink
      English
      167 hours ago

      ai for accessibility is nowhere near the same thing as crypto mining

  • @[email protected]
    link
    fedilink
    English
    12517 hours ago

    This sounds like a great thing for deaf people and just in general, but I don’t think AI will ever replace anime fansub makers who have no problem throwing a wall of text on screen for a split second just to explain an obscure untranslatable pun.

    • @[email protected]
      link
      fedilink
      English
      69 hours ago

      It’s unlikely to even replace good subtitles, fan or not. It’s just a nice thing to have for a lot of content though.

      • @[email protected]
        link
        fedilink
        English
        4
        edit-2
        3 hours ago

        I have family members who can’t really understand spoken English because it’s a bit fast, and can’t read English subtitles again, because again, too fast for them.

        Sometimes you download a movie and all the Estonian subtitles are for an older release and they desynchronize. Sometimes you can barely even find synchronized English subtitles, so even that doesn’t work.

        This seems like a godsend, honestly.

        Funnily enough, of all the streaming services, I’m again going to have to commend Apple TV+ here. Their shit has Estonian subtitles. Netflix, Prime, etc, do not. Meaning if I’m watching with a family member who doesn’t understand English well, I’ll watch Apple TV+ with a subscription, and everything else is going to be pirated for subtitles. So I don’t bother subscribing anymore. We’re a tiny country, but for some reason Apple of all companies has chosen to acknowledge us. Meanwhile, I was setting up an Xbox for someone a few years ago, and Estonia just… straight up doesn’t exist. I’m not talking about language support - you literally couldn’t pick it as your LOCATION.

    • @[email protected]
      link
      fedilink
      English
      1512 hours ago

      They are like the * in any Terry Pratchett (GNU) novel, sometimes a funny joke can have a little more spice added to make it even funnier

  • @[email protected]
    link
    fedilink
    English
    16520 hours ago

    What’s important is that this is running on your machine locally, offline, without any cloud services. It runs directly inside the executable

    YES, thank you JB

    • @[email protected]
      link
      fedilink
      English
      15322 hours ago

      I was just thinking, this is exactly what AI should be used for. Pattern recognition, full stop.

      • snooggums
        link
        fedilink
        English
        6722 hours ago

        Yup, and if it isn’t perfect that is ok as long as it is close enough.

        Like getting name spellings wrong or mixing homophones is fine because it isn’t trying to be factually accurate.

        • @[email protected]
          link
          fedilink
          English
          1316 hours ago

          I’d like to see this fix the most annoying part about subtitles, timing. find transcript/any subs on the Internet and have the AI align it with the audio properly.

          • @[email protected]
            link
            fedilink
            English
            14 hours ago

            YES! I can’t stand when subtitles are misaligned to the video. If this AI tool could help with that, it would be super useful.

        • TJA!
          link
          fedilink
          English
          3421 hours ago

          Problem ist that now people will say that they don’t get to create accurate subtitles because VLC is doing the job for them.

          Accessibility might suffer from that, because all subtitles are now just “good enough”

          • snooggums
            link
            fedilink
            English
            1816 hours ago

            Regular old live broadcast closed captioning is pretty much ‘good enough’ and that is the standard I’m comparing to.

            Actual subtitles created ahead of time should be perfect because they have the time to double check.

          • @[email protected]
            link
            fedilink
            English
            1020 hours ago

            Honestly though? If your audio is even half decent you’ll get like 95% accuracy. Considering a lot of media just wouldn’t have anything, that is a pretty fair trade off to me

            • @[email protected]
              link
              fedilink
              English
              7
              edit-2
              16 hours ago

              From experience AI translation is still garbage, specially for languages like Chinese, Japanese, and Korean , but if it only subtitles in the actual language such creating English subtitles for English then it is probably fine.

              • @[email protected]
                link
                fedilink
                English
                213 hours ago

                That’s probably more due to lack of training than anything else. Existing models are mostly made by American companies and trained on English-language material. Naturally, the further you get from the model, the worse the result.

                • @[email protected]
                  link
                  fedilink
                  English
                  312 hours ago

                  It is not the lack of training material that is the issue, it doesn’t understand context and cultural references. Someone commented here that crunchyroll AI subtitles translated Asura Hall a name to asshole.

          • @[email protected]
            link
            fedilink
            English
            821 hours ago

            I have a feeling that if you care enough about subtitles you’re going to look for good ones, instead of using “ok” ai subs.

          • @[email protected]
            link
            fedilink
            English
            2
            edit-2
            15 hours ago

            I imagine it would be not-exactly-simple-but-not- complicated to add a “threshold” feature. If Ai is less than X% certain, it can request human clarification.

            Edit: Derp. I forgot about the “real time” part. Still, as others have said, even a single botched word would still work well enough with context.

            • snooggums
              link
              fedilink
              English
              1
              edit-2
              16 hours ago

              That defeats the purpose of doing it in real time as it would introduce a delay.

    • @[email protected]
      link
      fedilink
      English
      12
      edit-2
      20 hours ago

      Yeah it’s pretty wonderful To see how far auto generated transcription/captioning has become over the last couple of years. A wonderful victory for many communities with various disabilities.

  • TheRealKuni
    link
    fedilink
    English
    3018 hours ago

    And yet they turned down having thumbnails for seeking because it would be too resource intensive. 😐

    • @[email protected]
      link
      fedilink
      English
      119 hours ago

      Video decoding is resource intensive. We’re used to it, we have hardware acceleration for some of it, but spewing something around 52 million pixels every second from a highly compressed data source is not cheap. I’m not sure how both compare, but small LLM models are not that costly to run if you don’t factor their creation in.

    • @[email protected]
      link
      fedilink
      English
      1410 hours ago

      I mean, it would. For example Jellyfin implements it, but it does so by extracting the pictures ahead of time and saving them. It takes days to do this for my library.

  • Phoenixz
    link
    fedilink
    English
    46
    edit-2
    18 hours ago

    As vlc is open source, can we expect this technology to also be available for, say, jellyfin, so that I can for once and for all have subtitles.done right?

    Edit: I think it’s great that vlc has this, but this sounds like something many other apps could benefit from

    • JustEnoughDucks
      link
      fedilink
      English
      38 hours ago

      In the *arr suite, bazarr has a plugin called Subgen which you can add and you can set it to generate subtitles on your entire library if you want, or only missing subtitles. The sync is spot on compared to 90% of what Opensubtitles delivers. I sometimes re-gen them with this plugin just because opensubtitles is so constantly out of sync (e.g. highly rated subtitles 4 lines will be at breakneck pace and the next 10 will be super slow and then everything is 3 seconds off)

      It isn’t in-player but it works. The downside is it is a larger model and takes ~20 minutes to generate a movie length of subtitles.

      • @[email protected]
        link
        fedilink
        English
        510 hours ago

        Has there been any estimated minimal system requirements for this yet, since it runs locally?

        • @[email protected]
          link
          fedilink
          English
          10
          edit-2
          9 hours ago

          It’s actually using whisper.cpp

          From the README:

          Memory usage Model Disk Mem tiny 75 MiB ~273 MB base 142 MiB ~388 MB small 466 MiB ~852 MB medium 1.5 GiB ~2.1 GB large 2.9 GiB ~3.9 GiB

          Those are the model sizes

          • @[email protected]
            link
            fedilink
            English
            26 hours ago

            Oh wow those pretty tiny memory requirements for a decent modern system! That’s actually very impressive! :D

            Many people can probably even run this on older media servers or even just a plain NAS! That’s awesome! :D

    • @[email protected]
      link
      fedilink
      English
      2018 hours ago

      crunchyroll is currently using AI subtitles. it’s obvious because when someone says “mothra. Funky…” it captions “mother fucker”

      • @[email protected]
        link
        fedilink
        English
        1518 hours ago

        That explains why their subtitles have seemed worse to me lately. Every now and then I see something obviously wrong and wonder how it got by anyone who looked at it. Now I know why. No one looked at it.

        • @[email protected]
          link
          fedilink
          English
          1718 hours ago

          my wife and I love laughing at the dumbass mistakes it makes.

          some characters name is Asura Halls?

          instead of “That’s Asura Halls!” you get “That asshole!”

          but if I was actually hearing impaired I’d be really pissed that I’m being treated as second class even though Sony still took my money like everyone else.