• moosetwin
    link
    fedilink
    English
    1924 hours ago

    I don’t mind the idea, but I would be curious where the training data comes from. You can’t just train them off of the user’s (unsubtitled) videos, because you need subtitles to know if the output is right or wrong. I checked their twitter post, but it didn’t seem to help.

      • @[email protected]
        link
        fedilink
        1220 hours ago

        They may have to give it some special training to be able to understand audio mixed by the Chris Nolan school of wtf are they saying.

        • @[email protected]
          link
          fedilink
          English
          318 hours ago

          No, if you have a center track you can just use that. Volume isn’t a problem for a computer listening to it since they don’t use the physical speakers.

          • @[email protected]
            link
            fedilink
            English
            13 hours ago

            I took the other comment as a joke but this is accurate and interesting additional information!

    • @[email protected]
      link
      fedilink
      922 hours ago

      I hope they’re using Open Subtitles, or one of the many academic Speech To Text datasets that exist.