@[email protected] to

Science [email protected]English • 2 years ago

Using AI to Write Science Fiction Visual Storytelling

-4

Using AI to Write Science Fiction Visual Storytelling

@[email protected] to

Science [email protected]English • 2 years ago

This video shows a demonstration of AI generated content using a combination of these three free tools.

Who needs human when you have AI :p

Chat

Dr. Jenkem
link
fedilink
English
4•
edit-2
2 years ago
Yeah or LLMs succumb to model collapse and just keep getting worse until eventually the fad dies.
- @DudeWTF
  link
  English
  2•2 years ago
  It is easy to have too many cooks in the kitchen, but that is an easy problem to solve. Model decay is not a real problem if you understand how a LLM works. Overtraining is like burning a big dinner and ruining a meal. One doesn’t stop cooking forever, or burn down the house and quit. You just cook another meal next time. If your model has 100 trillion tokens, you’re likely to try your very best to salvage your massive ruined dish, but in the end, it doesn’t matter. You can easily tweak the recipe for next time. Models have no persistent memory. Context can be used to train and turned into data, but it is a totally separate thing that is unrelated to the model itself. As an oversimplification, a LLM is just a large database of categories mixed with a massive amount of language data that enables a statistical calculation of what word should come next. This is a simple prediction of what word comes next. Everything else is censoring algorithms and illusions embedded into how humans use language. Really, thus is a tool to access culture through language, and in the case of larger models, the culture embedded into many different human languages.
  
  This is as much of a “fad” now as the internet was in the late 90’s, and this is on par with that change. LLMs are no fad. This is a tool as disruptive as the public internet. For instance, in 10 years, Google will be a relic of the past. AI will completely replace it. Education will also completely change. It is possible to have entirely individualized education. Physiology will change as a LLM can be tuned to address and help with many human social issues. This will change everything because it exists I’m the open source space already.
  - Dr. Jenkem
    link
    fedilink
    English
    2•2 years ago
    If we assume LLMs are as revolutionary as you are suggesting, then how is model collapse an easy problem to solve? Google is a relic of the past, the internet is filled with AI generated content; then where will the training data come from? We can’t replace human generated content with AI generated content without an inevitable model collapse.
    
    Oh and btw, good luck with differentiating between human generated and AI generated. Already, social media sites are being cluttered with AI generated content, Amazon book publishing being cluttered with shit tier LLM generated “books” (cheap immitations), and if academia goes this way, and entertainment as many speculate, there’s hardly anything left.
    - FaceDeer
      link
      fedilink
      0•2 years ago
      
      Oh and btw, good luck with differentiating between human generated and AI generated.
      
      One easy way to do this is to check if it was generated before 2023. Not so much AI-generated content from before then.
      
      Amazon book publishing being cluttered with shit tier LLM generated “books”
      
      So filter the books based on how “shit tier” they are.
      
      In the end, what’s needed to train AIs is good content. If some of that good content is itself AI-generated, who cares? You need to be selective in how you pick training material anyway.
      - Dr. Jenkem
        link
        fedilink
        English
        1•2 years ago
        LLMs need updated training data to stay relevant.
        
        And how exactly are you going to curate high quality data when it’s in the orders of tb’s or even petabytes?
        
        FaceDeer
        link
        fedilink
        0•2 years ago
        
        LLMs need updated training data to stay relevant.
        
        Yes. So add relevant new data along with the older stuff. The problem is not that AI-generated content is magically “poison” somehow. Model collapse happens when you lose rare data from repeated generations of training data generated by AIs.
        
        A simple way to imagine it is training an AI by showing it random coloured marbles out of a bucket and then asking it to fill the next AI’s bucket with new marbles to train on. If there’s just one single blue marble in the first bucket then it’s easily possible that the AI will fail to put a blue marble in the second bucket, after which there will never be a blue marble again if that’s all that subsequent AIs have to train off of. But if each time you train a new AI you reuse half the marbles from the first bucket again, you can have that blue marble show back up again in future AIs.
        
        Dr. Jenkem
        link
        fedilink
        English
        1•2 years ago
        If LLMs are as revolutionary as the zealots believe, then there will exist less and less blue marbles in the universe with each iteration. So either the bucket gets smaller or the ratio of blue marbles gets smaller.
        
        FaceDeer
        link
        fedilink
        1•2 years ago
        I said:
        
        But if each time you train a new AI you reuse half the marbles from the first bucket again, you can have that blue marble show back up again in future AIs.
        
        The original bucket containing the blue marble isn’t going anywhere. It still exists. The blue marble will always be available to mix into future AIs. All you have to do is make sure you’re using some historical data (or otherwise guaranteed “human-generated”) along with whatever new unvetted stuff you’re using.
        
        Dr. Jenkem
        link
        fedilink
        English
        1•2 years ago
        So then your back to locking LLMs to the year 2023. They’re usefulness is severely limited if you can’t train them on new data.
- FaceDeer
  link
  fedilink
  0•2 years ago
  Why would they? There’s plenty of non-AI-generated material to train them off of and it’s something that future trainers will watch out for.
  - Dr. Jenkem
    link
    fedilink
    English
    1•2 years ago
    Sure there may be a lot, but it’s still finite. And already, social media is being filled with AI generated content. If the trend continues, human generated content will be dwarfed by AI generated content. And it’s not going to be a simple process to distinguish between the two.
    - FaceDeer
      link
      fedilink
      1•2 years ago
      Infinite training data isn’t required.
      
      It’s actually fine to include some AI-generated data in your training set, the reason “model collapse” happens is when you train on only AI-generated content and you end up losing out on some of the less-common outputs. Without the less-common cases in the training data each generation of AI has less diverse information to learn from. If you make sure the training set is diverse enough then it should be fine.
      
      All else fails, just make sure a lot of your data is from before 2023.
      - Dr. Jenkem
        link
        fedilink
        English
        1•
        edit-2
        2 years ago
        I think you misunderstand the problem. Sure it starts with small amounts of output fed into the input, but as it continues to generate large amounts of output, overtime, more and more of the output makes it into the input.
        
        And again, limiting LLMs to pre-2023 training data ensures they never get smarter. Human knowledge expands as LLMs at best are locked into a constant state of 2023 knowledge.
        
        FaceDeer
        link
        fedilink
        1•2 years ago
        
        Sure it starts with small amounts of output fed into the input, but as it continues to generate large amounts of output, overtime, more and more of the output makes it into the input.
        
        Not inevitably. You’re assuming that each “generation” of AI is being trained on a data set that’s just blindly harvested. AI trainers are already spending a huge amount of effort curating their training sets, it’s become quite apparent that the quality of the training set is important and you can’t just dump a giant raw pile of everything into it to get good results. This would just be another thing for them to consider.
        
        Dr. Jenkem
        link
        fedilink
        English
        1•2 years ago
        To a certain extent, yes, the training data is blindly being dumped in. There’s no way terabytes of training data is being manually reviewed for accuracy. If for no other reason, it doesn’t economically make sense to do so. It’s simply not feasible for humans to manually currate all of that data and even if they did, human error still exists.
        
        FaceDeer
        link
        fedilink
        1•2 years ago
        Your disbelief doesn’t mean it’s not happening. The data sources that go into AIs are indeed curated selectively. Honestly, what do you think happens, a webcrawler is told to just “go nuts” and whatever random data it spits out gets fed right in? Trainers pick their sources carefully. They deduplicate it, they format it, they do a lot of work on it.
        
        Perfection is not required. Human error is fine in manageable amounts.

Science [email protected]

[email protected]

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

Welcome to /c/ScienceFiction

December book club canceled. Short stories instead!

We are a community for discussing all things Science Fiction. We want this to be a place for members to discuss and share everything they love about Science Fiction, whether that be books, movies, TV shows and more. Please feel free to take part and help our community grow.

Be civil: disagreements happen, but that doesn’t provide the right to personally insult others.
Posts or comments that are homophobic, transphobic, racist, sexist, ableist, or advocating violence will be removed.
Spam, self promotion, trolling, and bots are not allowed
Put (Spoilers) in the title of your post if you anticipate spoilers.
Please use spoiler tags whenever commenting a spoiler in a non-spoiler thread.

Lemmy World Rules

17 users / day
190 users / week
304 users / month
2.03K users / 6 months
13.9K subscribers
359 Posts
7.75K Comments
Modlog

mods:
@[email protected]