@[email protected] to

[email protected]English • 6 months ago

We have to stop ignoring AI’s hallucination problem

www.theverge.com

529

We have to stop ignoring AI’s hallucination problem

www.theverge.com

@[email protected] to

[email protected]English • 6 months ago

AI might be cool, but it’s also a big fat liar.

Chat

@[email protected]
link
fedilink
English
6•6 months ago

they do not understand why those things are true.

Some researchers compared the results of questions between chat gpt 3 and 4. One of the questions was about stacking items in a stable way. Chat gpt 3 just, in line with what you are saying about “without understanding”, listed the items saying to place them one on top of each other. No way it would have worked.

Chat gpt 4, however, said that you should put the book down first, put the eggs in a 3 x 3 grid on top of the book, trap them in a way with a laptop so they don’t roll around, and then put the bottle on top of the laptop standing up, and then balance the nail on the top of it…even noting you have to put the flat end of the nail down. This sounds a lot like understanding to me and not just rolling the dice hoping to be correct.

Yes, AI confidently gets stuff wrong. But let’s all note that there is a whole subreddit dedicated to people being confidently wrong. One doesn’t need to go any further than Lemmy to see people confidently claiming to know the truth about shit they should know is outside of their actual knowledge. We’re all guilty of this. Including refusing to learn when we are wrong. Additionally, the argument that they can’t learn doesn’t make sense because models have definitely become better.

Now I’m not saying ai is conscious, I really don’t know, but all of your shortcomings you’ve listed humans are guilty of too. So to use it as examples as to why it’s always just a hallucination, or that our thoughts are not, doesn’t seem to hold much water to me.
- @[email protected]
  link
  fedilink
  English
  14•6 months ago
  
  the argument that they can’t learn doesn’t make sense because models have definitely become better.
  
  They have to be either trained with new data or their internal structure has to be improved. It’s an offline process, meaning they don’t learn through chat sessions we have with them (if you open a new session it will have forgotten what you told it in a previous session), and they can’t learn through any kind of self-directed research process like a human can.
  
  all of your shortcomings you’ve listed humans are guilty of too.
  
  LLMs are sophisticated word generators. They don’t think or understand in any way, full stop. This is really important to understand about them.
  - @[email protected]
    link
    fedilink
    English
    1•6 months ago
    
    They have to be either trained with new data or their internal structure has to be improved. It’s an offline process, meaning they don’t learn through chat sessions we have with them (if you open a new session it will have forgotten what you told it in a previous session), and they can’t learn through any kind of self-directed research process like a human can.
    
    Most human training is done through the guidance of another, additionally, most of this is training is done through an automated process where some computer is just churning through data. And while you are correct that the context does not exist from one session to the next, you can in fact teach it something and it will maintain it during the session. It’s just like moving to a new session is like talking to completely different person, and you’re basically arguing “well, I explained this one thing to another human, and this human doesn’t know it. . .so how can you claim it’s thinking?” And just imagine the disaster that would happen if you would just allow it to be trained by anyone on the web. It would be spitting out memes, racism, and right wing propaganda within days. lol
    
    They don’t think or understand in any way, full stop.
    
    I just gave you an example where this appears to be untrue. There is something that looks like understanding going on. Maybe it’s not, I’m not claiming to know, but I have not seen a convincing argument as to why. Saying “full stop” instead of an actual argument as to why just indicates to me that you are really saying “stop thinking.” And I apologize but that’s not how I roll.
    - @[email protected]
      link
      fedilink
      English
      11•
      edit-2
      6 months ago
      
      Most human training is done through the guidance of another
      
      Let’s take a step back and not talk about training at all, but about spontaneous learning. A baby learns about the world around it by experiencing things with its senses. They learn a language, for example, simply by hearing it and making connections - getting corrected when they’re wrong, yes, but they are not trained in language until they’ve already learned to speak it. And once they are taught how to read, they can then explore the world through signs, books, the internet, etc. in a way that is often self-directed. More than that, humans are learning at every moment as they interact with the world around them and with the written word.
      
      An LLM is a static model created through exposure to lots and lots of text. It is trained and then used. To add to the model requires an offline training process, which produces a new version of the model that can then be interacted with.
      
      you can in fact teach it something and it will maintain it during the session
      
      It’s still not learning anything. LLMs have what’s known as a context window that is used to augment the model for a given session. It’s still just text that is used as part of the response process.
      
      They don’t think or understand in any way, full stop.
      
      I just gave you an example where this appears to be untrue. There is something that looks like understanding going on.
      
      You seem to have ignored the preceding sentence: “LLMs are sophisticated word generators.” This is the crux of the matter. They simply do not think, much less understand. They are simply taking the text of your prompts (and the text from the context window) and generating more text that is likely to be relevant. Sentences are generated word-by-word using complex math (heavy on linear algebra and probability) where the generation of each new word takes into account everything that came before it, including the previous words in the sentence it’s a part of. There is no thinking or understanding whatsoever.
      
      This is why [email protected] said in the original post to this thread, “They hallucinate all answers. Some of those answers will happen to be right.” LLMs have no way of knowing if any of the text they generate is accurate for the simple fact that they don’t know anything at all. They have no capacity for knowledge, understanding, thought, or reasoning. Their models are simply complex networks of words that are able to generate more words, usually in a way that is useful to us. But often, as the hallucination problem shows, in ways that are completely useless and even harmful.
      - @[email protected]
        link
        fedilink
        English
        2•6 months ago
        
        An LLM is a static model created through exposure to lots and lots of text. It is trained and then used. To add to the model requires an offline training process, which produces a new version of the model that can then be interacted with.
        
        But this is a deliberate decision, not an inherent limitation. The model could get feedback from the outside world, in fact this is how it’s trained (well, data is fed back into the model to update it). Of course we are limiting it to words, rather than a whole slew of inputs that a human gets. But keep in mind we have things like music and image generation AI as well. So it’s not like it can’t be also be trained on these things. Again, deliberate decision rather than inherent limitation.
        
        We both even agree it’s true that it can learn from interacting with the world, you just insist that because it isn’t persisting, that doesn’t actually count. But it does persist, just not the the new inputs from users. And this is done deliberately to protect the models from what would inevitably happen. That being said, it’s also been fed arguably more input than a human would get in their whole life, just condescended into a much smaller period of time. So if it’s “total input” then the AI is going to win, hands down.
        
        You seem to have ignored the preceding sentence: “LLMs are sophisticated word generators.”
        
        I’m not ignoring this. I understand that it’s the whole argument, it gets repeated around here enough. Just saying it doesn’t make it true, however. It may be true, again I’m not sure, but simply stating and saying “full stop” doesn’t amount to a convincing argument.
        
        They simply do not think, much less understand.
        
        It’s not as open and shut as you wish it to be. If anyone is ignoring anything here, it’s you ignoring the fact that it went from basically just, as you said, randomly stacking objects it was told to stack stably, to actually doing so in a way that could work and describing why you would do it that way. Additionally there is another case where they asked chat gpt4 to draw a unicorn using an obscure programming language. And you know what? It did it. It was rudimentary, but it was clearly a unicorn. This is something that wasn’t trained on images at all. They even messed with the code, turning the unicorn around, removing the horn, fed it back in, and then asked it to replace the horn, and it put it back on correctly. It seemed to understand not only what an unicorn looked like, but what was the horn and where it should go when it was removed.
        
        So to say it just can “generate more words” is something you can accuse us of as well, or possibly even just overly reductive of what it’s capable of even now.
        
        But often, as the hallucination problem shows, in ways that are completely useless and even harmful.
        
        There are all kinds of problems with human memory, where we imagine things all of the time. You’ve ever taken acid? If so, you would see how unreliable our brains are at always interpreting reality. And you want to really trip? Eye witness testimony is basically garbage. I exaggerate a bit, but there are so many flaws with it with people remembering things that didn’t happen, and it’s so easy to create false memories, that it’s not as convincing as it should be. Hell, it can even be harmful by convicting an innocent person.
        
        Every short coming you’ve used to claim AI isn’t real thinking is something shared with us. It might just be inherent to intelligence to be wrong sometimes.
        
        @[email protected]
        link
        fedilink
        English
        2•6 months ago
        It’s exciting either way. Maybe it’s equivalent to a certain lobe of the brain, and we’re judging it for not being integrated with all the other parts.
  - @[email protected]
    link
    fedilink
    English
    1•6 months ago
    You are just wrong
- @[email protected]
  link
  fedilink
  English
  2•6 months ago
  A source link to what you’re referring to would be nice.
  - @[email protected]
    link
    fedilink
    English
    1•6 months ago
    https://www.businessinsider.com/chatgpt-open-ai-balancing-task-convinced-microsoft-agi-closer-2023-5

[email protected]

[email protected]

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

3.11K users / day
8.26K users / week
16.2K users / month
32.9K users / 6 months
59.5K subscribers
12.2K Posts
510K Comments
Modlog