@[email protected] to

[email protected]English • 9 months ago

ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study

cross-posted to:
[email protected]

788

ChatGPT Answers Programming Questions Incorrectly 52% of the Time: Study

@[email protected] to

[email protected]English • 9 months ago

cross-posted to:
[email protected]

To make matters worse, programmers in the study would often overlook the misinformation.

The research from Purdue University, first spotted by news outlet Futurism, was presented earlier this month at the Computer-Human Interaction Conference in Hawaii and looked at 517 programming questions on Stack Overflow that were then fed to ChatGPT.

“Our analysis shows that 52% of ChatGPT answers contain incorrect information and 77% are verbose,” the new study explained. “Nonetheless, our user study participants still preferred ChatGPT answers 35% of the time due to their comprehensiveness and well-articulated language style.”

Disturbingly, programmers in the study didn’t always catch the mistakes being produced by the AI chatbot.

“However, they also overlooked the misinformation in the ChatGPT answers 39% of the time,” according to the study. “This implies the need to counter misinformation in ChatGPT answers to programming questions and raise awareness of the risks associated with seemingly correct answers.”

Chat

@[email protected]
link
fedilink
English
15•9 months ago
Removed by mod
- @[email protected]
  link
  fedilink
  English
  6•9 months ago
  This is a common misunderstanding of what it means to discover new things. New things are just remixing old things. For example, AI has discovered new matrix multiplications, protein foldings, drugs, chess/go/poker strategies, and much more that are all far superior to anything humans have ever come up with in these fields. In all these cases, the AI was just combining old things in new ways. Even Einstein was just combining old things into new ways. There is exactly zero chance that AI will all of a sudden quit making new discoveries all of a sudden.
  - @[email protected]
    link
    fedilink
    English
    7•9 months ago
    It’s also “discovered” multitudes more that are complete nonsense.
    - @[email protected]
      link
      fedilink
      English
      2•
      edit-2
      9 months ago
      Yeah, that’s the nature of discovery. Humans also “discovery” tons of things like chess strategies that are complete nonsense. Over time, we discard the most nonsense ones and keep the good ones as best as we can. It just turns out that this process is done way faster and efficiently by machines. That’s why nobody thinks humans are going to surpass AI at chess, go, poker, protein folding, matrix multiplation algorithm creation, and a whole bunch of other things.
      - @[email protected]
        link
        fedilink
        English
        2•9 months ago
        Can you provide a source for the claim that all these discoveries are “far superior” than what humans have discovered? I struggle to see how a discovery can be ‘superior’- isn’t how the discovery is classified and dealt with, the crucial aspect?
        
        @[email protected]
        link
        fedilink
        English
        1•9 months ago
        I mean in these fields, it is superior. The greatest chess player is an AI. The greatest GO player is an AI. The greatest poker player… So far as Matrix multiplication goes, there are numerous examples of mathematicians being stuck at finding methods to do it at a certain level of efficiency and then having AI come through and finding more efficient ways to do it for given matrix sizes. Similar to this is drug creation and protein folding. The list goes on and on. I wasn’t comparing discoveries across fields, I’m just saying in clearly measurable specific fields, AI has objectively surpassed humans, and it has become pretty routine for this to be the case.
        
        All these things I’ve mentioned are easily searchable, but if you still want sources after my clarification of my meaning let me know, and I’ll find some.
  - @[email protected]
    link
    fedilink
    English
    6•9 months ago
    Just a slight correction. ML/AI has aided in all sorts of discoveries, GenAI is a “remixing of existing concepts”. I don’t believe I’ve read, nor does the underlying principles really enable, anything regarding GenAI and discovering new ways to do things.
    - @[email protected]
      link
      fedilink
      English
      2•9 months ago
      Yes, ML/AI has, you are correct. So far as the capabilities of GenAI goes, we have not even begun to scratch the surface of understanding how all the emergent abilities of GenAI are happening, and nobody has any idea where they will max out at. All we know is that it is finding some patterns that humans find over time as well as many patterns that humans have not been able to find. The chances that it continues to find more and more complex patterns that we have not found are much higher than the chances that we are currently at the max of its ability.
      
      Maybe it won’t be transformers that leads to breakthroughs, it may be some completely different architecture such as Mamba/state space, but there is a good chance that transformers are a step in the direction of discovering something better.
  - @[email protected]
    link
    fedilink
    English
    3•
    edit-2
    9 months ago
    Removed by mod
    - @[email protected]
      link
      fedilink
      English
      1•9 months ago
      I didn’t say LLMs made these discoveries. They didn’t. AI made those discoveries. Yes, it is true that humans made AI, so in a way, humans made the discoveries, but if that is your take, then it is impossible for AI to ever make any discovery. Really, if we take this way of thinking to its natural conclusion, then even humans can never make discoveries, only the universe can make discoveries, since humans are a result of the universe “universing”. It is arbitrary to try to credit humans with anything that happens further down their evolution.
      
      Humans tried for a long time to get good at chess, and AI came along and made the absolute best chess players utterly irrelevant even if we give a team of the worlds best chessplayers an endless clock and thr AI a single minute for the entire game. That was 20 years ago. This is happening in more and more fields and showing no sign of stopping. We don’t know yet if discoveries will come from future LLMs like theybm have from other forms of AI, but we do know that with each generation more and more complex patterns are being identified and utilized by LLMs. 3 years ago the best LLMs would have scored single digits on IQ test, now they are triple digits, it is laughable to think that anyone knows where the current rapid trajectory will stop for this new technology, and much more laughable to think we are already at the end.
      - @[email protected]
        link
        fedilink
        English
        2•
        edit-2
        9 months ago
        Removed by mod
        
        @[email protected]
        link
        fedilink
        English
        1•9 months ago
        
        if this is your take, then lot of keyboard made a lot of discovery.
        
        This is literally my point. It is arbitrary to choose that all the good ideas came from “humans”. If we are going to give all credit for anything AI produces to humans, then it only seems fair to give all credit for human things to our common ancestors with chimpanzees, because if it were not for their clever ideas, we would never have been here. But wait, we can’t stop there, because we have to give credit to the original single-celled life forms, and eventually, back to the universe itself(like I mentioned before).
        
        Look, I totally get the desire to want to glorify humans and think that we have something special that machines don’t/can’t have. It kinda sucks to think that we are not so special, and potentially extememly inferior to what is right around the corner. We can’t let that primal ego desire cloud our judgement, though. Our brains are physical machines doing calculations. There is not some magical difference between our calculations that make it so we can make discoveries and machines cannot.
        
        Imagine you teach your little brother how to play chess, and then your brother thinks about it a bunch and comes up with a bunch of new strategies and starts to kick your butt every time, and eventually atatts crushing tournaments. Sure, you can cling to the fact that you taught him how to play, and you can go around telling everyone how “you” are winning all these tournaments because your brother is actually winning them, but it doesn’t change the fact that your brother is the one with the secret sauce that you simply are unable to comprehend.
        
        Your whole point is that if people do it, then it is some special discovery thing, but if computers do it, then it is just computational brute force. There is actually no difference between the two, it is just two different ways of wording the same process. We made programs that could understand the rules, and then it went further and in the same direction that we were trying to go.
        
        So far as continuing indefinitely because we are on a trajectory goes, sure, we will eventually hit some intelligence plateaus, but we are nowhere near this point. Why can I say this with such certainty? Because we have things that we know will work that we haven’t gotten around to combining yet. Some of this gets a bit technical, but a nice way to think of it is this. Right now, we are mainly using hardware designed to generate general graphics that we have hijacked to use for machine learning. The usual speedup when we go from using generalized hardware to specialized is about 5 orders of magnitude(10,000x). That kind of a gain has huge implications in the AI/ML world. This is just one out of many known improvements on the horizon, but it is one of the simplest to wrap your head around. I don’t know how familiar you are with things like crewAI or autogen, but they are phenomenal, they absolutely crush all of the greatest base LLMs, but they are still a bit slow due to how many LLM calls they take. When we have a 10,000x speedup(which is pretty much guarenteed), then everyone will be able to instantly use enormous agent frameworks like this in an instant.
        
        I understand wanting to see humans as having a monopoly on “intelligence”, but quite frankly that era is coming to an end. It may be a bumpy ride, but the sooner humans learn to adjust to this new world, the better. I don’t think it is something that someone can really make someone else see, but once you do see it, it is very obvious. I suggest you check out the cutting-edge agent stuff out there and then imagine that the most impressive stuff will be routinely done from a single prompt in an instant. Then, on top of that, consider that the base LLMs that we have now are the worst there will ever be. We are in for a very wild ride.
        
        @[email protected]
        link
        fedilink
        English
        2•
        edit-2
        9 months ago
        Removed by mod
        
        @[email protected]
        link
        fedilink
        English
        1•9 months ago
        The 5 orders of magnitude gained from general computers to asics is standard knowledge, you learn it in the first year of any comp sci class. You can find it all over, for example.
        
        The main thing that you are missing is that the human mind also brute forces to come up with ideas. There isn’t a difference. We don’t have some super magical mystical human thing that sets us apart.
        
        A way to imagine how it can be possible for a computer to have thoughts and ideas just like humans is this: Imagine you take a human brain and you switch out one neuron for an electrical part, and you leave the rest of the brain as it is. Can that brain have thoughts and ideas like a human? Obviously, yes. What if you switch out another one? And another. If each electrical neuron is doing the same thing as the original one, then eventually you could switch out the entire brain and have an entirely computer brain doing exactly what a human does. At what point would you say that this machine is no longer doing what a human does and just “Brute forcing” ideas?
        
        I totally get that right now, with lots of jobs at risk, many people are really concerned with holding onto the idea that hunans have a monopoly on thinking and thoughts. I think it’s important to now let what we want to be true to interfere with our analysis of what is true.
        
        @[email protected]
        link
        fedilink
        English
        1•9 months ago
        Removed by mod
        
        @[email protected]
        link
        fedilink
        English
        1•9 months ago
        I do just want to add that my conclusion is that I, as a human, am not uniquely special for having the ability to have thoughts, ideas, and come up with new things. This point of view is inherently a massive blow to the human ego. It simply doesn’t make any sense to hold such a view if one’s ego is what is controlling the judgment. The same can not be said about the opposite viewpoint.
        
        @[email protected]
        link
        fedilink
        English
        1•9 months ago
        Removed by mod

[email protected]

[email protected]

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each other!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed
Accounts 7 days and younger will have their posts automatically removed.

Approved Bots

3.63K users / day
9.28K users / week
17.1K users / month
31.6K users / 6 months
63K subscribers
13.5K Posts
566K Comments
Modlog