@[email protected] to

[email protected]English • 9 months ago

Recent AI failures are cracks in the magic

www.theintrinsicperspective.com

259

Recent AI failures are cracks in the magic

www.theintrinsicperspective.com

@[email protected] to

[email protected]English • 9 months ago

Trouble in trillions-land

Chat

Lvxferre
link
fedilink
English
17•9 months ago
As I often mention when this subject pops up: while the current statistics-based generative models might see some application, I believe that they’ll be eventually replaced by better models that are actually aware of what they’re generating, instead of simply reproducing patterns. With the current models being seen as “that cute 20s toy”.

In text generation (currently dominated by LLMs), for example, this means that the main “bulk” of the model would do three things:

convert input tokens into sememes (units of meaning)

perform logic operations with the sememes

convert sememes back into tokens for the output

Because, as it stands, LLMs are only chaining tokens. They might do this in an incredibly complex way, but that’s it. That’s obvious when you look at what LLM-fuelled bots output as “hallucination” - they aren’t the result of some internal error, they’re simply an undesired product of a model that sometimes outputs desirable stuff too.

Sub “tokens” and “sememes” with “pixels” and “objects” and this probably holds true for image generating models, too. Probably.

Now, am I some sort of genius for noticing this? Probably not; I’m just some nobody with a chimp avatar, rambling in the Fediverse. Odds are that people behind those tech giants already noticed the same ages ago, and at least some of them reached the same conclusion - that better gen models need more awareness. If they are not doing this already, it means that this shit would be painfully expensive to implement, so the “better models” that I mentioned at the start will probably not appear too soon.

Most cracks will stay there; Google will hide them with an obnoxious band-aid, OpenAI will leave them in plain daylight, but the magic trick will still not be perfect, at least in the foreseeable future.

And some might say “use MOAR processing power!”, or “input MOAR training data!”, in the hopes that the current approach will “magically” fix itself. For those, imagine yourself trying to drain the Atlantic with a bucket: does it really matter if you use more buckets, or larger buckets? Brute-forcing problems only go so far.

Just my two cents.
- @[email protected]
  link
  fedilink
  English
  7•
  edit-2
  9 months ago
  I agree 100%, and I think Zuckerberg’s attempt at a massive 340,000 of Nvidia’s H100 GPUs AI based on LLM with the aim to create a generel AI sounds stupid. Unless there’s a lot more to their attempt, it’s doomed to fail.
  
  I suppose the idea is something about achieving critical mass, but it’s pretty obvious, that that is far from the only factor missing to achieve general AI.
  
  I still think it’s impressive what they can do with LLM. And it seems to be a pretty huge step forward. But It’s taken about 40 years from we had decent “pattern recognition” to get here, the next step could be another 40 years?
  - Lvxferre
    link
    fedilink
    English
    7•9 months ago
    I think that Zuckerberg’s attempt is a mix of publicity stunt and “I want [you] to believe!”. Trying to reach AGI through a large enough LLM sounds silly, on the same level as “ants build, right? If we gather enough ants, they’ll build a skyscraper! Chrust me.”
    
    In fact I wonder if the opposite direction wouldn’t be a bit more feasible - start with some extremely primitive AGI, then “teach” it Language (as a skill) and a language (like Mandarin or English or whatever).
    
    I’m not sure on how many years it’ll take for an AGI to pop up. 100 years perhaps, but I’m just guessing.
- @[email protected]
  link
  fedilink
  English
  1•
  edit-2
  9 months ago
  That’s a huge oversimplification of the way LLMs work. They’re not statistical in the way a Markov chain is. They use neural networks, which are a decent analogy for the human brain. The way the synapses between neurons are wired is obviously different, and the way the neurons are triggered and the types of signals they can send to other neurons is obviously different. But overall, similar capabilities can in theory be achieved with either method. If you’re going to call neural networks statistics based, you might as well call the human brain statistics based as well.
  - Lvxferre
    link
    fedilink
    English
    1•9 months ago
    
    That’s a huge oversimplification of the way LLMs work.
    
    I’m sticking to what matters for the sake of the argument. Anyone who wants to inform themself further has a plethora of online resources to do so.
    
    They’re not statistical in the way a Markov chain is.
    
    Implied: “you’re suggesting that they work like Markov chains, they don’t.”
    
    In no moment I mentioned or even implied Markov chains. My usage of the verb “to chain” is clearly vaguer within that context; please do not assume words onto my mouth.
    
    They use neural networks, which are a decent analogy for the human brain. The way the synapses between neurons are wired is obviously different, and the way the neurons are triggered and the types of signals they can send to other neurons is obviously different. But overall, similar capabilities can in theory be achieved with either method.
    
    I don’t disagree with the conclusion (i.e. I believe that neural networks can achieve human-like capabilities), but the argument itself is such a fallacious babble (false equivalence) that I’m not bothering further with your comment.
    
    And it’s also an “ackshyually” given this context dammit. I’m not talking about the bloody neural network, but how it is used.
    - @[email protected]
      link
      fedilink
      English
      1•
      edit-2
      9 months ago
      No need to get offended. Maybe I misunderstood the intent behind your original message. I think you made a lot of good points.
      
      I brought up the Markov chain because a common misconception I’ve seen on the Internet and in real life is that LLMs work pretty much the same as Markov chains under the hood. And I saw no mention of neural networks in your original comment.

[email protected]

[email protected]

You are not logged in. However you can subscribe from another Fediverse account, for example Lemmy or Mastodon. To do this, paste the following into the search field of your instance: [email protected]

This is a most excellent place for technology news and articles.

Our Rules

Follow the lemmy.world rules.
Only tech related content.
Be excellent to each another!
Mod approved content bots can post up to 10 articles per day.
Threads asking for personal tech support may be deleted.
Politics threads may be removed.
No memes allowed as posts, OK to post as comments.
Only approved bots from the list below, to ask if your bot can be added please contact us.
Check for duplicates before posting, duplicates may be removed

Approved Bots

3.22K users / day
8.28K users / week
16.2K users / month
32.9K users / 6 months
59.5K subscribers
12.2K Posts
511K Comments
Modlog