How does Lemmy feel about "open source" machine learning, akin to the Fediverse vs Social Media?

brucethemoose@lemmy.world · edit-2 8 months ago

How does Lemmy feel about "open source" machine learning, akin to the Fediverse vs Social Media?

scarabine · 8 months ago

I’m most excited where it’s most open. Clear training process, legal data sets, fully open code bases, published reports, etc. I think we’re going to see the local models boom in sophistication once that’s more common.

Do you know of any good local models that fit that kind of description?

ArchRecord@lemm.ee · 8 months ago

I don’t know of any super high-quality ones that run well, but the Open Assistant project, (now archived) collected responses from voluntary participants (myself included) to build what is now considered a very high-quality dataset of chat conversation pairs, truly open source, and all voluntarily submitted instead of scraped.

The models are reasonable for fine-tuning, but aren’t very good compared to newer models from large companies.

brucethemoose@lemmy.world · 8 months ago

Cutting edge ones? Unfortunately, rarely. Right now there’s a sliding scale between “open and transparent” and “smart and performant” because they’re just so darn expensive to train.

I think some of the closest ones to your requirements are Nvidia’s research models, excluding Mistral Nemo which isn’t as well documented (as its really a Mistral Model). And you can see a lot of the open “alternative” efforts like RWKV, openllama and such are severely underfunded and undertrained.

The datasets are there, the highly optimized implementations are getting there, pieces are there, a lot of of models have detailed papers, fully open codebases, but the funding to actually do it is just too much to deal with most of the time.

Another factor is that “closed” datasets like whatever Mistral, Facebook, Cohere and such use do seem to have an edge.

brucethemoose@lemmy.world · 8 months ago

This one is brand new: https://github.com/allenai/OLMoE