It's not the same as slapping an open source license on a binary, because unencu...

diggan · 2025-01-30T15:49:44 1738252184

> The insistence every time open sourced model weights come up that it is not "truly" open source is tiring. There is enormous value in open source weights compared to closed APIs. Let us call them open source weights. What you want can be "open source data" or somesuch.

Agree that there is more value in open source weights than closed APIs, but what I really want to enable, is people learning how to create their own models from scratch. FOSS to me means being able to learn from other projects, how to build the thing yourself, and I wrote about why this is important to me here: https://news.ycombinator.com/item?id=42878817

It's not a puritan view but purely practical. Many companies started using FOSS as a marketing label (like what Meta does) and as someone who probably wouldn't be a software developer without being able to learn from FOSS, it fucking sucks that the ML/AI ecosystem is seemingly OK with the term being hijacked.

crawshaw · 2025-01-30T15:57:55 1738252675

It's not just a marketing label. The term is not being hijacked. Open source models, open source weights, the license chosen, these are all exetrmely valuable concepts.

The thing you want, open source model data pipelines, is a different thing. It's existence in no way invalidates the concept of an open source model. Nothing has been hijacked.

diggan · 2025-01-30T16:03:30 1738253010

We call software FOSS when you can compile (if needed) and build the project yourself, locally, granted you have the resources available. If you have parts that aren't FOSS attached to the project somehow, we call it "Open Core" or similar. You wouldn't call a software project FOSS if the only thing under a FOSS license is the binary itself, or some other output, we require at least the code to be FOSS for it to be considered FOSS.

Meta/Llama probably started the trend, and they still today say "The open-source AI models" and "Llama is the leading open source model family" which is grossly misleading.

You cannot download the Llama models or weights without signing a license agreement, you're not allowed to use it for anything you want, you need to add a disclaimer on anything that uses Llama (which almost the entire ecosystem breaks as they seemingly missed this when they signed the agreement) and so on, which to me goes directly against what FOSS means.

If you cannot reproduce the artifact yourself (again, granted you have the resources), you'd have a really hard time convincing me that that is FOSS.

fzzzy · 2025-01-30T17:55:12 1738259712

The data pipeline to build the weights is the source. The weights are a binary. The term is being hijacked. Just call it open weights, not open source models. The source for the models is not available. The weights are openly available.

rcdwealth · 2025-01-30T19:45:06 1738266306

Meta’s LLaMa 2 license is not Open Source – Open Source Initiative: https://opensource.org/blog/metas-llama-2-license-is-not-ope...

If it would not be hijacked, then such articles would not exist.

META is falsely and deceptively, but also carefully, pretending to be Open Source.

The Open Source Definition – Open Source Initiative https://opensource.org/osd

What is Free Software? - GNU Project - Free Software Foundation https://www.gnu.org/philosophy/free-sw.html

Word "Open" as in "Open Source" - Words to Avoid (or Use with Care) Because They Are Loaded or Confusing https://www.gnu.org/philosophy/words-to-avoid.html#Open

Please refrain from using "open" or "open source" as a synonym for "free software." These terms originate from different perspectives and values. The free software movement advocates for your freedom in computing, grounded in principles of justice. The open source approach, on the other hand, does not promote a set of values in the same way. When discussing open source views, it's appropriate to use that term. However, when referring to our views, our software, or our movement, please use "free software" or "free (libre) software" instead. Using "open source" in this context can lead to misunderstandings, as it implies our views are similar to those of the open source movement.

crawshaw · 2025-01-31T18:58:47 1738349927

Your concern about Meta's license is fair, I have no useful opinion on that. I certainly wish they would use a freer license, though I am loath to look a gift horse in the mouth.

My concern in this thread is people rejecting the concept of open source model weights as not "true" open source, because there is more that could be open sourced. It discounts a huge amount of value model developers provide when they open source weights. You are doing a variant of that here by trying to claim a narrow definition of "free software". I don't have any interest in the FSF definition.

Miraste · 2025-01-30T15:59:01 1738252741

I'm in favor of FOSS, and I'd like to see more truly open models for ideological reasons, but I don't see a lot of practical value for individuals in open-sourcing the process. You still can't build one yourself. How does it help to know the steps when creating a base model still costs >tens of millions of dollars?

It seems to me that open source weights enable everything the FOSS community is practically capable of doing.

diggan · 2025-01-30T16:05:46 1738253146

> How does it help to know the steps when creating a base model still costs >tens of millions of dollars?

You can still learn web development even though you don't have 10,000s of users with a large fleet of servers and distributed servers. Thanks to FOSS, it's trivial to go through GitHub and find projects you can learn a bunch from, which is exactly what I did when I started out.

With LLMs, you don't have a lot of options. Sure, you can download and fine-tune the weights, but what if you're interested in how the weights are created in the first place? Some companies are doing a good job (like the folks building OLMo) to create those resources, but the others seems to just want to use FOSS because it's good marketing VS OpenAI et al.

Miraste · 2025-01-30T16:26:54 1738254414

Learning resources are nice, but I don't think it's analogous to web dev. I can download nginx and make a useful website right now, no fleet of servers needed. I can even get it hosted for free. Making a useful LLM absolutely, 100% requires huge GPU clusters. There is no entry level, or rather that is the entry level. Because of the scale requirements, FOSS model training frameworks (see GPT-NeoX) are only helpful for large, well-funded labs. It's also difficult to open-source training data, because of copyright.

Finetuning weights and building infrastructure around that involves almost all the same things as building a model, except it's actually possible. That's where I've seen most small-scale FOSS development take place over the last few years.

fzzzy · 2025-01-30T17:58:55 1738259935

This isn't true. Learning how to train a 124M is just as useful as a 700B, and is possible on a laptop. https://github.com/karpathy/nanoGPT

Miraste · 2025-01-30T18:14:19 1738260859

To clarify my point:

Learning how to make a small website is useful, and so is the website.

Learning how to finetune a large GPT is useful, and so is the finetuned model.

Learning how to train a 124M GPT is useful, but the resulting model is useless.

diggan · 2025-01-30T18:18:04 1738261084

> Finetuning weights and building infrastructure around that involves almost all the same things as building a model

Those are two completely different roles? One is mostly around infrastructure and the other is actual ML. There are people who know both, I'll give you that, but I don't think that's the default or even common. Fine-tuning is trivial compared to building your own model and deployments/infrastructure is something else entirely.

fzzzy · 2025-01-30T17:56:20 1738259780

It wouldn't cost tens of millions of dollars to create a 500m or 1b, and the process of learning is transferrable to larger model weights.