It's not the same as slapping an open source license on a binary, because unencumbered weights are so much more generally useful than your typical program binary. Weights are fine-tunable and embeddable into a wide range of software.
To consider just the power of fine tuning: all of the press DeepSeek have received is over their R1 model, a relatively tiny fine-tune on their open source V3 model. The vast majority of the compute and data pipeline work to build R1 was complete in V3, while that final fine-tuning step to R1 is possible even by an enthusiastic dedicated individual. (And there are many interesting ways of doing it.)
The insistence every time open sourced model weights come up that it is not "truly" open source is tiring. There is enormous value in open source weights compared to closed APIs. Let us call them open source weights. What you want can be "open source data" or somesuch.
> The insistence every time open sourced model weights come up that it is not "truly" open source is tiring. There is enormous value in open source weights compared to closed APIs. Let us call them open source weights. What you want can be "open source data" or somesuch.
Agree that there is more value in open source weights than closed APIs, but what I really want to enable, is people learning how to create their own models from scratch. FOSS to me means being able to learn from other projects, how to build the thing yourself, and I wrote about why this is important to me here: https://news.ycombinator.com/item?id=42878817
It's not a puritan view but purely practical. Many companies started using FOSS as a marketing label (like what Meta does) and as someone who probably wouldn't be a software developer without being able to learn from FOSS, it fucking sucks that the ML/AI ecosystem is seemingly OK with the term being hijacked.
It's not just a marketing label. The term is not being hijacked. Open source models, open source weights, the license chosen, these are all exetrmely valuable concepts.
The thing you want, open source model data pipelines, is a different thing. It's existence in no way invalidates the concept of an open source model. Nothing has been hijacked.
We call software FOSS when you can compile (if needed) and build the project yourself, locally, granted you have the resources available. If you have parts that aren't FOSS attached to the project somehow, we call it "Open Core" or similar. You wouldn't call a software project FOSS if the only thing under a FOSS license is the binary itself, or some other output, we require at least the code to be FOSS for it to be considered FOSS.
Meta/Llama probably started the trend, and they still today say "The open-source AI models" and "Llama is the leading open source model family" which is grossly misleading.
You cannot download the Llama models or weights without signing a license agreement, you're not allowed to use it for anything you want, you need to add a disclaimer on anything that uses Llama (which almost the entire ecosystem breaks as they seemingly missed this when they signed the agreement) and so on, which to me goes directly against what FOSS means.
If you cannot reproduce the artifact yourself (again, granted you have the resources), you'd have a really hard time convincing me that that is FOSS.
The data pipeline to build the weights is the source. The weights are a binary. The term is being hijacked. Just call it open weights, not open source models. The source for the models is not available. The weights are openly available.
Please refrain from using "open" or "open source" as a synonym for "free software." These terms originate from different perspectives and values. The free software movement advocates for your freedom in computing, grounded in principles of justice. The open source approach, on the other hand, does not promote a set of values in the same way. When discussing open source views, it's appropriate to use that term. However, when referring to our views, our software, or our movement, please use "free software" or "free (libre) software" instead. Using "open source" in this context can lead to misunderstandings, as it implies our views are similar to those of the open source movement.
Your concern about Meta's license is fair, I have no useful opinion on that. I certainly wish they would use a freer license, though I am loath to look a gift horse in the mouth.
My concern in this thread is people rejecting the concept of open source model weights as not "true" open source, because there is more that could be open sourced. It discounts a huge amount of value model developers provide when they open source weights. You are doing a variant of that here by trying to claim a narrow definition of "free software". I don't have any interest in the FSF definition.
I'm in favor of FOSS, and I'd like to see more truly open models for ideological reasons, but I don't see a lot of practical value for individuals in open-sourcing the process. You still can't build one yourself. How does it help to know the steps when creating a base model still costs >tens of millions of dollars?
It seems to me that open source weights enable everything the FOSS community is practically capable of doing.
> How does it help to know the steps when creating a base model still costs >tens of millions of dollars?
You can still learn web development even though you don't have 10,000s of users with a large fleet of servers and distributed servers. Thanks to FOSS, it's trivial to go through GitHub and find projects you can learn a bunch from, which is exactly what I did when I started out.
With LLMs, you don't have a lot of options. Sure, you can download and fine-tune the weights, but what if you're interested in how the weights are created in the first place? Some companies are doing a good job (like the folks building OLMo) to create those resources, but the others seems to just want to use FOSS because it's good marketing VS OpenAI et al.
Learning resources are nice, but I don't think it's analogous to web dev. I can download nginx and make a useful website right now, no fleet of servers needed. I can even get it hosted for free. Making a useful LLM absolutely, 100% requires huge GPU clusters. There is no entry level, or rather that is the entry level. Because of the scale requirements, FOSS model training frameworks (see GPT-NeoX) are only helpful for large, well-funded labs. It's also difficult to open-source training data, because of copyright.
Finetuning weights and building infrastructure around that involves almost all the same things as building a model, except it's actually possible. That's where I've seen most small-scale FOSS development take place over the last few years.
> Finetuning weights and building infrastructure around that involves almost all the same things as building a model
Those are two completely different roles? One is mostly around infrastructure and the other is actual ML. There are people who know both, I'll give you that, but I don't think that's the default or even common. Fine-tuning is trivial compared to building your own model and deployments/infrastructure is something else entirely.
To consider just the power of fine tuning: all of the press DeepSeek have received is over their R1 model, a relatively tiny fine-tune on their open source V3 model. The vast majority of the compute and data pipeline work to build R1 was complete in V3, while that final fine-tuning step to R1 is possible even by an enthusiastic dedicated individual. (And there are many interesting ways of doing it.)
The insistence every time open sourced model weights come up that it is not "truly" open source is tiring. There is enormous value in open source weights compared to closed APIs. Let us call them open source weights. What you want can be "open source data" or somesuch.