You act like that's a "gotcha" instead of a normal thing. All they mean by that [0] is that can't mathematically prove their developers/tasks/tool are representative for of the majority of worldwide developers/tasks/tools.
You're demanding an unreasonable level of investment for anyone to "prove a negative."
The burden of proof lies on the people claiming zillion-fold boosts in productivity across "enough" places that they don't really define. This is especially true because they could profit in the process, as opposed to other people burning money to prove a point.
You act like that's a "gotcha" instead of a normal thing. All they mean by that [0] is that can't mathematically prove their developers/tasks/tool are representative for of the majority of worldwide developers/tasks/tools.
You're demanding an unreasonable level of investment for anyone to "prove a negative."
The burden of proof lies on the people claiming zillion-fold boosts in productivity across "enough" places that they don't really define. This is especially true because they could profit in the process, as opposed to other people burning money to prove a point.
[0] https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...