This article is so dumb. It totally ignores the memory price explosion that will make large fast memory laptops unfeasible for years and states stuff like this:
> How many TOPS do you need to run state-of-the-art models with hundreds of millions of parameters? No one knows exactly. It’s not possible to run these models on today’s consumer hardware, so real-world tests just can’t be done.
We know exactly the performance needed for a given responsiveness. TOPS is just a measurement independent from the type of hardware it runs on..
The less TOPS the slower the model runs so the user experience suffers. Memory bandwidth and latency plays a huge role too. And context, increase context and the LLM becomes much slower.
We don't need to wait for consumer hardware until we know much much is needed. We can calculate that for given situations.
It also pretends small models are not useful at all.
I think the massive cloud investments will put pressure away from local AI unfortunately. That trend makes local memory expensive and all those cloud billions have to be made back so all the vendors are pushing for their cloud subscriptions. I'm sure some functions will be local but the brunt of it will be cloud, sadly.
> How many TOPS do you need to run state-of-the-art models with hundreds of millions of parameters? No one knows exactly. It’s not possible to run these models on today’s consumer hardware, so real-world tests just can’t be done.
We know exactly the performance needed for a given responsiveness. TOPS is just a measurement independent from the type of hardware it runs on..
The less TOPS the slower the model runs so the user experience suffers. Memory bandwidth and latency plays a huge role too. And context, increase context and the LLM becomes much slower.
We don't need to wait for consumer hardware until we know much much is needed. We can calculate that for given situations.
It also pretends small models are not useful at all.
I think the massive cloud investments will put pressure away from local AI unfortunately. That trend makes local memory expensive and all those cloud billions have to be made back so all the vendors are pushing for their cloud subscriptions. I'm sure some functions will be local but the brunt of it will be cloud, sadly.