The thing that is supposed to happen next is high-bandwidth flash. In theory, it could allow laptops to run the larger models without being extortionately costly, by loading directly from flash into the GPU (not by executing in flash)
But I haven't seen figures of the actual bandwidth yet, and no doubt to start with it will be expensive. The underlying technology of flash has much higher read latency than dram, so it's not really clear (to me, at least) if they can deliver the speeds needed to remove the need to cache in VRAM just by increasing parallelism.
Video games have driven the need for hardware more than office work. Sadly games are already being scaled back and more time is being spent on optimization instead of content since consumers can't be expected to have the kind of RAM available they normally would and everyone will be forced to make do with whatever RAM they have for a long time.
That might not be the case. The kind of memory that will flood the second-hand market could not be the kind of memory we can stuff in laptops or even desktop systems.
I was mussing this summer if I should get a refurbed Thinkpad P16 with 96GB of RAM to run VMs purely in memory. Now that 96GB of ram cost as much as a second P16.
I feel you, so much. I was thinking of getting a second 64gb node for my homelab and i thought i’d save those money… now the ram alone cost as much as the node, and I’m crying.
Lesson learned: you should always listen to that voice inside your head that say: “but i need it…” lol
I rebuilt a workstation after a failed motherboard a year ago. I was not very excited about being forced to replace it on a days notice and cheaped out on the RAM (only got 32GB). This is like the third or fourth time I've taught myself the lesson to not pinch pennies when buying equipment/infrastructure assets. It's the second time the lesson was about RAM, so clearly I'm a slow learner.
By "we" do you mean consumers? No, "we" will get neither. This is unexpected, irresistable opportunity to create a new class, by controlling the technology that people are required and are desiring to use (large genAI) with a comprehensive moat — financial, legislative and technological. Why make affordable devices that enable at least partial autonomy? Of course the focus will be on better remote operation (networking, on-device secure computation, advancing narrative that equates local computation with extremism and sociopathy).