Not the OP, but my guess would be threadrippers (or similar w/lots of PCIe lanes...

Not the OP, but my guess would be threadrippers (or similar w/lots of PCIe lanes), each with a great number of GPUs. That's usually what you'd do for training AI in a home lab.

Server processors gets you more bang for the buck... iff you're planning to run the hardware flat out for literally years. You save on power, but the up-front cost takes up most of that, so for a system that's mostly idle you wouldn't use them. On the other hand, any CPU with fewer PCIe lanes than a TR won't be able to run multiple GPUs optimally, and TRs are relatively cheap enough to make the reduction in PSUs/chassises worth it.

Not to mention that there are some approaches to training you can only use if you have multiple GPUs on the same motherboard, aka. sharding a single model across GPUs without communications overhead killing any benefit of that.