I think modern Intel/AMD have same amount of dTLB entries for all page sizes.
For example a modern CPU with 3k TLB entries one can access at max:
- 12MB with 4k page size
- 6GB with 2M page size
- 3TB with 1G page size
If the working set per core is bigger than above numbers you get 10-20% slower memory accesses due to TLB miss penalty.
If the working set per core is bigger than above numbers you get 10-20% slower memory accesses due to TLB miss penalty.