Reading this over, one striking thing is that in going from 14 nm -> 7 nm, they reduced core count from 12 to 8, and kept total cache per die approximately the same (256 MB L3 + various smaller caches -> 256 MB L2*). Looking at the die shot, there isn't an unusual amount of uncore, so this is either a MUCH smaller processor or a MUCH more complex core. Given that the Z series has never been particularly cost limited and has often pushed reticle size, and that by getting rid of the system controller costs will already be coming down... is this just a super-high transistor count SMT2 core?
Thinking about this a bit more, I suspect that the answer comes down to the two dies per package. The main distinction between on-die cores and off-die cores is the difference between L3 (single ejection) and L4 (double ejection), and the main use of die-to-die and package-to-package interconnect is going to be handling the L4 data flow. I bet that the numbers showed that goes from 256MB of cache on die (available for active-use L2 + shared L3) to more didn't make much of a difference; that's a pretty big L3 chunk, after all, and even if the latency for L4 is way higher I bet you're already well into marginal gains.
In this case, and assuming that the package is thermally limited, there's no real reason not to go to a smaller die. Even if a 16 core, 512MB cache monster fit on reticle, if there's not much performance gap between that and the actual dual 8 core, 256MB cache system, why not take a yield gain from smaller dice?
If this is the case, I'd expect this die to be way smaller than the Z15 processor... < 400 mm^2? Anyone have numbers?
Ah, that's a super interesting point. My vague recollection is that the size per bit for eDRAM is somewhere between a half and a third that for SRAM (isoprocess), so this would be doubling the cache effective area isoprocess... or probably somewhere between maintaining it and halving it with the move to 7 nm. Still looks like it might be less die space for cache, but not nearly as much as I was thinking as first.
Exactly. I'm left wondering if the move off of eDRAM was motivated by improved latency or just unavailability of the technology (or that it hadn't been updated to 7nm).
I just saw a tweet of a guy serving 200k connections on a $5 USD instance. I believe it because I've experienced it first hand several times. Serving static pages in 2021 is not a big deal, at all.
On the other hand, this is anandtech, a huge site that's been going on for years (decades?). For sure they get a lot of traffic, but I doubt they get something like 1M+ simultaneous requests per second. Can't they just solve this problem already? Whoever is in charge of their tech stack is hardly doing its job, this is unexcusable.
I'm looking at our traffic logs and don't see any downturn in traffic. Normally when we get an outage it's pretty obvious. Did you actually get an error, or was it just not loading?
But even it was really down it wont be their first time with capacity problem. Unlike ServeTheHome where they host and even assemble their own Server, I think Anandtech's team are now simply a content writer with Future Publishing handling everything on the hosting and software side along with Ads.
But yes I think there is something very wrong with current trend of Web Development.
'I think Anandtech's team are now simply a content writer with Future Publishing handling everything on the hosting and software side along with Ads.'
We're not an independent site anymore, so we have little control over how the site works and integrates into Future's back end (to be fair, compared to TH, we're not that integrated several years later because of a custom CMS and hosting setup, but it means Future won't put any $$$ into it until we transfer over).
Simply put, with the big publishers, writers are writers, publishers deal with the publishing.
ArsTechnica has waned and waxed over time. But they have always been solid-enough for me. They have always been up to defend the individual in technology policy and how trends in law enforcement deceptively catches up. They can get squirrelly with the climate view once in a while. BUT even if you disagree with them -- you can bet if you sincerely read their content they stance is well grounded.
As a stable and long term aggregator for open source and similar news and articles I have been very happy with lxer. http://lxer.com/
I know we're not supposed to entertain this on HN, I think this would make a good 'Ask HN' - let me know if you make one.
> I just saw a tweet of a guy serving 200k connections on a $5 USD instance. I believe it because I've experienced it first hand several times. Serving static pages in 2021 is not a big deal, at all.
You can do that on a $1/month instance if you are clever about it. It's all about static serving.
Heavy load is something that can be detected and handled - if you're maxing out your CPU/network, switch to low-res/quality images, minimal CSS and do away with comments/recommendations. You could get that page down to 20kB and be mostly the same.
> Can't they just solve this problem already? Whoever is in charge of their tech stack is hardly doing its job, this is unexcusable.
A text+images news website should not be having this much trouble, it's embarrassing. That said, I'm sure there is something we do not understand - I would for sure like to speak with somebody from the site to understand it more.
They could; or they could just generate static pages and serve from a nginx reverse proxy rather than fetching from databases, etc. The price is very different
The NY Times does this for election day results because they fully expect that to get hit hard. They re-publish a static page every few minutes
Maybe they should hire IBM to invent a virtual L5 cache for the website and "magically" move it all into our L2s for "government-authorized nuclear bunker" level service.
Maybe they DID hire IBM and made all those things and still, after some years and a couple million dollars, the platform delivered can't handle more than 1,000 or so visitors.
Story time: in the early 2010, Oracle managed to sell two (yes, one for redundancy obviously) enormous mainframes to the French Civil Aviation Authority (DGAC) just to host their Intranet portal. I don't remember how much each costed exactly, but IIRC it was a six digit figure.
And the funniest part of that story: the mainframes where so heavy they couldn't fit the existing server room, so they were parked for months in a warehouse, unplugged, waiting for some new dedicated server room to be built. (I don't know if the said server was eventually built or not, and if the mainframes were actually used or not in the end).
I visited the “server room” of the organisation responsible for e-healthcare(the software application that was in use by all public hospitals, doctors, and pharmacies in one state of India) as a consultant to learn the following.
Their server room is in an old repurposed residential building on the second floor!
The only ways to get there were either through narrow winding steps, or an ancient metal elevator that creaks and is also made for tiny humans.
IBM managed to sell two Mainframes(Z10) to run their Java apps. The mainframes could not be brought in because of the elevator problem. I kid you not, they brought welders on site to dismantle the racks, and then weld them back again inside the “server room”!
Cooling for the “server room” with two mainframes you ask? A few airconditioners that you’d typically installed at home and a table fan(I don’t know why it was there).
They were concerned about performance issues on their Java applications which were simply “JVM runs out of memory about every day”.
I luckily could bail out of that nightmare and escaped the whole situation.
I could only think about the govt employees who were in charge of purchasing that got to go on fancy vacations paid for by IBM. I obviously don’t know if it’s the case, but my own amusing speculation to explain the irrational purchase.
I used to work in the room next to the machine room at the University of Illinois at Chicago where we had an 80s vintage IBM mainframe. Because everything in the machine room was pretty big (the units holding the disk drives were about the size of a washing machine), it still kind of breaks my brain to think that there were microchips inside them. I like to imagine that the CPUs were the size of paperbacks, and to see stories like this just kind of messes with my brain.
(At a previous employer, we got a tour of the data center where, along with racks and racks of commodity Linux servers, there was also an IBM mainframe. It was a sleek black piece of hardware about the size of a refrigerator, as I recall. Much prettier than those 80s machines, although is it really a mainframe without at least one 9-track tape drive?)
Chips yes, but microprocessors no. If it was a 3rd gen design the CPUs in that IBM mainframe would have been a large circuit board with a whole bunch of chips and TTLs, or racks of circuit boards. Third gen mainframes we’re still being built in the 90s and a single processing unit could contain thousands of chips. The 10-way IBM ES9000 9X2 contained over 5,000.
Bear in mind mainframes aren’t about fast execution so much, they’re about massive parallelisation and I/O bandwidth.
Fair enough. We routinely had several thousand interactive users at a time on UICVM back in the 80s. We had dedicated machines (effectively their own minicomputers) that managed terminal sessions so we could use commodity ASCII terminals or PCs running Kermit instead of IBM 3270 terminals. It still blows me away how small everything is in the twenty-first century.
Not quite a revolutionary cache design, we built a similar L2 is also L3 design 20 years ago in an x86 clone, chip was never released sadly, and of course all the patents have now expired
On most modern cpus with multiple sockets, it's slower to fetch a L2 line from the other socket than to fetch it from main memory. Was this also true on your system?
All of the work we do in distributed systems and making things redundant and robust and stable, every time I read something about an IBM mainframe I think: “Those IBM engineers figured it all out already!”
From the article:
“ So What Has IBM Done That is So Revolutionary?
In the first paragraph, I mentioned that IBM Z is their big mainframe product – this is the big iron of the industry. It’s built better than your government-authorized nuclear bunker. These systems underpin the critical elements of society, such as infrastructure and banking. Downtime of these systems is measured in milliseconds per year, and they have fail safes and fail overs galore – with a financial transaction, when it is made, it has to be committed to all the right databases without fail, or even in the event of physical failure along the chain.
This is where IBM Z comes in. It’s incredibly niche, but has incredibly amazing design.”
Well they should since they invented, and were first to market, for most of the innovations in the first place. They didn’t bother selling to startups like Google back when, and folks decided to recreate most of it on a cheaper basis, from what I understand.
1) How does a core know in which private cache of another core it can put the evicted data? Wouldn't that just remove data from that other core's cache that the other core needs?
2) Cores in today's processors can already read data from the private cache of another core (i.e., snooping), can they not?
> How does a core know in which private cache of another core it can put the evicted data?
Good question. There are many ways of handling this and I'm curious how IBM decided to implement this. Though it's unlikely they will divulge such internal implementation details.
> Wouldn't that just remove data from that other core's cache that the other core needs?
"This becomes important for cloud services (yes, IBM offers IBM Z in its cloud) where tenants do not need a full CPU, or for workloads that don’t scale exactly across cores."
> Cores in today's processors can already read data from the private cache of another core (i.e., snooping), can they not?
Depends on the exact CPU and its implementation. But yes, this idea of a "private cache" is pretty silly and misleading. Each cache is just a small subset of the system's memory. And all cores can issue memory reads which will eventually get the data from somewhere. Whether you're reading it from your own private cache, or some other core's private cache, a shared cache, or from DRAM, does not change the final result.
The only distinction here is in performance. Accessing a cache that is colocated to your core is much faster. Accessing a cache that is either shared or located elsewhere, is much slower. An approximate analogy would be using a us-east-1 ec2 instance to read from a us-east-1 RDS database, as opposed to a us-west-2 RDS database. The former database and its contents are certainly not "private", but they are a whole lot faster.
There's likely some 'recently used' state, plus they are probably only pushing dirty lines into other caches to save the overhead of pushing them all the way to memory on some miss/allocates
What’s running on this hardware? Who is writing new software for it? Clearly a lot because they have money to invest in this cutting edge R&D but I’ve never met anyone who says they code for IBM mainframes.
There's the usual finance / banking guys. But also a lot of ERPs - the corporate bread and butter hr, Financials, crm, erm, scm, etc applications (I'm in this space, historically as sysadmin / light developer, then perf testing, infra architect, tech team lead, now as kind of a ops manager).
Over my 25 years and few dozen projects, mainframe was a significant portion of implementations and almost always at least on the table as potential platform.
Fwiw, What I always find fascinating though is that after many an implementation on "distributed platform", I. E. Unix Linux or Windows, as soon as any performance issue arises, there's a lot of calls to "move it back to mainframe" . I don't agree with that myself necessarily, but A lot of big enterprise shops feel that mainframe is an expensive pain, but reliable predictable well-performing expensive pain.
> on "distributed platform", I. E. Unix Linux or Windows
Is the mainframe really "more reliable" or "more performant" than a well provisioned cluster of commodity hw?
I mean after 20 years of optimising the SW stack, and monster xeons and tera-memory and fibre-channels and SSDs and everything so cheap and accessible. Is it really tech or just corporate inertia keeping this industry alive? ;)
As I recall, there have been something six attempts to move the airline scheduling systems off of mainframes, but the efforts keep failing to compete with the existing mainframes.
Friday night HN thoughts... (1) IBM charges for "MIPS", so faster processor, mo' money. (2) Because of the humungo margins in this space, they have a great R&D budget. (3) The marketing has to be IBM mainframes are the most bestest and "worth it", so every gen is ahead in some respects. (4) IBM makes a lot of money outsourcing legacy mainframe workloads, so faster systems keeps them ahead of the curve there.
Years ago, IBM had a partner offering mainframe Linux/Java "cloud servers". But that must not have added-up. They will tell their existing customers Z is great for these workloads but you don't see Google or Amazon buying them.
Linux runs fine on s390x. Red Hat ships and supports RHEL for it with the usual selection of packages. It's interesting for me because it's the "last" (common) big endian architecture, so it's the last place available to test for endianness bugs.
It could be really interesting if any of the up and coming arm processors could be made to reliable work in big-endian mode. There is some custom kernels floating around that uses the Raspberry Pi in BE-mode.
Thank you for that well written article. As a curious techie, but someone unfamiliar with the subject, I found it approachable and easy to understand despite the complex topic. Not an easy task -- well done!
Thanks! The Chief Architect on the project messaged me this morning, saying he showed his wife my video (https://www.youtube.com/watch?v=z6u_oNIXFuU) on the topic last night and it sounds like she finally gets what he's been working on
Most caches are going to have logic for tagging, eviction and invalidation for multiple other purposes, so this makes sense to do. There's some more accounting each core and L2 cache might have to do, but if the interconnect is up to the task, I'm not sure if this is that much more complex or an increase in latency compared to the old way. Just seems like a more efficient use of die area.
Clickbait doesn't apply if the actual content is worth reading. Plus in this case the title is descriptive of the contents, although it might fall foul of Betteridge's Law.
IBM even acquired Red Hat a couple of years ago, so yeah, they still exist and are trying to dominate in different domains, including but not limited to cloud, AI and Quantum Computing.
The IBM z series is for mainframes as the article explains. I think they're CISC. They also still make and develop POWER for supercomputing and scieintific workstations (and a tiny niche "PC" market.) Those supercomputers are dominating the charts though.