Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Did IBM Just Preview the Future of Caches? (anandtech.com)
241 points by boyter on Sept 3, 2021 | hide | past | favorite | 89 comments


Reading this over, one striking thing is that in going from 14 nm -> 7 nm, they reduced core count from 12 to 8, and kept total cache per die approximately the same (256 MB L3 + various smaller caches -> 256 MB L2*). Looking at the die shot, there isn't an unusual amount of uncore, so this is either a MUCH smaller processor or a MUCH more complex core. Given that the Z series has never been particularly cost limited and has often pushed reticle size, and that by getting rid of the system controller costs will already be coming down... is this just a super-high transistor count SMT2 core?


Thinking about this a bit more, I suspect that the answer comes down to the two dies per package. The main distinction between on-die cores and off-die cores is the difference between L3 (single ejection) and L4 (double ejection), and the main use of die-to-die and package-to-package interconnect is going to be handling the L4 data flow. I bet that the numbers showed that goes from 256MB of cache on die (available for active-use L2 + shared L3) to more didn't make much of a difference; that's a pretty big L3 chunk, after all, and even if the latency for L4 is way higher I bet you're already well into marginal gains.

In this case, and assuming that the package is thermally limited, there's no real reason not to go to a smaller die. Even if a 16 core, 512MB cache monster fit on reticle, if there's not much performance gap between that and the actual dual 8 core, 256MB cache system, why not take a yield gain from smaller dice?

If this is the case, I'd expect this die to be way smaller than the Z15 processor... < 400 mm^2? Anyone have numbers?


Followup: Looks like 530 mm^2 [1], so quite a bit smaller than Z15 but not by a factor of two; consistent with nullc's eDRAM observation.

[1]: https://wccftech.com/ibm-z-next-gen-processor-detailed-telum...


FWIW, the article this thread links to says 530mm2.


> so this is either a MUCH smaller processor or a MUCH more complex core.

According to the video presentation [1] each processor also contains an AI accelerator with more than 1000 compute engines for >6 TFlops

[1] https://www.ibm.com/blogs/systems/ibm-telum-processor-the-ne...


IIRC that L3 was eDRAM, so it's a bigger change than you might imagine.


Ah, that's a super interesting point. My vague recollection is that the size per bit for eDRAM is somewhere between a half and a third that for SRAM (isoprocess), so this would be doubling the cache effective area isoprocess... or probably somewhere between maintaining it and halving it with the move to 7 nm. Still looks like it might be less die space for cache, but not nearly as much as I was thinking as first.


Exactly. I'm left wondering if the move off of eDRAM was motivated by improved latency or just unavailability of the technology (or that it hadn't been updated to 7nm).


Unavailability. There simply is no process for making the deep trench capacitors that yields in <14nm.


Was eDRAM ever used for L2? I thought its latency put it firmly in L3/L4 camp.


Z12 had a small (1MB) L2, and a 48MB shared eDRAM L3.

In this new chip the L2 replaces the prior L2 and L3.

Edit: similar deal for Z14 4MB/4MB L2 + 256MB eDRAM L3.


Down because too much traffic ...

Completely tangential rant:

I just saw a tweet of a guy serving 200k connections on a $5 USD instance. I believe it because I've experienced it first hand several times. Serving static pages in 2021 is not a big deal, at all.

On the other hand, this is anandtech, a huge site that's been going on for years (decades?). For sure they get a lot of traffic, but I doubt they get something like 1M+ simultaneous requests per second. Can't they just solve this problem already? Whoever is in charge of their tech stack is hardly doing its job, this is unexcusable.


I'm looking at our traffic logs and don't see any downturn in traffic. Normally when we get an outage it's pretty obvious. Did you actually get an error, or was it just not loading?


504 Gateway Timeout

Several times over the course of like an hour. It was a mmesage from Cloudflare, saying that your site took a long time to reply.


Well it wasn't down for me.

But even it was really down it wont be their first time with capacity problem. Unlike ServeTheHome where they host and even assemble their own Server, I think Anandtech's team are now simply a content writer with Future Publishing handling everything on the hosting and software side along with Ads.

But yes I think there is something very wrong with current trend of Web Development.


'I think Anandtech's team are now simply a content writer with Future Publishing handling everything on the hosting and software side along with Ads.'

We're not an independent site anymore, so we have little control over how the site works and integrates into Future's back end (to be fair, compared to TH, we're not that integrated several years later because of a custom CMS and hosting setup, but it means Future won't put any $$$ into it until we transfer over).

Simply put, with the big publishers, writers are writers, publishers deal with the publishing.


Anand hasnt had Anand for a very long time you are right to point out.

We had it really good when he was in charge. In depth technical dives from a curious position. https://imgur.com/a/MFAypUc

I believe Phronix is still good. https://www.phoronix.com/scan.php?page=home

ArsTechnica has waned and waxed over time. But they have always been solid-enough for me. They have always been up to defend the individual in technology policy and how trends in law enforcement deceptively catches up. They can get squirrelly with the climate view once in a while. BUT even if you disagree with them -- you can bet if you sincerely read their content they stance is well grounded.

As a stable and long term aggregator for open source and similar news and articles I have been very happy with lxer. http://lxer.com/


What? He was talking about site layout and uptime, not content.


> Completely tangential rant:

I know we're not supposed to entertain this on HN, I think this would make a good 'Ask HN' - let me know if you make one.

> I just saw a tweet of a guy serving 200k connections on a $5 USD instance. I believe it because I've experienced it first hand several times. Serving static pages in 2021 is not a big deal, at all.

You can do that on a $1/month instance if you are clever about it. It's all about static serving.

Heavy load is something that can be detected and handled - if you're maxing out your CPU/network, switch to low-res/quality images, minimal CSS and do away with comments/recommendations. You could get that page down to 20kB and be mostly the same.

> Can't they just solve this problem already? Whoever is in charge of their tech stack is hardly doing its job, this is unexcusable.

A text+images news website should not be having this much trouble, it's embarrassing. That said, I'm sure there is something we do not understand - I would for sure like to speak with somebody from the site to understand it more.


Maybe they should use in some kind of caching service like Cloudflare?


They could; or they could just generate static pages and serve from a nginx reverse proxy rather than fetching from databases, etc. The price is very different

The NY Times does this for election day results because they fully expect that to get hit hard. They re-publish a static page every few minutes


Same reason you can sometimes view hacker news without being logged in when it is seemingly down


Maybe they should hire IBM to invent a virtual L5 cache for the website and "magically" move it all into our L2s for "government-authorized nuclear bunker" level service.


Maybe they DID hire IBM and made all those things and still, after some years and a couple million dollars, the platform delivered can't handle more than 1,000 or so visitors.


Story time: in the early 2010, Oracle managed to sell two (yes, one for redundancy obviously) enormous mainframes to the French Civil Aviation Authority (DGAC) just to host their Intranet portal. I don't remember how much each costed exactly, but IIRC it was a six digit figure.

And the funniest part of that story: the mainframes where so heavy they couldn't fit the existing server room, so they were parked for months in a warehouse, unplugged, waiting for some new dedicated server room to be built. (I don't know if the said server was eventually built or not, and if the mainframes were actually used or not in the end).


Oh oh I got one.

I visited the “server room” of the organisation responsible for e-healthcare(the software application that was in use by all public hospitals, doctors, and pharmacies in one state of India) as a consultant to learn the following.

Their server room is in an old repurposed residential building on the second floor!

The only ways to get there were either through narrow winding steps, or an ancient metal elevator that creaks and is also made for tiny humans.

IBM managed to sell two Mainframes(Z10) to run their Java apps. The mainframes could not be brought in because of the elevator problem. I kid you not, they brought welders on site to dismantle the racks, and then weld them back again inside the “server room”!

Cooling for the “server room” with two mainframes you ask? A few airconditioners that you’d typically installed at home and a table fan(I don’t know why it was there).

They were concerned about performance issues on their Java applications which were simply “JVM runs out of memory about every day”.

I luckily could bail out of that nightmare and escaped the whole situation.

I could only think about the govt employees who were in charge of purchasing that got to go on fancy vacations paid for by IBM. I obviously don’t know if it’s the case, but my own amusing speculation to explain the irrational purchase.


Did you consider that you just wrote that big rant without checking if the site was actually down or if it was just you?


I used to work in the room next to the machine room at the University of Illinois at Chicago where we had an 80s vintage IBM mainframe. Because everything in the machine room was pretty big (the units holding the disk drives were about the size of a washing machine), it still kind of breaks my brain to think that there were microchips inside them. I like to imagine that the CPUs were the size of paperbacks, and to see stories like this just kind of messes with my brain.

(At a previous employer, we got a tour of the data center where, along with racks and racks of commodity Linux servers, there was also an IBM mainframe. It was a sleek black piece of hardware about the size of a refrigerator, as I recall. Much prettier than those 80s machines, although is it really a mainframe without at least one 9-track tape drive?)


Chips yes, but microprocessors no. If it was a 3rd gen design the CPUs in that IBM mainframe would have been a large circuit board with a whole bunch of chips and TTLs, or racks of circuit boards. Third gen mainframes we’re still being built in the 90s and a single processing unit could contain thousands of chips. The 10-way IBM ES9000 9X2 contained over 5,000.

Bear in mind mainframes aren’t about fast execution so much, they’re about massive parallelisation and I/O bandwidth.


Fair enough. We routinely had several thousand interactive users at a time on UICVM back in the 80s. We had dedicated machines (effectively their own minicomputers) that managed terminal sessions so we could use commodity ASCII terminals or PCs running Kermit instead of IBM 3270 terminals. It still blows me away how small everything is in the twenty-first century.


Dave Jones disassembled a mainframe CPU from a 90s vintage machine. It was the size of a paperback. Check his chanell on Youtube or Odysee.


Not quite a revolutionary cache design, we built a similar L2 is also L3 design 20 years ago in an x86 clone, chip was never released sadly, and of course all the patents have now expired


We being who?


I probably shouldn't say :-) but those expired patents likely belong to AMD these days


The entire point of the patent system is to encourage disclosure. Patents are by their very nature public information.


If it was 20 years ago the patents shouldn't last much longer...


Cyrix?


On most modern cpus with multiple sockets, it's slower to fetch a L2 line from the other socket than to fetch it from main memory. Was this also true on your system?


these were separate CPUs each with an L2 on the same die - we had a cache-line wide bus directly across the die


Flexible use of L2 cache slices has been heavily explored in academia. Here is one example:

http://www.cs.wisc.edu/multifacet/papers/ieeemicro08_virtual...


All of the work we do in distributed systems and making things redundant and robust and stable, every time I read something about an IBM mainframe I think: “Those IBM engineers figured it all out already!”

From the article:

“ So What Has IBM Done That is So Revolutionary? In the first paragraph, I mentioned that IBM Z is their big mainframe product – this is the big iron of the industry. It’s built better than your government-authorized nuclear bunker. These systems underpin the critical elements of society, such as infrastructure and banking. Downtime of these systems is measured in milliseconds per year, and they have fail safes and fail overs galore – with a financial transaction, when it is made, it has to be committed to all the right databases without fail, or even in the event of physical failure along the chain.

This is where IBM Z comes in. It’s incredibly niche, but has incredibly amazing design.”


Yeah its interesting how each mainframe works like its own small distributed system while being a quite centralized one in reality.


Well they should since they invented, and were first to market, for most of the innovations in the first place. They didn’t bother selling to startups like Google back when, and folks decided to recreate most of it on a cheaper basis, from what I understand.


Two questions:

1) How does a core know in which private cache of another core it can put the evicted data? Wouldn't that just remove data from that other core's cache that the other core needs?

2) Cores in today's processors can already read data from the private cache of another core (i.e., snooping), can they not?


> How does a core know in which private cache of another core it can put the evicted data?

Good question. There are many ways of handling this and I'm curious how IBM decided to implement this. Though it's unlikely they will divulge such internal implementation details.

> Wouldn't that just remove data from that other core's cache that the other core needs?

"This becomes important for cloud services (yes, IBM offers IBM Z in its cloud) where tenants do not need a full CPU, or for workloads that don’t scale exactly across cores."

> Cores in today's processors can already read data from the private cache of another core (i.e., snooping), can they not?

Depends on the exact CPU and its implementation. But yes, this idea of a "private cache" is pretty silly and misleading. Each cache is just a small subset of the system's memory. And all cores can issue memory reads which will eventually get the data from somewhere. Whether you're reading it from your own private cache, or some other core's private cache, a shared cache, or from DRAM, does not change the final result.

The only distinction here is in performance. Accessing a cache that is colocated to your core is much faster. Accessing a cache that is either shared or located elsewhere, is much slower. An approximate analogy would be using a us-east-1 ec2 instance to read from a us-east-1 RDS database, as opposed to a us-west-2 RDS database. The former database and its contents are certainly not "private", but they are a whole lot faster.


I found the easiest way to think about it, when we were doing caches in computer architecture was to think of it in terms of abstraction.

The CPUs have "read memory" and "write memory" as the only operations programs can see.

The cache policies operate asynchronously (because multiple cores simultaneously executing), to make those happen in the fastest way possible.

There's an endless variety of schemes and ways to design it, with the one huge requirement that it must ALWAYS be accurate.

But as long as you can satisfy that, you can do some goofy, crazy stuff. Might be faster, might be slower, but it will work.


There's likely some 'recently used' state, plus they are probably only pushing dirty lines into other caches to save the overhead of pushing them all the way to memory on some miss/allocates


What’s running on this hardware? Who is writing new software for it? Clearly a lot because they have money to invest in this cutting edge R&D but I’ve never met anyone who says they code for IBM mainframes.


Yo (kind of:)

There's the usual finance / banking guys. But also a lot of ERPs - the corporate bread and butter hr, Financials, crm, erm, scm, etc applications (I'm in this space, historically as sysadmin / light developer, then perf testing, infra architect, tech team lead, now as kind of a ops manager).

Over my 25 years and few dozen projects, mainframe was a significant portion of implementations and almost always at least on the table as potential platform.

Fwiw, What I always find fascinating though is that after many an implementation on "distributed platform", I. E. Unix Linux or Windows, as soon as any performance issue arises, there's a lot of calls to "move it back to mainframe" . I don't agree with that myself necessarily, but A lot of big enterprise shops feel that mainframe is an expensive pain, but reliable predictable well-performing expensive pain.


> on "distributed platform", I. E. Unix Linux or Windows

Is the mainframe really "more reliable" or "more performant" than a well provisioned cluster of commodity hw?

I mean after 20 years of optimising the SW stack, and monster xeons and tera-memory and fibre-channels and SSDs and everything so cheap and accessible. Is it really tech or just corporate inertia keeping this industry alive? ;)


As I recall, there have been something six attempts to move the airline scheduling systems off of mainframes, but the efforts keep failing to compete with the existing mainframes.


Friday night HN thoughts... (1) IBM charges for "MIPS", so faster processor, mo' money. (2) Because of the humungo margins in this space, they have a great R&D budget. (3) The marketing has to be IBM mainframes are the most bestest and "worth it", so every gen is ahead in some respects. (4) IBM makes a lot of money outsourcing legacy mainframe workloads, so faster systems keeps them ahead of the curve there.

Years ago, IBM had a partner offering mainframe Linux/Java "cloud servers". But that must not have added-up. They will tell their existing customers Z is great for these workloads but you don't see Google or Amazon buying them.


Linux runs fine on s390x. Red Hat ships and supports RHEL for it with the usual selection of packages. It's interesting for me because it's the "last" (common) big endian architecture, so it's the last place available to test for endianness bugs.


It could be really interesting if any of the up and coming arm processors could be made to reliable work in big-endian mode. There is some custom kernels floating around that uses the Raspberry Pi in BE-mode.


I wish IBM had competitors in the space so we might actually get to use them without having to hedge ridiculous amounts of IBM lock in.

The sheer amount of engineering that goes into these things is amazing.


I know that when we got the tour of the data center at Blue Cross/Blue Shield, they had an IBM mainframe in the room.

It's worth noting that with modern computing systems, you could be deploying to a Z/OS system without even necessarily being aware of it.


Lots of Linux distros support IBM mainframes, Debian for example:

https://www.debian.org/ports/s390/


To some extent, the stock market runs on mainframes. I have worked along side some mainframe programmers, but never touched the things myself.


Applications where the growing list of Intel and AMD security holes are unacceptable. Secure virtualization for example.

POWER9 has a slightly lower performance that amd64 (don't know about POWER10), but that is the main value proposition.

It would be interesting though to see what would happen if security researchers got the same cheap access to these platforms as to amd64.


The article is about IBM Z, not POWER.

Both IBM Z and POWER processors are vulnerable to a variety of spectre-style attacks.


I thought the wording in the article sounded familiar, seems like this is basically the article in video form: https://www.youtube.com/watch?v=z6u_oNIXFuU


TechTechPotato is Dr. Ian Cutress, the author of the article.


Hello! :)


Thank you for that well written article. As a curious techie, but someone unfamiliar with the subject, I found it approachable and easy to understand despite the complex topic. Not an easy task -- well done!


Thanks! The Chief Architect on the project messaged me this morning, saying he showed his wife my video (https://www.youtube.com/watch?v=z6u_oNIXFuU) on the topic last night and it sounds like she finally gets what he's been working on


Most caches are going to have logic for tagging, eviction and invalidation for multiple other purposes, so this makes sense to do. There's some more accounting each core and L2 cache might have to do, but if the interconnect is up to the task, I'm not sure if this is that much more complex or an increase in latency compared to the old way. Just seems like a more efficient use of die area.


Put in the cycle times for memory fetch, if you're going to say IBM's l4 cache can be 19 cycles


mmm interesting


Can we please avoid clickbait titles?


Clickbait doesn't apply if the actual content is worth reading. Plus in this case the title is descriptive of the contents, although it might fall foul of Betteridge's Law.


IBM still exists and makes processors? What architecture? Target market? Are they any good?


Yes: https://en.wikipedia.org/wiki/IBM_POWER_microprocessors

They're definitely good. They're what the Blackbird and FSF-RYF-certified Talos use: https://www.raptorcs.com/content/base/products.html

And they have another architecture (s390x) for mainframes too.


IBM even acquired Red Hat a couple of years ago, so yeah, they still exist and are trying to dominate in different domains, including but not limited to cloud, AI and Quantum Computing.


IBM is the largest non-consumer security company in the world (I think McAfee is biggest in that space).


The IBM z series is for mainframes as the article explains. I think they're CISC. They also still make and develop POWER for supercomputing and scieintific workstations (and a tiny niche "PC" market.) Those supercomputers are dominating the charts though.


POWER hardware is pretty insane, I wish it were more popular for "normal" workloads.


Wait is the z series POWER or is it s390x?


The z series is s390x.


IBM just refers to it as Z/Architecture now, the 64-bit sucessor of S/390 which looks like it ended in '98, no relation to POWER.


"s390x" is the Linux nomenclature for this. Probably wasn't selected by IBM marketing.


Seemed like the Z/Architecture was backward compatible with S/390 up until IBM z13 (launched 2015) according to wiki.

"z13s introduce a new vector architecture and are the last z Systems servers to support running an operating system in ESA/390 architecture mode"


The operating system now needs to natively support the 64-bit z/Architecture, but application compatibility still goes all the way back to System/360.


Huge chunk of the OS is still 24bit code, then 31bit, little 64bit.

The removal of backwards compatibility meant you couldn't setup LPAR or VM with 31bit OS anymore


POWER10 looks like a bit of a monster, obviously we plebs aren't allowed to buy one...


Does POWER have any applications outside of iSeries and the expensive as hell Raptor Workstations?



I wouldn't be surprised if IBM make decent money just off people using them as leverage for a better deal from an x86 company.

POWER is definitely used, but no idea of anywhere public.

Raptor unfortunately won't be doing a POWER10 box for a while if at all.


Summit, the current #2 supercomputer on the Top500 list, uses POWER9 CPUs.


Yes, probably POWER10, mainframes and financial databases (see article), no idea.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: