As is traditional with most explanations of HTTP caching, it doesn't mention Vary header. Although apparently some CDNs (e.g. Cloudflare) straight up ignore it for some reason [0].
I would say "vary" is the wrong way to solve that problem. The issue is that there can easily be a bunch of stupid inconsequential differences between accept headers, far beyond simply asking for type x versus type y. Slightly different priorities, order, including an extra mime in the list, putting some irrelevant format nobody uses first just in case, etc.
An optimal solution would involve: the response listing which alternate content-types can be returned for that endpoint, the cache considering the accept header, if it sees a type from the alternates list higher in the accept header priority than whatever it has in cache, then it would forward the request to the server. Once it had all the alternatives in cache, it would pass them through according to the accept without hitting the server.
The closest existing header to the above would be the link header, if you give it rel=alternate, and type as the mime type. It's not clear what href you would be, since it usually is to a different document, but we want the same url but a different mime type. So clearly this would be an abuse of the header, but could work.
Sure, except I doubt most people want to uglify all their urls with extensions for occasional alternates. Plus, if the url with the extension gets past around instead of the original (as would inevitably be done) you're back to square one.
I had thought about recommending that people just use an alternate link as intended, to point to an alternate format. I think that would work best using existing web standards as intended, but it has the downside of initially serving the original format regardless of the content type.
Good call! Honestly I just wanted to wrap it up before the holidays, but you’re right that a small section on Vary would have been useful.
Things like non-conforming caching services made me punt actual suggestions to a later article, as I wasn’t sure how my sense of the RFC interacted with the real world. HTTP Caching Tests seems like a great resource for this, but only includes Fastly out of the big providers, and it seems to be doing okay with Vary. https://cache-tests.fyi/
Updated the article with some information on the `Vary` and `No-Vary-Search` headers. I’ve left out the details of how revalidation works with `Vary` since I haven’t been able to reconcile yet what the spec seems to encourage vs what the tests on cache-tests.fyi suggest is conformant behavior.
> the cache MUST NOT use that stored response without revalidation unless all the presented request header fields nominated by that Vary field value match those fields in the original request
You’ll find that some have creative readings of MUST NOT.
[0] https://news.ycombinator.com/item?id=38346382