The original Docker (and the current Podman) created each layer as an overlay filesystem. So each layer was essentially an ephemeral container. If a build failed, you could actually just run the last successful layer with a shell and see what's wrong.
More importantly, the layers were represented as directories on the host system. So when you wanted to run something in the final container, Docker just needed to reassemble it.
Buildkit has broken all of it. Now building is done, essentially, in a separate system, the "docker buildx" command talks with it over a socket. It transmits the context, and gets the result back as an OCI image that it then needs to unpack.
This is an entirely useless step. It also breaks caching all the time. If you build two images that differ only slightly, the host still gets two full OCI artifacts, even if two containers share most of the layers.
It looks like their Bazel infrastructure optimized it by moving caching down to the file level.
Buildkit didn't break anything here except that it each individual build step is no longer exposed as a runnable image in docker.
That was unfortunate, but you can actually have buildkit run a command in that filesystem these days, and buildx now even exposes a DAP interface.
Buildkit is still a separate system, unlike the old builder. So you get that extra step of importing the result back.
And since it's a separate system, there are also these strange limitations. For example, I can't just cache pre-built images in an NFS directory and then just push them into the Buildkit context. There's simply no command for it. Buildkit can only pull them from a registry.
> Buldkit is far more efficient than the old model.
I understand why. I tried to debug it, and simply getting it running under a debugger is an adventure.
So far, I found that switching to podman+podman-compose is a better solution. At least my brain is good enough to understand them completely, and contribute fixes if needed.
Buildkit is integrated into dockerd the same way the old builder was.
If you want a newer Buildkit you'll need to run it separately of course.
I'm not quite sure I understand what you are trying to do with nfs there.
But you can definitely export the cache to a local filesystem and import it with cache-from.
You can also provide named contexts.
"Buildkit can only pull them from a registry" is just plain false.
> Buildkit is integrated into dockerd the same way the old builder was. If you want a newer Buildkit you'll need to run it separately of course.
I don't think that the older builder created special containers for itself?
> I'm not quite sure I understand what you are trying to do with nfs there. But you can definitely export the cache to a local filesystem and import it with cache-from.
Which is dog-slow, because it squirts the cache through a socket. I have an NFS disk that can be used to cache the data directly. This was just one of the attempts to make it go faster.
> You can also provide named contexts.
Which can only refer to images that are built inside this particular buildkit or are pullable from a repo.
This is really all I want, a way to quickly reused the previous state saved in some format in Github Cache, NFS, or other storage.
More importantly, the layers were represented as directories on the host system. So when you wanted to run something in the final container, Docker just needed to reassemble it.
Buildkit has broken all of it. Now building is done, essentially, in a separate system, the "docker buildx" command talks with it over a socket. It transmits the context, and gets the result back as an OCI image that it then needs to unpack.
This is an entirely useless step. It also breaks caching all the time. If you build two images that differ only slightly, the host still gets two full OCI artifacts, even if two containers share most of the layers.
It looks like their Bazel infrastructure optimized it by moving caching down to the file level.