Have they ever asked the customers why they prefer scraping over the data deltas...

ks2048 · 2025-05-02T17:42:20 1746207740

I would bet the answer is it is easier to write a script the simply downloads everything it can (foreach <a href=>: download and recurse), rather than looking into which sites provide data dumps and how to use them.

PeterStuer · 2025-05-03T08:21:29 1746260489

So the solution would be an edgecached site exactly like the full site with just the deltas since a periodic timepoint?

The crawler still crawls but can confidently rest assured it still has all the info with the base+delta as if it had recrawled everyting?