> It comes down to how you plan to shard data and distribute queries when data d...

jupp0r · on June 1, 2023

> A problem everbody would love to have

Except the people who do have it and need to keep their business running off of one postgres instance.

throwaway894345 · on June 1, 2023

You can have data that fits on one machine and still run multiple instances of postgres in a failover configuration, which will probably cover just about everyone (depending on your filesystem, disk for a single instance is essentially infinite, so I'm not actually sure what bottleneck would motivate you need to scale beyond this configuration).

jupp0r · on June 1, 2023

> I'm not actually sure what bottleneck would motivate you need to scale beyond this configuration

It's usually not that data doesn't fit on one machine but that load on the database exceeds what one machine can serve. A failover configuration might enable you to use the spares for some read operations and take a little load of the primary, but you lose ACID semantics when you do that and it generally doesn't help you for long.

threeseed · on June 1, 2023

There is a reason the world moved on from failover architectures.

a) At some point you will have more data or more users than one instance can handle. And instead of simply adding another node you need to throttle usage in order to do a rolling upgrade. Which is far easier said than done and involves impact to the business.

b) With distributed databases you are constantly testing that everything works in Dev, Test etc environments. With failover you really only test it every now and again usually before you deploy to Production. And in most companies which are hopeless the testing will be guaranteed to be inadequate.

c) Vendors lie. They promise that failover will just work but in my experience it very often doesn't. Which is another reason why b) is such important to validate their claims.

throwaway894345 · on June 1, 2023

I mean, I don't like that Postgres is not infinitely scalable, but the whole point is that (a) is not generally true--most companies could probably get by with a single machine's worth of data, or rather if they have more than one machine's worth of data, those systems probably aren't talking to each other such that they need to be on the same box. Regarding (b) and (c), do you not need to test failure conditions for distributed databases (this isn't rhetorical, I've only ever used cloud providers offerings)?

pooper · on June 2, 2023

> most companies could probably get by with a single machine's worth of data, or rather if they have more than one machine's worth of data those systems probably aren't talking to each other such that they need to be on the same box

one of the things I would like to see in my lifetime is somehow it should be easier to "run the whole enterprise" from one box. Sure, it will probably be seriously underpowered and I can't do all things at once but for most small to mid-size companies, it should be possible to run all our "code" from one machine.

I think of this as some kind of development or pre-qa environment. It really shouldn't be that big of an ask...

I am thinking most, if not all, companies will be able to fit their entire enterprise on a Supermicro A1+ server with two 96-core processors. Sure, there is no machine in the world that can fit all of YouTube videos but there is no reason why we can't have YouTube, with a limited set of non-production data, running from just one box. Thoughts?

throwaway894345 · on June 2, 2023

> one of the things I would like to see in my lifetime is somehow it should be easier to "run the whole enterprise" from one box. Sure, it will probably be seriously underpowered and I can't do all things at once but for most small to mid-size companies, it should be possible to run all our "code" from one machine.

I think that's the mainframe idea. There's probably some interesting philosophical question in there about whether or not a data center is just a big mainframe. It sort of feels like it verges on semantics.

> Sure, there is no machine in the world that can fit all of YouTube videos but there is no reason why we can't have YouTube, with a limited set of non-production data, running from just one box. Thoughts?

I'm not sure what you mean exactly, but if you're streaming YouTube's traffic through one box (even if that box isn't directly connected to the disks on which the video is stored) you'll run into I/O bottlenecks--such a machine would need to push terrabytes per second which is probably not trivial. Moreover, having a single machine that can handle YouTube's peak traffic probably means you're underutilizing it most of the time.

pooper · on June 3, 2023

Sorry I should have been clearer. I didn't mean to replace production. I meant like somewhere I can change maybe a few lines of code and run the entire application end to end.

Basically, somewhere I can run something and feel safe knowing this is production code except this one change I have made.

jupp0r · on June 2, 2023

You have a misconception about the workload of your typical database server. It's not about the amount of data it's storing, it's about

1. compute and memory bandwidth to serve complicated queries 2. IO

You can't scale memory bandwidth beyond some pretty low limit on one machine. You can't scale IO bandwidth beyond some limit. To give you an example, I've seen database servers with 20GB of data being so overloaded by compute requirements of complex queries that they needed to be scaled horizontally.

pooper · on June 3, 2023

But like this server has two 96 core processors. Moreover, we are talking about development environment for a single developer and maybe one or two users to try out their changes. It should be good enough, no?

I mean I expect things to be slower I guess but to test for correctness and spec?

justinclift · on June 2, 2023

Interestingly, when a place does get to the point where the single instance has capacity issues (after upgrading to EPYC and lots of flash drives) then other non-obvious stuff shows up too.

For example, at one place just over a year ago they were well into this territory. One of weird problems for them was with pgBadger's memory usage (https://github.com/darold/pgbadger). That's written in perl, which doesn't seem to do garbage collection well. So even on a reporting node with a few hundred GB's of ram, it could take more than 24 hours to do a "monthly" reporting run to analyse PG usage for the time period.

There wasn't a solution in place at the time I left, so they're probably still having the issue... ;)

threeseed · on June 1, 2023

I consult for a lot of companies and I never heard of or seen a database that wasn't horizontally scaled.

It's not for scalability reasons it's for high-availability.

Which as cloud adoption has increased and server uptime has decreased is even more important.

Spooky23 · on June 1, 2023

Some of these arguments and “common knowledge” things are getting old. Everybody scaled up twenty year ago - hell Amazon used to brag that they used an HP Superdome or whatever.

Anyone with dogmatic opinions about this stuff need to be taken with a grain of salt. If you scale out PeopleSoft, your accounting system will exceed the value of your company. If you’re worried about webscaling your random app, that’s more wasting time navel gazing than accomplishing anything! :)

jahewson · on June 1, 2023

Why shard when you can just replicate?

lmm · on June 1, 2023

Because replica failover is rarely seamless (and often doesn't actually work at all, IME).

mikepurvis · on June 1, 2023

Instinctively that's surprising... replica failover should be far simpler technically, shouldn't it?

lmm · on June 1, 2023

No? Replication tends to be a bodged-on mess throughout, full of undertested edge cases, of which failover is definitely one. If you build the system so that nodes joining and leaving is a natural and normal part of operation, well, it naturally works a lot better.