More

AlisdairO · on Feb 2, 2023

HA on rds uses synchronous replication - you won’t lose data on automated failover under any normal circumstances.

hot_gril · on Feb 2, 2023

Ok that's fine

AlisdairO · on May 27, 2022

This is the c-store paper, which was evolved into Vertica: http://www.cs.umd.edu/~abadi/vldb.pdf . It's very readable, worth a look.

edit: column stores existed before c-store, but c-store did some very nifty stuff around integrating compression awareness into the query executor

avinassh · on May 28, 2022

Thank you, I will read the paper!

AlisdairO · on April 21, 2022

> If you have a build farm / CI machines, don't use swap. With swap, if a user schedules too many compiles at once, machine will slow to a halt and become kinda-dead, not quite tripping dead timer, but not making any progress either. Instead, set up the OOM priority on the users processes so they are killed first. If OOM hits, clang is killed, build process fails, and we can go on.

This doesn't really work that well. It's true that if you enable swap and have significant memory pressure for any extended period your machine will grind to a halt, but this will _also_ happen if you don't use swap and rely on the Linux OOM killer.

Indeed, despite the lack of swap, as part of trying to avoid OOM killing applications, Linux will grind the hell out of your disk - because it will drop executable pages out of RAM to free up space, then read them back in again on demand. As memory pressure increases, the period of time between dropping the page and reading it back in again becomes very short, and all your applications run super slowly.

An easy solution to this is a userspace OOM-kill daemon like https://facebookmicrosites.github.io/oomd/ . This works on pressure stall information, so it knows when your system is genuinely struggling to free up memory.

On the historical fleets I've worked on pre-OOMD/PSI, a reasonable solution was to enable swap (along with appropriate cgroups), but target only allowing brief periods of swapin/out. This gives you two advantages:

* allows you to ride out brief periods of memory overconsumption

* allows genuinely rarely accessed memory to be swapped out, giving you more working space compared to having no swap

lazide · on April 21, 2022

Eh, I’ve never seen a machine actually use any notable amount of swap and not be functionally death spiraling.

I’m sure someone somewhere is able to use swap and not have the machine death spiral, but from desktop to servers? It’s never been me.

I always disable swap for this reason, and it’s always been the better choice. Not killing something off when you get to that point ASAP is a losing bargain.

toast0 · on April 21, 2022

FreeBSD isn't Linux, but I've had FreeBSD machines fill their swap and work just fine for months. I had one machine that had a ram issue and started up with a comically small amount of ram (maybe 4 mb instead of 256 mb... It was a while ago) and just ran a little slow, but it was lightly loaded.

I've also had plenty of machines that fill the swap and then processes either crash when malloc fails or the kernel kills some stuff (sometimes the wrong thing) or something things just hang. Measuring memory pressure is tricky, a small swap partition (I like 512 MB, but limit to 2x ram if you're running vintage/exotic hardware that's got less than 256MB) gives you some room to monitor and react to memory usage spikes without instantly falling over, but without thrashing for long.

You should monitor (or at least look at) both swap used % and also pages/second. If the pages/second is low, you're probably fine even with a high % use, you can take your time to figure out the issue; if pages/second is high, you better find it quick.

lazide · on April 21, 2022

The issue is specific to Linux. I’ve had Solaris and SunOS boxes (years ago) also do fine.

taeric · on April 21, 2022

Don't mistake every machine you have seen death spiraling using swap, with every machine using swap as death spiraling. Notably, how many machines did you not have to look at, because the swap was doing just fine?

lazide · on April 21, 2022

That I’ve administered? None under any significant load!

I even finally disabled it on the lab raspberry pi’s eventually, and a SBC I use to rclone 20+ TB NVR archives due to performance problems it was causing.

It’s a pretty consistent signal actually - if I look at a machine and it’s using any swap, it’s probably gotten wonky in the recent past.

taeric · on April 22, 2022

Apologies. I forgot I had posted something. :(

I am a little surprised that every machine you admin has had issues related to swap. Feels high.

For the ones that are now using swap and likely went wonky before, how many would have that crashed due to said wonkiness?

Teletio · on April 21, 2022

There are plenty of workload which sometimes just spike.

Batch process for example.

With proper monitoring you can actually act on it yourself instead of just restarting which just leads to a oom loop.

lazide · on April 21, 2022

If you pushed something to swap, you didn’t have enough RAM to run everything at once. Or you have some serious memory leaks or the like.

If you can take the latency hit to load what was swapped out back in, and don’t care that it wasn’t ready when you did the batch process, then hey, that’s cool.

What I’ve had happen way too many times is something like the ‘colder’ data paths on a database server get pushed out under memory pressure, but the memory pressure doesn’t abate (and rarely will it push those pages back out of swap for no reason) before those cold paths get called again, leading to slowness, leading to bigger queues of work and more memory pressure, leading to doom loops of maxed out I/O, super high latency, and ‘it would have been better dead’.

These death spirals are particularly problematic because since they’re not ‘dead yet’ and may never be so dead they won’t, for instance, accept TCP connections, they defacto kill services in ways that are harder to detect and repair, and take way longer to do so, than if they’d just flat out died.

Certainly won’t happen every time, and if your machine never gets so loaded and always has time to recover before having to do something else, then hey maybe it never doom spirals.

Teletio · on April 21, 2022

I try to avoid swap for latency critical things.

I do a lot of ci/CD where we just have weird load and it would be a waste of money/resources to just shelf out the max memory.

Other example would be something like Prometheus: when it crashes and reads the wal, memory spikes.

Also it's probably a unsolved issue to tell applications how much memory they actually are allowed to consume. Java has some direct buffer and heap etc.

I have plenty of workloads were I prefer to get an alert warning and acting on that instead of handling broken builds etc.

AlisdairO · on April 21, 2022

I think the key here is what you mean by using swap. Having a lot of data swapped out is not bad in and of itself - if the machine genuinely isn't using those pages much, then now you have more space available for everything else.

What's bad is a high frequency of moving pages in and out of swap. This is something that can cause your machine to be functionally unavailable. But it is important to note that you can easily trigger somewhat-similar behaviour even with swap disabled, per my previous comment. I've seen machines without swap go functionally unavailable for > 10 minutes when they get low on RAM - with the primary issue being that they were grinding on disk reloading dropped executable pages.

I agree that in low memory situations killing off something ASAP is often the best approach, my main point here is that relying on the Linux OOM killer is not a good way to kill something off ASAP. It kills things off as a last resort after trashing your machine's performance - userspace OOM killers in concert with swap typically give a much better availability profile.

lazide · on April 21, 2022

100% agree.

In a situation where a bunch of memory is being used by something that is literally not needed and won’t be needed in a hurry, then it’s not a big deal.

In my experience though, it’s just a landmine waiting to explode, and someone will touch it and bam useless and often difficult to fix machine, usually at the most inconvenient time. But I also don’t keep things running that aren’t necessary.

If someone puts swap on something with sufficiently high performance, then obviously this is less of a concern too. Have a handful of extra NVMe or fast SSD lying around? Then ok.

I tend to be using those already though for other things (and sometimes maxing those out, and if I am, almost always when I have max memory pressure), so meh.

I’ve had better experience having it fail early and often so I can fix the underlying issue.

rcxdude · on April 21, 2022

When I reenabled swap on my desktop (after running without swap for years assuming it would avoid the death spiral, only to find out it was almost always worse because there was no spiral: it just froze the whole system almost immediately), it would frequently hold about 25% of my RAM capacity with the system working perfectly fine (this is probably an indication of the amount of memory many desktop apps hold onto without actually using more than anything else, but it was useful). In my experience if you want a quick kill in low memory you need to run something like earlyoom to kill the offending process before the kernel desperately tries to keep things running by swapping out code pages and slowing the system to a crawl.

roelschroeven · on April 21, 2022

It's only one datapoint, but at this very moment a server at work is using a notable amount of swap, 1.5 GiB to be more precise, while functioning perfectly normally.

    $ free -h
                  total        used        free      shared  buff/cache   available
    Mem:          3.9Gi       1.7Gi       573Mi       180Mi       1.6Gi       1.7Gi
    Swap:         4.0Gi       1.5Gi       2.5Gi

lazide · on April 21, 2022

I wish you luck! Only time that’s happened before was memory leaks for me, and it didn’t go very long before death spiraling. But if you’re comfortable with it, enjoy.

roelschroeven · on April 25, 2022

It's still working just fine, with still the same amount of swap in use (approximately).

otabdeveloper4 · on April 21, 2022

> Eh, I’ve never seen a machine actually use any notable amount of swap and not be functionally death spiraling.

For my low-end notebook with solid-state storage I set the kernel's swappiness setting to 100 percent and this problem got magically fixed. It's rock-solid now.

I don't know how it works but it does.

yencabulator · on April 22, 2022

It's pretty common for me to see a gig or two in swap, never really wanted back, and that RAM used for disk caching instead.

theamk · on April 22, 2022

I think "Linux drops will drop executable pages without swap" is a symptom of machines with small amount of memory, say 4G or less. So it is pretty outdated for regular servers, and probably only relevant when you are saving money by buying tiny VMS.

Those build servers had at least 64GB of RAM, while executables were less than 1GB (our entire SDK install was ~2.5GB and it had much more stuff than just clang). So a machine would need to finely balance on memory pressure: high enough to cause clang to be evicted, but low enough to avoid OOM killer wraith.

I don't think this is very likely in machines with decent amount of memory.

AlisdairO · on April 22, 2022

Fair enough - I've seen it more commonly in smaller machines, but they're also more common in the fleets I've observed (and the ones that are more likely to run close to the edge memory-wise). I have also seen it in systems up to 32GB RAM, so it's by no means a non-issue in systems that are at least somewhat larger. The general observation that oomd/earlyoom + swap is a better solution than no swap still generally applies.

AlisdairO · on March 7, 2022

just a note that in newer versions of PG I believe partition changes no longer require an access exclusive lock on the parent table, which I'm looking forward to when we upgrade...

AlisdairO · on Feb 25, 2022

As they note table lock acquisition can be super painful. It's possible to write a script to use pg_stat_activity to detect that the table alteration is waiting on a lock and kill conflicting work. For some workloads the kill is unacceptable, but on highly active systems killing off a few operations is usually much less damaging than letting sessions pile up on a lock wait.

Personally I would love it if it were possible to set a mode on postgres whereby access exclusive locks break other locks and cause the transactions holding them to roll back (similar to what happens with vacuums).

AlisdairO · on Jan 14, 2022

Particularly in the dynamo case, you're working outside of a common buffer pool. One of the key benefits of normalization in a typical db is that you can fit more stuff into memory if you normalize - dynamo renders that point largely moot.

mulmen · on Jan 14, 2022

That's only one benefit of normalization.

In our case many of our objects had redundant data (aka denormalized) so updates required multiple calls to the DynamoDB service. By normalizing we saw throughput gains in our application and reduced service calls by taking fewer trips. Additionally we had conflated a couple of our domain-specific concepts in the data model and by splitting what was actually two independent entities that had been modeled as one we reduced the absolute record count.

I describe these optimizations as "making the data smaller" and "normalization".

AlisdairO · on Jan 12, 2022

To confirm, RDS does support in place major version upgrade for postgres.

AlisdairO · on Sept 1, 2021

The query hint thing is a serious issue. I get the concern that people won't report issues and so on, but on heavily loaded DBs having no decent operational response to a bad plan change is really scary. In an ideal world something like oracle's plan stability would be pretty helpful.

AlisdairO · on Aug 21, 2021

> That's a language design issue.

Agreed. And, also, who cares? Yes, Java is a mediocre language and way too verbose. I spend a bunch of time in Java and I don't notice the worst of that because I (like literally every Java dev I know) use an IDE. You shouldn't separate a language from the standard tooling it comes with/enables - they're intertwined.

AlisdairO · on Aug 21, 2021

For large codebases, the code exploration features are priceless. Call hierarchy is an absurd productivity booster.

dntrkv · on Aug 21, 2021

Using this right now to break apart a 5k line file. Being able to see the call hierarchy of all the functions allows me to move over functions with less dependencies and slowly work towards the ones that are heavily intertwined. I can see it all at a glance without having to jump from one function to another over and over.