Systemd as tragedy

jimrandomh · on Jan 29, 2019

Back in 2016, systemd started killing user processes on logout (rather than send them the SIGHUP signal, as POSIX says should happen). This caused problems for programs like nohup, screen and tmux, which deliberately keep running. Systemd's response was to say that they should incorporate systemd's library, and use systemd's new daemonization API. As far as I know, none of them did.

Two years later, you can find hundreds of support requests across the internet, from frustrated users who are having their sessions killed by systemd.

Bugs are annoying, but that's life. On the other hand, when you're an impacted user who's lost work, and researching the bug leads you to a years-old discussion in which someone is actively denying that the bug exists and refusing to fix it, that's infuriating. I don't think systemd's developers deserve the trust that maintaining a core piece of infrastructure requires; they don't seem to care enough about whether they've broken things.

poettering · on Jan 29, 2019

You know, because we knew this would be controversial we made sure it was both a compile-time option and a runtime option. Yes the upstream default of both defaults to on, but that's just upstream. We made it very easy and supported for downstream distros to switch between opt-out and opt-in of this option for their users. We have encouraged distributions to leave it on, but we were fully aware that for compatibility reasons this is something downstreams likely wanted to turn off, and most compat-minded distros did, as we expected.

Now I am used to taking blame for apparently everything that every went wrong on Linux, but you might as well blame your downstream distros for this as much you want to blame us upstream about this, as it's up to them to pick the right compile-time options matching their userbase and requirements in compatibility, and if they didn't do that to your liking, then maybe you should complain to them first.

(And yes, I still consider it a weakness of UNIX that "logout" doesn't really mean "logout", but just "maybe, please, if you'd be so kind, i'd like to exit, but not quite". I mean, that's not how you build a secure system. We fixed that really, fully knowing it would depart from UNIX tradition, but that's why we made it both compile-time and runtime configurable)

(Also, nobody has to "incorporate" systemd's library to avoid the automatic clean-up. In fact, there's no library we provide that could do that. What was requested though is to either run things as child of systemd --user or just register a separate PAM session, neither of which requires any systemd-specific library.)

Lennart

inferiorhuman · on Jan 29, 2019

> Now I am used to taking blame for apparently everything that every went wrong on Linux, but you might as well blame your downstream distros for this as much you want to blame us upstream about this, as it's up to them to pick the right compile-time options matching their userbase and requirements in compatibility, and if they didn't do that to your liking, then maybe you should complain to them first.

It's up to you as a systemd developer to pick sane defaults. Claiming that it's okay to introduce opt-out breaking changes upstream and then abdicate responsibility is a quite bit like walking around while waving your hands and arms around and then blaming whoever you hit for walking into you.

poettering · on Jan 29, 2019

Well. What is a distro for then if not for picking the most highlevel of defaults suitable for them?

inferiorhuman · on Jan 29, 2019

> Well. What is a distro for then if not for picking the most highlevel of defaults suitable for them?

IOW the distros maintainers made a mistake by picking systemd? Agreed.

sfilargi · on Jan 29, 2019

You are right. Distros failed us completely by choosing systemd.

izacus · on Jan 29, 2019

Killing software that might be running after a valid login session is a sane default.

inferiorhuman · on Jan 29, 2019

And that's what SIGHUP is for. The process will exit by default. If that's not the desired behavior a handler can be registered. Killing things that are explicitly designed to run after logout is a piss poor default.

poettering · on Jan 29, 2019

We send SIGHUP btw. The kernel's own sending of SIGHUP is bound to the TTY concept btw, which is specific to TTY logins only, not graphical ones.

That said the question is not so much about who sends what, but more about whether a secure system should allow user code to escape lifecycle management or whether logging out means logging out and giving up all resources.

tyingq · on Jan 29, 2019

I get what you're saying. However, I'd probably apply the kernel rule of "when maintaining the kernel, do not do something which breaks user programs/applications". Yes, this isn't the kernel, but it's comparable in being a core function that heavily affects userland stuff.

pas · on Jan 30, 2019

Sometimes the ole way o' logg out is just insecure. And there is no way to conjure up a new backward compatible and secure way. cgroups work well, especially because they are not opt-in. That means programs daemonizing either has to set themselves up as a system service or start a new logind scope (or PAM session, etc. which translates to escaping the cgroup, which requires user approval to remain secure).

inferiorhuman · on Jan 29, 2019

> more about whether a secure system should allow user code to escape lifecycle management

Please stop trotting that tired old line out. It is simply untrue. Systemd does the exact opposite of providing increased security. If nothing else the greatly increased surface area of systemd makes for a less secure system.

The pwnie articulates a number of other ways in which your code and your behavior are actively reducing the security of Linux.

onlydeadheroes · on Jan 30, 2019

I know right, I run openvpn as user nobody and I keep thinking that nobody user better stay logged in!

v_lisivka · on Jan 29, 2019

If you created a problem, it's your duty to provide a workaround or a solution to the problem. Why not provide systemd specific version of `nohup` for such cases and encourage users to use it instead of old and insecure version?

tyingq · on Jan 29, 2019

This. There's a reason the defacto way to keep running post logout was named "nohup". This wasn't some deep dark unknown secret behaviour that was broken.

zbentley · on Jan 30, 2019

It was called that because connected pty devices could hang up. Whether hanging up due to intentional logout or actually hanging up the modem was, and is, left as an exercise to the user. Unless we try to disambiguate it via login/pty manager programs, that is.

v_lisivka · on Jan 29, 2019

Because 1) maintainer can be overloaded, so (s)he will stick to defaults, 2) maintainer needs a logical reason to change default setting to something else, which is not obvious in most cases. Maintainer is not a QA team.

zpallin · on Jan 29, 2019

Look, it's everyone's responsibility, this doesn't just fall on Systemd. While it's clear that Systemd made some difficult changes to how user processes operate, it still performed the due diligence of providing the original behavior as configurations. They should reconfigure their tools. If they're not doing that, then it's not necessarily Systemd's fault that things don't work for sysadmins trying to use their tools.

youdontknowtho · on Jan 29, 2019

Wait a minute. Why isn't it the distro's responsibility to choose the most compatible defaults?

bjourne · on Jan 29, 2019

Isn't it more efficient if 1 upstream picks the sane defaults rather than N distros? The situation was exactly the same when PulseAudio was introduced in Ubuntu. Audio broke for a huge amount of users and according to upstream it was because they had configured it wrongly...

IMO, it is part and parcel of designing great software that you pick as universally agreeable defaults as possible.

darkpuma · on Jan 29, 2019

It's the responsibility of both to pick sane defaults. When the software developer picks insane defaults they are being antisocial, those distro packagers are people too and developers who pick insane defaults are causing unnecessary grief for packagers.

inferiorhuman · on Jan 29, 2019

If you smell shit while walking down the street, maybe someone dropped a deuce on the sidewalk. If you smell shit everywhere you go, maybe it's you, maybe you shat your pants. When you violate the principle of least astonishment you're creating a huge stink.

That you can configure systemd to behave in a less obnoxious manner is well beside the point. Systemd should be unobtrusive and predictable without any extra action on the part of the distribution folks or end users.

That the suggestion is to simply read the code or documentation is the height of arrogance considering how sloppy and insecure the systemd code is (parse error equals root privileges? come on…).

SahAssar · on Jan 29, 2019

Your argument assumes that systemd is simply meant to be a in-place compatible drop-in for what it replaces, which I don't think is something anyone would/should expect. If systemd was meant to behave the exact same way as systems it is replacing then there wouldn't be much point of it. For those cases it sometimes will break things, and will sometimes have settings to follow previous behavior.

inferiorhuman · on Jan 29, 2019

There's plenty of room within the POSIX specs to address service management without requiring kernel integration, breaking userland tools, etc. When your init replacement manages to interfere with the kernel you've done something very, very wrong.

SahAssar · on Jan 29, 2019

Not sure if I missed something here but how has it interfered with the kernel? AFAIK it has broken some userland tools (which is bad in itself in most cases), but actually breaking kernelspace is not something I've heard of.

inferiorhuman · on Jan 30, 2019

https://igurublog.wordpress.com/2014/04/03/tso-and-linus-and...

Yet just two days ago, we see Linus Torvalds (the creator of Linux and maintainer of the Linux kernel), launching into a tirade against – yes, you guessed it – systemd developers because of their atrocious response to a bug in systemd that is crashing the kernel and preventing it from being debugged. Linus is so upset with systemd developer Kay Sievers (gee, where I have heard that name before – oh, that’s right, he’s the moron who refused to fix udev problems) that Linus is threatening to refuse any further contributions from this Red Hat developer, not just because of this bug, but because of a pattern of this behavior – a problem for Kay because Red Hat is also foaming at the mouth to have their kernel-based, no doubt bug- and security-flaw-ridden D-Bus implementation included in our kernels. Other developers were so peeved that they suggested simply triggering a kernel panic and halting the system when systemd is so much as detected in use.

The key phrase there is:

a bug in systemd that is crashing the kernel and preventing it from being debugged

Honestly though when you get Linus flaming your behavior you're doing something really wrong.

Per_Bothner · on Jan 30, 2019

_Honestly though when you get Linus flaming your behavior you're doing something really wrong._

Haven't been around here long, have you? :-)

yellowapple · on Feb 1, 2019

Likewise, of course, or you'd know that the tirades were more often than not in response to things that were indeed "really wrong" (at least by his standards).

inferiorhuman · on Jan 30, 2019

Yeah I know Linus likes to go on a good tear. But I'm not talking about flaming your code or design decisions, but flaming your behavior.

youdontknowtho · on Jan 31, 2019

from 2014. I'm only pointing it out to make it clear that the post wasn't recent. Not questioning anything else about it.

pas · on Jan 30, 2019

Some distros focus on user convenience some on security. Different defaults are required.

And sometimes security requires breaking compatibility.

jimrandomh · on Jan 29, 2019

There's a bug here, which impacts end users: a variety of programs which are clearly intended to persist in the background (nohup, tmux, etc) are failing to persist. This is a real bug. We care about it. I won't be satisfied until it appears that the bug is on track to be fixed, and a lot of other people won't either.

The options for fixing the bug are:

* nohup, tmux, emacs, etc all take dependencies on systemd and use the new systemd daemonization procedure. This is not a viable path because the maintainers of those utilities have refused (see https://github.com/tmux/tmux/issues/428), and because there are too many of them.

* Each distro separately works around the problem by maintaining forks of nohup, tmux, etc. This is not a viable solution because it's way too many forks; people will be finding broken distro+utility pairs forever.

* Each distro separately works around the problem by putting loginctl enable-linger in /etc/profile and KillUserProcesses=no. This would effectively be overruling a systemd's decision. Some distros won't know they need to do this, and the github systemd repo becomes a trap.

* Or: systemd backs down and changes the defaults so that the old daemonization APIs work again.

If you have a fifth option, we'd all love to hear it. But the status quo is that there's a user-facing bug, and the bug is still there. Rather than make the case for it not being a bug, you're currently making the case for it being someone else's bug, but the "someone else" doesn't actually have the power to fix it. You are the only one with the power to fix this bug.

inferiorhuman · on Jan 29, 2019

> If you have a fifth option, we'd all love to hear it.

Replace systemd with something else.

NullPrefix · on Jan 31, 2019

There's literally nothing wrong with OpenRC

pas · on Jan 30, 2019

Devuan

apple4ever · on Jan 30, 2019

I don't understand the issue. systemd offers the option to override the default. Its literally a config. If its such a big deal, why don't the distros just override it? Its a one time change.

inferiorhuman · on Jan 29, 2019

> And yes, I still consider it a weakness of UNIX that "logout" doesn't really mean "logout", but just "maybe, please, if you'd be so kind, i'd like to exit, but not quite". I mean, that's not how you build a secure system.

As an aside this is the height of arrogance to suggest that the systemd is somehow a more secure alternative. Lest this be considered an empty ad hominem attack, let me quote the pwnie you won in 2017[1]:

> Where you are dereferencing null pointers, or writing out

> of bounds, or not supporting fully qualified domain names,

> or giving root privileges to any user whose name begins with

> a number, there's no chance that the CVE number will

> referenced in either the change log or the commit message.

> But CVEs aren't really our currency any more, and only the

> lamest of vendors gets a Pwnie!

1: https://pwnies.com/archive/2017/winners/#lamestvendor

arpa · on Jan 29, 2019

> giving root privileges to any user whose name begins with > a number

https://github.com/systemd/systemd/issues/6237

oh my god, what a spectacular issue. And, seriously, the Poetterings' response is basically "not my job" and "not a bug". And this person develops something that sits at the core of a modern linux system...

inferiorhuman · on Jan 29, 2019

> oh my god, what a spectacular issue. And, seriously, the Poetterings' response is basically "not my job" and "not a bug". And this person develops something that sits at the core of a modern linux system...

All the while Lennart claims that he's making Linux more secure. FFS.

Edit: I forgot about this

https://igurublog.wordpress.com/2014/04/03/tso-and-linus-and...

> He (Theodore Ts’o) goes on to describe how he previously had to neuter policykit’s security (rendering his system very vulnerable) just to get his system working, and how he has found systemd "very difficult sometimes to figure out".

And:

> As for Kay Sievers, maybe he should rename himself to Kay Sewers, because that’s exactly what he smells of. He told to IETF internet area director and previously DHCP working group co-chair “Tod Lemon” to lmgtfy when he asked about a systemd related git repository.

This gem sums it up perfectly though:

> Yet just two days ago, we see Linus Torvalds (the creator of Linux and maintainer of the Linux kernel), launching into a tirade against – yes, you guessed it – systemd developers because of their atrocious response to a bug in systemd that is crashing the kernel and preventing it from being debugged. Linus is so upset with systemd developer Kay Sievers (gee, where I have heard that name before – oh, that’s right, he’s the moron who refused to fix udev problems) that Linus is threatening to refuse any further contributions from this Red Hat developer, not just because of this bug, but because of a pattern of this behavior – a problem for Kay because Red Hat is also foaming at the mouth to have their kernel-based, no doubt bug- and security-flaw-ridden D-Bus implementation included in our kernels. Other developers were so peeved that they suggested simply triggering a kernel panic and halting the system when systemd is so much as detected in use.

the_why_of_y · on Jan 29, 2019

Only the root user can put such an invalid unit file into a directory where systemd will read it - what is the security impact exactly?

benchaney · on Jan 29, 2019

The security impact is that if you allow a user to choose their own username, and you use a standard POSIX specified way of verifying that the username is valid, and at any point in time you run a service as that user, an attacker can gain root privileges.

inferiorhuman · on Jan 29, 2019

Or if you have a package that generates a service user that starts with a digit. Then you'll be running an arbitrary service as root in which case any vulnerabilities become that much more serious. Or have things regressed so much with systemd that the standard is now verify each and every thing you have the init system do?

The other problem is, of course, the utter lack of understanding Lennart demonstrates by being so dismissive and the increased potential for systemd to be hiding future security vulns.

youdontknowtho · on Jan 29, 2019

You know it's open source and that you could actually get involved? If you submit a pull request and it doesn't get merged you can take your concerns to the the larger group.

As to the stuff mentioned in the pwnie. Those sound like great contributions that would be appreciated.

You could also take your concerns to the distro development group. If that doesn't work you could also customize your distro with a custom build of systemd.

If you still don't get satisfaction you can stop using it.

If you dislike how they do thing you have options. Or, you could just be mean on a forum...

sametmax · on Jan 29, 2019

For what it's worth, systemd makes my life easier.

When I switch distro, it's almost always systemd, and not the system du jour, so I know how it works. Creating service files is a google query away, and makes common use cases a breathe, while advanced features that were hard to bash script yourself into, are now just a few options to type.

I understand that many people may have problems with systemd for their particular situation, but that's not my experience.

As a dumb user with a few laptops and servers that needs an occassional daemon, I'm glad systemd won. I know you get a lot of heat since it came out, so thank you for working on it.

nine_k · on Jan 30, 2019

Sure, systemd solves a number of real problems. This is good.

What is not as good: (1) systemd takes over or duplicates functionality not related directly to its primary purpose, and (2) is not solid enough to trust it in a number of cases, while (3) the developers' attitude does not give a lot of hope that the situation will materially improve.

(Of course, I run a distro without systemd.)

gerbilly · on Jan 29, 2019

> I still consider it a weakness of UNIX that "logout" doesn't really mean "logout"

Ok, but UNIX and it's behaviour has evolved over forty years, and users have a certain set of expectations about it.

Also, it should be noted, systems like UNIX are cultural artifacts. The way they are is the result of forty years of back and forth debate and negotiation and eventually compromise.

I can't speak for all of them, but I think that people that are bothered by systemd are upset that all of history has been brushed aside to make place for the preferences of just a few influential developers.

Whether a feature like logout is "logical" or not, is besides the point. Operating system design isn't just about logic, it's about serving users.

pas · on Jan 30, 2019

Yes, indeed, it's not about logic, as those same users cheer Linux instead of sticking with BSD, and then complain about not being UNIX enough.

toomim · on Jan 29, 2019

That was the point of OP's article. That it's hard to change.

sfilargi · on Jan 29, 2019

Completely agree. The problem is not upstream, but downstream. Distros should have done better job and chosen a better default system manager and not systemd.

You build your software the way you want and like. If others don’t like that it breaks POSIX they should stop using it instead of complaining. Or fork it.

majewsky · on Jan 29, 2019

> What was requested though is to either run things as child of systemd --user or just register a separate PAM session

When you run your screen or tmux below `systemd --user`, you still would have to `loginctl enable-linger`, no? I remember having to do that when I set up a PulseAudio server on a headless machine where I don't maintain an active session.

jjolla · on Feb 10, 2019

> still consider it a weakness of UNIX that "logout" doesn't really mean "logout" ... I mean, that's not how you build a secure system

so, unix has been running for 20+ years laden with this security flaw? strange that nobody has been screaming out to plug it all this time.

this feels like you have a bee in your bonnet that it is not a very 'pure' logout by some interpretation of what a "logout" should be. imho, "logout" should mean what it has always meant in the past.

tasuki · on Jan 29, 2019

Lennart, thanks for the information. Mind explaining why you chose to kill user processes on logout as the default?

poettering · on Jan 29, 2019

I think my comment above explained that already.

belorn · on Jan 29, 2019

I think tasuki is asking you to elaborate a bit further on what kind of security issues you have solved by not using SIGHUP signal. I would personally also like to hear more in-depth details, preferable with some examples of security vulnerabilities that was caused because of that POSIX design choice.

poettering · on Jan 29, 2019

Well, this boils down to: in a modern operating system, is it good design that an unprivileged user who logs in once can consume arbitrary runtime resources uncontrolled, unbounded forever, even after logout just because they decided to mask SIGHUP? I think not, I think the system should default to behaviour where unprivileged processes are clearly lifecycle bound, and when the user's sessions end they end comprehensively. I mean, other OSes don't really allow this unprivileged either, for good reasons: the lifecycle of the unpriv user's processes should be controlled by privileged code, and clearly be defined by the act of logging in and logging out in its lifetime.

It's entirely OK if the admin then opts out specific users or even all users from this behaviour, i.e. if a privileged players decides to liberalize unbounded, unlifecycled resource consumption for unprivileged players. But a default where unprivileged code can just stick around uncontrolled and consume as much as it wants forever is just a strange choice security wise.

i.e. I think the fact that SIGHUP masking is unrestricted, i.e. is not subject to privilege checks is the problem really. Something is unpriv by default that should be priv by default. And that's pretty much what this option in systemd provides you with.

mfritsche · on Jan 29, 2019

> Well, this boils down to: in a modern operating system, is it good design that an unprivileged user who logs in once can consume arbitrary runtime resources uncontrolled, unbounded forever, even after logout just because they decided to mask SIGHUP?

This was well known and accounted for where necessary. You considered everyone else to be wrong about the issue and went ahead and fixed it according to your opinion. Don't be surprised that a considerable portion of "everyone" doesn't agree with you.

pas · on Jan 30, 2019

> This was well known and accounted for where necessary.

Could you please explain that in a bit more detail?

dooglius · on Jan 29, 2019

> is it good design that an unprivileged user who logs in once can consume arbitrary runtime resources uncontrolled, unbounded forever

A unprivileged user can still do this by setting up an intermediary box that keeps a persistent ssh session open. Incidentally, this is exactly what I plan to do if I ever need to ssh into a server with KillUserProcesses=yes.

> other OSes don't really allow this unprivileged either

On Windows, if I remote desktop from a laptop into a desktop, and start a web server, then shut down the laptop, the server stays running. On iOS if I start drafting an email, and reboot my phone, I don't lose my work. On ChromeOS, my tabs will stick around after a system crash. The world is moving toward processes being _more_ persistent, not less.

arwineap · on Jan 29, 2019

Windows has a different concept for services and processes. All of your processes are killed when you logout

pas · on Jan 30, 2019

If you already have a middle box, then great, but usually malware (eg a nasty Chrome extension) likes to stick around to snoop on user activity. (Preferably on all user activity, forever.)

inferiorhuman · on Jan 30, 2019

> If you already have a middle box, then great, but usually malware (eg a nasty Chrome extension) likes to stick around to snoop on user activity. (Preferably on all user activity, forever.)

Well I'm certainly seeing why people get so frustrated with systemd junkies. Killing a "rogue" Chrome extension doesn't provide any meaningful form of security. There's no privilege escalation in play here. Whatever snooping it could do with you logged out could be done when you're logged in. Snooping on all users? Yeah, not going to happen without privilege escalation (which systemd will happily provide). So while systemd introduced this obnoxious behavior that broke all sorts of commonly used utilities no benefit was gained (except perhaps reinventing the wheel).

Meanwhile if you're worried about security don't forget that systemd has introduced a number of denial-of-service vectors (including one that results in a kernel panic) as well as an actual privilege escalation bug (which, in a fit of irony, could've been mitigated significantly by respecting return value tradition of zero = success). Take a look at the privilege escalation bug remedy, the vuln was due entirely to breathtakingly sloppy code. I'm ignoring the whole dereferencing unchecked pointers thing because that's such laughably bad practice I don't even know where to begin. Then take a look at Lennart's response and his unwillingness to mention CVEs anywhere.

The end result is that you have a combination of: breaking changes offering zero benefit, sloppy code resulting in reduced security, and a complete absence of any sort of security culture. Lennart, IBM, and systemd can claim all sorts of things (perhaps there really is a value in moving away from shell scripts) but security? No. There is absolutely ZERO merit to any claim that systemd increases security. The lack of security culture and defensive coding that permeates systemd all but guarantee future vulnerabilities.

Edit:

But wait! There's more!

https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-9445

Systemd is also remotely exploitable. Sure, no program is perfect, but most programs strive to decrease the attack surface where systemd strives to increase it.

jimrandomh · on Jan 29, 2019

> Well, this boils down to: in a modern operating system, is it good design that an unprivileged user who logs in once can consume arbitrary runtime resources uncontrolled, unbounded forever, even after logout just because they decided to mask SIGHUP? I think not, I think the system should default to behaviour where unprivileged processes are clearly lifecycle bound

If there were some way to design this so that nohup would give a permission denied error on start and tmux would give one on detach, rather than die on logout when it's too late to display a warning, that would be a lot better. There may not be a feasible way to do this, but it would solve a key part of this problem, which is that people don't find out about this behavior until something has already gone wrong, and don't find out that systemd is responsible for the behavior until after they've gotten frustrated enough to be mad about it.

belorn · on Jan 29, 2019

From a hosting perspective I understand the issue being addressed but I don't see specific problems being solved. For example, if I lock out a compromised account by locking the unix user they can still be currently logged in with running processes which I then need to manually address and kill, and they can also have cron jobs which restarts them. Services like Apache (with mpm_itk) will still change user-id to those locked users. There is no general system-wide method to declare that a user and all its connected aspects should stop being available, and therefore a compromised account must currently be handled rather individually.

What I see most companies in this industry do is use per-user virtual machines to address the issue, which completely bypasses the question about logged in and logged out. It would be interesting if the intention in current development is to give us administrators more options here and allow for cleaner handling of compromised accounts.

mruts · on Jan 29, 2019

But the point is that Linux and Unix isn't a modern operating system. It's ancient, and built upon decades and decades of work by hundreds of thousands of developers. You can't just decide to break norms handed down through the decades.

And I don't think anyone really had a problem with the old default of letting things run. Worse is better, after all. Pursuit of a perfect system will just make things too complicated, too brittle, and too obtuse.

I like Unix because it doesn't try to solve every problem. It's a libertarian operating system, if you will. Sometimes this causes problems, sure, but if the system is simple and liberal, you can always fix them without much effort.

m4r35n357 · on Jan 29, 2019

They show that you do not understand what "log out" is for in Unix.

yellowapple · on Jan 30, 2019

I feel like the kernel policy of "don't break userspace" would be a valuable one for y'all to adopt.

loudtieblahblah · on Jan 29, 2019

>You know, because we knew this would be controversial we made sure it was both a compile-time option and a runtime option.

This is standard from you. You knock the glass on the floor and blame the maid service for not cleaning up after you.

It's everyone's faults but yours.

>And yes, I still consider it a weakness of UNIX that "logout" doesn't really mean "logout", but just "maybe, please, if you'd be so kind, i'd like to exit, but not quite".

Oh how hyperbolic. Nuances and caveats in terminology is not a weakness.

I don't see why you're splitting hairs over this but can't be bothered to care about your UID numbering bug.

Or he fact systemd-resolv is responsible for DNS leaking on VPNs.

But yes, tell me more about how a functionality that enables terminal multiplexes is a "weakness"

>Now I am used to taking blame for apparently everything that every went wrong on Linux,

It's because of your smarmy, arrogance.

You break POSIX compliance, which has a real world effect in multiple areas and you accept bug reports with the humility of Donald Trump being interviewed by MSNBC.

Then when you retreat into your safe space, you play victim to the situation you created.

You talk of Linux culture toxicity, smearing the likes of Linus Torvalds, while essentially being the metaphorical sibling putting your finger in people's face repeating "I'm not touching you" over and over. Then you acted attacked when someone claps back.

You're a cry bully hiding behind a vaneer of professionalism acceptable for Red Hat's HR department which enables you to mark one more bug as "wontfix"; your attitude, your arrogance, your conceits that things not broken in fact, are so you can provide solutions no one asked for and no one benefits from.

ownagefool · on Jan 29, 2019

To be fair, at least poettering presented an argument and is responsible for software that helps a whole bunch of us get things done.

You're just kind of yelling, and it diminishes any point you may have made.

zeveb · on Jan 29, 2019

After awhile, anyone who deals with Lennart just starts yelling, because he is impossible to reason with. He's very intelligent, and absolutely convinced that his is the One True Correct Right Way. It doesn't matter than hundreds or thousands of voices oppose him; I don't think it would matter if every single human being on earth opposed him.

What makes it worse is that he's often not completely wrong. Linux did need something like PulseAudio, something like Avahi and something like systemd. But his reach exceeds his grasp (which probably applies to us all, as I've found on my own projects), which leads to the well-known problems of PulseAudio & systemd.

I don't actually want him to quit the Linux world. But I wish he would scale back his ambitions just a tad, and consider that maybe — just maybe — other people have some good points, and valid concerns.

And also Windows/DOS are not terribly good design exemplars.

SahAssar · on Jan 29, 2019

I get what you are saying, but

> It doesn't matter than hundreds or thousands of voices oppose him; I don't think it would matter if every single human being on earth opposed him.

makes it seem like everyone that uses systemd hates it or sees the same flaws as you or the other people yelling.

I and many others started admin'ing during or slightly before the systemd transition (ubuntu14->16 and rhel6->7) and have found it a much easier path to running services in a sane way than before. It was certainly possible before it, but with systemd I can do it a lot better and easier than I would have been able with previous inits.

For every person saying that systemd made things worse I expect there to be 10 silent sysadmins that appreciate what it did. I have no evidence of that, but that is my experience.

loudtieblahblah · on Jan 29, 2019

It does a lot more than manages services.

It breaks screen and tmux functionality, leaks DNS when connected to a VPN, it riddled with "wontfix" security vulnerabilities stemming from a refusal to be POSIX compliant.

Systemd replaced udev for crying out loud.

SahAssar · on Jan 29, 2019

That might be true and still not contradict what I said. A lot of the systemd critics still seem to not see what it actually did for most people using it. You're free to hate it and some of that is certainly justified, but don't assume that the contrary opinion is based on uneducated or misguided opinions.

Most of what I see/use of systemd I like. Some of it I don't, and some of it is a dumpsterfire. I think I could say the same or worse for any ambitious software project.

As for the security issues I certainly place those in the dumpsterfire category and I'd like for the systemd team to handle them better.

inferiorhuman · on Jan 30, 2019

You know what? Systemd generally works for me. Sure there's teeth gnashing at having all my userland tools upended. I've frustration at the unit file specs. But it mostly works.

That, however, does not mean that systemd is anything other than a giant fucking dumpster fire. Looking at how Lennart interacts with other Linux devs, how he reacts to bug and security reports, looking at the lack of code review and the shoddy design decisions that get baked into systemd… it appears as if systemd mostly works through sheer luck. That sort of approach may be acceptable when you're talking GNU vs X emacs, but it's absolutely the wrong approach to such a critical piece of software.

The other thing I'm missing is any improvement. All of this upheaval has been for what? Assuaging Lennart's ego? Not good enough.

> You're free to hate it and some of that is certainly justified, but don't assume that the contrary opinion is based on uneducated or misguided opinions.

When the article being discussed consistently wrongly characterizes and dismisses technical arguments against systemd I think it's fair to say it's a bit more than misguided.

> As for the security issues I certainly place those in the dumpsterfire category and I'd like for the systemd team to handle them better.

Yeah, no. Security as an afterthought is a bad approach in general but it's even worse when you're talking about low level bits like PID 1, the kernel, boot loader, etc. This right here is enough reason to run, screaming far far away from systemd.

You know the best part though? I've had plenty of frustration with upstart (especially with features they've decided to remove over the years). None of this compares to the heavy handed, anti-social bullshit that seems to engulf systemd. Hell, I recently bought a replacement laptop. I even entertained the idea of a Linux machine. Systemd and its effect on Linux on the desltop was one of the top reasons I went with another MacBook Pro.

apple4ever · on Jan 31, 2019

I agree. I love systemd as compared to the other ways (though I think launchd is pretty nice too).

Asooka · on Jan 29, 2019

So you made the default the worst possible option, because... why exactly? And now that the problem is apparent, you haven't changed the default because...? I don't know what goes through your and the rest of the systemd's team's heads, but good software engineering it is not.

vesak · on Jan 29, 2019

You've done great work as a whole, as you probably know. Try not let the lowlifes get to you.

MereInterest · on Jan 29, 2019

Absolutely. I can understand implementing this feature for some special cases, like containers that should clear all hint of a user away on log off. It should never have been the default, and breaks an entire category of software. In my standard .bashrc file, I have the following snippet to warn me if I am on a system with that stupid setting enabled.

    if which loginctl > /dev/null && loginctl >& /dev/null; then
        if loginctl show-user | grep KillUserProcesses | grep -q yes; then
            echo "systemd is set to kill user processes on logoff"
            echo "This will break screen, tmux, emacs --daemon, nohup, etc"
            echo "Tell the sysadmin to set KillUserProcesses=no in /etc/systemd/login.conf"
        fi
    fi

kokada · on Jan 29, 2019

Thanks, now I know why Emacs daemon keeps delaying my restarts in the system (just discovered that NixOS defaults KillUserProcesses to false).

Turning this on to true, for me it does no make sense to a user service (yeah, I run emacs as a user's systemd service) to keep running after I logout of my system.

P.S.: And the fact that for some people this behavior makes sense is why I think Lenart decision to put this as an option makes sense.

MereInterest · on Jan 30, 2019

I'm glad that it helped resolve your issue, though I still don't think it was an appropriate choice for a default. I tend to do most of my work on a remote server, using tmux and emacs daemon to pick up right where I left off in the case of a dropped connection. That systemd would terminate my process when I explicitly requested it not to be is very abnormal.

pas · on Jan 30, 2019

You haven't requested systemd, you started a user scope, and haven't started a service for what you need.

POSIX is nice, but rather lacking in certain aspects, such as security anf administration-friendliness. cgroups help with both, but people have to understand them and use them well.

MereInterest · on Jan 30, 2019

Handling and ignoring SIGHUP is the explicit way to indicate that a program should not be terminated. That systemd invented a new category and then ex post facto declared that everybody else was wrong for not using it is ridiculous. Systemd changing behavior such that I must "Simon says nohup" is completely asinine.

jimrandomh · on Jan 29, 2019

Systemd developers, if you're reading this: this isn't the sort of bug where people grumble for awhile and then get over it, because things are still broken, and the workaround being circulated (KillUserProcesses=no) doesn't fully work. (https://github.com/systemd/systemd/issues/8486) As long as people continue to encounter this issue anew--and they still are--people will be angry at the systemd maintainers.

avar · on Jan 29, 2019

The bug you've linked to was closed[1] by the reporter with "Thanks for the clarification guys. Much appreciated!", after it was pointed out to them that something they were trying ho do with "KillUserProcesses=no" was better done in another way.

1. Edit: Not literally closed by the reporter. Lennart Poettering closed it, "closed by the reporter" as in "the issue was resolved to the reporter's satisfaction".

dvfjsdhgfv · on Jan 29, 2019

> The bug you've linked to was closed by the reporter

Are we reading the same bug report? The one I'm looking at was closed by the creator of Systemd.

MertsA · on Jan 29, 2019

That comes down entirely to how systemd is configured. If you don't like what your chosen distro has picked as the default then complain to them. systemd didn't force anyone's hand on the subject, they just added the feature. It's a pretty natural design choice IMHO. When I want to log out, I don't want to let some hung up daemon keep running just because it wasn't able to process the SIGHUP sent to it.

How else do you propose to make sure that when I log off my ssh-agent is really terminated and not just locked up with my keys still in memory? The POSIX approach is insufficient, there's no way to know if a process received a signal and chose to ignore it and keep running or if it received a signal but it was deadlocked and kept running.

zrm · on Jan 29, 2019

The problem is that you're breaking compatibility by changing the default. It's one thing to add a feature that can solve a problem. It's something else to break existing programs that don't use it.

If you're not going evaluate each individual program to determine whether the new behavior is appropriate then it should be opt-in rather than opt-out. Then ssh-agent and anything else that knows it should be forcefully killed can opt-in without breaking other innocent programs.

deno · on Jan 29, 2019

So you think backwards compatibility is so important that we should keep old BROKEN and INSECURE behavior just for the sake of not inconveniencing few power users with technical knowledge to override it? Instead those few loudest complaining should be catered to and regular users left for the wolves…

I think some people sometimes lack any perspective on the topic.

wokwokwok · on Jan 29, 2019

Yes.

I’m not being emotional about it, just irritated.

Systemd has tangibly caused me to lose work with tmux; I appreciate there are root causes for this, but frankly, if some piece of someone’s code does that, for whatever reason that is beyond my control to immediately stop using it...

...it feels justified to be annoyed.

How do you suggest an alternative meaningful response would look?

Create my own distribution?

What tangible and meaningful alternatives do I have other than encouraging people not to use systemd?

deno · on Jan 29, 2019

> Create my own distribution?

> What tangible and meaningful alternatives do I have other than encouraging people not to use systemd?

Sure, if you think you can actually “test every single program and make everything opt-in.” I think you will however find that making everyone happy and having new features are just simply contradictory by the very definition. At some point you will want new stuff and you’ll have to break something.

The best you could do is adopt BSD’s model and fork tmux and other userland and ship outdated/patched versions. It’s a ton of work, of course.

I am not actually seriously suggesting you create your own distro, after all you can probably just fix the annoying issue with systemd and move on with your life, and Systemd actually makes it easy for your by making it a configuration switch and supporting the non-default workflow.

I am simply suggesting you put yourself in the position of someone that has to make those decisions and really think about it from that perspective. Everything’s always a trade off.

inferiorhuman · on Jan 29, 2019

> I am not actually seriously suggesting you create your own distro, after all you can probably just fix the annoying issue with systemd and move on with your life, and Systemd actually makes it easy for your by making it a configuration switch and supporting the non-default workflow.

Given the extraordinary scope of systemd, what happens with the next major issue? Having to perpetually work around poorly designed software is infuriating.

> I am simply suggesting you put yourself in the position of someone that has to make those decisions and really think about it from that perspective. Everything’s always a trade off.

Why should the onus be on the end user? Perhaps the distributions should be making choices that are less antagonistic of their users (e.g. upstart instead of systemd).

You're right about the tradeoffs though, and one of the tradeoffs for buying into systemd is angry users.

deno · on Jan 29, 2019

> Given the extraordinary scope of systemd, what happens with the next major issue? Having to perpetually work around poorly designed software is infuriating.

Systemd doesn’t break stuff if they just feel like it. Everything is compatible if it can be, for example you can still run /etc/init.d scripts and manage them through systemd on Debian. Lingering processes are also still supported! It’s a configuration switch that most distros decided to turn on by default, because...

> Why should the onus be on the end user? Perhaps the distributions should be making choices that are less antagonistic of their users (e.g. upstart instead of systemd).

... it’s a net benefit to most users. It’s only “antagonistic” to a particular subset of powerusers perfectly capable of working around the issue but somehow more motivated to loudly complain about it on Internet.

> You're right about the tradeoffs though, and one of the tradeoffs for buying into systemd is angry users.

Fair deal if it helps with even 0.1% desktop market share.

erik_seaberg · on Jan 29, 2019

> particular subset of powerusers perfectly capable of working around the issue

What is the actual workaround? Is there a patch that unbreaks nohup by passing cwd and env to systemd-run --user or something?

deno · on Jan 29, 2019

Take a look at this: https://github.com/tmux/tmux/issues/428

erik_seaberg · on Jan 29, 2019

I see arguing but no consensus on what ought to be done.

My use case: I run a shell pipeline that will probably take all weekend to finish. On a POSIX box I start it with nohup. What do I do on a systemd box? Does nohup need a patch that doesn't exist yet?

MertsA · on Jan 30, 2019

There's a couple ways to work around the issue, you can just configure systemd to not kill processes that were in the user scope when the user scope is closed in which case it behaves exactly as it did before. Or if you want to keep systemd cleaning up hung applications but not e.g. some script that you typically ran with nohup you can just use systemd-run instead.

https://www.freedesktop.org/software/systemd/man/systemd-run...

In particular you'd probably want --user so that it runs it under your user instance of systemd and --scope so that it's all run under a scope for that command instead of just a transient service. For most uses of nohup you could literally just make it an alias for systemd-run --user --scope instead.

inferiorhuman · on Jan 30, 2019

I expect that the formal answer is that you should be running that within the service framework (be it systemd or other). My answer is: if you want POSIX-like behavior don't run it on Linux.

rauhl · on Jan 29, 2019

SIGHUP isn’t broken & insecure: it works, and it is secure. Processes which don’t want to handle the hangup signal are terminated, and processes which want to ignore it do.

MertsA · on Jan 30, 2019

But this just isn't the case. If something stays around after receiving SIGHUP, it was probably because that application intended to do so but it could also just be a hung up application that one way or another is going to stay around until it's killed. Sending a signal doesn't give you any sort of feedback to see if you're waiting for the application to close or if the application shouldn't be closed. Signals alone are insufficient.

deno · on Jan 29, 2019

Tell me more about this perfect world with no bugs and nondeterministic behavior.

mruts · on Jan 29, 2019

Well, there are some pretty severe restrictions on the type of code you can put into signal handlers. Only atomic operations are allowed. And, in my experience, almost all applications react appropriately to signals.

MertsA · on Jan 30, 2019

>Well, there are some pretty severe restrictions on the type of code you can put into signal handlers.

Err... Maybe I'm missing something but I don't believe that's the case. There's a lot of things that you shouldn't do inside of a signal handler that will exhibit undefined behavior, but it's not like the kernel puts any restrictions on what the application can do inside of a signal handler. If an application wants to make SIGHUP just call whatever existing application exit logic they already have, they can. It's a terrible idea because if the application was signalled in the middle of some library call then it's anyone's guess as to whether or not it's just going to crash but that doesn't mean that you can't do it.

I think you're underestimating the difficulty of gracefully shutting down an application in a signal handler. If it's waiting for the application to finish some operation it's stuck in it'll just do the exact same thing as using nohup and there's no way to know that outside of the application.

zrm · on Jan 30, 2019

If an application is handling SIGHUP then it presumably intends to continue running. If it used systemd-run instead, it could still get into a bad state at any point thereafter and you have the same problem. Even using a watchdog couldn't fix every buggy application, because there are ways for an application to crash or misbehave yet continue to send the watchdog notification. We still haven't solved the halting problem.

Meanwhile if the process isn't handling SIGHUP then there is little chance of undefined behavior in the default handler, which merely terminates the process immediately.

MertsA · on Jan 31, 2019

>If an application is handling SIGHUP then it presumably intends to continue running.

That's not correct, for stuff running in the user's scope more often than not a SIGHUP handler is just to gracefully exit the application. I.E. close any open files, finish any writes in process, etc.

But also, you don't know what the SIGHUP handler does to begin with. That's the crux of the problem. Outside of the process the SIGHUP handler is just a black box.

>If it used systemd-run instead, it could still get into a bad state at any point thereafter and you have the same problem.

No, if it was started with systemd-run there's no SIGHUP sent to it in the first place. Reaping applications that won't close in the user scope isn't about preventing them from breaking in the first place, it's just sweeping up the broken pieces so that it doesn't break the next user scope because it's still holding some exclusive lock on something.

It's like putting the user session into its own container. It doesn't fix anything, it just keeps the breakage contained to the user's scope so that when you log out, it really does shut down that "container".

zrm · on Feb 4, 2019

> That's not correct, for stuff running in the user's scope more often than not a SIGHUP handler is just to gracefully exit the application. I.E. close any open files, finish any writes in process, etc.

That's essentially the same thing, and the application would have to do something similar to protect itself.

Suppose the user would lose data if the application doesn't exit gracefully, but this may take a variable amount of time depending on how much unsaved data there is, current load on the machine, etc. So it handles SIGHUP, continues running to save its state, but hasn't finished before systemd kills it.

To prevent this it would have to use systemd-run to preserve itself long enough to finish saving its state, and we're back to square one again. Or it doesn't do that and the user loses data.

deno · on Jan 29, 2019

When they work, sure. And when they don’t the user is wondering why his laptop is playing sounds when she’s logged out. Systemd’s solution is the right one from technical POV. No need to hope applications cooperate when you can just ask the kernel to make sure they do.

EpicEng · on Jan 29, 2019

>I think some people sometimes lack any perspective on the topic.

Apparently you think Linus is one of those who "lack perspective"?

http://lkml.iu.edu/hypermail/linux/kernel/1711.2/01701.html

I get that systemd isn't the kernel, but it's close enough. There are many who would agree that breaking existing behavior in the name of security isn't wise. I have also not yet seen anyone point out specific security issues this solved. Unix has worked this way for a long time.

deno · on Jan 29, 2019

User launches voice chat, logs out, application stays around and listens on user/other users. Just one example. Having programs running despite being logged out is unintuitive and wrong. Most users do not know or care about going into a task manager. And if you want Linux to ever have a chance to succeed on desktop, they shouldn’t have to.

As to the Linus’ post, if you want to argue that there wasn’t enough notice about this change, then that’s fine, but this isn’t what anyone here is arguing.

Also it’s a configuration switch, any distribution could have decided to revert it or postpone it at their choosing.

dooglius · on Jan 29, 2019

What on earth is broken or insecure about not killing processes?

deno · on Jan 29, 2019

You watch porn, log out, but mpv is somehow stuck and still playing. Broken enough?

eadmund · on Jan 29, 2019

This, right here is an example of what those who oppose systemd mean when we say that it's monolithic.

What gives the init system the right or the duty to reach down into a user's processes and determine[0] that they are stuck (versus running appropriately, as e.g. the user indicated with nohup(1))? Why is it the init system's job to handle that?

That's just not its job. If I wanted to run some sort of misbehaved-process killer, I could. Or, y'know, not running misbehaving processes. Ideally, that would include not running misbehaving processes like anything from the systemd project.

0: or, as in systemd's case, blindly assume

the_why_of_y · on Jan 29, 2019

KillUserProcesses is enforced not by systemd (PID 1) but by systemd-logind.

deno · on Jan 29, 2019

> What gives the init system the right or the duty to reach down into a user's processes and determine[0] that they are stuck (versus running appropriately, as e.g. the user indicated with nohup(1))? Why is it the init system's job to handle that?

If this behavior was mandated by some other piece of software named FluffyUnicorn and had nothing to do with Lennart, but was still widely adopted just as systemd is, would you be ok with it?

It’s in systemd because it makes sense to be there. Systemd already groups services into cgroups so it makes sense to also do that for user sessions.

> That's just not its job. If I wanted to run some sort of misbehaved-process killer, I could. Or, y'know, not running misbehaving processes. Ideally, that would include not running misbehaving processes like anything from the systemd project.

So toggle a configuration switch on your system. What you are actually trying to do is to FORCE this bad and confusing behavior as a DEFAULT on regular users that have no need or want for it.

zrm · on Jan 29, 2019

> If this behavior was mandated by some other piece of software named FluffyUnicorn and had nothing to do with Lennart, but was still widely adopted just as systemd is, would you be ok with it?

If this behavior was mandated by some other piece of software, it wouldn't be as widely adopted as systemd is.

That's the true problem with systemd. It tries to do everything and does 80% of it well enough that many people use it, but then is too complex and integrated with itself to easily identify and carve out the problematic bits and replace them with third party alternatives.

deno · on Jan 29, 2019

> If this behavior was mandated by some other piece of software, it wouldn't be as widely adopted as systemd is.

So your argument is that this is forced on people because of systemd’s political power?

There’s a configuration option to reverse this behavior, it’s not hidden away somewhere, it’s been widely publicized. Any distro could have flipped the switch and easily reverted to preserve backwards compatibility, but none did. This is because this change is a net benefit to the majority of users.

> That's the true problem with systemd. It tries to do everything and does 80% of it well enough that many people use it, but then is too integrated with itself to easily identify and carve out the problematic bits

Again, you don’t need to fork systemd to change this behavior. If that was the case I would understand the criticism. But that is not the case. The alternative workflow is perfectly well supported. All we’re arguing about is the defaults. Systemd developers go out of their way to not break things.

You’re arguing for making up some abstraction layers for plug-n-play components that no one is demanding, and would probably never be used. Modularity has a cost, and not only that, but you also have to know where to draw the line between core and addon.

And if systemd actually did all of that, I’m pretty sure all those habitual complainers would just argue that it’s over-engineered and should have been kept simple. You can’t win with the peanut gallery.

zrm · on Jan 29, 2019

> Any distro could have flipped the switch and easily reverted to preserve backwards compatibility, but none did.

No, many of them did. The problem is that this is not the only such issue, and distribution maintainers don't have unlimited time and resources to re-evaluate every individual default chosen by upstream, so most of the upstream defaults end up in the distributions. The distributions can fix this once you identify the problem, as e.g. Debian has done, but "you can change it" is no argument for a bad default, because changing it is work in the meantime things are broken.

> Again, you don’t need to fork systemd to change this behavior. If that was the case I would understand the criticism. But that is not the case. The alternative workflow is perfectly well supported. All we’re arguing about is the defaults.

If the defaults weren't important then why are you arguing about them?

> Systemd developers go out of their way to not break things.

Yet tmux and screen are broken on the distributions that use upstream's default.

> You’re arguing for making up some abstraction layers for plug-n-play components that no one is demanding, and would probably never be used. Modularity has a cost, and not only that, but you also have to know where to draw the line between core and addon.

You say that as if it wasn't the way everything works in many other init systems. The init system doesn't typically have a DNS server, you can use dnsmasq or BIND or unbound or djbdns or whatever you like. It doesn't have its own cron, there are many choices and you can choose any of them.

And just drawing any hard lines would help. Even if you had to replace two modular components to replace one thing, or one component that does two things when it should be one, that's certainly a lot more feasible than having to understand and touch thirty integrated pieces to replace one component.

deno · on Jan 29, 2019

> The problem is that this is not the only such issue, and distribution maintainers don't have unlimited time and resources to re-evaluate every individual default chosen by upstream, so most of the upstream defaults end up in the distributions.

Well they should. Otherwise, what’s the point of them?

> Yet tmux and screen are broken on the distributions that use upstream's default.

Of their own volition. And btw, distributions could patch them to work with systemd. None of this is systemd’s fault. Since when is it upstream’s job to make sure downstream properly integrates their software?

> The init system doesn't typically have a DNS server

There’s no DNS server in systemd core. It just lives under the same umbrella. Do you know FreeBSD has DNS server in the same repo as kernel? Does it mean it has a DNS server in the kernel? You know perfectly well that this is just plain false.

> It doesn't have its own cron, there are many choices and you can choose any of them.

Why would you need “many choices” for a simple timer? What are you going to do, invent new type of time?

Anyway, you’re completely ignoring the other perspective on this. Because old style init did so little and so poorly, cron used to be a de facto service manager. Also don’t forget inetd. So you had duplicated, poorly implemented, but nevertheless, redundant functionality in several separate systems. How is systemd’s approach not both less complex and much more sane?

> And just drawing any hard lines would help. Even if you had to replace two modular components to replace one thing, or one component that does two things when it should be one, that's certainly a lot more feasible than having to understand and touch thirty integrated pieces to replace one component.

Why? If you can’t point to where the line is then what’s the point. It’s like saying you want cars to be more modular, so let’s just arbitrarily invent a “motor carriage[1].”

You could replace the engine without the coach, wouldn’t that be swell?

Anyway most of systemd’s components communicate over a common system bus. You could provide alternatives just by speaking the same API.

[1] Sorry, I’m not a native speaker; I mean this: https://en.wikipedia.org/wiki/Coach_(carriage) but with an engine instead of horse

zrm · on Jan 29, 2019

> Well they should. Otherwise, what’s the point of them?

If the distribution is supposed to micromanage everything from upstream then what's the point of upstream?

> Of their own volition. And btw, distributions could patch them to work with systemd. None of this is systemd’s fault. Since when is it upstream’s job to make sure downstream properly integrates their software?

Since when does everything have to integrate with the init system at all?

> There’s no DNS server in systemd core. It just lives under the same umbrella.

It isn't a matter of which repository it's in, it's a matter of how much work it is to swap it out. Can I just run dnsmasq or dnscache and change an IP address somewhere, or do I actually have to change the code because it's expecting something more than a general purpose DNS resolver?

> Why would you need “many choices” for a simple timer? What are you going to do, invent new type of time?

An existing implementation has poor code quality and I can do better, but my new implementation is less feature complete, so some people prefer the one with more features while others prefer the one that has fewer bugs and uses less memory etc. etc.

> Because old style init did so little and so poorly, cron used to be a de facto service manager. Also don’t forget inetd.

Which they still are, because they're still there and there is nothing stopping people from using them in that way as ever.

But runit et al don't require that either, so let's not pretend that there is no third way.

> Why? If you can’t point to where the line is then what’s the point.

Your argument was that it's hard to know where to draw lines. But it's more important that you draw them somewhere than the specific place where you choose to draw them. Otherwise everything mushes together into a single piece of spaghetti that can't be disentangled from itself.

> Anyway most of systemd’s components communicate over a common system bus. You could provide alternatives just by speaking the same API.

Where are the RFCs for these APIs, so that I can write my application against the spec and be assured that it will continue to work against future versions of the software on the other end?

deno · on Jan 29, 2019

If you don’t like systemd so much then write something better. I mean you’ll find literally anything to dislike about it, I don’t get it. You can still use cron or rsyslog if you like. Or don’t use systemd. This is stupid. I’m done. The default makes sense for 99.99999% of users, literally the only point I was trying to make.

zrm · on Jan 30, 2019

> If you don’t like systemd so much then write something better.

Writing something better doesn't get rid of the dependencies other projects now have on pieces of systemd, which pieces then have dependencies on other pieces until you need the whole thing.

> I mean you’ll find literally anything to dislike about it, I don’t get it.

This thread is about one specific complaint: It has too many interdependencies without well-specified stable interfaces between them, and actively encourages things to take on more of them, as with replacing SIGHUP handling with systemd-run.

> The default makes sense for 99.99999% of users, literally the only point I was trying to make.

This doesn't make any sense. Most applications don't handle SIGHUP and are terminated by the default handler. Applications that do handle it continue to run. If they used systemd-run instead they would also continue to run. Where is the benefit from forcing applications to do something systemd-specific and breaking existing things that don't?

zeveb · on Jan 29, 2019

> What you are actually trying to do is to FORCE

It's a rule: if you're advocating systemd, you don't get to accuse anyone else of forcing anything.

deno · on Jan 29, 2019

What do you disagree with in that sentence? There are defaults, distros have defaults, they’re the subject of this discussion. Anyone arguing for any default is likely dictating the de facto behavior for majority of nontechnical users, which is the majority of users period.

dooglius · on Jan 29, 2019

If I've nohup'd mpv or put it in a tmux shell, then that is the behavior I want. For instance, if I ssh into a controller for a home entertainment system to kick off a video, then this would be exactly what I want.

deno · on Jan 29, 2019

Then you can toggle one simple configuration switch, instead of forcing confusing behavior on the other 99% of users that don’t want or need it.

Take a step back and consider if say Windows did it like that, wouldn’t you agree it is broken?

dooglius · on Jan 29, 2019

> Then you can toggle one simple configuration switch

Only if I have root permissions (granted, I probably wouldn't be watching porn on a machine I wasn't admin on but that was just an example application).

> instead of forcing confusing behavior on the other 99% of users that don’t want or need it

Who is forcing users to run programs with nohup or tmux shells?

> Take a step back and consider if say Windows did it like that, wouldn’t you agree it is broken?

I'm pretty sure Windows does do it like this; if I were to remote desktop into a Windows box and start playing a video, it should keep playing even if I disconnect, reconnect, and log back in. It does this for normal applications, at least, though videos are a special enough case where it might be accelerating with the remote GPU.

MertsA · on Jan 31, 2019

>Only if I have root permissions (granted, I probably wouldn't be watching porn on a machine I wasn't admin on but that was just an example application).

It doesn't take root to do so, in most cases you probably still want to run the transient scope under your user so you'd use systemd-run --user in order to create it not with the main system instance of systemd but with the user level instance of it.

>I'm pretty sure Windows does do it like this

No it doesn't, as for your remote desktop example you can have the exact same behavior on Linux with systemd reaping user scopes by just using a VNC server. Windows is different in that when logging off it won't allow you to while an application is still running. It gives you the choice to either stop and go back to whatever application isn't closing (because you have unsaved work or something) or to kill it.

flukus · on Jan 31, 2019

> It doesn't take root to do so, in most cases you probably still want to run the transient scope under your user so you'd use systemd-run --user in order to create it not with the main system instance of systemd but with the user level instance of it.

If a non-root user can do it and leave a program running then doesn't that invalidate all that BS about security?

MertsA · on Feb 1, 2019

None of this is about trying to prevent the user from using resources. The user is the one who is logging out in the first place. If the user wants to terminate all of their processes except for one daemon they can do that. The security benefits aren't the primary benefit, security wise all you gain is that after you log out there's no chance that anything with any sensitive information is still hanging around. I mentioned ssh-agent as an example but you could also have stuff like maybe chrome didn't close on SIGHUP and as a result maybe this makes your saved passwords accessible to someone who can dump the RAM later by getting physical access to it. It definitely helps security but it's not really that big of a deal.

Ironically enough when I went to Google to search for an example the result that came up was my comments on HN on the same subject from a year and a half ago.

https://news.ycombinator.com/item?id=14735145

Here's a great example of the kind of real life breakage that reaping the user scope on logout actually fixes.

https://bugs.freedesktop.org/show_bug.cgi?id=94508

deno · on Jan 29, 2019

> Only if I have root permissions (granted, I probably wouldn't be watching porn on a machine I wasn't admin on but that was just an example application).

If you’re not an admin you probably prefer the systemd default. OTOH if you do need to run tmux between sessions you probably have root as well.

> Who is forcing users to run programs with nohup or tmux shells?

You’re forcing confusing behavior (media playing despite logging out) on unsuspecting users. This is unintuitive to to nontechnical users, and just “wrong” to most that know the reasons behind it. I haven’t heard any good technical argument for keeping this behavior, only that it should remain like that because a minority is used to it. Though you’re welcome to change my mind.

> I'm pretty sure Windows does do it like this; if I were to remote desktop into a Windows box and start playing a video, it should keep playing even if I disconnect, reconnect, and log back in.

If you connect and disconnect you are not necessarily logging out, it’s equivalent to locking the session, which does keep music playing on Linux/systemd, and btw even offers MPRIS2-based media control right on the lockscreen, at least for Plasma.

Also it can pause the music if you log in concurrently as a different user. This is because systemd (and PolKit) have a very sophisticated seat management built in. For example it treats you differently if you log in remotely or have a seat right at the console. It can offer different authentication mechanisms and permissions (e.g. you need root/admin to shutdown the machine remotely, but don’t if you’re physically at it). All of this is possible and configurable thanks to the work of Lennart and others.

The question at hand is only whether you make the default the behavior that makes sense to 99% of regular users or to the few loudest.

pas · on Jan 30, 2019

complain to distros then. (systemd set the secure default, even if that breaks backward comp, as usually upstreams do, when it comes to security.)

or better yet, read the release notes, it likely mentions this breaking change. (if not, that's a bug.)

zrm · on Jan 30, 2019

> systemd set the secure default, even if that breaks backward comp, as usually upstreams do, when it comes to security.

Breaking compatibility is generally avoided to the utmost. Even security-sensitive things like TLS continue to support older, less secure versions to retain compatibility with peers that haven't been upgraded yet, much to the chagrin of everyone when they screw up the version negotiation, but better than the chicken and egg problem where nobody can upgrade until everybody has.

But the other point is that the claimed security improvement doesn't actually seem to be there in this case. They haven't made it so you can't have a program continue to run after the end of the current session, they've only changed what you have to do to make that happen, thereby breaking everything that did it the traditional way.

semi-extrinsic · on Jan 29, 2019

If only there was a way for the system init program to identify and keep a list of processes it has spawned, you could imagine like a unique numerical Process ID, and then if there was a program that could check the Process Status, and another that could kill the process identified by this... PID with increasing levels of aggressiveness...

wbl · on Jan 29, 2019

PIDs get reused so this doesn't work well.

MertsA · on Jan 29, 2019

He's sarcastically alluding to systemd's approach at solving this.

wbl · on Jan 29, 2019

If a process doesn't handle SIGHUP it dies. So all the daemon has to do in that case is nothing.

MertsA · on Jan 29, 2019

If a process doesn't set its own SIGHUP handler it dies. If it does in order to gracefully handle shutting down but it's deadlocked then there's no feedback as to whether or not the process actually finished handling the signal.

inferiorhuman · on Jan 29, 2019

So the answer to your hypothetical deadlock is to break everything else? What kind of complex and graceful shutdown does ssh-agent really need?

MertsA · on Jan 30, 2019

>So the answer to your hypothetical deadlock is to break everything else?

It's not a hypothetical situation, everyone on here has seen applications hang and have to be terminated. SIGHUP handlers are no different in this regard.

>What kind of complex and graceful shutdown does ssh-agent really need?

That's a straw man argument, and the whole point of SIGHUP in the first place instead of just some "persistence" bit set per process is because for real world applications it's not as simple as just kill -9 to stop a process. But for ssh-agent in particular it needs to go through and unlink the socket that it binds to on startup. More to the point it also has to go through and close every PKCS11 provider that is registered which means calling functions that aren't even in openssh to begin with so who knows if some PKCS11 provider will hang during that.

DyslexicAtheist · on Jan 29, 2019

wasn't GP specifically mentioning user processes and not system daemons? e.g. for daemons it's perfectly expected behavior to not shut down on SIGHUP. Apache, and other system daemons would re-read configuration files when receiving SIGHUP (as a way to reduce downtime during config updates).

inferiorhuman · on Jan 29, 2019

> How else do you propose to make sure that when I log off my ssh-agent is really terminated and not just locked up with my keys still in memory?

Perhaps with a signal handler?

pas · on Jan 30, 2019

That was the nice and friendly POSIX way, turns out it's really convenient for malware to stick around that way. Now user session isolation and termination works (cgroups), but it of course breaks backward comp.

cout · on Jan 29, 2019

I agree that on Linux-based systems, SIGHUP is a reasonable mechanism for killing processes when a user closes an ssh session, and that ignoring SIGHUP is a reasonable way to avoid getting terminated.

I disagree that POSIX says that processes should expect a SIGHUP when a user logs out (SIGHUP means the controlling terminal was closed). I am not at all a POSIX expert, so please correct me if I misunderstand, but afaict POSIX explicitly does not specify what happens to the controlling terminal when a user logs out (http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_...):

> POSIX.1 does not specify how controlling terminal access is affected by a user logging out (that is, by a controlling process terminating). 4.2 BSD uses the vhangup() function to prevent any access to the controlling terminal through file descriptors opened prior to logout. System V does not prevent controlling terminal access through file descriptors opened prior to logout (except for the case of the special file, /dev/tty). Some implementations choose to make processes immune from job control after logout (that is, such processes are always treated as if in the foreground); other implementations continue to enforce foreground/background checks after logout. Therefore, a Conforming POSIX.1 Application should not attempt to access the controlling terminal after logout since such access is unreliable. If an implementation chooses to deny access to a controlling terminal after its controlling process exits, POSIX.1 requires a certain type of behavior (see Controlling Terminal ).

newnewpdro · on Jan 29, 2019

There is no NOHUP signal, you're referring to SIGHUP.

See the enable-linger option for loginctl and KillUserProcesses for logind.conf. KillUserProcesses was set to default enabled on 4/9/2016, prior to that it didn't happen, but was configurable if desired. So you were always able to change the config to restore the previous behavior from the moment the default turned it on.

Edit:

Here is the commit where it happened

https://github.com/systemd/systemd/commit/97e5530cf2076a2b4f...

JdeBP · on Jan 30, 2019

> So you were always able to change the config to restore the previous behavior from the moment the default turned it on.

No, you were not.

The thing that people are missing here is that neither of the systemd-logind behaviours, with KillUserProcesses=yes or KillUserProcesses=no, is the long-standing behaviour of kernel login sessions all of the way back to 7th Edition that nohup, tmux, screen, emacs --daemon, mosh-server, deluged, and more all interoperate with.

The behaviour of kernel login sessions is that end of login session is a HUP signal to the session leader, and that termination of the entire TTY login service (such as at system shutdown) is a TERM signal to everything followed by a KILL signal to everything then remaining.

The systemd-logind session behaviour with KillUserProcesses=no is no signals at all at the end of the login session, and at termination of the TTY login service both HUP and TERM signals together then KILL signals, to everything.

The systemd-logind session behaviour with KillUserProcesses=yes is both HUP and TERM signals together then KILL signals, to everything, both at login session termination and at TTY login service stop.

As I pointed out years ago, the fix is to make systemd-logind use KillUnit at hangup and StopUnit at service termination, actually providing the conventional behaviour which it currently does not in any mode and addressing the original problems (with some background GNOME utilities in a login session that were never being sent a HUP signal at logout and would have exited had they been) that motivated this whole mechanism in the first place.

* https://news.ycombinator.com/item?id=12335128

* https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=825394#221

* https://news.ycombinator.com/item?id=11798604

mirimir · on Jan 29, 2019

I just checked a few Debian stretch boxes that I setup, and "KillUserProcesses=no" is set on them all. And until a few minutes ago, I didn't even know to check.

So how can it be the default?

shittyadmin · on Jan 29, 2019

If you comment out that line it'll be on by default - Debian fixed it for you with their own default configuration file, because 99% of their users would only be annoyed by it.

This is why we have distro vendors, to build a system that works in the real world with software from developers with opinions that... differ to say the least.

coryrc · on Jan 29, 2019

Debian maintainers make many improvements to upstream and only rarely mess up (ssh key generation).

jimrandomh · on Jan 29, 2019

I meant SIGHUP. Edited.

cryptonector · on Jan 29, 2019

Eleven years earlier when SMF was added to what would eventually be Solaris 10, we had this same problem. Some of us had to drop everything to fix "bugs" in cron, sshd, ... introduced by SMF.

Systemd is basically SMF, done poorly, because NIH.

jesuslop · on Jan 29, 2019

Is there a daemonization API as such? I think there was only the "way of doing" shown in man 7 daemon.

JdeBP · on Jan 29, 2019

The systemd people have their own version of that manual page.

* https://freedesktop.org/software/systemd/man/daemon.html

IBM was explaining what to do back in 1995.

* http://jdebp.eu./FGA/unix-daemon-design-mistakes-to-avoid.ht...

masonic · on Jan 29, 2019

  killing user processes on logout

By "killing", do you mean some other signal than (or in addition to) SIGHUP? Does it send SIGKILL?

seba_dos1 · on Jan 30, 2019

That's the whole issue here. It does.

JdeBP · on Jan 29, 2019

Also two years ago, I explained how one could make this work, by having logind use KillUnit at hangup and StopUnit at shutdown.

* https://news.ycombinator.com/item?id=12335128

* https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=825394#221

majewsky · on Jan 29, 2019

> killing user processes on logout (rather than send them the SIGHUP signal, as POSIX says should happen)

TIL what nohup(1) is for.

zbentley · on Jan 30, 2019

Sort of. While it's debatable when SIGHUP should be sent as part of controlled system logout/whatever, the signal itself was originally used upon abrupt disconnection (hang up) of the controlling terminal of a program.

xyzzyz · on Jan 29, 2019

Systemd's response was to say that they should incorporate systemd's library, and use systemd's new daemonization API.

By "use systemd's new demonization API" you mean, instead of

$ screen

systemd asks you to write

$ systemd-run --scope --user screen

instead. Annoying to have to learn a new thing, but hardly the unbearable burden.

On the other hand, when you're an impacted user who's lost work, and researching the bug leads you to a years-old discussion in which someone is actively denying that the bug exists and refusing to fix it, that's infuriating.

Because it's a bug for some, and intended behavior for others. Look, you make it as if they introduced a bug on purpose to screw with some people. It's clearly not the case, there was a specific tradeoff involved.

shakna · on Jan 29, 2019

> Because it's a bug for some, and intended behavior for others. Look, you make it as if they introduced a bug on purpose to screw with some people. It's clearly not the case, there was a specific tradeoff involved.

They broke userland.

It doesn't matter what tradeoff they made - they went against POSIX behaviour, and as a result, broke numerous utilities, both past and future.

Let's say that again - systemd introduced breaking behaviour on userland, against POSIX, and instead of backing down and allowing for expected and specified behaviour, they said it's everyone else's problem.

That is neither professional, nor responsible.

When you make a mistake, a mistake that breaks the behaviour of POSIX, and POSIX utilities like _cron_, you apologise, and fix the problem.

You don't turn around and say that all the sysutils should incorporate your new idea.

poettering · on Jan 29, 2019

First of all, as mentioned above, we made this compile-time as well as runtime-configurable, so that downstream distros can choose whether they want to make this opt-in or opt-out. Hence blame your distros if you picked it in a way you didn't like.

Moreover, this doesn't affect cron at all. Cron creates its own PAM session for each job it runs which means those jobs are independent from any real login session (i.e. ssh, graphical, tty login), and thus also don't get cleaned up by them.

This affected stuff that is forked off a login session and then stays around as "orphan" if you so will, i.e. with all session resources released, except for these processes that try hard to avoid clean-up (usually by double forking + detaching explicitly from any TTY/ignoring SIGHUP).

MereInterest · on Jan 29, 2019

As many, many others have stated, ignoring SIGHUP is not a way to "avoid clean-up". It is the explicit and intended method that a program should use to indicate that it should not be cleaned up.

youdontknowtho · on Jan 29, 2019

This has more to do with feelings about you and the perception of you as a "bad guy" than it does about the technical discussion.

I tend to agree with the idea that the choice of defaults belongs to the distro's. If the distro's are deferring to the upstream project on default settings for a critical system component then they need to be more thorough and validate what they are shipping.

v_lisivka · on Jan 29, 2019

Maintaining of all these special cases requires lot of knowledge. If maintainer is responsible for just systemd package, then it's not a problem, but when number of packages per maintainer is measured in hundreds, maintainer will stick to defaults, unless users will complain loudly enough to sacrifice whole working day on the problem.

Redoubts · on Jan 29, 2019

> Maintaining of all these special cases requires lot of knowledge.

Distro maintainers need to have a lot of knowledge about their init system. There's no way out of that. It's probably something everyone should know a little about as well.

inferiorhuman · on Jan 29, 2019

> Distro maintainers need to have a lot of knowledge about their init system. There's no way out of that. It's probably something everyone should know a little about as well.

Then maybe the init system should be simpler and not attempt to ingratiate itself with UEFI or attempt to replace su, sudo, syslogd, netcat, resolvconf, etc.

zbentley · on Jan 30, 2019

> They broke userland.

That alludes to kernel development, which systemd is largely uninvolved with. A userland program chosen by various distributions failed to support conventions from a different userland program. That's all. Were the programs involved fundamental and highly important to many users' experience? Sure. Is busting out "you broke userland" like some magical shibboleth useful as a means of your conveying your unhappiness that your distribution maintainers chose to replace a widely-depended-upon program with a different program useful? I think not.

> they went against POSIX behaviour

Which? There's "tradition" and "specified behaviour". Both are important in different situations and in different degrees.

> You don't turn around and say that all the sysutils should incorporate your new idea.

Why not? They're no more privileged by the POSIX specification, or by the user/kernel -space divide than any other program.

pas · on Jan 30, 2019

POSIX was broken first. It's insecure by default.

Intel, the kernel, even Chrome broke my userland by mitigating Spectre.

It happens.

CRON was and is run as a system service, in its own scope. If you run your own cron instance, but forgot to set it up as a system service, yeah, it gets cleaned up as you exit your shell/session/scope.

xyzzyz · on Jan 29, 2019

> They broke userland.

So? "We don't break userland" is a Linux kernel thing. Systemd is not kernel, it's userland, and userland things break other userland things all the time. They already broke lots of existing stuff when they replaced /etc/init.d/ scripts with systemd definition files, should systemd also have not done that?

> It doesn't matter what tradeoff they made - they went against POSIX behaviour, and as a result, broke numerous utilities, both past and future.

Linux is not POSIX, so I don't see how that's relevant. For what it's worth, I don't even know what part of POSIX it broke. Care to enlighten me?

jimrandomh · on Jan 29, 2019

Right; the Linux kernel has a "we don't break userland" policy, systemd doesn't. That's a selling point for the Linux kernel, and a strike against systemd. Both systemd and the Linux kernel are infrastructure projects which, if they're doing their jobs well, will never cause me problems so I get to ignore them. Systemd has been causing other people problems, and doesn't seem to understand that in the role they're trying to fill, preventing that from happening is their first and most important responsibility.

majewsky · on Jan 29, 2019

Like it or not, the Linux kernel is clearly the outlier in terms of backwards compatibility. For example, Postgres changes their data format in most non-bugfix releases. Would you consider that "a strike against" Postgres?

dooglius · on Jan 29, 2019

They provide an upgrade process that makes this invisible to the end user, so it's not a fair comparison. If it started deleting tables when I exit a session, that would definitely be a strike against it.

SahAssar · on Jan 29, 2019

Postgres has session-bound resources, and in most cases no way to disable those from being deleted when exiting a session. For example in postgres you can't persist a prepared statement, but you can of course persist data within a table. Any function running will be killed when you exit (or at least not complete since the transaction is cancelled).

IMO when a user has logged out and has not had the permissions/foresight to setup a task in the system to run without a session it should be killed.

I get that this has not been the default behavior in linux/UNIX, but to me it seems like the sensible one.

And that's before we ever argue about the possibility to turn it off.

kokada · on Jan 29, 2019

Systemd offer a compile and runtime option to turn this option off, so it is a fair comparison.

shakna · on Jan 29, 2019

I think you're completely missing the point.

If you ruin everyone else's day, and change behaviour everyone else is expecting, then it's probably your own fault.

Approaching it as if everyone should simply change and do what you want, is the height of arrogance. You are generating work for others. And in this particular case, not only are you generating work for others, you are eradicating a category of software.

When a distribution adopts systemd, they let everyone know how things are changing, and slowly transition things over, releasing when stable.

We know systemd replaces init.d. It was difficult, but distributions using systemd got over that hurdle, but it did take time.

However, this is not the same.

Yes, systemd is userland, however it is also PID 1. It is a layer between most userland and the kernel, and so needs to reflect the responsibility of it's position.

Ignoring how NOHUP is supposed to be interpreted, is a _bad idea_, and yes, a violation of POSIX, specifically signals (SIGHUP and nohup), and how they are supposed to be handled.

Moreso, it greatly heightens the difficulty of many utilities that are expected to work.

Why should cron (all implementations of cron), suddenly need to rely on another userland library to maintain it's function?

You just broke most Linux automation. Across an entire industry.

Why should screen (all implementations of screen), suddenly need to rely on a userland library much bigger than most implementations, to continue it's base function?

You just broke an entire category of background systems - including systems communicating with embedded hardware. You might have caused a factory-floor fault. Which could cause injury, or worse.

A breaking change of this level can cause industry-wide ramifications that are not just limited to the digital. Unexpected behaviour is exceptional, and should take time and considerable thought before occurring.

Systemd has responsibility that no other userland system has. It's PID 1.

If they're going to require a massive change in process behaviour, then they are going to require consultation, awareness within the industry, and transition time. They should be working with distributions, aware of the man-hours they're generating, before they put something in place.

BlackFly · on Jan 29, 2019

This discussion is very much apropos of what the article is talking about:

> The whole systemd battle, Rice said, comes down to a lot of disruptive change; that is where the tragedy comes in. Nerds have a complicated relationship to change; it's awesome when we are the ones creating the change, but it's untrustworthy when it comes from outside. Systemd represents that sort of externally imposed change that people find threatening. That is true even when the change isn't coming from developers like Poettering, who has shown little sympathy toward the people who have to deal with this change that has been imposed on them.

The posix violation is by design. If you think that posix dictates the wrong thing, then you will do something different and this is what Poettering has done. The fact that systemd has more or less been embraced by linux is an endorsement of his design philosophy, even if distributions reject specific features.

shakna · on Jan 29, 2019

I am not upset that there was divergence from POSIX.

Design choices are fine - I can understand why systemd takes a different approach.

What I don't like, and completely disagree with, is systemd not working with the community they directly effect to reduce disruption.

Like it or not, the product is an industry standard, and so will be held to industry expectations.

Rather than turning around and requiring everyone to change, they could have said, "Sorry, we're making changes, here are some preliminary patches that could help."

Or a timeline for a breaking change, wherein they can negotiate with others.

I don't have significant issues with systemd's software, though some reservations about quality. My main concern, and it has been since the beginning, is that systemd acts without thought or conscience to the effects that they might cause.

They lack the ability to be a team player, despite creating an environment where people depend on them.

systemd's adoption rates is an absolute credit to it. They have some very good design thoughts, and those working on it have done some excellent work.

However, it would be better if they communicated with the people they effect, rather than letting the community be an accidental Q&A team when things go wrong.

They do get this right sometimes, but that seems to be the exception, rather than the rule.

They approached the init.d situation calmly, and slowly. They worked with Debian, and Fedora and others to make sure it would work without interruption or loss of quality.

They approached the sigkill situation like they were a kid who just learned how to light a fire and wanted to burn the library down.

poettering · on Jan 29, 2019

You make plenty of assumptions there, in particular that there was no communication about the session killing thing. Turns however there was. We informed downstreams about our intention and the reasons in detail, and we documented this for everybody else in NEWS. We also made sure there was an easy compile-time option to pick the default for this option, and then left the rest for the downstreams to decide: whether to default to on or off to this, taking in the information we got from us and from the rest of the community. If you think they made the wrong decision, then complain to them really. But seriously, you really just assume we wouldn't talk to anyone, without actually having any idea what it communication is really taking place.

rauhl · on Jan 29, 2019

> We informed downstreams about our intention and the reasons in detail, and we documented this for everybody else in NEWS.

From The Hitchiker’s Guide to the Galaxy, regarding the plans to destroy the Earth:

‘But the plans were on display …’

‘On display? I eventually had to go down to the cellar to find them.’

‘That’s the display department.’

‘With a flashlight.’

‘Ah, well, the lights had probably gone.’

‘So had the stairs.’

‘But look, you found the notice, didn’t you?’

‘Yes,’ said Arthur, ‘yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying “Beware of the Leopard.”’

Back in the real world: you built & shipped a system whose defaults were and are broken, and now you blame others for not enabling the DONT_BE_WRONG setting. You might as well blame end users for not becoming fully-versed with your code before their first login.

It’s not the users’ fault. It’s not the distros’ fault. It’s yours, and your project’s, for shipping code which breaks the user experience.

I appreciate your vision. It’s a good one. You’re a smart guy. But have some humility! Have a sense of your own limitations, and those of the distros and users who will use your code. You’re a human being; the distros are made up of human beings; your end users are … human beings. Think of them.

Redoubts · on Jan 29, 2019

This is kind of a ridiculous reply. Is the only solution then to admit that Linux is "done"? Because it sounds like there's no room for change, even when change is communicated and multiple options to avoid it are provided.

brmgb · on Jan 29, 2019

> What I don't like, and completely disagree with, is systemd not working with the community they directly effect to reduce disruption.

> Rather than turning around and requiring everyone to change, they could have said, "Sorry, we're making changes, here are some preliminary patches that could help."

> Or a timeline for a breaking change, wherein they can negotiate with others.

But they did exactly that.

They contacted the tmux mainteners and asked if some modifications would be possible to accomodate the new option (see poettering comment here: run things as child of systemd --user or just register a separate PAM session). If I remember correctly, it would not even have been the first special case in tmux ; there already is one for OSX.

The discussion was actually progressing nicely until the anti-systemd flooded it. I remember seeing posts in a lot of place urging people to comment on the bug report with specious arguments. The whole thing was kind of upsetting.

irishsultan · on Jan 29, 2019

They did that 6 days after releasing the version that broke tmux, that's hardly preparing for or negotiating.

youdontknowtho · on Jan 29, 2019

POSIX isn't a law. You don't "violate" POSIX. It's a standard for compatibility. You can choose to not be compatible with a standard when you think it makes sense. That's something that lots of projects do. You are using standards compliance as a moral cudgel.

Your argument is way too impassioned to be just technical. You just basically accused Lennart of hurting people with no evidence whatsoever.

This sort of stuff really doesn't help.

cassianoleal · on Jan 29, 2019

When there is a standard and someone doesn't follow it, it is said that the standard has been violated.

It follows that when someone implements functionality that doesn't follow POSIX, POSIX has been violated.

There's nothing wrong with the statement.

youdontknowtho · on Jan 30, 2019

He accused Lennart of hurting people with no proof. Is that reasonable?

cassianoleal · on Jan 30, 2019

Please point out where in my comment I make any reference to reasonability.

youdontknowtho · on Jan 31, 2019

Apologies for that part, then. I just don't see standards compliance like other people do. Personally, I don't see standards as things that imply some kind of morality. They are tools to accomplish a goal. sometimes other goals may supersede their usefulness.

cassianoleal · on Feb 8, 2019

That is fair enough. I have not argued against your point of view. My comment was more on the linguistic side of things.

You criticised the parent's language saying that "you don't violate a standard" because it "isn't a law". I was just pointing out that you do indeed violate a standard because it's a standard, and saying that does not add any kind of moral or passion value - it's just using the language the way it's intended.

jodrellblank · on Jan 29, 2019

Aren't we just a few weeks after Rich Hickey's "you have no right to make demands of open source software" rant?

Systemd has responsibility that no other userland system has. It's PID 1.

No, you have the responsibility to check what the software you are installing does, and if you don't approve, change it or reject it. Or, don't check, and deal with it.

Systemd developers do not owe you working POSIX, working cron, industry wide working Linux automation, screen, separate userland for everything. They don't owe you anything. If you don't like their thing, don't use their thing.

buster · on Jan 29, 2019

Although I very much like the "don't break userland" approach, I agree with you. Especially in the light, that 1. You can start your background process the systemd way (shown elsewhere in this thread) 2. You can configure the desired behavior 3. Your distro probably already has configured it for you (Debian)

So it comes down to "something changed which is absolutely extremely important for me but I would rather discuss about it for hours then take the few seconds to configure it". Especially since the new behavior is intended behavior and also has upsides for a lot of use cases.

So don't be ungrateful. Be happy that some people are really putting a lot of work behind the software you use daily FOR FREE and just configure the darn thing the way you like.

And last but not least, most people here (me included) are not in the position to complain so much about free software, unless they show some commitment to open source themselves.

Karunamon · on Jan 29, 2019

>If you don’t like their thing, don’t use their thing

Oh how I wish that was a course of action I could reasonably take in this instance...

cyphar · on Jan 29, 2019

> Annoying to have to learn a new thing, but hardly the unbearable burden.

The problem is now your scripts won't work on systems that don't use systemd. Shell scripts work on FreeBSD, but now you can't use them because they require systemd-specific code.

I am not necessarily anti-systemd in most respects (I like a declarative definitions of services and less shell script hell), but the fact that they keep trying to get people (including container runtime developers like myself) to use _their_ API rather than the preexisting ones is fairly "anti-social".

poettering · on Jan 29, 2019

Aleksa,

I am not trying to get you to use our APIs. You talking about the cgroups APIs again, if I am not mistaken? As I tried to explain again and again: if you want container runtimes to manage their own cgroups then just set Delegate=yes in the unit file of your manager, get your own cgroup subtree, and you can do below it whatever you want, you do not have to call into systemd ever. Not a single API call, no C call, no D-Bus call, nothing. You get your own kingdom if you set Delegate=yes, and systemd won't interfere with that. This is extensively documented.

I wished you'd actually listen to what I keep repeating to you. We tried to be really nice to container managers, knowing that they disklike systemd APIs, so we put a lot of work in making the delegation boundary clean, so that they can be entirely systemd agnostic beyond setting the Delegate=yes boolean in their unit file, but alas, we just keep hearing the same nonsense.

The LXC/LXD people btw did get this right: they manage their own cgroup subtree now, and systemd doesn't interfere, and they don't link to or do dbus calls into systemd either.

cyphar · on Jan 29, 2019

> then just set Delegate=yes in the unit file of your manager

In runc we don't have a dedicated manager or long-running daemon. Yes, Docker and cri-o use Delegate=yes (so I am quite aware of this option) but that really doesn't help people who are using runc in their own user sessions or wrote their own wrapper and aren't aware of Delegate=yes.

I get that we are quite odd, and don't fit into a system-service model. After all of the back-and-forth with both you and Tejun (especially when it comes to "rootless" delegation -- which systemd only offers if you get a privileged user to delegate for you), I'm not sure that there's much I can do on this topic. I get that what I care about is not something you care about, but I would hope you accept that I'm not just being obstinate for the sake of it.

> Not a single API call, no C call, no D-Bus call, nothing.

Right, unless you need to set this up for someone else. And we have code that does this too -- I don't really recommend people use it, but it is necessary (and I'm pretty sure some folks at Red Hat use it based on how many bug reports they submit related to it).

Since systemd is managing the entire cgroupv2 tree (and the fact we can get around that for cgroupv1 appears to be seen as a design flaw by both you and Tejun), obviously we have to talk to systemd to do this type of thing. I just wish this wasn't the way it was done (and if cgroupv2 had a named cgroup concept -- which is what systemd needs for tracking services -- I would think that this wouldn't be such a pain-point).

I guess I'm just annoyed that we can't use "better rlimits" with "rootless" container runtimes because of all of this.

> I wished you'd actually listen to what I keep repeating to you.

I am listening, and I am aware of Delegate=yes and all of that history. But as I outlined above, I don't necessarily agree with it entirely. And unlike a lot of people around here, I don't think any of these pain-points are coming up because of malice or something stupid like that -- I just think we disagree on our priorities.

> We tried to be really nice to container managers, knowing that they disklike systemd APIs, so we put a lot of work in making the delegation boundary clean

Don't get me wrong -- I do appreciate that we have Delegate now (there was a period of several years where "systemd decided to reorganise the cgroup tree, un-containing my containers" happened on several occasions -- and Delegate solved those issues).

And from what I've heard from the LXC folks, you were quite reasonable about getting systemd to work inside LXC. Which is good to hear.

> The LXC/LXD people btw did get this right: they manage their own cgroup subtree now, and systemd doesn't interfere, and they don't link to or do dbus calls into systemd either.

We do basically the same thing. We just don't support cgroupv2.

stiff · on Jan 29, 2019

They changed a decades-old behavior many people rely on, and it must have been obvious from the start people will loose work because of it.

nerdponx · on Jan 29, 2019

It's a bug because it violates the expectations of an uninformed user. You aren't given a warning about it, it's not documented in big bold letters anywhere, and it's also not POSIX compliant.

zorpner · on Jan 29, 2019

Annoying to have to learn a new thing, but hardly the unbearable burden.

Rather, a breaking change to everyone's scripts and processes for zero benefit.

Tor3 · on Jan 29, 2019

Our scripts and tools work similarly on the four Unix systems we have in-house. Are you saying that it's OK that they don't work on Linux? Please do not forget that Linux is a POSIX system, basically a re-implementation of Unix, and until systemd it's been a fully compliant -nix system. Where I work we have transparently been able to deploy our products on all -nix, including Linux, since the nineties.

EDIT: My reply was supposed to be to xyzzys's post below, not the one I apparently replied to.. sorry about that.

xyzzyz · on Jan 29, 2019

There's a benefit, you're just not seeing it. Again, do you think that the systemd developers decided to implement it just to screw with people? As I said, there's a specific trade-off involved here.

I agree that it might not be the most desirable default, but if that's the case, then the guilt also falls on the distribution maintainers, who either ignored the big bold letters in the changelog, or didn't bother to test the everyone's standard workflows before pushing to stable.

inferiorhuman · on Jan 29, 2019

> Again, do you think that the systemd developers decided to implement it just to screw with people?

Based on Lennart's behavior, yes I do.

michaelmrose · on Jan 29, 2019

Instead of pretending the benefit is so obvious it doesn't require you to discuss it perhaps you could explain it.

dvfjsdhgfv · on Jan 29, 2019

Not the parent nor Systemd developers, but apparently they think it's the only way to make sure the user's session is cleaned up.

But frankly, 100% people would be fine with it if the default was left at no instead of changing it to yes. It's all about giving users a choice when a new feature is introduced, something Systemd developers understand only partially.

zorpner · on Jan 29, 2019

There's a benefit, you're just not seeing it.

Not to appeal to self-authority, but I have been maintaining production Linux systems in large-scale environments since the late 90s. If there were a benefit that outweighed the unnecessary breaking changes, I would see it, even if I didn't appreciate it. There isn't.

You should stop and think before you assume that other people are incompetent, both because it would make you a better interlocutor, and as a bonus it wouldn't violate HN's principle of charity.