Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Systemd as tragedy (lwn.net)
288 points by wyldfire on Jan 29, 2019 | hide | past | favorite | 401 comments


Back in 2016, systemd started killing user processes on logout (rather than send them the SIGHUP signal, as POSIX says should happen). This caused problems for programs like nohup, screen and tmux, which deliberately keep running. Systemd's response was to say that they should incorporate systemd's library, and use systemd's new daemonization API. As far as I know, none of them did.

Two years later, you can find hundreds of support requests across the internet, from frustrated users who are having their sessions killed by systemd.

Bugs are annoying, but that's life. On the other hand, when you're an impacted user who's lost work, and researching the bug leads you to a years-old discussion in which someone is actively denying that the bug exists and refusing to fix it, that's infuriating. I don't think systemd's developers deserve the trust that maintaining a core piece of infrastructure requires; they don't seem to care enough about whether they've broken things.


You know, because we knew this would be controversial we made sure it was both a compile-time option and a runtime option. Yes the upstream default of both defaults to on, but that's just upstream. We made it very easy and supported for downstream distros to switch between opt-out and opt-in of this option for their users. We have encouraged distributions to leave it on, but we were fully aware that for compatibility reasons this is something downstreams likely wanted to turn off, and most compat-minded distros did, as we expected.

Now I am used to taking blame for apparently everything that every went wrong on Linux, but you might as well blame your downstream distros for this as much you want to blame us upstream about this, as it's up to them to pick the right compile-time options matching their userbase and requirements in compatibility, and if they didn't do that to your liking, then maybe you should complain to them first.

(And yes, I still consider it a weakness of UNIX that "logout" doesn't really mean "logout", but just "maybe, please, if you'd be so kind, i'd like to exit, but not quite". I mean, that's not how you build a secure system. We fixed that really, fully knowing it would depart from UNIX tradition, but that's why we made it both compile-time and runtime configurable)

(Also, nobody has to "incorporate" systemd's library to avoid the automatic clean-up. In fact, there's no library we provide that could do that. What was requested though is to either run things as child of systemd --user or just register a separate PAM session, neither of which requires any systemd-specific library.)

Lennart


> Now I am used to taking blame for apparently everything that every went wrong on Linux, but you might as well blame your downstream distros for this as much you want to blame us upstream about this, as it's up to them to pick the right compile-time options matching their userbase and requirements in compatibility, and if they didn't do that to your liking, then maybe you should complain to them first.

It's up to you as a systemd developer to pick sane defaults. Claiming that it's okay to introduce opt-out breaking changes upstream and then abdicate responsibility is a quite bit like walking around while waving your hands and arms around and then blaming whoever you hit for walking into you.


Well. What is a distro for then if not for picking the most highlevel of defaults suitable for them?


> Well. What is a distro for then if not for picking the most highlevel of defaults suitable for them?

IOW the distros maintainers made a mistake by picking systemd? Agreed.


You are right. Distros failed us completely by choosing systemd.


Killing software that might be running after a valid login session is a sane default.


And that's what SIGHUP is for. The process will exit by default. If that's not the desired behavior a handler can be registered. Killing things that are explicitly designed to run after logout is a piss poor default.


We send SIGHUP btw. The kernel's own sending of SIGHUP is bound to the TTY concept btw, which is specific to TTY logins only, not graphical ones.

That said the question is not so much about who sends what, but more about whether a secure system should allow user code to escape lifecycle management or whether logging out means logging out and giving up all resources.


I get what you're saying. However, I'd probably apply the kernel rule of "when maintaining the kernel, do not do something which breaks user programs/applications". Yes, this isn't the kernel, but it's comparable in being a core function that heavily affects userland stuff.


Sometimes the ole way o' logg out is just insecure. And there is no way to conjure up a new backward compatible and secure way. cgroups work well, especially because they are not opt-in. That means programs daemonizing either has to set themselves up as a system service or start a new logind scope (or PAM session, etc. which translates to escaping the cgroup, which requires user approval to remain secure).


> more about whether a secure system should allow user code to escape lifecycle management

Please stop trotting that tired old line out. It is simply untrue. Systemd does the exact opposite of providing increased security. If nothing else the greatly increased surface area of systemd makes for a less secure system.

The pwnie articulates a number of other ways in which your code and your behavior are actively reducing the security of Linux.


I know right, I run openvpn as user nobody and I keep thinking that nobody user better stay logged in!


If you created a problem, it's your duty to provide a workaround or a solution to the problem. Why not provide systemd specific version of `nohup` for such cases and encourage users to use it instead of old and insecure version?


This. There's a reason the defacto way to keep running post logout was named "nohup". This wasn't some deep dark unknown secret behaviour that was broken.


It was called that because connected pty devices could hang up. Whether hanging up due to intentional logout or actually hanging up the modem was, and is, left as an exercise to the user. Unless we try to disambiguate it via login/pty manager programs, that is.


Because 1) maintainer can be overloaded, so (s)he will stick to defaults, 2) maintainer needs a logical reason to change default setting to something else, which is not obvious in most cases. Maintainer is not a QA team.


Look, it's everyone's responsibility, this doesn't just fall on Systemd. While it's clear that Systemd made some difficult changes to how user processes operate, it still performed the due diligence of providing the original behavior as configurations. They should reconfigure their tools. If they're not doing that, then it's not necessarily Systemd's fault that things don't work for sysadmins trying to use their tools.


Wait a minute. Why isn't it the distro's responsibility to choose the most compatible defaults?


Isn't it more efficient if 1 upstream picks the sane defaults rather than N distros? The situation was exactly the same when PulseAudio was introduced in Ubuntu. Audio broke for a huge amount of users and according to upstream it was because they had configured it wrongly...

IMO, it is part and parcel of designing great software that you pick as universally agreeable defaults as possible.


It's the responsibility of both to pick sane defaults. When the software developer picks insane defaults they are being antisocial, those distro packagers are people too and developers who pick insane defaults are causing unnecessary grief for packagers.


If you smell shit while walking down the street, maybe someone dropped a deuce on the sidewalk. If you smell shit everywhere you go, maybe it's you, maybe you shat your pants. When you violate the principle of least astonishment you're creating a huge stink.

That you can configure systemd to behave in a less obnoxious manner is well beside the point. Systemd should be unobtrusive and predictable without any extra action on the part of the distribution folks or end users.

That the suggestion is to simply read the code or documentation is the height of arrogance considering how sloppy and insecure the systemd code is (parse error equals root privileges? come on…).


Your argument assumes that systemd is simply meant to be a in-place compatible drop-in for what it replaces, which I don't think is something anyone would/should expect. If systemd was meant to behave the exact same way as systems it is replacing then there wouldn't be much point of it. For those cases it sometimes will break things, and will sometimes have settings to follow previous behavior.


There's plenty of room within the POSIX specs to address service management without requiring kernel integration, breaking userland tools, etc. When your init replacement manages to interfere with the kernel you've done something very, very wrong.


Not sure if I missed something here but how has it interfered with the kernel? AFAIK it has broken some userland tools (which is bad in itself in most cases), but actually breaking kernelspace is not something I've heard of.


https://igurublog.wordpress.com/2014/04/03/tso-and-linus-and...

Yet just two days ago, we see Linus Torvalds (the creator of Linux and maintainer of the Linux kernel), launching into a tirade against – yes, you guessed it – systemd developers because of their atrocious response to a bug in systemd that is crashing the kernel and preventing it from being debugged. Linus is so upset with systemd developer Kay Sievers (gee, where I have heard that name before – oh, that’s right, he’s the moron who refused to fix udev problems) that Linus is threatening to refuse any further contributions from this Red Hat developer, not just because of this bug, but because of a pattern of this behavior – a problem for Kay because Red Hat is also foaming at the mouth to have their kernel-based, no doubt bug- and security-flaw-ridden D-Bus implementation included in our kernels. Other developers were so peeved that they suggested simply triggering a kernel panic and halting the system when systemd is so much as detected in use.

The key phrase there is:

a bug in systemd that is crashing the kernel and preventing it from being debugged

Honestly though when you get Linus flaming your behavior you're doing something really wrong.


_Honestly though when you get Linus flaming your behavior you're doing something really wrong._

Haven't been around here long, have you? :-)


Likewise, of course, or you'd know that the tirades were more often than not in response to things that were indeed "really wrong" (at least by his standards).


Yeah I know Linus likes to go on a good tear. But I'm not talking about flaming your code or design decisions, but flaming your behavior.


from 2014. I'm only pointing it out to make it clear that the post wasn't recent. Not questioning anything else about it.


Some distros focus on user convenience some on security. Different defaults are required.

And sometimes security requires breaking compatibility.


There's a bug here, which impacts end users: a variety of programs which are clearly intended to persist in the background (nohup, tmux, etc) are failing to persist. This is a real bug. We care about it. I won't be satisfied until it appears that the bug is on track to be fixed, and a lot of other people won't either.

The options for fixing the bug are:

* nohup, tmux, emacs, etc all take dependencies on systemd and use the new systemd daemonization procedure. This is not a viable path because the maintainers of those utilities have refused (see https://github.com/tmux/tmux/issues/428), and because there are too many of them.

* Each distro separately works around the problem by maintaining forks of nohup, tmux, etc. This is not a viable solution because it's way too many forks; people will be finding broken distro+utility pairs forever.

* Each distro separately works around the problem by putting loginctl enable-linger in /etc/profile and KillUserProcesses=no. This would effectively be overruling a systemd's decision. Some distros won't know they need to do this, and the github systemd repo becomes a trap.

* Or: systemd backs down and changes the defaults so that the old daemonization APIs work again.

If you have a fifth option, we'd all love to hear it. But the status quo is that there's a user-facing bug, and the bug is still there. Rather than make the case for it not being a bug, you're currently making the case for it being someone else's bug, but the "someone else" doesn't actually have the power to fix it. You are the only one with the power to fix this bug.


> If you have a fifth option, we'd all love to hear it.

Replace systemd with something else.


There's literally nothing wrong with OpenRC


Devuan


I don't understand the issue. systemd offers the option to override the default. Its literally a config. If its such a big deal, why don't the distros just override it? Its a one time change.


> And yes, I still consider it a weakness of UNIX that "logout" doesn't really mean "logout", but just "maybe, please, if you'd be so kind, i'd like to exit, but not quite". I mean, that's not how you build a secure system.

As an aside this is the height of arrogance to suggest that the systemd is somehow a more secure alternative. Lest this be considered an empty ad hominem attack, let me quote the pwnie you won in 2017[1]:

> Where you are dereferencing null pointers, or writing out

> of bounds, or not supporting fully qualified domain names,

> or giving root privileges to any user whose name begins with

> a number, there's no chance that the CVE number will

> referenced in either the change log or the commit message.

> But CVEs aren't really our currency any more, and only the

> lamest of vendors gets a Pwnie!

1: https://pwnies.com/archive/2017/winners/#lamestvendor


> giving root privileges to any user whose name begins with > a number

https://github.com/systemd/systemd/issues/6237

oh my god, what a spectacular issue. And, seriously, the Poetterings' response is basically "not my job" and "not a bug". And this person develops something that sits at the core of a modern linux system...


> oh my god, what a spectacular issue. And, seriously, the Poetterings' response is basically "not my job" and "not a bug". And this person develops something that sits at the core of a modern linux system...

All the while Lennart claims that he's making Linux more secure. FFS.

Edit: I forgot about this

https://igurublog.wordpress.com/2014/04/03/tso-and-linus-and...

> He (Theodore Ts’o) goes on to describe how he previously had to neuter policykit’s security (rendering his system very vulnerable) just to get his system working, and how he has found systemd "very difficult sometimes to figure out".

And:

> As for Kay Sievers, maybe he should rename himself to Kay Sewers, because that’s exactly what he smells of. He told to IETF internet area director and previously DHCP working group co-chair “Tod Lemon” to lmgtfy when he asked about a systemd related git repository.

This gem sums it up perfectly though:

> Yet just two days ago, we see Linus Torvalds (the creator of Linux and maintainer of the Linux kernel), launching into a tirade against – yes, you guessed it – systemd developers because of their atrocious response to a bug in systemd that is crashing the kernel and preventing it from being debugged. Linus is so upset with systemd developer Kay Sievers (gee, where I have heard that name before – oh, that’s right, he’s the moron who refused to fix udev problems) that Linus is threatening to refuse any further contributions from this Red Hat developer, not just because of this bug, but because of a pattern of this behavior – a problem for Kay because Red Hat is also foaming at the mouth to have their kernel-based, no doubt bug- and security-flaw-ridden D-Bus implementation included in our kernels. Other developers were so peeved that they suggested simply triggering a kernel panic and halting the system when systemd is so much as detected in use.


Only the root user can put such an invalid unit file into a directory where systemd will read it - what is the security impact exactly?


The security impact is that if you allow a user to choose their own username, and you use a standard POSIX specified way of verifying that the username is valid, and at any point in time you run a service as that user, an attacker can gain root privileges.


Or if you have a package that generates a service user that starts with a digit. Then you'll be running an arbitrary service as root in which case any vulnerabilities become that much more serious. Or have things regressed so much with systemd that the standard is now verify each and every thing you have the init system do?

The other problem is, of course, the utter lack of understanding Lennart demonstrates by being so dismissive and the increased potential for systemd to be hiding future security vulns.


You know it's open source and that you could actually get involved? If you submit a pull request and it doesn't get merged you can take your concerns to the the larger group.

As to the stuff mentioned in the pwnie. Those sound like great contributions that would be appreciated.

You could also take your concerns to the distro development group. If that doesn't work you could also customize your distro with a custom build of systemd.

If you still don't get satisfaction you can stop using it.

If you dislike how they do thing you have options. Or, you could just be mean on a forum...


For what it's worth, systemd makes my life easier.

When I switch distro, it's almost always systemd, and not the system du jour, so I know how it works. Creating service files is a google query away, and makes common use cases a breathe, while advanced features that were hard to bash script yourself into, are now just a few options to type.

I understand that many people may have problems with systemd for their particular situation, but that's not my experience.

As a dumb user with a few laptops and servers that needs an occassional daemon, I'm glad systemd won. I know you get a lot of heat since it came out, so thank you for working on it.


Sure, systemd solves a number of real problems. This is good.

What is not as good: (1) systemd takes over or duplicates functionality not related directly to its primary purpose, and (2) is not solid enough to trust it in a number of cases, while (3) the developers' attitude does not give a lot of hope that the situation will materially improve.

(Of course, I run a distro without systemd.)


> I still consider it a weakness of UNIX that "logout" doesn't really mean "logout"

Ok, but UNIX and it's behaviour has evolved over forty years, and users have a certain set of expectations about it.

Also, it should be noted, systems like UNIX are cultural artifacts. The way they are is the result of forty years of back and forth debate and negotiation and eventually compromise.

I can't speak for all of them, but I think that people that are bothered by systemd are upset that all of history has been brushed aside to make place for the preferences of just a few influential developers.

Whether a feature like logout is "logical" or not, is besides the point. Operating system design isn't just about logic, it's about serving users.


Yes, indeed, it's not about logic, as those same users cheer Linux instead of sticking with BSD, and then complain about not being UNIX enough.


That was the point of OP's article. That it's hard to change.


Completely agree. The problem is not upstream, but downstream. Distros should have done better job and chosen a better default system manager and not systemd.

You build your software the way you want and like. If others don’t like that it breaks POSIX they should stop using it instead of complaining. Or fork it.


> What was requested though is to either run things as child of systemd --user or just register a separate PAM session

When you run your screen or tmux below `systemd --user`, you still would have to `loginctl enable-linger`, no? I remember having to do that when I set up a PulseAudio server on a headless machine where I don't maintain an active session.


> still consider it a weakness of UNIX that "logout" doesn't really mean "logout" ... I mean, that's not how you build a secure system

so, unix has been running for 20+ years laden with this security flaw? strange that nobody has been screaming out to plug it all this time.

this feels like you have a bee in your bonnet that it is not a very 'pure' logout by some interpretation of what a "logout" should be. imho, "logout" should mean what it has always meant in the past.


Lennart, thanks for the information. Mind explaining why you chose to kill user processes on logout as the default?


I think my comment above explained that already.


I think tasuki is asking you to elaborate a bit further on what kind of security issues you have solved by not using SIGHUP signal. I would personally also like to hear more in-depth details, preferable with some examples of security vulnerabilities that was caused because of that POSIX design choice.


Well, this boils down to: in a modern operating system, is it good design that an unprivileged user who logs in once can consume arbitrary runtime resources uncontrolled, unbounded forever, even after logout just because they decided to mask SIGHUP? I think not, I think the system should default to behaviour where unprivileged processes are clearly lifecycle bound, and when the user's sessions end they end comprehensively. I mean, other OSes don't really allow this unprivileged either, for good reasons: the lifecycle of the unpriv user's processes should be controlled by privileged code, and clearly be defined by the act of logging in and logging out in its lifetime.

It's entirely OK if the admin then opts out specific users or even all users from this behaviour, i.e. if a privileged players decides to liberalize unbounded, unlifecycled resource consumption for unprivileged players. But a default where unprivileged code can just stick around uncontrolled and consume as much as it wants forever is just a strange choice security wise.

i.e. I think the fact that SIGHUP masking is unrestricted, i.e. is not subject to privilege checks is the problem really. Something is unpriv by default that should be priv by default. And that's pretty much what this option in systemd provides you with.


> Well, this boils down to: in a modern operating system, is it good design that an unprivileged user who logs in once can consume arbitrary runtime resources uncontrolled, unbounded forever, even after logout just because they decided to mask SIGHUP?

This was well known and accounted for where necessary. You considered everyone else to be wrong about the issue and went ahead and fixed it according to your opinion. Don't be surprised that a considerable portion of "everyone" doesn't agree with you.


> This was well known and accounted for where necessary.

Could you please explain that in a bit more detail?


> is it good design that an unprivileged user who logs in once can consume arbitrary runtime resources uncontrolled, unbounded forever

A unprivileged user can still do this by setting up an intermediary box that keeps a persistent ssh session open. Incidentally, this is exactly what I plan to do if I ever need to ssh into a server with KillUserProcesses=yes.

> other OSes don't really allow this unprivileged either

On Windows, if I remote desktop from a laptop into a desktop, and start a web server, then shut down the laptop, the server stays running. On iOS if I start drafting an email, and reboot my phone, I don't lose my work. On ChromeOS, my tabs will stick around after a system crash. The world is moving toward processes being _more_ persistent, not less.


Windows has a different concept for services and processes. All of your processes are killed when you logout


If you already have a middle box, then great, but usually malware (eg a nasty Chrome extension) likes to stick around to snoop on user activity. (Preferably on all user activity, forever.)


> If you already have a middle box, then great, but usually malware (eg a nasty Chrome extension) likes to stick around to snoop on user activity. (Preferably on all user activity, forever.)

Well I'm certainly seeing why people get so frustrated with systemd junkies. Killing a "rogue" Chrome extension doesn't provide any meaningful form of security. There's no privilege escalation in play here. Whatever snooping it could do with you logged out could be done when you're logged in. Snooping on all users? Yeah, not going to happen without privilege escalation (which systemd will happily provide). So while systemd introduced this obnoxious behavior that broke all sorts of commonly used utilities no benefit was gained (except perhaps reinventing the wheel).

Meanwhile if you're worried about security don't forget that systemd has introduced a number of denial-of-service vectors (including one that results in a kernel panic) as well as an actual privilege escalation bug (which, in a fit of irony, could've been mitigated significantly by respecting return value tradition of zero = success). Take a look at the privilege escalation bug remedy, the vuln was due entirely to breathtakingly sloppy code. I'm ignoring the whole dereferencing unchecked pointers thing because that's such laughably bad practice I don't even know where to begin. Then take a look at Lennart's response and his unwillingness to mention CVEs anywhere.

The end result is that you have a combination of: breaking changes offering zero benefit, sloppy code resulting in reduced security, and a complete absence of any sort of security culture. Lennart, IBM, and systemd can claim all sorts of things (perhaps there really is a value in moving away from shell scripts) but security? No. There is absolutely ZERO merit to any claim that systemd increases security. The lack of security culture and defensive coding that permeates systemd all but guarantee future vulnerabilities.

Edit:

But wait! There's more!

https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2017-9445

Systemd is also remotely exploitable. Sure, no program is perfect, but most programs strive to decrease the attack surface where systemd strives to increase it.


> Well, this boils down to: in a modern operating system, is it good design that an unprivileged user who logs in once can consume arbitrary runtime resources uncontrolled, unbounded forever, even after logout just because they decided to mask SIGHUP? I think not, I think the system should default to behaviour where unprivileged processes are clearly lifecycle bound

If there were some way to design this so that nohup would give a permission denied error on start and tmux would give one on detach, rather than die on logout when it's too late to display a warning, that would be a lot better. There may not be a feasible way to do this, but it would solve a key part of this problem, which is that people don't find out about this behavior until something has already gone wrong, and don't find out that systemd is responsible for the behavior until after they've gotten frustrated enough to be mad about it.


From a hosting perspective I understand the issue being addressed but I don't see specific problems being solved. For example, if I lock out a compromised account by locking the unix user they can still be currently logged in with running processes which I then need to manually address and kill, and they can also have cron jobs which restarts them. Services like Apache (with mpm_itk) will still change user-id to those locked users. There is no general system-wide method to declare that a user and all its connected aspects should stop being available, and therefore a compromised account must currently be handled rather individually.

What I see most companies in this industry do is use per-user virtual machines to address the issue, which completely bypasses the question about logged in and logged out. It would be interesting if the intention in current development is to give us administrators more options here and allow for cleaner handling of compromised accounts.


But the point is that Linux and Unix isn't a modern operating system. It's ancient, and built upon decades and decades of work by hundreds of thousands of developers. You can't just decide to break norms handed down through the decades.

And I don't think anyone really had a problem with the old default of letting things run. Worse is better, after all. Pursuit of a perfect system will just make things too complicated, too brittle, and too obtuse.

I like Unix because it doesn't try to solve every problem. It's a libertarian operating system, if you will. Sometimes this causes problems, sure, but if the system is simple and liberal, you can always fix them without much effort.


They show that you do not understand what "log out" is for in Unix.


I feel like the kernel policy of "don't break userspace" would be a valuable one for y'all to adopt.


>You know, because we knew this would be controversial we made sure it was both a compile-time option and a runtime option.

This is standard from you. You knock the glass on the floor and blame the maid service for not cleaning up after you.

It's everyone's faults but yours.

>And yes, I still consider it a weakness of UNIX that "logout" doesn't really mean "logout", but just "maybe, please, if you'd be so kind, i'd like to exit, but not quite".

Oh how hyperbolic. Nuances and caveats in terminology is not a weakness.

I don't see why you're splitting hairs over this but can't be bothered to care about your UID numbering bug.

Or he fact systemd-resolv is responsible for DNS leaking on VPNs.

But yes, tell me more about how a functionality that enables terminal multiplexes is a "weakness"

>Now I am used to taking blame for apparently everything that every went wrong on Linux,

It's because of your smarmy, arrogance.

You break POSIX compliance, which has a real world effect in multiple areas and you accept bug reports with the humility of Donald Trump being interviewed by MSNBC.

Then when you retreat into your safe space, you play victim to the situation you created.

You talk of Linux culture toxicity, smearing the likes of Linus Torvalds, while essentially being the metaphorical sibling putting your finger in people's face repeating "I'm not touching you" over and over. Then you acted attacked when someone claps back.

You're a cry bully hiding behind a vaneer of professionalism acceptable for Red Hat's HR department which enables you to mark one more bug as "wontfix"; your attitude, your arrogance, your conceits that things not broken in fact, are so you can provide solutions no one asked for and no one benefits from.


To be fair, at least poettering presented an argument and is responsible for software that helps a whole bunch of us get things done.

You're just kind of yelling, and it diminishes any point you may have made.


After awhile, anyone who deals with Lennart just starts yelling, because he is impossible to reason with. He's very intelligent, and absolutely convinced that his is the One True Correct Right Way. It doesn't matter than hundreds or thousands of voices oppose him; I don't think it would matter if every single human being on earth opposed him.

What makes it worse is that he's often not completely wrong. Linux did need something like PulseAudio, something like Avahi and something like systemd. But his reach exceeds his grasp (which probably applies to us all, as I've found on my own projects), which leads to the well-known problems of PulseAudio & systemd.

I don't actually want him to quit the Linux world. But I wish he would scale back his ambitions just a tad, and consider that maybe — just maybe — other people have some good points, and valid concerns.

And also Windows/DOS are not terribly good design exemplars.


I get what you are saying, but

> It doesn't matter than hundreds or thousands of voices oppose him; I don't think it would matter if every single human being on earth opposed him.

makes it seem like everyone that uses systemd hates it or sees the same flaws as you or the other people yelling.

I and many others started admin'ing during or slightly before the systemd transition (ubuntu14->16 and rhel6->7) and have found it a much easier path to running services in a sane way than before. It was certainly possible before it, but with systemd I can do it a lot better and easier than I would have been able with previous inits.

For every person saying that systemd made things worse I expect there to be 10 silent sysadmins that appreciate what it did. I have no evidence of that, but that is my experience.


It does a lot more than manages services.

It breaks screen and tmux functionality, leaks DNS when connected to a VPN, it riddled with "wontfix" security vulnerabilities stemming from a refusal to be POSIX compliant.

Systemd replaced udev for crying out loud.


That might be true and still not contradict what I said. A lot of the systemd critics still seem to not see what it actually did for most people using it. You're free to hate it and some of that is certainly justified, but don't assume that the contrary opinion is based on uneducated or misguided opinions.

Most of what I see/use of systemd I like. Some of it I don't, and some of it is a dumpsterfire. I think I could say the same or worse for any ambitious software project.

As for the security issues I certainly place those in the dumpsterfire category and I'd like for the systemd team to handle them better.


You know what? Systemd generally works for me. Sure there's teeth gnashing at having all my userland tools upended. I've frustration at the unit file specs. But it mostly works.

That, however, does not mean that systemd is anything other than a giant fucking dumpster fire. Looking at how Lennart interacts with other Linux devs, how he reacts to bug and security reports, looking at the lack of code review and the shoddy design decisions that get baked into systemd… it appears as if systemd mostly works through sheer luck. That sort of approach may be acceptable when you're talking GNU vs X emacs, but it's absolutely the wrong approach to such a critical piece of software.

The other thing I'm missing is any improvement. All of this upheaval has been for what? Assuaging Lennart's ego? Not good enough.

> You're free to hate it and some of that is certainly justified, but don't assume that the contrary opinion is based on uneducated or misguided opinions.

When the article being discussed consistently wrongly characterizes and dismisses technical arguments against systemd I think it's fair to say it's a bit more than misguided.

> As for the security issues I certainly place those in the dumpsterfire category and I'd like for the systemd team to handle them better.

Yeah, no. Security as an afterthought is a bad approach in general but it's even worse when you're talking about low level bits like PID 1, the kernel, boot loader, etc. This right here is enough reason to run, screaming far far away from systemd.

You know the best part though? I've had plenty of frustration with upstart (especially with features they've decided to remove over the years). None of this compares to the heavy handed, anti-social bullshit that seems to engulf systemd. Hell, I recently bought a replacement laptop. I even entertained the idea of a Linux machine. Systemd and its effect on Linux on the desltop was one of the top reasons I went with another MacBook Pro.


I agree. I love systemd as compared to the other ways (though I think launchd is pretty nice too).


So you made the default the worst possible option, because... why exactly? And now that the problem is apparent, you haven't changed the default because...? I don't know what goes through your and the rest of the systemd's team's heads, but good software engineering it is not.


You've done great work as a whole, as you probably know. Try not let the lowlifes get to you.


Absolutely. I can understand implementing this feature for some special cases, like containers that should clear all hint of a user away on log off. It should never have been the default, and breaks an entire category of software. In my standard .bashrc file, I have the following snippet to warn me if I am on a system with that stupid setting enabled.

    if which loginctl > /dev/null && loginctl >& /dev/null; then
        if loginctl show-user | grep KillUserProcesses | grep -q yes; then
            echo "systemd is set to kill user processes on logoff"
            echo "This will break screen, tmux, emacs --daemon, nohup, etc"
            echo "Tell the sysadmin to set KillUserProcesses=no in /etc/systemd/login.conf"
        fi
    fi


Thanks, now I know why Emacs daemon keeps delaying my restarts in the system (just discovered that NixOS defaults KillUserProcesses to false).

Turning this on to true, for me it does no make sense to a user service (yeah, I run emacs as a user's systemd service) to keep running after I logout of my system.

P.S.: And the fact that for some people this behavior makes sense is why I think Lenart decision to put this as an option makes sense.


I'm glad that it helped resolve your issue, though I still don't think it was an appropriate choice for a default. I tend to do most of my work on a remote server, using tmux and emacs daemon to pick up right where I left off in the case of a dropped connection. That systemd would terminate my process when I explicitly requested it not to be is very abnormal.


You haven't requested systemd, you started a user scope, and haven't started a service for what you need.

POSIX is nice, but rather lacking in certain aspects, such as security anf administration-friendliness. cgroups help with both, but people have to understand them and use them well.


Handling and ignoring SIGHUP is the explicit way to indicate that a program should not be terminated. That systemd invented a new category and then ex post facto declared that everybody else was wrong for not using it is ridiculous. Systemd changing behavior such that I must "Simon says nohup" is completely asinine.


Systemd developers, if you're reading this: this isn't the sort of bug where people grumble for awhile and then get over it, because things are still broken, and the workaround being circulated (KillUserProcesses=no) doesn't fully work. (https://github.com/systemd/systemd/issues/8486) As long as people continue to encounter this issue anew--and they still are--people will be angry at the systemd maintainers.


The bug you've linked to was closed[1] by the reporter with "Thanks for the clarification guys. Much appreciated!", after it was pointed out to them that something they were trying ho do with "KillUserProcesses=no" was better done in another way.

1. Edit: Not literally closed by the reporter. Lennart Poettering closed it, "closed by the reporter" as in "the issue was resolved to the reporter's satisfaction".


> The bug you've linked to was closed by the reporter

Are we reading the same bug report? The one I'm looking at was closed by the creator of Systemd.


That comes down entirely to how systemd is configured. If you don't like what your chosen distro has picked as the default then complain to them. systemd didn't force anyone's hand on the subject, they just added the feature. It's a pretty natural design choice IMHO. When I want to log out, I don't want to let some hung up daemon keep running just because it wasn't able to process the SIGHUP sent to it.

How else do you propose to make sure that when I log off my ssh-agent is really terminated and not just locked up with my keys still in memory? The POSIX approach is insufficient, there's no way to know if a process received a signal and chose to ignore it and keep running or if it received a signal but it was deadlocked and kept running.


The problem is that you're breaking compatibility by changing the default. It's one thing to add a feature that can solve a problem. It's something else to break existing programs that don't use it.

If you're not going evaluate each individual program to determine whether the new behavior is appropriate then it should be opt-in rather than opt-out. Then ssh-agent and anything else that knows it should be forcefully killed can opt-in without breaking other innocent programs.


So you think backwards compatibility is so important that we should keep old BROKEN and INSECURE behavior just for the sake of not inconveniencing few power users with technical knowledge to override it? Instead those few loudest complaining should be catered to and regular users left for the wolves…

I think some people sometimes lack any perspective on the topic.


Yes.

I’m not being emotional about it, just irritated.

Systemd has tangibly caused me to lose work with tmux; I appreciate there are root causes for this, but frankly, if some piece of someone’s code does that, for whatever reason that is beyond my control to immediately stop using it...

...it feels justified to be annoyed.

How do you suggest an alternative meaningful response would look?

Create my own distribution?

What tangible and meaningful alternatives do I have other than encouraging people not to use systemd?


> Create my own distribution?

> What tangible and meaningful alternatives do I have other than encouraging people not to use systemd?

Sure, if you think you can actually “test every single program and make everything opt-in.” I think you will however find that making everyone happy and having new features are just simply contradictory by the very definition. At some point you will want new stuff and you’ll have to break something.

The best you could do is adopt BSD’s model and fork tmux and other userland and ship outdated/patched versions. It’s a ton of work, of course.

I am not actually seriously suggesting you create your own distro, after all you can probably just fix the annoying issue with systemd and move on with your life, and Systemd actually makes it easy for your by making it a configuration switch and supporting the non-default workflow.

I am simply suggesting you put yourself in the position of someone that has to make those decisions and really think about it from that perspective. Everything’s always a trade off.


> I am not actually seriously suggesting you create your own distro, after all you can probably just fix the annoying issue with systemd and move on with your life, and Systemd actually makes it easy for your by making it a configuration switch and supporting the non-default workflow.

Given the extraordinary scope of systemd, what happens with the next major issue? Having to perpetually work around poorly designed software is infuriating.

> I am simply suggesting you put yourself in the position of someone that has to make those decisions and really think about it from that perspective. Everything’s always a trade off.

Why should the onus be on the end user? Perhaps the distributions should be making choices that are less antagonistic of their users (e.g. upstart instead of systemd).

You're right about the tradeoffs though, and one of the tradeoffs for buying into systemd is angry users.


> Given the extraordinary scope of systemd, what happens with the next major issue? Having to perpetually work around poorly designed software is infuriating.

Systemd doesn’t break stuff if they just feel like it. Everything is compatible if it can be, for example you can still run /etc/init.d scripts and manage them through systemd on Debian. Lingering processes are also still supported! It’s a configuration switch that most distros decided to turn on by default, because...

> Why should the onus be on the end user? Perhaps the distributions should be making choices that are less antagonistic of their users (e.g. upstart instead of systemd).

... it’s a net benefit to most users. It’s only “antagonistic” to a particular subset of powerusers perfectly capable of working around the issue but somehow more motivated to loudly complain about it on Internet.

> You're right about the tradeoffs though, and one of the tradeoffs for buying into systemd is angry users.

Fair deal if it helps with even 0.1% desktop market share.


> particular subset of powerusers perfectly capable of working around the issue

What is the actual workaround? Is there a patch that unbreaks nohup by passing cwd and env to systemd-run --user or something?



I see arguing but no consensus on what ought to be done.

My use case: I run a shell pipeline that will probably take all weekend to finish. On a POSIX box I start it with nohup. What do I do on a systemd box? Does nohup need a patch that doesn't exist yet?


There's a couple ways to work around the issue, you can just configure systemd to not kill processes that were in the user scope when the user scope is closed in which case it behaves exactly as it did before. Or if you want to keep systemd cleaning up hung applications but not e.g. some script that you typically ran with nohup you can just use systemd-run instead.

https://www.freedesktop.org/software/systemd/man/systemd-run...

In particular you'd probably want --user so that it runs it under your user instance of systemd and --scope so that it's all run under a scope for that command instead of just a transient service. For most uses of nohup you could literally just make it an alias for systemd-run --user --scope instead.


I expect that the formal answer is that you should be running that within the service framework (be it systemd or other). My answer is: if you want POSIX-like behavior don't run it on Linux.


SIGHUP isn’t broken & insecure: it works, and it is secure. Processes which don’t want to handle the hangup signal are terminated, and processes which want to ignore it do.


But this just isn't the case. If something stays around after receiving SIGHUP, it was probably because that application intended to do so but it could also just be a hung up application that one way or another is going to stay around until it's killed. Sending a signal doesn't give you any sort of feedback to see if you're waiting for the application to close or if the application shouldn't be closed. Signals alone are insufficient.


Tell me more about this perfect world with no bugs and nondeterministic behavior.


Well, there are some pretty severe restrictions on the type of code you can put into signal handlers. Only atomic operations are allowed. And, in my experience, almost all applications react appropriately to signals.


>Well, there are some pretty severe restrictions on the type of code you can put into signal handlers.

Err... Maybe I'm missing something but I don't believe that's the case. There's a lot of things that you shouldn't do inside of a signal handler that will exhibit undefined behavior, but it's not like the kernel puts any restrictions on what the application can do inside of a signal handler. If an application wants to make SIGHUP just call whatever existing application exit logic they already have, they can. It's a terrible idea because if the application was signalled in the middle of some library call then it's anyone's guess as to whether or not it's just going to crash but that doesn't mean that you can't do it.

I think you're underestimating the difficulty of gracefully shutting down an application in a signal handler. If it's waiting for the application to finish some operation it's stuck in it'll just do the exact same thing as using nohup and there's no way to know that outside of the application.


If an application is handling SIGHUP then it presumably intends to continue running. If it used systemd-run instead, it could still get into a bad state at any point thereafter and you have the same problem. Even using a watchdog couldn't fix every buggy application, because there are ways for an application to crash or misbehave yet continue to send the watchdog notification. We still haven't solved the halting problem.

Meanwhile if the process isn't handling SIGHUP then there is little chance of undefined behavior in the default handler, which merely terminates the process immediately.


>If an application is handling SIGHUP then it presumably intends to continue running.

That's not correct, for stuff running in the user's scope more often than not a SIGHUP handler is just to gracefully exit the application. I.E. close any open files, finish any writes in process, etc.

But also, you don't know what the SIGHUP handler does to begin with. That's the crux of the problem. Outside of the process the SIGHUP handler is just a black box.

>If it used systemd-run instead, it could still get into a bad state at any point thereafter and you have the same problem.

No, if it was started with systemd-run there's no SIGHUP sent to it in the first place. Reaping applications that won't close in the user scope isn't about preventing them from breaking in the first place, it's just sweeping up the broken pieces so that it doesn't break the next user scope because it's still holding some exclusive lock on something.

It's like putting the user session into its own container. It doesn't fix anything, it just keeps the breakage contained to the user's scope so that when you log out, it really does shut down that "container".


> That's not correct, for stuff running in the user's scope more often than not a SIGHUP handler is just to gracefully exit the application. I.E. close any open files, finish any writes in process, etc.

That's essentially the same thing, and the application would have to do something similar to protect itself.

Suppose the user would lose data if the application doesn't exit gracefully, but this may take a variable amount of time depending on how much unsaved data there is, current load on the machine, etc. So it handles SIGHUP, continues running to save its state, but hasn't finished before systemd kills it.

To prevent this it would have to use systemd-run to preserve itself long enough to finish saving its state, and we're back to square one again. Or it doesn't do that and the user loses data.


When they work, sure. And when they don’t the user is wondering why his laptop is playing sounds when she’s logged out. Systemd’s solution is the right one from technical POV. No need to hope applications cooperate when you can just ask the kernel to make sure they do.


>I think some people sometimes lack any perspective on the topic.

Apparently you think Linus is one of those who "lack perspective"?

http://lkml.iu.edu/hypermail/linux/kernel/1711.2/01701.html

I get that systemd isn't the kernel, but it's close enough. There are many who would agree that breaking existing behavior in the name of security isn't wise. I have also not yet seen anyone point out specific security issues this solved. Unix has worked this way for a long time.


User launches voice chat, logs out, application stays around and listens on user/other users. Just one example. Having programs running despite being logged out is unintuitive and wrong. Most users do not know or care about going into a task manager. And if you want Linux to ever have a chance to succeed on desktop, they shouldn’t have to.

As to the Linus’ post, if you want to argue that there wasn’t enough notice about this change, then that’s fine, but this isn’t what anyone here is arguing.

Also it’s a configuration switch, any distribution could have decided to revert it or postpone it at their choosing.


What on earth is broken or insecure about not killing processes?


You watch porn, log out, but mpv is somehow stuck and still playing. Broken enough?


This, right here is an example of what those who oppose systemd mean when we say that it's monolithic.

What gives the init system the right or the duty to reach down into a user's processes and determine[0] that they are stuck (versus running appropriately, as e.g. the user indicated with nohup(1))? Why is it the init system's job to handle that?

That's just not its job. If I wanted to run some sort of misbehaved-process killer, I could. Or, y'know, not running misbehaving processes. Ideally, that would include not running misbehaving processes like anything from the systemd project.

0: or, as in systemd's case, blindly assume


KillUserProcesses is enforced not by systemd (PID 1) but by systemd-logind.


> What gives the init system the right or the duty to reach down into a user's processes and determine[0] that they are stuck (versus running appropriately, as e.g. the user indicated with nohup(1))? Why is it the init system's job to handle that?

If this behavior was mandated by some other piece of software named FluffyUnicorn and had nothing to do with Lennart, but was still widely adopted just as systemd is, would you be ok with it?

It’s in systemd because it makes sense to be there. Systemd already groups services into cgroups so it makes sense to also do that for user sessions.

> That's just not its job. If I wanted to run some sort of misbehaved-process killer, I could. Or, y'know, not running misbehaving processes. Ideally, that would include not running misbehaving processes like anything from the systemd project.

So toggle a configuration switch on your system. What you are actually trying to do is to FORCE this bad and confusing behavior as a DEFAULT on regular users that have no need or want for it.


> If this behavior was mandated by some other piece of software named FluffyUnicorn and had nothing to do with Lennart, but was still widely adopted just as systemd is, would you be ok with it?

If this behavior was mandated by some other piece of software, it wouldn't be as widely adopted as systemd is.

That's the true problem with systemd. It tries to do everything and does 80% of it well enough that many people use it, but then is too complex and integrated with itself to easily identify and carve out the problematic bits and replace them with third party alternatives.


> If this behavior was mandated by some other piece of software, it wouldn't be as widely adopted as systemd is.

So your argument is that this is forced on people because of systemd’s political power?

There’s a configuration option to reverse this behavior, it’s not hidden away somewhere, it’s been widely publicized. Any distro could have flipped the switch and easily reverted to preserve backwards compatibility, but none did. This is because this change is a net benefit to the majority of users.

> That's the true problem with systemd. It tries to do everything and does 80% of it well enough that many people use it, but then is too integrated with itself to easily identify and carve out the problematic bits

Again, you don’t need to fork systemd to change this behavior. If that was the case I would understand the criticism. But that is not the case. The alternative workflow is perfectly well supported. All we’re arguing about is the defaults. Systemd developers go out of their way to not break things.

You’re arguing for making up some abstraction layers for plug-n-play components that no one is demanding, and would probably never be used. Modularity has a cost, and not only that, but you also have to know where to draw the line between core and addon.

And if systemd actually did all of that, I’m pretty sure all those habitual complainers would just argue that it’s over-engineered and should have been kept simple. You can’t win with the peanut gallery.


> Any distro could have flipped the switch and easily reverted to preserve backwards compatibility, but none did.

No, many of them did. The problem is that this is not the only such issue, and distribution maintainers don't have unlimited time and resources to re-evaluate every individual default chosen by upstream, so most of the upstream defaults end up in the distributions. The distributions can fix this once you identify the problem, as e.g. Debian has done, but "you can change it" is no argument for a bad default, because changing it is work in the meantime things are broken.

> Again, you don’t need to fork systemd to change this behavior. If that was the case I would understand the criticism. But that is not the case. The alternative workflow is perfectly well supported. All we’re arguing about is the defaults.

If the defaults weren't important then why are you arguing about them?

> Systemd developers go out of their way to not break things.

Yet tmux and screen are broken on the distributions that use upstream's default.

> You’re arguing for making up some abstraction layers for plug-n-play components that no one is demanding, and would probably never be used. Modularity has a cost, and not only that, but you also have to know where to draw the line between core and addon.

You say that as if it wasn't the way everything works in many other init systems. The init system doesn't typically have a DNS server, you can use dnsmasq or BIND or unbound or djbdns or whatever you like. It doesn't have its own cron, there are many choices and you can choose any of them.

And just drawing any hard lines would help. Even if you had to replace two modular components to replace one thing, or one component that does two things when it should be one, that's certainly a lot more feasible than having to understand and touch thirty integrated pieces to replace one component.


> The problem is that this is not the only such issue, and distribution maintainers don't have unlimited time and resources to re-evaluate every individual default chosen by upstream, so most of the upstream defaults end up in the distributions.

Well they should. Otherwise, what’s the point of them?

> Yet tmux and screen are broken on the distributions that use upstream's default.

Of their own volition. And btw, distributions could patch them to work with systemd. None of this is systemd’s fault. Since when is it upstream’s job to make sure downstream properly integrates their software?

> The init system doesn't typically have a DNS server

There’s no DNS server in systemd core. It just lives under the same umbrella. Do you know FreeBSD has DNS server in the same repo as kernel? Does it mean it has a DNS server in the kernel? You know perfectly well that this is just plain false.

> It doesn't have its own cron, there are many choices and you can choose any of them.

Why would you need “many choices” for a simple timer? What are you going to do, invent new type of time?

Anyway, you’re completely ignoring the other perspective on this. Because old style init did so little and so poorly, cron used to be a de facto service manager. Also don’t forget inetd. So you had duplicated, poorly implemented, but nevertheless, redundant functionality in several separate systems. How is systemd’s approach not both less complex and much more sane?

> And just drawing any hard lines would help. Even if you had to replace two modular components to replace one thing, or one component that does two things when it should be one, that's certainly a lot more feasible than having to understand and touch thirty integrated pieces to replace one component.

Why? If you can’t point to where the line is then what’s the point. It’s like saying you want cars to be more modular, so let’s just arbitrarily invent a “motor carriage[1].”

You could replace the engine without the coach, wouldn’t that be swell?

Anyway most of systemd’s components communicate over a common system bus. You could provide alternatives just by speaking the same API.

[1] Sorry, I’m not a native speaker; I mean this: https://en.wikipedia.org/wiki/Coach_(carriage) but with an engine instead of horse


> Well they should. Otherwise, what’s the point of them?

If the distribution is supposed to micromanage everything from upstream then what's the point of upstream?

> Of their own volition. And btw, distributions could patch them to work with systemd. None of this is systemd’s fault. Since when is it upstream’s job to make sure downstream properly integrates their software?

Since when does everything have to integrate with the init system at all?

> There’s no DNS server in systemd core. It just lives under the same umbrella.

It isn't a matter of which repository it's in, it's a matter of how much work it is to swap it out. Can I just run dnsmasq or dnscache and change an IP address somewhere, or do I actually have to change the code because it's expecting something more than a general purpose DNS resolver?

> Why would you need “many choices” for a simple timer? What are you going to do, invent new type of time?

An existing implementation has poor code quality and I can do better, but my new implementation is less feature complete, so some people prefer the one with more features while others prefer the one that has fewer bugs and uses less memory etc. etc.

> Because old style init did so little and so poorly, cron used to be a de facto service manager. Also don’t forget inetd.

Which they still are, because they're still there and there is nothing stopping people from using them in that way as ever.

But runit et al don't require that either, so let's not pretend that there is no third way.

> Why? If you can’t point to where the line is then what’s the point.

Your argument was that it's hard to know where to draw lines. But it's more important that you draw them somewhere than the specific place where you choose to draw them. Otherwise everything mushes together into a single piece of spaghetti that can't be disentangled from itself.

> Anyway most of systemd’s components communicate over a common system bus. You could provide alternatives just by speaking the same API.

Where are the RFCs for these APIs, so that I can write my application against the spec and be assured that it will continue to work against future versions of the software on the other end?


If you don’t like systemd so much then write something better. I mean you’ll find literally anything to dislike about it, I don’t get it. You can still use cron or rsyslog if you like. Or don’t use systemd. This is stupid. I’m done. The default makes sense for 99.99999% of users, literally the only point I was trying to make.


> If you don’t like systemd so much then write something better.

Writing something better doesn't get rid of the dependencies other projects now have on pieces of systemd, which pieces then have dependencies on other pieces until you need the whole thing.

> I mean you’ll find literally anything to dislike about it, I don’t get it.

This thread is about one specific complaint: It has too many interdependencies without well-specified stable interfaces between them, and actively encourages things to take on more of them, as with replacing SIGHUP handling with systemd-run.

> The default makes sense for 99.99999% of users, literally the only point I was trying to make.

This doesn't make any sense. Most applications don't handle SIGHUP and are terminated by the default handler. Applications that do handle it continue to run. If they used systemd-run instead they would also continue to run. Where is the benefit from forcing applications to do something systemd-specific and breaking existing things that don't?


> What you are actually trying to do is to FORCE

It's a rule: if you're advocating systemd, you don't get to accuse anyone else of forcing anything.


What do you disagree with in that sentence? There are defaults, distros have defaults, they’re the subject of this discussion. Anyone arguing for any default is likely dictating the de facto behavior for majority of nontechnical users, which is the majority of users period.


If I've nohup'd mpv or put it in a tmux shell, then that is the behavior I want. For instance, if I ssh into a controller for a home entertainment system to kick off a video, then this would be exactly what I want.


Then you can toggle one simple configuration switch, instead of forcing confusing behavior on the other 99% of users that don’t want or need it.

Take a step back and consider if say Windows did it like that, wouldn’t you agree it is broken?


> Then you can toggle one simple configuration switch

Only if I have root permissions (granted, I probably wouldn't be watching porn on a machine I wasn't admin on but that was just an example application).

> instead of forcing confusing behavior on the other 99% of users that don’t want or need it

Who is forcing users to run programs with nohup or tmux shells?

> Take a step back and consider if say Windows did it like that, wouldn’t you agree it is broken?

I'm pretty sure Windows does do it like this; if I were to remote desktop into a Windows box and start playing a video, it should keep playing even if I disconnect, reconnect, and log back in. It does this for normal applications, at least, though videos are a special enough case where it might be accelerating with the remote GPU.


>Only if I have root permissions (granted, I probably wouldn't be watching porn on a machine I wasn't admin on but that was just an example application).

It doesn't take root to do so, in most cases you probably still want to run the transient scope under your user so you'd use systemd-run --user in order to create it not with the main system instance of systemd but with the user level instance of it.

>I'm pretty sure Windows does do it like this

No it doesn't, as for your remote desktop example you can have the exact same behavior on Linux with systemd reaping user scopes by just using a VNC server. Windows is different in that when logging off it won't allow you to while an application is still running. It gives you the choice to either stop and go back to whatever application isn't closing (because you have unsaved work or something) or to kill it.


> It doesn't take root to do so, in most cases you probably still want to run the transient scope under your user so you'd use systemd-run --user in order to create it not with the main system instance of systemd but with the user level instance of it.

If a non-root user can do it and leave a program running then doesn't that invalidate all that BS about security?


None of this is about trying to prevent the user from using resources. The user is the one who is logging out in the first place. If the user wants to terminate all of their processes except for one daemon they can do that. The security benefits aren't the primary benefit, security wise all you gain is that after you log out there's no chance that anything with any sensitive information is still hanging around. I mentioned ssh-agent as an example but you could also have stuff like maybe chrome didn't close on SIGHUP and as a result maybe this makes your saved passwords accessible to someone who can dump the RAM later by getting physical access to it. It definitely helps security but it's not really that big of a deal.

Ironically enough when I went to Google to search for an example the result that came up was my comments on HN on the same subject from a year and a half ago.

https://news.ycombinator.com/item?id=14735145

Here's a great example of the kind of real life breakage that reaping the user scope on logout actually fixes.

https://bugs.freedesktop.org/show_bug.cgi?id=94508


> Only if I have root permissions (granted, I probably wouldn't be watching porn on a machine I wasn't admin on but that was just an example application).

If you’re not an admin you probably prefer the systemd default. OTOH if you do need to run tmux between sessions you probably have root as well.

> Who is forcing users to run programs with nohup or tmux shells?

You’re forcing confusing behavior (media playing despite logging out) on unsuspecting users. This is unintuitive to to nontechnical users, and just “wrong” to most that know the reasons behind it. I haven’t heard any good technical argument for keeping this behavior, only that it should remain like that because a minority is used to it. Though you’re welcome to change my mind.

> I'm pretty sure Windows does do it like this; if I were to remote desktop into a Windows box and start playing a video, it should keep playing even if I disconnect, reconnect, and log back in.

If you connect and disconnect you are not necessarily logging out, it’s equivalent to locking the session, which does keep music playing on Linux/systemd, and btw even offers MPRIS2-based media control right on the lockscreen, at least for Plasma.

Also it can pause the music if you log in concurrently as a different user. This is because systemd (and PolKit) have a very sophisticated seat management built in. For example it treats you differently if you log in remotely or have a seat right at the console. It can offer different authentication mechanisms and permissions (e.g. you need root/admin to shutdown the machine remotely, but don’t if you’re physically at it). All of this is possible and configurable thanks to the work of Lennart and others.

The question at hand is only whether you make the default the behavior that makes sense to 99% of regular users or to the few loudest.


complain to distros then. (systemd set the secure default, even if that breaks backward comp, as usually upstreams do, when it comes to security.)

or better yet, read the release notes, it likely mentions this breaking change. (if not, that's a bug.)


> systemd set the secure default, even if that breaks backward comp, as usually upstreams do, when it comes to security.

Breaking compatibility is generally avoided to the utmost. Even security-sensitive things like TLS continue to support older, less secure versions to retain compatibility with peers that haven't been upgraded yet, much to the chagrin of everyone when they screw up the version negotiation, but better than the chicken and egg problem where nobody can upgrade until everybody has.

But the other point is that the claimed security improvement doesn't actually seem to be there in this case. They haven't made it so you can't have a program continue to run after the end of the current session, they've only changed what you have to do to make that happen, thereby breaking everything that did it the traditional way.


If only there was a way for the system init program to identify and keep a list of processes it has spawned, you could imagine like a unique numerical Process ID, and then if there was a program that could check the Process Status, and another that could kill the process identified by this... PID with increasing levels of aggressiveness...


PIDs get reused so this doesn't work well.


He's sarcastically alluding to systemd's approach at solving this.


If a process doesn't handle SIGHUP it dies. So all the daemon has to do in that case is nothing.


If a process doesn't set its own SIGHUP handler it dies. If it does in order to gracefully handle shutting down but it's deadlocked then there's no feedback as to whether or not the process actually finished handling the signal.


So the answer to your hypothetical deadlock is to break everything else? What kind of complex and graceful shutdown does ssh-agent really need?


>So the answer to your hypothetical deadlock is to break everything else?

It's not a hypothetical situation, everyone on here has seen applications hang and have to be terminated. SIGHUP handlers are no different in this regard.

>What kind of complex and graceful shutdown does ssh-agent really need?

That's a straw man argument, and the whole point of SIGHUP in the first place instead of just some "persistence" bit set per process is because for real world applications it's not as simple as just kill -9 to stop a process. But for ssh-agent in particular it needs to go through and unlink the socket that it binds to on startup. More to the point it also has to go through and close every PKCS11 provider that is registered which means calling functions that aren't even in openssh to begin with so who knows if some PKCS11 provider will hang during that.


wasn't GP specifically mentioning user processes and not system daemons? e.g. for daemons it's perfectly expected behavior to not shut down on SIGHUP. Apache, and other system daemons would re-read configuration files when receiving SIGHUP (as a way to reduce downtime during config updates).


> How else do you propose to make sure that when I log off my ssh-agent is really terminated and not just locked up with my keys still in memory?

Perhaps with a signal handler?


That was the nice and friendly POSIX way, turns out it's really convenient for malware to stick around that way. Now user session isolation and termination works (cgroups), but it of course breaks backward comp.


I agree that on Linux-based systems, SIGHUP is a reasonable mechanism for killing processes when a user closes an ssh session, and that ignoring SIGHUP is a reasonable way to avoid getting terminated.

I disagree that POSIX says that processes should expect a SIGHUP when a user logs out (SIGHUP means the controlling terminal was closed). I am not at all a POSIX expert, so please correct me if I misunderstand, but afaict POSIX explicitly does not specify what happens to the controlling terminal when a user logs out (http://pubs.opengroup.org/onlinepubs/9699919799/xrat/V4_xbd_...):

> POSIX.1 does not specify how controlling terminal access is affected by a user logging out (that is, by a controlling process terminating). 4.2 BSD uses the vhangup() function to prevent any access to the controlling terminal through file descriptors opened prior to logout. System V does not prevent controlling terminal access through file descriptors opened prior to logout (except for the case of the special file, /dev/tty). Some implementations choose to make processes immune from job control after logout (that is, such processes are always treated as if in the foreground); other implementations continue to enforce foreground/background checks after logout. Therefore, a Conforming POSIX.1 Application should not attempt to access the controlling terminal after logout since such access is unreliable. If an implementation chooses to deny access to a controlling terminal after its controlling process exits, POSIX.1 requires a certain type of behavior (see Controlling Terminal ).


There is no NOHUP signal, you're referring to SIGHUP.

See the enable-linger option for loginctl and KillUserProcesses for logind.conf. KillUserProcesses was set to default enabled on 4/9/2016, prior to that it didn't happen, but was configurable if desired. So you were always able to change the config to restore the previous behavior from the moment the default turned it on.

Edit:

Here is the commit where it happened

https://github.com/systemd/systemd/commit/97e5530cf2076a2b4f...


> So you were always able to change the config to restore the previous behavior from the moment the default turned it on.

No, you were not.

The thing that people are missing here is that neither of the systemd-logind behaviours, with KillUserProcesses=yes or KillUserProcesses=no, is the long-standing behaviour of kernel login sessions all of the way back to 7th Edition that nohup, tmux, screen, emacs --daemon, mosh-server, deluged, and more all interoperate with.

The behaviour of kernel login sessions is that end of login session is a HUP signal to the session leader, and that termination of the entire TTY login service (such as at system shutdown) is a TERM signal to everything followed by a KILL signal to everything then remaining.

The systemd-logind session behaviour with KillUserProcesses=no is no signals at all at the end of the login session, and at termination of the TTY login service both HUP and TERM signals together then KILL signals, to everything.

The systemd-logind session behaviour with KillUserProcesses=yes is both HUP and TERM signals together then KILL signals, to everything, both at login session termination and at TTY login service stop.

As I pointed out years ago, the fix is to make systemd-logind use KillUnit at hangup and StopUnit at service termination, actually providing the conventional behaviour which it currently does not in any mode and addressing the original problems (with some background GNOME utilities in a login session that were never being sent a HUP signal at logout and would have exited had they been) that motivated this whole mechanism in the first place.

* https://news.ycombinator.com/item?id=12335128

* https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=825394#221

* https://news.ycombinator.com/item?id=11798604


I just checked a few Debian stretch boxes that I setup, and "KillUserProcesses=no" is set on them all. And until a few minutes ago, I didn't even know to check.

So how can it be the default?


If you comment out that line it'll be on by default - Debian fixed it for you with their own default configuration file, because 99% of their users would only be annoyed by it.

This is why we have distro vendors, to build a system that works in the real world with software from developers with opinions that... differ to say the least.


Debian maintainers make many improvements to upstream and only rarely mess up (ssh key generation).


I meant SIGHUP. Edited.


Eleven years earlier when SMF was added to what would eventually be Solaris 10, we had this same problem. Some of us had to drop everything to fix "bugs" in cron, sshd, ... introduced by SMF.

Systemd is basically SMF, done poorly, because NIH.


Is there a daemonization API as such? I think there was only the "way of doing" shown in man 7 daemon.


The systemd people have their own version of that manual page.

* https://freedesktop.org/software/systemd/man/daemon.html

IBM was explaining what to do back in 1995.

* http://jdebp.eu./FGA/unix-daemon-design-mistakes-to-avoid.ht...


  killing user processes on logout
By "killing", do you mean some other signal than (or in addition to) SIGHUP? Does it send SIGKILL?


That's the whole issue here. It does.


Also two years ago, I explained how one could make this work, by having logind use KillUnit at hangup and StopUnit at shutdown.

* https://news.ycombinator.com/item?id=12335128

* https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=825394#221


> killing user processes on logout (rather than send them the SIGHUP signal, as POSIX says should happen)

TIL what nohup(1) is for.


Sort of. While it's debatable when SIGHUP should be sent as part of controlled system logout/whatever, the signal itself was originally used upon abrupt disconnection (hang up) of the controlling terminal of a program.


Systemd's response was to say that they should incorporate systemd's library, and use systemd's new daemonization API.

By "use systemd's new demonization API" you mean, instead of

$ screen

systemd asks you to write

$ systemd-run --scope --user screen

instead. Annoying to have to learn a new thing, but hardly the unbearable burden.

On the other hand, when you're an impacted user who's lost work, and researching the bug leads you to a years-old discussion in which someone is actively denying that the bug exists and refusing to fix it, that's infuriating.

Because it's a bug for some, and intended behavior for others. Look, you make it as if they introduced a bug on purpose to screw with some people. It's clearly not the case, there was a specific tradeoff involved.


> Because it's a bug for some, and intended behavior for others. Look, you make it as if they introduced a bug on purpose to screw with some people. It's clearly not the case, there was a specific tradeoff involved.

They broke userland.

It doesn't matter what tradeoff they made - they went against POSIX behaviour, and as a result, broke numerous utilities, both past and future.

Let's say that again - systemd introduced breaking behaviour on userland, against POSIX, and instead of backing down and allowing for expected and specified behaviour, they said it's everyone else's problem.

That is neither professional, nor responsible.

When you make a mistake, a mistake that breaks the behaviour of POSIX, and POSIX utilities like _cron_, you apologise, and fix the problem.

You don't turn around and say that all the sysutils should incorporate your new idea.


First of all, as mentioned above, we made this compile-time as well as runtime-configurable, so that downstream distros can choose whether they want to make this opt-in or opt-out. Hence blame your distros if you picked it in a way you didn't like.

Moreover, this doesn't affect cron at all. Cron creates its own PAM session for each job it runs which means those jobs are independent from any real login session (i.e. ssh, graphical, tty login), and thus also don't get cleaned up by them.

This affected stuff that is forked off a login session and then stays around as "orphan" if you so will, i.e. with all session resources released, except for these processes that try hard to avoid clean-up (usually by double forking + detaching explicitly from any TTY/ignoring SIGHUP).


As many, many others have stated, ignoring SIGHUP is not a way to "avoid clean-up". It is the explicit and intended method that a program should use to indicate that it should not be cleaned up.


This has more to do with feelings about you and the perception of you as a "bad guy" than it does about the technical discussion.

I tend to agree with the idea that the choice of defaults belongs to the distro's. If the distro's are deferring to the upstream project on default settings for a critical system component then they need to be more thorough and validate what they are shipping.


Maintaining of all these special cases requires lot of knowledge. If maintainer is responsible for just systemd package, then it's not a problem, but when number of packages per maintainer is measured in hundreds, maintainer will stick to defaults, unless users will complain loudly enough to sacrifice whole working day on the problem.


> Maintaining of all these special cases requires lot of knowledge.

Distro maintainers need to have a lot of knowledge about their init system. There's no way out of that. It's probably something everyone should know a little about as well.


> Distro maintainers need to have a lot of knowledge about their init system. There's no way out of that. It's probably something everyone should know a little about as well.

Then maybe the init system should be simpler and not attempt to ingratiate itself with UEFI or attempt to replace su, sudo, syslogd, netcat, resolvconf, etc.


> They broke userland.

That alludes to kernel development, which systemd is largely uninvolved with. A userland program chosen by various distributions failed to support conventions from a different userland program. That's all. Were the programs involved fundamental and highly important to many users' experience? Sure. Is busting out "you broke userland" like some magical shibboleth useful as a means of your conveying your unhappiness that your distribution maintainers chose to replace a widely-depended-upon program with a different program useful? I think not.

> they went against POSIX behaviour

Which? There's "tradition" and "specified behaviour". Both are important in different situations and in different degrees.

> You don't turn around and say that all the sysutils should incorporate your new idea.

Why not? They're no more privileged by the POSIX specification, or by the user/kernel -space divide than any other program.


POSIX was broken first. It's insecure by default.

Intel, the kernel, even Chrome broke my userland by mitigating Spectre.

It happens.

CRON was and is run as a system service, in its own scope. If you run your own cron instance, but forgot to set it up as a system service, yeah, it gets cleaned up as you exit your shell/session/scope.


> They broke userland.

So? "We don't break userland" is a Linux kernel thing. Systemd is not kernel, it's userland, and userland things break other userland things all the time. They already broke lots of existing stuff when they replaced /etc/init.d/ scripts with systemd definition files, should systemd also have not done that?

> It doesn't matter what tradeoff they made - they went against POSIX behaviour, and as a result, broke numerous utilities, both past and future.

Linux is not POSIX, so I don't see how that's relevant. For what it's worth, I don't even know what part of POSIX it broke. Care to enlighten me?


Right; the Linux kernel has a "we don't break userland" policy, systemd doesn't. That's a selling point for the Linux kernel, and a strike against systemd. Both systemd and the Linux kernel are infrastructure projects which, if they're doing their jobs well, will never cause me problems so I get to ignore them. Systemd has been causing other people problems, and doesn't seem to understand that in the role they're trying to fill, preventing that from happening is their first and most important responsibility.


Like it or not, the Linux kernel is clearly the outlier in terms of backwards compatibility. For example, Postgres changes their data format in most non-bugfix releases. Would you consider that "a strike against" Postgres?


They provide an upgrade process that makes this invisible to the end user, so it's not a fair comparison. If it started deleting tables when I exit a session, that would definitely be a strike against it.


Postgres has session-bound resources, and in most cases no way to disable those from being deleted when exiting a session. For example in postgres you can't persist a prepared statement, but you can of course persist data within a table. Any function running will be killed when you exit (or at least not complete since the transaction is cancelled).

IMO when a user has logged out and has not had the permissions/foresight to setup a task in the system to run without a session it should be killed.

I get that this has not been the default behavior in linux/UNIX, but to me it seems like the sensible one.

And that's before we ever argue about the possibility to turn it off.


Systemd offer a compile and runtime option to turn this option off, so it is a fair comparison.


I think you're completely missing the point.

If you ruin everyone else's day, and change behaviour everyone else is expecting, then it's probably your own fault.

Approaching it as if everyone should simply change and do what you want, is the height of arrogance. You are generating work for others. And in this particular case, not only are you generating work for others, you are eradicating a category of software.

When a distribution adopts systemd, they let everyone know how things are changing, and slowly transition things over, releasing when stable.

We know systemd replaces init.d. It was difficult, but distributions using systemd got over that hurdle, but it did take time.

However, this is not the same.

Yes, systemd is userland, however it is also PID 1. It is a layer between most userland and the kernel, and so needs to reflect the responsibility of it's position.

Ignoring how NOHUP is supposed to be interpreted, is a _bad idea_, and yes, a violation of POSIX, specifically signals (SIGHUP and nohup), and how they are supposed to be handled.

Moreso, it greatly heightens the difficulty of many utilities that are expected to work.

Why should cron (all implementations of cron), suddenly need to rely on another userland library to maintain it's function?

You just broke most Linux automation. Across an entire industry.

Why should screen (all implementations of screen), suddenly need to rely on a userland library much bigger than most implementations, to continue it's base function?

You just broke an entire category of background systems - including systems communicating with embedded hardware. You might have caused a factory-floor fault. Which could cause injury, or worse.

A breaking change of this level can cause industry-wide ramifications that are not just limited to the digital. Unexpected behaviour is exceptional, and should take time and considerable thought before occurring.

Systemd has responsibility that no other userland system has. It's PID 1.

If they're going to require a massive change in process behaviour, then they are going to require consultation, awareness within the industry, and transition time. They should be working with distributions, aware of the man-hours they're generating, before they put something in place.


This discussion is very much apropos of what the article is talking about:

> The whole systemd battle, Rice said, comes down to a lot of disruptive change; that is where the tragedy comes in. Nerds have a complicated relationship to change; it's awesome when we are the ones creating the change, but it's untrustworthy when it comes from outside. Systemd represents that sort of externally imposed change that people find threatening. That is true even when the change isn't coming from developers like Poettering, who has shown little sympathy toward the people who have to deal with this change that has been imposed on them.

The posix violation is by design. If you think that posix dictates the wrong thing, then you will do something different and this is what Poettering has done. The fact that systemd has more or less been embraced by linux is an endorsement of his design philosophy, even if distributions reject specific features.


I am not upset that there was divergence from POSIX.

Design choices are fine - I can understand why systemd takes a different approach.

What I don't like, and completely disagree with, is systemd not working with the community they directly effect to reduce disruption.

Like it or not, the product is an industry standard, and so will be held to industry expectations.

Rather than turning around and requiring everyone to change, they could have said, "Sorry, we're making changes, here are some preliminary patches that could help."

Or a timeline for a breaking change, wherein they can negotiate with others.

I don't have significant issues with systemd's software, though some reservations about quality. My main concern, and it has been since the beginning, is that systemd acts without thought or conscience to the effects that they might cause.

They lack the ability to be a team player, despite creating an environment where people depend on them.

systemd's adoption rates is an absolute credit to it. They have some very good design thoughts, and those working on it have done some excellent work.

However, it would be better if they communicated with the people they effect, rather than letting the community be an accidental Q&A team when things go wrong.

They do get this right sometimes, but that seems to be the exception, rather than the rule.

They approached the init.d situation calmly, and slowly. They worked with Debian, and Fedora and others to make sure it would work without interruption or loss of quality.

They approached the sigkill situation like they were a kid who just learned how to light a fire and wanted to burn the library down.


You make plenty of assumptions there, in particular that there was no communication about the session killing thing. Turns however there was. We informed downstreams about our intention and the reasons in detail, and we documented this for everybody else in NEWS. We also made sure there was an easy compile-time option to pick the default for this option, and then left the rest for the downstreams to decide: whether to default to on or off to this, taking in the information we got from us and from the rest of the community. If you think they made the wrong decision, then complain to them really. But seriously, you really just assume we wouldn't talk to anyone, without actually having any idea what it communication is really taking place.


> We informed downstreams about our intention and the reasons in detail, and we documented this for everybody else in NEWS.

From The Hitchiker’s Guide to the Galaxy, regarding the plans to destroy the Earth:

‘But the plans were on display …’

‘On display? I eventually had to go down to the cellar to find them.’

‘That’s the display department.’

‘With a flashlight.’

‘Ah, well, the lights had probably gone.’

‘So had the stairs.’

‘But look, you found the notice, didn’t you?’

‘Yes,’ said Arthur, ‘yes I did. It was on display in the bottom of a locked filing cabinet stuck in a disused lavatory with a sign on the door saying “Beware of the Leopard.”’

Back in the real world: you built & shipped a system whose defaults were and are broken, and now you blame others for not enabling the DONT_BE_WRONG setting. You might as well blame end users for not becoming fully-versed with your code before their first login.

It’s not the users’ fault. It’s not the distros’ fault. It’s yours, and your project’s, for shipping code which breaks the user experience.

I appreciate your vision. It’s a good one. You’re a smart guy. But have some humility! Have a sense of your own limitations, and those of the distros and users who will use your code. You’re a human being; the distros are made up of human beings; your end users are … human beings. Think of them.


This is kind of a ridiculous reply. Is the only solution then to admit that Linux is "done"? Because it sounds like there's no room for change, even when change is communicated and multiple options to avoid it are provided.


> What I don't like, and completely disagree with, is systemd not working with the community they directly effect to reduce disruption.

> Rather than turning around and requiring everyone to change, they could have said, "Sorry, we're making changes, here are some preliminary patches that could help."

> Or a timeline for a breaking change, wherein they can negotiate with others.

But they did exactly that.

They contacted the tmux mainteners and asked if some modifications would be possible to accomodate the new option (see poettering comment here: run things as child of systemd --user or just register a separate PAM session). If I remember correctly, it would not even have been the first special case in tmux ; there already is one for OSX.

The discussion was actually progressing nicely until the anti-systemd flooded it. I remember seeing posts in a lot of place urging people to comment on the bug report with specious arguments. The whole thing was kind of upsetting.


They did that 6 days after releasing the version that broke tmux, that's hardly preparing for or negotiating.


POSIX isn't a law. You don't "violate" POSIX. It's a standard for compatibility. You can choose to not be compatible with a standard when you think it makes sense. That's something that lots of projects do. You are using standards compliance as a moral cudgel.

Your argument is way too impassioned to be just technical. You just basically accused Lennart of hurting people with no evidence whatsoever.

This sort of stuff really doesn't help.


When there is a standard and someone doesn't follow it, it is said that the standard has been violated.

It follows that when someone implements functionality that doesn't follow POSIX, POSIX has been violated.

There's nothing wrong with the statement.


He accused Lennart of hurting people with no proof. Is that reasonable?


Please point out where in my comment I make any reference to reasonability.


Apologies for that part, then. I just don't see standards compliance like other people do. Personally, I don't see standards as things that imply some kind of morality. They are tools to accomplish a goal. sometimes other goals may supersede their usefulness.


That is fair enough. I have not argued against your point of view. My comment was more on the linguistic side of things.

You criticised the parent's language saying that "you don't violate a standard" because it "isn't a law". I was just pointing out that you do indeed violate a standard because it's a standard, and saying that does not add any kind of moral or passion value - it's just using the language the way it's intended.


Aren't we just a few weeks after Rich Hickey's "you have no right to make demands of open source software" rant?

Systemd has responsibility that no other userland system has. It's PID 1.

No, you have the responsibility to check what the software you are installing does, and if you don't approve, change it or reject it. Or, don't check, and deal with it.

Systemd developers do not owe you working POSIX, working cron, industry wide working Linux automation, screen, separate userland for everything. They don't owe you anything. If you don't like their thing, don't use their thing.


Although I very much like the "don't break userland" approach, I agree with you. Especially in the light, that 1. You can start your background process the systemd way (shown elsewhere in this thread) 2. You can configure the desired behavior 3. Your distro probably already has configured it for you (Debian)

So it comes down to "something changed which is absolutely extremely important for me but I would rather discuss about it for hours then take the few seconds to configure it". Especially since the new behavior is intended behavior and also has upsides for a lot of use cases.

So don't be ungrateful. Be happy that some people are really putting a lot of work behind the software you use daily FOR FREE and just configure the darn thing the way you like.

And last but not least, most people here (me included) are not in the position to complain so much about free software, unless they show some commitment to open source themselves.


>If you don’t like their thing, don’t use their thing

Oh how I wish that was a course of action I could reasonably take in this instance...


> Annoying to have to learn a new thing, but hardly the unbearable burden.

The problem is now your scripts won't work on systems that don't use systemd. Shell scripts work on FreeBSD, but now you can't use them because they require systemd-specific code.

I am not necessarily anti-systemd in most respects (I like a declarative definitions of services and less shell script hell), but the fact that they keep trying to get people (including container runtime developers like myself) to use _their_ API rather than the preexisting ones is fairly "anti-social".


Aleksa,

I am not trying to get you to use our APIs. You talking about the cgroups APIs again, if I am not mistaken? As I tried to explain again and again: if you want container runtimes to manage their own cgroups then just set Delegate=yes in the unit file of your manager, get your own cgroup subtree, and you can do below it whatever you want, you do not have to call into systemd ever. Not a single API call, no C call, no D-Bus call, nothing. You get your own kingdom if you set Delegate=yes, and systemd won't interfere with that. This is extensively documented.

I wished you'd actually listen to what I keep repeating to you. We tried to be really nice to container managers, knowing that they disklike systemd APIs, so we put a lot of work in making the delegation boundary clean, so that they can be entirely systemd agnostic beyond setting the Delegate=yes boolean in their unit file, but alas, we just keep hearing the same nonsense.

The LXC/LXD people btw did get this right: they manage their own cgroup subtree now, and systemd doesn't interfere, and they don't link to or do dbus calls into systemd either.


> then just set Delegate=yes in the unit file of your manager

In runc we don't have a dedicated manager or long-running daemon. Yes, Docker and cri-o use Delegate=yes (so I am quite aware of this option) but that really doesn't help people who are using runc in their own user sessions or wrote their own wrapper and aren't aware of Delegate=yes.

I get that we are quite odd, and don't fit into a system-service model. After all of the back-and-forth with both you and Tejun (especially when it comes to "rootless" delegation -- which systemd only offers if you get a privileged user to delegate for you), I'm not sure that there's much I can do on this topic. I get that what I care about is not something you care about, but I would hope you accept that I'm not just being obstinate for the sake of it.

> Not a single API call, no C call, no D-Bus call, nothing.

Right, unless you need to set this up for someone else. And we have code that does this too -- I don't really recommend people use it, but it is necessary (and I'm pretty sure some folks at Red Hat use it based on how many bug reports they submit related to it).

Since systemd is managing the entire cgroupv2 tree (and the fact we can get around that for cgroupv1 appears to be seen as a design flaw by both you and Tejun), obviously we have to talk to systemd to do this type of thing. I just wish this wasn't the way it was done (and if cgroupv2 had a named cgroup concept -- which is what systemd needs for tracking services -- I would think that this wouldn't be such a pain-point).

I guess I'm just annoyed that we can't use "better rlimits" with "rootless" container runtimes because of all of this.

> I wished you'd actually listen to what I keep repeating to you.

I am listening, and I am aware of Delegate=yes and all of that history. But as I outlined above, I don't necessarily agree with it entirely. And unlike a lot of people around here, I don't think any of these pain-points are coming up because of malice or something stupid like that -- I just think we disagree on our priorities.

> We tried to be really nice to container managers, knowing that they disklike systemd APIs, so we put a lot of work in making the delegation boundary clean

Don't get me wrong -- I do appreciate that we have Delegate now (there was a period of several years where "systemd decided to reorganise the cgroup tree, un-containing my containers" happened on several occasions -- and Delegate solved those issues).

And from what I've heard from the LXC folks, you were quite reasonable about getting systemd to work inside LXC. Which is good to hear.

> The LXC/LXD people btw did get this right: they manage their own cgroup subtree now, and systemd doesn't interfere, and they don't link to or do dbus calls into systemd either.

We do basically the same thing. We just don't support cgroupv2.


They changed a decades-old behavior many people rely on, and it must have been obvious from the start people will loose work because of it.


It's a bug because it violates the expectations of an uninformed user. You aren't given a warning about it, it's not documented in big bold letters anywhere, and it's also not POSIX compliant.


Annoying to have to learn a new thing, but hardly the unbearable burden.

Rather, a breaking change to everyone's scripts and processes for zero benefit.


Our scripts and tools work similarly on the four Unix systems we have in-house. Are you saying that it's OK that they don't work on Linux? Please do not forget that Linux is a POSIX system, basically a re-implementation of Unix, and until systemd it's been a fully compliant -nix system. Where I work we have transparently been able to deploy our products on all -nix, including Linux, since the nineties.

EDIT: My reply was supposed to be to xyzzys's post below, not the one I apparently replied to.. sorry about that.


There's a benefit, you're just not seeing it. Again, do you think that the systemd developers decided to implement it just to screw with people? As I said, there's a specific trade-off involved here.

I agree that it might not be the most desirable default, but if that's the case, then the guilt also falls on the distribution maintainers, who either ignored the big bold letters in the changelog, or didn't bother to test the everyone's standard workflows before pushing to stable.


> Again, do you think that the systemd developers decided to implement it just to screw with people?

Based on Lennart's behavior, yes I do.


Instead of pretending the benefit is so obvious it doesn't require you to discuss it perhaps you could explain it.


Not the parent nor Systemd developers, but apparently they think it's the only way to make sure the user's session is cleaned up.

But frankly, 100% people would be fine with it if the default was left at no instead of changing it to yes. It's all about giving users a choice when a new feature is introduced, something Systemd developers understand only partially.


There's a benefit, you're just not seeing it.

Not to appeal to self-authority, but I have been maintaining production Linux systems in large-scale environments since the late 90s. If there were a benefit that outweighed the unnecessary breaking changes, I would see it, even if I didn't appreciate it. There isn't.

You should stop and think before you assume that other people are incompetent, both because it would make you a better interlocutor, and as a bonus it wouldn't violate HN's principle of charity.


The benefit is, of course, clean up of orphan defunct processes. One might argue if this is outweighing the drawback of the change (it might not, but that’s what some distro maintainers chose to enable), but you shouldn’t suggest that they just broke you for no purpose, instead, you should stop and think before you assume that other people are incompetent, both because it would make you a better interlocutor, and as a bonus it wouldn't violate HN's principle of charity.


Your copy/paste doesn't apply to my comment, since I didn't assume you were incompetent, just that you'd made an overaggressive claim you didn't care to back up.

Of course, a defense of systemd's comically broken reaping behavior removes all necessity for assumption in this case. sysvinit at least consistently reaps on SIGCHLD -- systemd randomly reorders into the sd-event API and then does something random based on the order receipt.


> Your copy/paste doesn't apply to my comment, since I didn't assume you were incompetent, just that you'd made an overaggressive claim you didn't care to back up.

Sorry, I assumed you're competent enough to figure it out, or at least look at the original sources where authors of the change explicitly explain the reason why they do it. Of course, since you assumed that they are incompetent, you didn't bother to do so, instead, completely uncharitably assumed that there's zero benefit for that.


I'm sorry to bring bad news, but there's indeed a benefit, you just don't see it.


Surely it can be articulated, then.


It was, many times, you can just google and educate yourself.


> This argument, he said, seems to be predicated on the notion that systemd is a single, monolithic binary.

Can we please stop misrepresenting the complaints against systemd? The only time I ever hear this "monolithic binary" argument is from systemd advocates. The actual complaint is about tightly coupling important features together. Not only does this make it difficult (often impossible) to replace individual components, when tight coupling happens at the (internal) protocol level, any replacement component ne4cessarily hast to implement a bunch of (sometimes unwanted) systemd baggage.

Busybox implements all of its features in single monolithic binary, but it isn't a monolithic design that tightly couples those specific components together. Replacing one of busybox's components is often as simple as removing busybox's symlink and installing the replacement. This isn't even a "Unix philosophy" issue. Even inexperienced designers shouldn't have as hard time Understanding why systemd is a monolithic design but busybox isn't.


https://suckless.org/sucks/systemd/ has items like "pid 1 does DNS". It's an incorrect complaint that exists in the wild, though it certainly isn't the basis of all accusations that it violates the Unix philosophy.



What baggage are you specifically referring to?


It runs its own logging system with non-standard interfaces and formats. It runs its own DNS resolver with non-standard behaviour. It maintains compatibility only with a narrow range of udev versions, which in turn maintain compatibility only with a narrow range of kernel versions. And all the d-bus interfaces between these pieces may change at any point without notice. So you can't replace any piece of it, because even if you provide your own component that implements one of the systemd d-bus interfaces, you've got no forward compatibility.


If there was a serious effort to replace/port parts of it, the needed internal APIs can be stabilized ( https://www.freedesktop.org/wiki/Software/systemd/InterfaceP... ).


> "It's software" so of course it's buggy, he said. The notion that systemd has to be perfect, unlike any other system, raises the bar too high.

systemd is a PID 1 program, it means it have to raise bar higher. When troubles begin, you would need tools to fix them, and if PID1 is crashed, you are out of luck. If system cannot boot into shell, you'd need to fix it from initrd shell. Or to boot other system, to fix this one. It sucks.

Linux kernel chases very high standards of reliability, because when kernel panics it is even worse than PID1 crash. Init system should follow the same standards as linux.


Have you ever had pid 1 (systemd or any other init) crash? For the last ~three years I've been paid to maintain high reliability algorithmic trading systems that ran systemd and a whole lot of other stuff, and systemd has never crashed on me. Lots of other stuff, including the kernel itself, has crashed.

The bar is higher for pid 1 - if I were designing systemd I would have made a tiny pid 1 that just did message-passing to a more complex secondary process that could be restarted, or something, just to be safe - but I think systemd has empirically cleared the bar.


Yep.

3AM, deep slumber, called out to look at a stricken server. Its problems included that systemd was frozen. Reluctantly I came to the conclusion that a restart was the only route forward. Cept, that is when you discover that the commands that have served you well for 2 decades don't work, as they are all wrappers for systemd, which has keeled over.

To this day, the `shutdown` man page, which I was checking in, makes no mention of how to resolve, tho in fairness the other commands (poweroff, halt, init) do. I discovered this after stumbling across https://github.com/systemd/systemd/issues/3282

If you find yourself stuck in the middle of the night, reading through docs to try and figure out how recover a machine with a crashed systemd, then `systemctl reboot -ff` or equivalent is what you are now looking for, the `-ff` being the key to "JUST £&*(ing RESTART THE MACHINE!!!".

Experiences like that, don't win you friends.


The worst thing about this is when stuff goes down, it does so at the least convenient time. Back in 2003 I was on a customer site who had a RH server and there was no internet connection available (as it was routed through the box) and my phone was a Treo 180G which had precisely fuck all useful internet on it. The company still exists and is in the middle of nowhere on the end of a shonky ADSL line and no mobile phone reception so the story hasn't improved.

If this happened to me today with systemd I'd be up shit creek without a paddle.


Did raising elephants not work (SysRq + R E I S U B)


systemd disables the magic sysrq keys by default.


I've had shutdown (for reboot) hang a few times after a systemd update, forcing me to cut the power. It's made me a bit paranoid, so I block the systemd package from having updates automatically installed, and every 6 months or so carefully manage update and reboot of each and every server ...

EDITs:

there's the classic case of the linux "debug" parameter: https://bugs.freedesktop.org/show_bug.cgi?id=76935

and the even more classic case of firmware loading events: https://lkml.org/lkml/2012/10/3/484

and while "all software has bugs" systemd really has the most annoying bugs (by virtue of trying to do everything core to the system) and always insists that they are features and we are backwards whiny geeks for complaining.


I’ve had Systemd completely stop responding before on numerous occasions on centos 7. As in can’t reboot or hangs rebooting or all commands hang.

Only recourse has been to reboot the instance from AWS dashboard.

I can’t get to the bottom of it because the tools don’t work when it’s down and there’s nothing there when it comes back up. I am not enjoying boiling to death in this pot of shit.

And then there’s the situation where it just won’t boot. I just fire up a new instance then because it’s easier than debugging it.


> Have you ever had pid 1 (systemd or any other init) crash?

No, I have not. But I have seen how systemd gracefully failed to boot system to login, with good looking colorful error message. Something that reminded me "Keyboard is not found. Press F1 to run setup."


Oh, to be clear I'm real mad about how systemd fails boot if (say) one of your filesystems is unavailable and makes you log in with a root password to fix it.

But OP was asserting that systemd crashes under normal operation because its pid 1 is too fragile, which is very different. At scale I already expect that there's a chance a machine won't come back if I reboot it - it's annoying if I can't ssh in, but, well, I already lost a disk I care about and it won't return to service and I need to fix it anyway. (And it's an easy fix, just add "nofail" to fstab.) At scale I don't expect init to crash under normal operation.


Yep. CentOS 6's upstart can be felled by generating a bunch of inotify events in /etc.

http://rachelbythebay.com/w/2014/11/24/touch/


I've never experienced an outright crash, but I've been bitten by [0] on some of my servers.

[0] https://github.com/systemd/systemd/issues/719


There was that time it was bricking computers by erasing UEFI variables, but I'll allocate equal blame between systemd and UEFI


Personally, I lay the blame for this issue squarely at the feet of various UEFI implementations which fail to boot when the system's EFI variables are, for whatever reason, wiped clean. The UEFI spec explicitly states that clearing all of the variables on a system must not result in an unbootable system.


You shouldn't, because the maintainer of the kernel subsystem concerned told us all that systemd wasn't to blame for it.

* https://news.ycombinator.com/item?id=15973577

* https://news.ycombinator.com/item?id=11152880


Actually, that was nothing to do with systemd. That was definitely a UEFI implementation issue. And systemd didn't delete anything, the user did - they ran:

  rm -rf --no-preserve-root /
https://lwn.net/Articles/674940/


The bug was in the kernel (it should not have allowed userspace to write arbitrary UEFI variables), but AFAIR it was exposed by systemd because it eagerly mounted the UEFI variable filesystem provided by the kernel into /sys/something/efivars.


Indeed, but again that was a firmware issue. systemd didn't delete the variables. And systemd was setting EFI variables, so consequently it needed it to be mounted as read/write.

The configuration files should have set that to read only after boot.

The kernel patch where this was fixed can be found here:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/lin...


Using systemd automount and NFS you can easily get pid1 unresponsive, hung in uninteruptible sleep forever.


No. Not mine. Not systemd. Not others. And I touched upon how rare this was in practice in my experience some years ago on Hacker News.

* https://news.ycombinator.com/item?id=8384251

But it does happen to other people.

* https://unix.stackexchange.com/questions/440229/

And there was one crash that made the headlines.

* https://news.ycombinator.com/item?id=12600413


Very recently I had this issue[0] as the result of a systemd upgrade, requiring the use of a recovery disk to downgrade to the previous version as the keyboard input had failed to be initialized.

[0] https://github.com/systemd/systemd/issues/11314


If this bug hits you in staging, no big problem, just don't promote that particular update to production. If this bug hits you in production, your lack of a staging environment is the bigger concern IMO.

(I have been hit by the same issue on my private notebook, but I have procedures in place to cleanly recover from failed upgrades on all systems, so it was not a big deal.)


"Yeah the software completely broke, but that's fine because you should be able to deal with that" does not make me feel better about the software in question.


The current debian testing version crashes with a NULL pointer segv in the kernel module. You need to downgrade to the previous version.


In what kernel module? There is no "the kernel module" in a systemd context.


There is. udev loads kernel modules. See eg. http://www.linuxfromscratch.org/lfs/view/development/chapter...


Waving in the direction of udev does not clarify what kernel module is supposedly the kernel module, which is what you were asked.


'rurban is one of our resident trolls - see also https://news.ycombinator.com/item?id=13364173


> Have you ever had pid 1 (systemd or any other init) crash? For the last ~three years I've been paid to maintain high reliability algorithmic trading systems that ran systemd and a whole lot of other stuff, and systemd has never crashed on me.

You see, that's the argument I hear a lot from Systemd advocates. The problem with anecdotal evidence is obvious. When you hear people opposing Systemd, practically all of them have some real-life issues with it, often related to functionality that would otherwise be non-essential (i.e. doesn't really need to be handled by PID 1). Of course if you don't have a particular problem, you don't feel it's important. That's precisely the attitude people resent.


> When you hear people opposing Systemd, practically all of them have some real-life issues with it

Yes, but a lot of people have real-life issues with it on their desktop of the form "It's too complicated." I'm asking specifically about real-life issues on production servers at scale. There will of course be tools that are poorly suited for a personal machine (even a personal server) but well suited for a team that wants to run a bunch of reliable servers.

For instance I would never be happy running RHEL on my desktop, but that doesn't mean RHEL is useless.


I can't quote any statistics but have the impression that a large part of non-Systemd crowd are old-time admins who maintain a large number of servers, myself included. When you break something on a desktop machine, that's easily fixable. When you need to deal with a large heterogeneous environment, you prefer to have things handled a bit more gracefully. Linus is a good example of a person who got this right.


This article is a bit of a joke "It's software so of course it's buggy" isn't a great argument when you're replacing something that didn't suffer the same issues.

I just count my blessings that runit is widely packaged in every major distro because it can just happily sit on top of sysvinit, systemd, upstart, pretty much any init system and does things in a very simple shell script style, I really wasn't a fan of the weird ini-like format for systemd or several different tools I'm expected to learn just to read my (now binary) log files competently.

If you're sick of switching init systems constantly or don't want to have to write separate scripts for your linux box and your freebsd box even, I highly recommend checking runit out.

I'm sure I'll give it a serious shot eventually... in about 3 years once they work all the Poettering kinks out, just like PulseAudio. They're doing some cool things with cgroups and stuff, so I hope it gets there eventually.


I second using runit. We use runit to be able to use the same service definitions inside docker, on a VM or bare metal.

If you've ever tried to use systemd inside docker to bring up a couple of services, you would know the hoops you have to jump through to get it working.

(I understand that docker wasn't invented to run multiple services in the one container, but sometimes it can't be avoided and simplifies app deployment vastly I.e, using CI to test your service actually starts up as per its definition: just run up a quick docker image with runit and a service definition file)


I've only seen supervisord as the root process in multi-purpose containers. Is there any significant gain to using systemd instead?


If you use systemd, you can use standard packages from your distro to run up services inside a container. That's basically the only reason I considered it.


>I'm sure I'll give it a serious shot eventually... in about 3 years once they work all the Poettering kinks out, just like PulseAudio.

Good luck. If you need anything more than "I play a three minute song" on Linux audio you need both some type of real time kernel and jack.


> When Lennart Poettering started thinking about the problem, he first looked at Upstart, which was an event-based system that was still based around scripts, but he concluded that he could do a better job.

I think I'd like systemd more if I had more confidence that he had done a better job.

I never really understood why Upstart didn't get more traction; was it just the Canonical backing made other distros avoid it, or did it have drawbacks I never ran across?


I used upstart deeply around 2012-2014 in a commercial product based on Ubuntu. I remember two problems with it which had me waiting for Ubuntu to switch to systemd (unfortunately my employer ran out of money right when that happened). The first was that Upstart's model is upside-down: where systemd has units that depend on units, Upstart has jobs that are activated as other jobs complete. Whereas you would tell systemd that (say) Apache depends on syslog and ask it to try to start Apache, and therefore systemd would first try to start syslog, you tell Upstart that Apache can be started when syslog starts, and that you should start syslog, and then Apache's dependencies are met and Apache can start too. Instead of having a single target (like sysvinit runlevels or systemd's multi-user.target) that it works towards, Upstart just starts ... stuff ... untill it's done. Upstart did have a runlevel concept, but as I recall it was an event that triggered jobs to start, not a target. See http://upstart.ubuntu.com/cookbook/#critique-of-dependency-b... for Upstart's defense of this approach.

(systemd is hardly unique in being dependency-based instead of event-based - but it is a strike against Upstart in particular. Also, while one of the stronger claims of why Upstart should work this way is it matches the evented mindset of udev, systemd can handle events from udev just fine, by just having it add additional targets - and in the end, systemd consumed udev.)

The other was that it was hard to deal with its bugs. I had a job that wasn't activating. I spent days chasing it, including attaching a debugger to Upstart, and eventually gave up. I've used systemd extensively and have not had "Why isn't this service starting" bugs. (I've certainly had that confusion, but I've always been able to figure it out quickly.) systemd has in my experience been reliable and trustworthy. I'm not going to say that it's bug-free, or that I like its implementation, or that I have no complaints about it. I am going to say I haven't been frustrated by it - even when running very old distro versions with lots of known bug fixes upstream. That's very important for whether I'm happy with it or whether I'm going to switch back to sysvinit and find something else for service monitoring.


Hey thanks, that's good info and helps me understand the bigger picture. Very useful comment!


The problem is full process management, and in that regard, upstart was pretty much still just scripts. I will say this, systemd tries to solve the problem of tracking all children/grandchildren which is difficult to do with sysvinit or scripts by themselves.

Systemd also solves the problem of unified ways to start services across many distributions. Packaging things is easier. I don't like the systemd ini target format and wish something else had won out instead. With something like upstart, in theory you could rewrite the upstart daemon and use the same scripts, since the format/standard was much simpler.

I feel like there must have been a lot of bullying to get systemd across so many systems. It just seems weird to see so many major distros all decide to accept something so complex if it wasn't for Redhat cramming it into everything they maintained and helped fund.


> systemd tries to solve the problem of tracking all children/grandchildren which is difficult to do with sysvinit or scripts by themselves.

> Systemd also solves the problem of unified ways to start services across many distributions. Packaging things is easier.

> It just seems weird to see so many major distros all decide to accept something so complex

You answered your own question. Distributions adopted it because it solved many problems for them.


You know, he didn't say that. He said Red Hat adopted a complex system that solves some problems - implying that they may have some other motivations. If anything, the point seems to be that it is a very complex and unwelcome way to solve some problems - and perhaps unworthy of Linux.


I run Devuan on some boxes. Essentially Debian without systemd. No issues whatsoever. Completely problem free.


> I feel like there must have been a lot of bullying to get systemd across so many systems.

I followed the Debian discussions when they debated init changing.

It was not pretty, there was a lot of things, but I don't remember any bullying. It was simply a very long uphill battle from a sort of loose group against the majority of maintainers that wished to stop worrying about 99% of init script problems.

Systemd is not perfect, never was, but it gets shit done, whereas no other project does in this regard. (A lot of people wanted to tackle some parts of what systemd does, but that was late and insufficient.)

And, all in all, there is always room for a systemd2 in Rust, distros would switch to it in a heartbeat, if it would be better.


I think it largely was Canonical that was the issue. Every time they've pushed something that they designed/developed it always seemed to me that they were more attempting to solve their business model problem rather than a Linux architectural problem. (i.e. get people dependent on them rather than just providing a solution)

Even though Red Hat is a much larger organization, their contributions have never felt like power plays in the same way that Canonical's have. Perhaps it's just the way that Canonical tries to shove their stuff through while Red Hat seems to get more organic buy-in. For example, as controversial as Systemd was (mainly from sys admins and power users), it did have a level buy-in from distro maintainers and developers which Upstart never really did. Canonical just tried to say 'we're doing this.'


> Even though Red Hat is a much larger organization, their contributions have never felt like power plays in the same way that Canonical's have.

Since when? I remember some particularly contentious things like EGCS and various kernel doodads, and of course systemd.


Sure, many things have been contentious... changes in general meet with resistance. That's not a uniquely Red Hat, or even Canonical, thing. But in the end, and this is just my impression as I'm no longer a major user of either company's distro, Red Hat seems more interested in working with (with the notable exception of one specific prolific individual) other developers and distros while Canonical doesn't. It shows in the results: most of Canonical's major initiatives don't even survive in their own distro for too long either because of user revolt or the other distros go in a different direction. (they lost me as a user because of a couple of them)

(If there are distro maintainers / package developers who disagree, I'd be interested to hear about it. Again, this is just my impression)


The answer is in this Google+ post:

https://web.archive.org/web/20140928104327/https://plus.goog...

   Scott James Remnant
   +
   4
   1
   2
   1
   Reply
    
   +Michael Hasselmann at the point that Kay, Lennart and
   I sat down and discussed all this stuff, I don't think
   Upstart was perceived as "shitty" at all. We'd had
   on/off discussions for ages, but the big one I remember
   was the LF Collab Summit in SF in April 2010.

   Hindsight certainly lends a different perspective, and
   I'd be the first person to say that Upstart doesn't
   work as intended. +Lennart Poettering makes a great
   point about mountall in a recent post, it was written
   because Upstart couldn't do the complex filesystem
   cases it was designed to be able to do; and I was very
   aware even at the time that was a failure that would
   need to be addressed.

   Had the CLA not been in place, the result of the LF
   Collab discussions would have almost certainly been
   contributions of patches from +Kay Sievers  and Lennart
   (after all, we'd all worked together on things like
   udev, and got along) that would have fixed all those
   design issues, etc.

   But the CLA prevented them from doing that (I won't
   sign the CLA myself, which is one reason I don't
   contribute since leaving Canonical - so I hold no
   grudges here), so history happened differently. After
   our April 2010 meeting, Lennart went away and wrote
   systemd, which was released in July 2010 if memory
   serves.

   So I don't think I can claim that the perceived
   shittiness of Upstart spawned systemd, because at the
   time it wasn't seen that way. I don't think I can even
   claim that it provoked Lennart in any way, init was an
   area all distributions were fiddling with, so it was
   inevitable anyway.

   I entirely agree with Kay and +Greg Kroah-Hartman  that
   it was the CLA that caused systemd to be written
   instead of Upstart.

   But I don't need that self-affirmation anyway :) I
   wrote Upstart, I got paid for it, I moved on to do
   other things, something else came along and replaced
   it. If Upstart hadn't been under the CLA, and systemd
   hadn't've happened, all my code would have long since
   been rewritten by now anyway.

   That's the nature of the software world, there's no
   point getting precious over things. Do your bit, have
   fun doing it, move on and let others do their bit,
   etc.


Indeed. Early systemd was a fucking dumpster fire and part of the reason I no longer use Linux.


What do you use nowadays? Do you feel it addresses things better?


Can’t speak to the parent but I have a similar feeling.

Personally I’ve been using openbsd quite heavily for personal projects, for work my team transitioned from RHEL6 to FreeBSD- and while it has quirks (mostly on installation of software) it is incredibly stable.

It’s a shame my chosen cloud provider doesn’t treat it like a first class citizen, but that’s fine since the community projects seem to work on making good images.

I still use Linux on my desktop and laptop; since I feel like the design of systemd is more suited to those roles (systemd’s design in general feels like windows service activation) but I’ve been having issues that seem to be related to systemd. So it’s not winning me over.


Ditto.

I've switched to OpenBSD for nearly all personal work. We're still on RHEL/CentOS at work though.

I haven't figured out how to get xmonad running on OpenBSD yet or my workstation at work would be on it too.


Not the person you replied to but systemd was the last straw that made me switch all my servers to FreeBSD.

FreeBSD is much better at just not screwing everything up (particularly on OS updates). Sometimes I have to manually configure a new piece of hardware, but once I configure it it stays configured, and the way to configure it doesn't change from version to version.

On my new laptop I just didn't bother replacing windows. With "Bash on Windows" I can run all the unix programs I wanted to, but windows is handling all the session-management type stuff that systemd would do. I've found the system more reliable and better at responding to hardware changes. As much as there are horror stories of windows update breaking everything, I haven't experienced that myself (whereas I have had systemd updates leave systems non-booting).


I switched to Mac, ironically enough. I got tired of fiddling with my system and wanted to actually use it for stuff.


One of my "favorite" corner cases caused by systemd was that if an NIS user tried to login to the system, it would cause my X session to restart:

1. They changed the defaults for systemd-logind to disallow network access

2. When systemd-logind attempts to send a message to the NIS server, it instead hangs (or has a timeout greater than the systemd watchdog anyways)

3. The watchdog timer causes all systemd-logind services to be restart, which then tears down all of the child processes. My X session is a child process.


I believe systemd solves a problem or two.

However, I believe the implementation is not great.

A couple specific observations:

* lots of binaries. /lib/systemd, the logs. This more than anything else is a tragedy. Binaries get in the way of viewing, diagnosing, understanding and changing things. It seems like premature optimization to me - I don't think speed requires it. I think the majority are completely unnecessary.

* copying from launchd? Commercial software vendors ship binaries. They are aligned with them. They resist disclosure, which protects from reverse engineering, lawsuits and user modifications. Does that align with linux?

* the config files are a disorganized mess. It is like /etc/init.d but poorly understood and then /etc/rc.d was lumped into the same directory and organization became chaos. I looked at one system here and in /etc/systemd there are 16 directories, 23 symbolic links and just 11 files. Try unraveling the interdependencies.

additionally the config files (which mimic windows config files) lack depth. Simple config files can have benefits, say if a GUI read and wrote them to change settings, but I don't think that applies here. What I noticed is that they cannot do anything, so additional logic requires an intermediate script or compiled binary elsewhere.

* it is responsible for too many functions. I bet it didn't solve so many in its first iteration, but sucked in more over time. It reminds me of the accounting program that grew to rule everything in Tron.

I wonder if these problems could be resolved. It's hard to remove complexity and add elegance later.


Launchd has been open source (Apache license) pretty much from the beginning: https://wiki.freebsd.org/launchd


Systemd as the new MCP, interesting. “Sark, prepare that daemon for immediate deresolution.”


> lots of binaries. /lib/systemd, the logs. This more than anything else is a tragedy. Binaries get in the way of viewing, diagnosing, understanding and changing things

would you prefer sh/python/perl/lua scripts?

some (the suckless group) view C and make && make install the perfect way to observe and change things. (I like config files and runtime configurability.)


> Try unraveling the interdependencies.

These dependencies between services have been there the entire time. At least now they're clearly expressed.


I kinda like how a good chunk of the community complains we are "too monolithic", and the other side of the community complains we have "too many binaries". We can't win with you guys, can we?

Lennart


Multiple binaries can still be part of a monolithic system (just like multiple services can still be part of a monolithic system in the web services world). A well-archetected system has strong separation of concerns — and well-defined, reasonable interfaces — between components.

If I’m writing a piece of code and I start wanting to rewrite the world around it, that’s a sign that I’m probably doing the wrong thing.

In the case of systemd, you actually have a decent case that a lot of things in Linux and the rest of the POSIX world could stand to be improved. A good, clean, portable, systems-focused approach could tackle all those things head on.


> "too monolithic" ... "too many binaries"

> We can't win

Juxtaposing two unrelated issues usually isn't a winning strategy.


In what universe does the number of binaries have anything to do with how monolithic something is?


You really don't understand the complaint?

It's that the plethora of binaries are too tightly coupled and are not at all modular or interchangeable, which is materially the same as being a monolith.

I think you are smart enough to know this, and are just feigning ignorance.


Not that it matters, but I'm really happy about systemd.

Sure I have to learn new things, sometimes stuff breaks, but that has always been the case. For me most problems were with upgrading Linux distributions.

Anyhow it's great to have less distribution specific stuff and less shell scripts. Thank you and all the other systemd devs.


I hate how the Unix philosophy argument is boiled down to "one vs many" binaries. It's about independence to the degree of reusability. There's nothing in systemd suite that's useful unless you're two feet in. You can say things about streams and text and binary formats, but those are distractions. Systemd tooling is tightly coupled to each other. It's a huge codebase, a huge system that shoddily replaces that which was historically done with less. And it's the biggest reason I don't run a Linux workstation anymore.


Basically said, systemd follows quite the opposite of the KISS principle which made Unix and Linux so great.


I really like systemd.

It's got tons of really great functionality.

systemd feels like it is well thought out and modern, like it's an integrated rethink and rebuild of lots of various software utilities that have evolved over many years.

I'm investing all my learning in systemd where appropriate instead of things I used previously like cron and supervisor.


I agree with that. But at the same time, systemd makes assumptions that I don’t think it should.

Case in point: I deal with several clusters that use Stanford Central LDAP for account info, and our UID numbers have gotten pretty high. So much so that it overlaps the range systemd uses for dynamic service UIDs.

The biggest annoyance is that the UID range is hard-coded, so I can’t provide an alternate range.

More details: https://github.com/systemd/systemd/issues/9843 (but please don’t spam the GitHub issue!)


For an open source Free software project, I find it really strange pottering says this: "We don't recommend people to recompile their RPMs, the same way as we don't recommend changing the UID ranges..."

https://github.com/systemd/systemd/issues/9843#issuecomment-...


From the perspective of a software author, I am not at all surprised.

If you are a user reporting a bug, and you say "I am using the RPM package of systemd from CentOS 7.4", then I know which RPM to download to get exactly the software you are using.

But, if you say "I am using the RPM package of systemd from CentOS 7.4, plus my own patch", then things will be harder for me. Not only would I need to get your patch, to be completely safe, I would need to get your build environment. For example, which compiler you are using, and which version of the compiler.

Working off of a common set of pre-built packages makes problem reproduction alot easier.


Do you remember when systemd would take usernames it couldn't parse (due to undocumented restrictions on valid usernames) in unit files and just translate them to uid 0?



Wish my customers could file bugs and feature requests with that kind of thoughtful detail.


Thank you for that! It did take an hour+ to research it and write it up. And it’s kindof depressing that there’s been no movement, but I do understand that I’m not really able to contribute anything on the development side.


What you're dealing with fits in the crux of my problem with systemd. They've made a series of assumptions about how things will be on a system, and if that doesn't match your reality, well terribly sorry, adjust your reality, even if it's not possible, or realistic.

Unfortunately, unlike before, you can't easily just cut systemd out of the mix.


I'd hate it less if it were more modular, such that I could swap out distinct pieces when and if needed.

I don't even understand the rationale for some of the integration. Like systemd-resolve. Why did it need to suck in the resolver? Now I can't edit resolv.conf because it overwrites it.

I can't even find a reasonable list of everything it does. Init, mounts, login, pluggable hardware, nspawn containers, logging, and...whatever else I'm forgetting.


systemd-resolvd is, like many of the systemd components, optional.

If you're using it, it is either because you setup it yourself or your distro set up it to you or you're using a NetworkManager with systemd-resolvd as your DNS resolver (and even them, it is configurable and not the default at all).


Some years ago, I was developing a Linux distro for embedded systems. Some of these embedded systems come with old kernels, and you will never get a recent kernel for them.

Systemd is written in C and relies on kernel headers. Which means that they need to have a layer that deals with different versions of the kernel headers. I remember it was headers about network.

At the time I was working on this, I had to contribute to systemd to add support for my kernel version. Got a few patches accepted in their master. And things were working fine for a while.

Until they suddenly decided to drop <= 3.10 kernel support. Yep. Afterwhat I had to maintain patch sets for systemd, which is a great amount of job, given that my upstream was updating systemd quite often.

Now my good people, explain me how such a mess could have happen with a previous init based on shell?


I'm curious: Did systemd drop 3.10 support before the kernel developers stopped supporting it (in 2017, see http://lkml.iu.edu/hypermail/linux/kernel/1711.0/03167.html), or before that? If it was around the same time, then I could understand dropping support, since even the kernel developers had decided to move on.


You could have stayed on an older systemd version? Preferably one used by a major distro, such that someone else might backport security fixes etc for you.

Anyway the root cause here is bad vendor BSPs being stuck on old Linux versions.


GNU libc 2.26 (released 2017-08-02) requires Linux 3.2 or newer. It’s not like systemd is alone in requiring recent kernels.


It is my opinion that Lennart Poettering is the software equivalent of Thomas Midgley Jr., the inventor of both leaded gasoline and chlorofluorocarbons.[1]

This is the person behind PulseAudio, Avahi, and systemd. In any sane world, these software projects would have been stillborn. Sensible alternatives would have outcompeted them. Instead, Red Hat did their best to foist them upon the community. They promised that after the initial adoption, all the wrinkles would be ironed out. All three times this was not true.

I hate to appear so cynical, and I'm certain that Red Hat didn't intentionally pursue this strategy, but it sure is convenient for them that the harder Linux is to use, the more they can sell support contracts. Not so convenient for people who actually want to use Linux.

1. https://en.wikipedia.org/wiki/Thomas_Midgley_Jr.


> It is my opinion that Lennart Poettering is the software equivalent of Thomas Midgley Jr., the inventor of both leaded gasoline and chlorofluorocarbons.

Considering that Midgley Jr. is (indirectly) responsible for the suffering and potentially death of many humans and other beings, I submit that this (in spirit) triggers Godwin's Law. The debate is therefore over.


> Considering that Midgley Jr. is (indirectly) responsible for the suffering and potentially death of many humans and other beings

Poettering is directly responsible for the suffering of many human beings, as indicated by threads like this. It'd be difficult and unfair to accuse him of being even indirectly responsible for anyone's death, though.


>Sensible alternatives would have outcompeted them.

Of course, and they didn't, because there were no sensible alternatives on linux. All these projects were clones of decently engineered macos projects, before they were created, linux people used some truly silly staff like sysvinit.

I would use something better than pulse and systemd, but there is nothing better available.


> All these projects were clones of decently engineered macos projects

systemd is the third world copy of Solaris SMF, is it not?


Jack and runit.

They manage better performance in real world use cases with a fraction of the manpower invested in them. I wonder what will happen to the suite of projects supported by redhat now that they are owned by IBM however.


>Jack and runit.

Jack as a regular sound daemon? are you joking? Have you tried to run jack with several audio sources and audio outputs (including bt headset) without pre-configuration? It's not a pulse alternative, it's a low-latency daemon for professional stuff. Consumer oriented daemon should just work.

>runit

Does it do anything beyond service managing? Login, timers, bootloader, containers, logging facilities?

>runit starts /etc/runit/1

Wait, does it simply run sh scripts? No, thank you.


Which alternative to PulseAudio, Avahi, and Systemd exists for Linux that provide the same features?

I'm not aware of any audio daemon that is as feature rich and flexible and PulseAudio. None of the alternatives fit the same general purpose use case that is expected in a modern desktop that needs to compete with Mac OS and Windows 10.


If you use a modern Linux system with KDE, you can try removing PA. Odds are you won't notice that it is gone. Not saying that network audio is a bad feature per se, but in my ten years of using PA, I have never needed it. IMO, systemd and Avahi are fundamentally different from PA in that most users have no need for the latter but almost everyone for the former.


Because KDE has Phonon, an audio subsystem way more complex than PulseAudio (and in my experience, more bugged too).

Things like hotplug, Bluetooth devices and as such shouldn't be managed by your desktop manager, unless you want for things like hotplug and Bluetooth to only work in KDE.


Phonon is only an abstraction layer with API and ABI guarantees so every time a new technology comes and goes, all app developers wouldn't have to port their existing codebases.

Funny story:

During KDE 4.0 development, KDE introduced Solid library. Which abstracted HAL.

HAL was Linux'es "Hardware Abstraction Layer".

So HAL developers mocked KDE and got some tshirts that said "KDE Abstracted my abstraction layer".

1-2 years later HAL was deprecated. And Solid got a new upower/udev backend and no KDE developer had to port away their code from HAL to anything else.

Phonon's situation is similar. Xine, gstream, vlc, these technologies come and go. KDE apps don't care.

Also Phonon can get a QuickTime backend when it's built for Mac or something else for Windows.


Phonon is a sound API, PulseAudio is a sound server. It is not the same thing.


> This is the person behind PulseAudio, Avahi, and systemd.

Huh, I didn't know he was behind Avahi. That... explains some things. Thanks.


You might be giving Lennart too much credit. As the author of the article points out, Lennart didn't invent PulseAudio and systemd so much as he ripped them off (poorly) from coreaudio and launchd. To compare him to any inventor seems too flattering.


Avahi is just a reimplementation of something Apple developed, too. Apple developed zeroconf networking, with their implementation of the standard being Bonjour. https://en.wikipedia.org/wiki/Avahi_(software)#Avahi_vs._Bon...


Also not a huge fan but it works better than a bunch of shell scripts.


Because runit openrc perp and s6 aren't a thing?



It works worse than launchd, which Lennart ripped off. My point here is that calling a ripoff artist an inventor is being far too flattering.

(And for the record I wouldn't call Torvalds an inventor either, even though he has more sense than Lennart.)


Don't you still have to resort to shell scripts for anything slightly complicated (mounting a drive on a specific network connection) and just have systemd invoke them?


So far this is a particular thing I have run into.

DevOps (yelling): no more shell scripts

Devs: systemd fstab is all kinds of fucked for nfsv4

Devs: Writes super complicated script to mount

Me/DevOps: Makes ansible script to make systemd mount paired with service that is basically a wrapper for... a shell script...


I hate systemd with a passion, however I understand the reason for it. I think that systemd is a bit Franklinstien by the nature of having to pull udev and d-bus into it. So I guess it is a victim of having to deal with the system it is on. Oh well.

I think the only Unix system to get this right was likely Solaris via SMF.

https://en.m.wikipedia.org/wiki/Service_Management_Facility

WindowsNT also did it correctly.


I'd recommend everyone watch the video: https://www.youtube.com/watch?v=o_AIw9bGogo


There are parallels with city planning.

systemd reminds me of the efforts in some cities at clearing 'slums' to replace them with 'modern' apartment blocks.[1]

sytemd seems like a similar technocratic, modernist effort aimed at UNIX.

Operating systems are kind of like cities. One the surface they are often inefficient and messy, and people often wish they could come in and make them logical and rational, but people live in those cities and neighbourhoods, and sometimes the postbox is where it is for a reason.

If Jane Jacobs[2] were around, she could write a book called: "The Death and Life of Great American Operating Systems"

[1] See: https://en.wikipedia.org/wiki/Pruitt%E2%80%93Igoe, https://en.wikipedia.org/wiki/Quarry_Hill,_Leeds, https://www.youtube.com/watch?time_continue=862&v=Dxr2tEGlWU...

[2] https://www.citymetric.com/fabric/against-modernist-nightmar...


If the current state is the tragedy, I'm terrified of the farce.

My favorite systemdisms:

* (already mentioned several times above; sorry) even detached and nohup'd processes will be killed if the systemd context that transitively led to the creation of the process is killed: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=825394

* mounts being in a separate "mount" namespace that is not accessible normally: http://www.volkerschatz.com/unix/advmount.html


Damn. So I'm just a simple Debian user. Systemd showed up, so I learned as much as I needed. And in some ways, I found it easier to use than init scripts.

Now I've been using Docker. And there is no systemd in containers. But of course, why would you want that? But the problem is that many packages now depend on systemd. Or at least, useful features do. So I'm back to fighting with scripts. It's funny, no?


Maybe you wouldn't want systemd, but a basic init system as PID 1 is wise for a Docker container. See https://github.com/Yelp/dumb-init


Wow. Thanks.

I've been wondering about all those zombies.


I would suggest https://github.com/openSUSE/catatonit (self-plug -- I wrote it). dumb-init (and tini which is what "docker run --init" uses) don't handle zombies in the most ideal fashion possible, catatonit does. And it does so in less code.


Thanks. I use "docker run --init", and there are indeed zombies :(

I play a lot with shell scripts, and ps gets littered with grep and awk.


Why are you saying there's no systemd in containers? Systemd works perfectly both outside and inside containers, docker or any others.


I'm sure it does, however it's supposed to.

What I mean is that systemctl isn't available in containers:

   root@foo:~# systemctl status sshd
   bash: systemctl: command not found


Yeah, I know that I ought to be managing services in containers with systemctl in the host. I just haven't learned that yet.


That means only that you run an OS without systemd inside the container. Which you are not required to.


> And there is no systemd in containers. But of course, why would you want that?

I knew a guy who invented an orchestration system that involved shoving an init process into docker containers. Much sadness ensued.


Are you referring to dumb-init?

Is there a better way to reap zombies?


> Are you referring to dumb-init?

Nope.


For all those lamenting the death of standards, change for the sake of change, system breakage, bloat, and lack of user control: give FreeBSD a try (or any BSD, really).


systemd broke UNIX. Everything that worked before, now is a mine field. Last week I had a SLES virtual machine refusing to boot, no network, just emergency shell. The reason was a forgotten line in fstab (without nofail option of course), so systemd couldn't mount some file system. Coming from Gentoo, this is just unbelievable.


Systemd correctly implements the "noauto" and "nofail" keywords in fstab(5) that were added in 2008 in the way they have been documented since 2010: if neither of these is specified for a filesystem, then that file system will be mounted at boot time and a failed attempt to mount that filesystem implies a boot failure.

  Util-linux-ng 2.14 Release Notes (09-Jun-2008)
  ==============================================
  mount(8) supports new "nofail" mount option.
https://github.com/karelzak/util-linux/commit/abe3d704b6aeb6...

The fact that previous init scripts gladly ignored these failures does not reflect negatively on systemd but on those init scripts.

Since 2011, FreeBSD has the same options with the same behavior, except that "nofail" is called "failok".

https://svnweb.freebsd.org/base?view=revision&revision=22283...

  Add a special mount option "failok" to indicate that the administrator wants
  the system to proceed to boot without bailing out into single user mode,
  even when the file system can not be successfully mounted.
By your argument, FreeBSD broke UNIX.


systemd refusing to boot without required drives is actually a favorite feature of mine.

Ever since I had an RH AS server boot without /var mounted. It was truly lovely with a gigabyte of trash filling up / then trying to decide if it was worth merging data or just throwing it out.


I hate parallel service start up on server.

Yes, it is faster. But when it breaks because bad dependency, it is hard to reason the breakage on a initrd shell on a small remote console.

I value debuggability, reliablity and manual fallback options on server.


I hate writing systemd .service files.

I'm always confused with the Type (should it be simple? forking?) and my program ends-up starting but not stopping or things like that.

I know I lack the knowledge to write a good .service file in one go, as I write one every other month at best and forget all the details in between. It would be the same with a traditional init scripts. However, with init scripts, the way to debug is obvious: just add a '-x' somewhere and here we go, you have a detailed execution. I don't know the equivalent with systemd and I end-up bashing my head against the wall trying everything blindly.


I particularly like how it ripped out the old logging system but completely failed to replace it properly. Local might be ok, but shipping to a central log server? Just save yourself the headache and use syslog or some other shipper like filebeat, etc. How can something that can't even do logging properly suddenly become the defacto standard?


> I still think NSA used RedHat to push systemd for potentially nefarious reasons.

Wow, just wow. Do you have any idea how insulting this is to the folks dedicating their lives to writing this GPL software?

You should try having an actual conversation with the authors some time. These are real people and long-time GNU/Linux users and advocates you're implicitly accusing of being NSA collaborators.


RedHats collaboration with NSA is well documented. I think given what we know of past actions it's at best extremely naive to not at least entertain it as a possibility.

Linus himself semi-openly admits he's been approached...

https://www.youtube.com/watch?v=7gRsgkdfYJ8


I like to share the following links to those who also don't like systemd.

http://without-systemd.org

Devuan (Debian without systemd):

http://devuan.org

More Linux Distros without systemd:

http://without-systemd.org/wiki/index.php/Linux_distribution...


Debian Stretch works nicely for me without systemd


Notable quotes

  22:10 "Unix is dead."
  25:28 "Change is awesome when we are the ones doing it.
  27:20 "No one needed to send death threats over a piece of software."
  27:30 "Contempt isn't cool."


From the article -- Windows NT had a strong service model from the beginning, he said, and Mac OS has one now in the form of launchd. Other systems had to play a catch-up game to get there.

Reminds me of the saying, "and if your friends all jumped off a cliff, would you jump off too?" which exasperated parents use to council their children that emulating others needs to be done responsibly.


There is a big difference here. To continue with the cliff analogy: Launchd ran fast enough to clear the cliff and was rewarded by a refreshing dive into the cool water; it was implemented in a competent manner and consequently didn't cause people to become upset. Systemd on the other-hand can't run without stumbling, it hobbled off the cliff and bounced off every rock on the way down. It made lots of people pissed off because it interfered with their routines.

If systemd didn't cause problems for people, virtually no user would even know what it was, and consequently, virtually no user would hate it. If you ask your average OSX user what launchd is, they'll come up with a blank because launchd isn't causing trouble for them. If you ask your average linux desktop user what systemd is, a large portion of them will recognize the name, and many of them will chew your ear off out of frustration.

At the end of the day it's not about "adherence to philosophies" or any similarly dubious concepts. Whether or not software is hated is ultimately determined by whether or not it causes problems for people.


That's not the problem. Windows NT did have that right from the beginning. The problem is the second sentence, because it ignores that so too did Unix from the same timeframe. AT&T System 5 Release 4 had the Service Access Facility from 1988. AIX had the System Resource Controller from version 3.2. Other systems were not playing catch-up. They had this stuff too, at around the same time.

Benno Rice's explanation is actually ahistorical here. What actually happened is that the world spent a lot of time "cloning" existing Unix softwares during that time period, for reasons that we all known, and a lot of the time the clones were behind the times and did not catch up. By the time that Miquel van Smoorenburg cloned AT&T Unix init and rc for Minix, for example, what xe was cloning had already become years out of date.

* http://jdebp.eu./FGA/inittab-getty-is-history.html


We specifically use runit because we've had systemd issues in the past. Systemd may have matured now but at this point we're moving over to containers anyway, so it doesn't matter.


> "I'm not going to defend Lennart's approach to community interaction, but what I will say is you have to admire the willpower and determination of somebody who at work one day says 'I'm gonna write my own init system- actually no it's going to be a full on system layer for linux' [...] and then he gets it put into pretty much every major linux distribution."

Yeah, I suppose I also admire the willpower and determination of Genghis Khan.

Corbet is almost as dismissive as Lennart himself of the life-disruptive bugs real people encounter due to Lennart's approach to software engineering and project management. "It has bugs? It's software" This attitude is positively correlated with the amount of hate systemd (and pulseaudio) have gotten over the years.


If planes fell out of the sky due to buggy software as often as systemd crashes due to yet another bug, air travel would get much hate, too.


Exactly.

Generally speaking people like things that work well and hate things that don't. Any other justification citing 'philosophies' or similar nonsense are retroactive attempts to cast their feelings in a more intellectual light.

When somebody says "I dislike systemd because it violates the Unix philosophy" what they almost certainly actually mean is "systemd has caused me a lot of grief." You don't hear complaints about firefox or chromium violating the so called "Unix philosophy" because generally firefox or chromium will satisfy the overwhelming majority of users. That's why nobody uses Uzbl.

This is what systemd advocates (like the article's author) don't seem to get when they point out that systemd is a launchd ripoff so people who like launchd should like systemd as well. People like launchd because launchd doesn't cause them grief. People dislike systemd because systemd does. Design philosophies have jack shit to do with it.

Anyway the only real 'philosophy' Unix ever had was "worse is better" and systemd certainly seems to pay homage to that.


If systemd cost as much as a plane, it would have fewer bugs.

We can go back and forth with dumb comparisons all day but I don't really think we're advancing the conversation.


I disagree about the value of the conversation. Airplane software is meticulously tested because of the safety implications. If we can't have proper testing for a widely used, sprawling piece of software that runs with root privileges and PID 1 we may be better off without that piece of software.

Also this: the equally widespread SQLite seems to be programmed with a much better level of discipline and comes with much better testing. It isn't nearly as bug-ridden.


It's a good comparison because sqlite is used in many embedded aviation applications.

I do get frustrated with sqlite a lot because it's missing so much of what I expect in SQL (no ALTER tables, having to rewrite a lot of right joins into other types of joins, etc.) but I have never had a situation where I got frustrated with sqlite because it lost my data.


And still there are bugs in SQLite, despite unprecedented testing, best in industry by opinions of many.


And SQLite respond to bugs with a lot more professional attitude. They apologise, they fix. They try and ensure it won't happen in future. They do not make excuses.

They don't dismiss concerns - they validate them.


I really don't think that often hated "Not my problem" kind of answers Pottering gives are that unprofessional, because in 99% of cases, he points to exact party who's problem it is and mostly they are distribution maintainers, which makes sense.


Except, defaults matter. If you ship something that violates decades of well-known behavior, it is at least very unfair (and probably a lot of other unpleasant adjectives) to expect downstream to enable the —suck-less flag to restore the expected behavior.

The friendly and not bull-headed thing to have done was to ship the new behavior, but enable the compatibility flag by default, rather than expect other people to clean up after systemd's messes.

This mentality is the main reason bag on systemd all the time. If systemd does something to annoy you, it is automatically your fault for doing something Wrong or not enabling some obscure config file option. It's never systemd's fault when systemd unilaterally violates the principle of least surprise.


>> Except, defaults matter.

Do they?

Debian maintains hella lot of patches for almost every package because of reasons I can hardly justify. Overriding a few systemd config files is nothing. But let's say systemd sets some sane defaults and everyone happy. Or are they? Like, for instance default NTP is "centos.pool.ntp.org". Oh, crap, Debian still can't use it.


Comparison is still dumb, because if you have pilot for each autopilot to counter failure (which happens!), you need sysadmin for each systemd instance to counter failure. I'd say systemd running mostly unsupervised should be better than autopilots which now-days never run unsupervised.


If you add up all the engineering time used diagnosing systemd bugs at companies that use systemd distros, you might discover it’s enough money to run a small airline.


You have erroneously put Benno Rice's words into Jonathan Corbet's mouth.


Can I easily replace systemd on a distro that came with it? I wouldn't consider myself a Linux expert, but I got to a point where I could get around and started to understand the different init runlevels and how to set them, then everything was ripped out from under me and I no longer understand how to work everything anymore. Perhaps Linux boots faster now, I don't know, but it was really very frustrating I couldn't find a major user-friendly distro that worked the way I was used to.


The writing was on the wall for run-levels for a long time. Longer than Linux has existed, in fact.

* http://jdebp.eu./FGA/run-levels-are-history.html


The author goes into 'contempt culture' which is valid in any group of people ie 'PHP haters' but then tries to slip through the old trope of 'resistance to change' which is a clever bait and switch that positions change as 'always good' and refashions dissenters as anti-change which is a time tested PR strategy to push through unpopular changes that benefit a few by positioning any 'dissent' as 'anti change' and thus avoiding the meat of the discussion - the change in question.

We can agree on the problem and then on the solution, but any proposed solution can't just pass through 'without scrutiny' because a problem exists.

Systemd proponents since the beginning have positioned any criticism as 'motivated', 'greybeards', 'haters' or anti-change which tells you that there is no space for discussion here, only 'acceptance'. Can there be any legitimate criticism of systemd without this kind of politics, hypersensitivity and now posing as victims?


Also the whole "One solution for people who don't like it is to create their own alternative; that is a good way to find out just how much fun that task is."

This seems to imply that those criticizing it didn't have alternatives, which is just not true. However it is so monolithic that as soon as a tiny part is depended on by something else (e.g. gnome) alternatives quickly become a non-starter.

systemd seems to be getting rewarded by not playing well with others which is something that plays strongly into a lot of people's dislike with it.


I've used linux systems since '93. I waited until the last minute to adopt systemd. So far, for me at least, it's been great. It's a lot easier to express service dependencies, find problems when they occur. I hear a lot of hate, but in all, systemd seems to solve a ton of problems in a single place, which is something I think linux needed for a while.


My impression of systemd: shoddy version of launchd.


I reckon there's a SystemD bug that didn't mount LVM volumes correctly, some has lost data that's real tragedy


Surely if there's an issue that serious, you have a bug report number?


Here you go: [0], from 2017.

[0] https://github.com/systemd/systemd/issues/6066


So, fstab says to mount a filesystem, and systemd mounts the filesystem. The reporter explicitly says they want the filesystem mounted at boot time. The response is correct: there's no such thing as "mounted at boot time but not 'after system boot process completed'", because there's no magic "after system boot process completed" point at which that behavior could change. When the device becomes available it gets mounted.

sysvinit and typical distribution init scripts couldn't implement the behavior seemingly desired by the user, either.


... All I did was find the issue number.

The ticket follows the sensible approach: Document it, and close it as expected behaviour, and it did lead eventually to an LVM issue that did require a fix [0].

systemd did good here, above board, but there was a faff about it, because the expected behaviour of the audience differed from the reality of the situation.

[0] https://github.com/systemd/systemd/pull/6174



Farce, not tragedy


a lot of people shit on systemd because of the "SIGHUP"-"bug" however what they don't realize is that there are more use cases where logging out a user SHOULD kill all users processes.

I mean without that behavior linux is not a multi user system. The default is sane, however the lack of "permissions" in this kind is also a little bit bad, however since it is runtime configurable this should be enforced by the distribution, i.e. when installing the distri there should be a switch "this system is only used by a single user". it's basically the same as windows uac. while it looks bad for a single user, it's not that bad when you see computers as a tool which can have more than one user, with more than one "privilege".

system just shows that linux is by far not ready for the masses cause a lot of behavior is undefined, like the uid/username stuff. usernames can be created as you wish, but most programs will behave broken if you create usernames that start with a number, it's a security nightmare.

and the problem that there is no real process to actually get ALL people together who can solve this, it will never be fixed. there is only the posix standard but not everything applies to linux and not everything is a good idea in the sense of linux.

systemd actually makes a lot of things in the right way, but of course it's not perfect and probably at a certain point in time people would come up with something better.


more like abomination


I hate how he talks about knee-jerk reactions to change. because I don't think that's what's going on here. I remember I first saw systemd on a release of opensuse. I didn't think anything of it at first, except that I didn't really like the command line interface (systemctl) and I found the flags and options cumbersome. I often see new software in new releases (including the old HAL/DBUS layer) and didn't have the same reactions (although HAL had a lot of issues and was later removed or merged into dbus).

I've seen the BSD talk on this and I agree, having a system layer is helpful. It'd be nice if it was plugable, NetworkManager (or others that have some standard messages you can send/get via dbus), consolekit OR logind, etc.

systemd does make it nice that I only have to write startup/shutdown scripts once for each distro, but I'm not happy with the layout of target files, the way mounts are handled, some of the weird race conditions I've found between systemd mount targets and fstab, etc.

systemd is modular, but the modules are still all part of the whole and are not easily replaceable. The same can be said when Docker went to a modeler refactor, but there are alternative implementations of the entire docker engine. Every attempt to create alternative implementations of systemd have eventually gone unmaintained because systemd keeps getting more and more complex and engulfing more systems.

If it wasn't for distros like Void, Gentoo, Alpine, Slackware, et. al, we'd no longer have a choice at all. There would be some things that simply couldn't be deployed on embedded systems because all of the dbus shims just wouldn't exist.

It's not that people are opposed to change, it's that there are legit concerns about some of the ways systemd works and is implemented, and the way it's been ham-fisted as a political move in a lot of ways.

Honestly, I don't think it will matter in a few years. I think the way things are going, eventually all services will be hosted via docker containers and it will be much easier to make Linux distros that have a tiny init layer that just launches a docker daemon and services. RacherOS already does this, with the init process being a container, which can be uses to start up shell environment containers and other service containers.


>It's not that people are opposed to change, it's that there are legit concerns about some of the ways systemd works and is implemented, and the way it's been ham-fisted as a political move in a lot of ways.

I personally think the industry needs a lot more resistance to change when it comes to interfaces and other things humans have to understand.

I mean, I'm not talking about systemd in particular; I'm talking about in general about how interfaces change over time and people don't seem to take into account the cognitive costs of that change. Sure, ss is better than netstat and IP is better than ifconfig... but how much of that 'better' could you have done in a way that didn't toss away the historical knowledge so many people have of those tools?

And really, sysadmin tools are the least of it; I mean, they are operated by professionals, so if you want to pay for retraining (or pay the costs associated with there being fewer of us)

People change customer facing interfaces to no benefit all the time, forcing people who are trying to do other things to put effort into re-learning their interface.

I mean, my point is that interface changes are expensive, and should not be undertaken without a really good argument that they bring more benefit than the cost of retraining.


First we need sane (secure, semantic, programmatic) interfaces that slowly become standards.

netstat? /proc files? ss? parsing text? wtf?

I mean, sure why not, but at least don't call them interfaces. they are userland apps people like to script, because they are lazy to use libnetlink (or libwhatever thay uses the right kernel interface, if it exists at all).

That said, the recent gmail ui change made me reconsider Thunderbird again. And android looks different every year. sometimes it's better, sometimes it's worse. iptables, nftables. http1, http2 (and now 3 over UDP). change is the only constant.


You need to figure out what is wrong with a machine before you write a program to fix it; this is why it's important to be able to log in to something broken and nose around.

Text processing is not harder than figuring out what library to use this month.

These things change a lot... but they don't have to, and running things on computers would be easier/cheaper if they didn't.


> systemd is modular, but the modules are still all part of the whole

The idea that you are searching for is coupling. Modular systems should aim to have low coupling and high cohesion.


Not everything runs well in a container. In fact with the lack of network understanding in most container implementations I would say there are many issues.

Additionally this concept of stateless container design and state kept in containers there are opposing implementations


docker is currently killed by the kubernetes community btw. It's a slow death in some regards (1+ years) but quick in regards to "it will replace anything in 10 years".


You know, there's so many "big" wrong decisions with systemd design and assumptions (in particular, there seems to an unhealthy focus on graphical desktops rather than headless multi-user servers, as this default of sigkill all the things on ssh connection drop default is but one example. And the monolithic design (DNS in the init?)).

But that said, does anyone know where on earth they came up with the command line ux? Like the names of the commands , and the parameters? I mean, they are like an April fool's joke...


My only issue is the binary log files. When you're writing out the logs to disk, just make sure it is in text! How hard can it be?? It creates unnecessary friction when dealing with logs from a crashed system.


Does this ever actually come up for all the people complaining about it? Where are these people working that they're not willing to install rsyslog which reads journald logs directly and will write plaintext, but who have this problem enough to warrant all the complaining?

Like, previously to have any logs at all you had to have a syslog daemon. This is not a new situation.


> Like, previously to have any logs at all you had to have a syslog daemon. This is not a new situation.

What's new is that now more and more-difficult-to-debug pieces are required. For what benefit?


This is a "now you have two problems" solution.


I just don’t have time to beta test an init system with more LoC than the 2.6 kernel plus GCC 4. For now Devuan is where we landed but moving entirely away from Linux as Redhat moves system d into the kernel is the overall plan.


First as tragedy...


Systemd just makes things worse.

I remeber seeing someone post one of the simplest init process one could write. It was only about 100 lines of C.


It's much less than that, but it's sort of an uninteresting demo I think - you could pretty easily write an init process that just spawns systemd in a container, for instance, and run the whole system in a container. Or write an init process that spawns systemd as pid 2, runs prctl(PR_SET_CHILD_SUBREAPER) to make pid 2 responsible for reaping instead of pid 1, and then sits around carefully doing nothing. If systemd crashes, whatever, hopefully sshd etc. are still up.

Of course, if you do that then the question is why even bother with an init process. Patch the kernel to not treat pid 1 specially. If a process gets orphaned, don't reparent it to init, just reparent it to nonexistent pid 0. Have the kernel deal with the thing that the 10 lines of C would be doing (waiting on processes to terminate).

I think there are meaningful complaints about systemd but the fact that it's a complicated pid 1 is not one of those. If it were actually a problem someone would have written and productionalized one of the above two approaches.


It was code that even let one make simple files to start up one shot process or daemons. If a long running daemon died it would restart it. Plus it handled the getting the status of orphaned processes, and few other things.

Really the biggest issue with systemd is is scope creep scope creep means more and more packages will have systemd as a hard dependency. Like there some aspects of gnome that will not work without systemd, but at least it's not entirely a hard dependency just means you lose features unless you patch gnome.


Yet OpenBSD has Gnome without systemd. So it is possible.


So does gentoo, but if you look at there docs if want some things you may need to patch it.


As a conspiracy theorist, I'm going to say that systemd is aimed towards destroying Linux and everything it represents.

They want to push the cloud, to better control us, take away power from us, leave us at their mercy.

The next step is, when everything that matters is on the cloud, they are going to replace Linux with something else that only they know how it works on the inside, and that's going to be the end of it, we'll be left only with paywalled APIs and services, but they'll say it's better for everyone and all of that. It's sad.


You're confusing systemd with docker and kubernetes.


I hate how accurate this is. The more I work on Kubernetes implementations, the more I feel like it's a tool that 99% of companies can't effectively operate. The attitude is very reminiscent of the mainframe era where ops didn't want mere users touching their precious systems.

I feel like as an industry we're wasting tons of cycles solving the wrong problem.


A well-written, thoughtful article, worth the read.

> Nerds have a complicated relationship to change; it's awesome when we are the ones creating the change, but it's untrustworthy when it comes from outside.


Yikes! How in the world did I lose 2 of my 3 karma points on this?! Somebody please explain and upvote? Linux user since the 90s but totally new to Hacker News. I'm good with constructive criticism, opinions, etc. but not a fan of downvotes without explanation.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: