Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Why are there no decent Go or Rust alternatives to Ansible?
14 points by encryptluks2 on Feb 3, 2022 | hide | past | favorite | 16 comments
I finally started automating some or my bash functions and scripts with Ansible. I have found chezmoi to be an incredible tool to manage your dot files, but it isn't made to really manage files outside your home directory and run scripts. While you can still do it their FAQ recommends against it, and it becomes semi-obvious why.

I've worked with Ansible in the past both professionally and in my free time. I found it to be incredibly useful, but I have been burned many times by poor playbooks, documentation, etc.

Running a bunch of plays in Ansible can feel like you're submitting a job to a CI server. If you're used to running them in scripts, you'll notice that what used to take a second or two may now take 15 or 20 seconds.

The thing that also baffled me is how slow it can be, especially running modules on localhost. File operations where you may want to delete some temporary paths is insane how much slower it is then a simple shell script.

chezmoi is different though. It is pretty much instantaneous. What is even more incredible is that with templates, includes and more, there is no noticeable slow down.

Having used Hugo as a SSG and creating Helm charts, I got used to using Go Templates. While I am not a huge fan of Go templating, the functions seem equally as powerful as Ansible, even more so because it is fast.

Which brings me to the point of all this, why has no one replaced Ansible with a Go-based tool? Sort of like a hybrid between Ansible and chezmoi.

Are people just using CI now for automation and if so what can you recommend that is lightweight?



One of the appeals behind Ansible is its "agentless" workflow. The couple of minutes you lose waiting on execution time is nothing compared to the amount of man hours saved installing and maintaining agents on all your remote systems. When used properly, it also maintains idempotence, meaning, all subsequent runs of your playbooks should not do anything different. Each execution should have the same results and it should not cause configuration drift.

I made a career out of shell scripting and learned most automation scripts are not written with the rigor other software is. Most shell scripts written by my peers would not pass a legitimate code review. Things like input validation, error handling and such are not typically considered by sysadmins writing automation scripts.

Tools like Ansible have these patterns built-in already, for the most part.


But it’s not agentless. It depends on ssh. And we discovered back in 1995 that parallel ssh to even a medium sized number of hosts doesn’t scale. In fact, it anti-scales.

There’s a reason why tools like Puppet and Chef have agents. It’s so that they can have a more scalable parallel UDP-based communications mechanism between them and their agents.

Heck, some tools even do a torrent-style mechanism for the lengthy stuff.

These are lessons that Ansible actively refused to learn. At least until recently.

And shells are variable across platforms and versions thereof. If you’re depending on your bash script running exactly the same wheat everywhere in the world, you’re going to be in a world of pain. There’s a reason why defensive bash scripting is hard and convoluted. And I don’t think that any of the Ansible playbooks have really been written to be sufficiently robust in this manner.

Moreover, all of these tools end up treating your infrastructure like pets, because you’re applying one set of operations on top of another on top of another, ad infinitum.

Instead, you want to treat your infrastructure like cattle, and ideally use immutable infrastructure. If you want to make any changes to the infrastructure, you don’t change the existing ones, instead you spin up new replacements and then tear down the old ones.

To make that easy, you want a robust and reliable build system to create OS images that have everything you want/need, and then you run them read-only. Logs and other things that might want/need local write access use a different partition.

And the further you go down that road, the less and less it looks like Puppet or Chef or Ansible or any of those kinds of tools will be helpful to you.

Of course instances themselves are only part of the equation. You also need to be able to configure network devices and all the other aspects of infrastructure in a similar immutable manner, and tools like Puppet or Chef or Ansible have never really been able to offer much in the way of configuring anything other than hosts/instances.

IMO, you’re asking the wrong questions. You instead want to know how you can deploy your static-built code onto your immutable infrastructure, and what kinds of tools are available to help you do that.


> These are lessons that Ansible actively refused to learn. At least until recently.

what changed?


They continued to get repeated complaints about scalability, and it became more and more clear to them that they would have to implement their own agent, in the vein of Puppet and Chef.

They also got complaints about Ansible scripts failing in weird ways because of all the different shell versions, and their scripts weren’t written in a sufficiently robust manner.

That was a pretty killer one-two punch.


Can you point to some sources about this?


Various private conversations that I had with folks at Opscode/Chef and various people I’ve spoken to from AnsibleWorks.

UT Austin was a user of Cobbler, and after leaving UT Austin, I learned Chef to completely redesign and redeploy the production service networks for a small startup here in Austin, by the name of ihiji (who was since bought by Control4). Many thanks to my good friend Matt Ray for that referral.

I have stayed in the Chef community and broader DevOps community, working as a consultant for Momentum Software (before and after they got bought by VMWare), and even leading up to the work I was doing as a consultant at Whole Foods (before they got bought by Amazon), I continued to have contacts and conversations with various people in this space.

So, nothing on paper. Just private conversations with various people over the years.


System automation is very “scripty” so the traditional tools have always been in scripting languages (chef, ansible, puppet, salt) and you can have lots of other peoples “code” running instead of doing it all yourself. That is sort of the appeal, enough other people use it so that you don’t have to do everything yourself if you don’t want to.

There are solutions in go, one that looked interesting to me was mgmt [1]. You can look into that if it is to your liking. There is a list of videos and blog posts to mgmt that you can take a look at.

Thing is, if you move away from the mainstream you will find less help and less pre-made solutions. Don’t overestimate the skill of the “average” programmer. For every brilliant programer there is somebody out there that just googles code snippets and haphazardly bashes them together to try to solve a problem. It is easy to reach you limits, if not in skill then in time and that is where a tool with a larger community can save your ass.

[1]: https://github.com/purpleidea/mgmt


There's also the issue of ansible being very easily extensible. That would either have to go through some new compatibility later with golang or it would be lost. The plugin building story is not trivial at the moment.


I would agree with this, Ansible (and Salt) allow you to extend the functionality by

1) Create a python script for your functionality 2) Drop it in a predefined folder on the master server 3) sync all agents (if using agents)

In my experience Chef / Puppet are not so easily extendable, and anything using Golang or Rust would require a step to generate the binary code for your new functionality, taking for granted the proposed tool would allow extensions to be picked up dynamically by adding them to a specified folder.


Take this as pure speculation, but my gut feeling says that this might be due to the tendency in the infrastructure/cloud side to move towards Kubernetes and similar solutions. Classic massive vms fleets are a thing that, in my opinion, has been and will continue decreasing and perhaps, there are no challenges that might appeal to someone in order to create a replacement for something mature that works and has a big company as RedHat behind. Again, pure speculation from my part.

Me personally have used Ansible a lot in my carrer until 2 years ago, where I moved back fully to the Kubernetes world, and yes, Ansible can be used there too, but I prefer to rely on more native solutions. Funnly enough, during the last months of that period, I started to write, as a side project, a “clone” in Golang introducing improvements and new features but I finally abandoned it as I moved away from that vm/ssh world.


I wrote a simple tool, which is modeled after puppet in terms of syntax. It was mostly an experiment to see what minimum required features are enough to provide something useful - and I settled on a few primitives such as creating files, installing packages, and running commands.

Despite the minimalism it turned out to be more useful than expected:

https://github.com/skx/marionette/

I added extra things, such as the ability to pull docker containers, and clone git repositories, and despite being single-host I'm using it to setup several virtual machines.

(In the past I toyed with another project which used a combination of ssh+scp, also written in golang - https://github.com/skx/deployr )


disclaimer: ansible developer

I would agree, Ansible is not the best tool to 'remove some files from localhost', a simple scirpt or one line shell command is much faster. Ansible is designed mostly to do 'remote' work in parallel and with many features that would just be 'overhead' for that scenario.

Ansible was not built for speed, it was built to make automation across many targets simple, auditable and reliable.

If you don't need any of those things, it will appear unwieldy, I would compare this to going through security checks, luggage handling and customs at an airport to cross the street.

FYI, you can build modules (remote execution pliugins) in much more than Python, any scripting language works, even binary modules are supported. I know of group using golang modules for their environment, have not heard of people using rust (yet).


One option is to just Ansible to pre-bake the machine image with packer and cloud-init for "last mile" deployment. For VMware govc is an awesome cli to handle the machine launch. For testing locally with kvm machines the multipass cli works well on Linux, macOS and windows - no cloud needed


I had thought about this as well, but even prebaking an image, lets say deploying desktop machines... you still have the issue of needing to validate configuration changes once a machine is deployed.


I think that most of the overhead is due to how ansible communicates with clients. A faster language would not change that.

chezmoi has a much simpler approach so I don't think it's a good comparison.

There is a project called mitogen that implements a faster communication method. Don't know if it works with newer versions of ansible tho


Chef's Habitat project is in Rust: https://github.com/habitat-sh/habitat

I've never used it though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: