16 December 2021
I was recently onboarded on one of the largest projects we have at Padok. It has been running for almost two years and we are now ten SRE interacting with several developer teams. What struck me during my onboarding was the number of tools I had to install: it took me at least a full day once we had all the dependencies fixed (like getting access to all the different services). The process was actually not that painless thanks to an extensive documentation. However, along this process, I wondered: how can I manage all these tools whilst keeping my machine clean for side projects? What could we do to speed up the tools set up part of our onboarding?
I think these are questions we are especially concerned about as SREs, as we love customizing and automating our local environments.
Why we need tools
A tool is extending our abilities to perform specific tasks. More precisely, in the DevOps world, tools help us in several ways:
- Gain time on repetitive tasks. Automating the tasks you perform daily is the most obvious reason to use a tool. In our DevOps team, we need to connect to multiple clusters and AWS accounts and switch between them daily. A combination of AWS profiles and a shell script using
[fzf](<https://github.com/junegunn/fzf>)helps us do it with just one command and a few keystrokes.
- Ensure quality levels. A good example of this is to enforce best practices on our code through automatic linting checks and formatting through your CI checks or with pre-commit hooks. It saves you time during code reviews and makes sure you maintain your quality standards.
- Enforce uniform practices across a team. If everyone in your team uses the same tools, it is easier to implement usage, standard onboarding, and maintenance procedures. So if someone encounters a problem with a tool, only one person is required to fix it for everyone in the team. Hence, uniform tool adoption means better maintainability.
- Maintain security standards. For example, tools can make it simple to enforce strong authentication policies, thus you can improve security without sacrificing usability.
How to choose the right tools
The first and perhaps most important part of keeping your workspace clean and healthy is to choose your tools carefully.
So what is a good tool?
It depends a lot on your specific context, and you need to consider a few things:
The time you spend. Especially on long tasks, or tasks you do regularly, a good tool will help you save time.
The time needed to set up and maintain the tool. This table from xkcd sums it up very well:
Choosing a tool is always a tradeoff. For example, in my project, to improve security we decided to switch from SSH to SSM bastions. In the process, each of us (and the developers) had to update our connection scripts and manage the edge cases (different OS, configurations...). We gained in security but needed to prioritize effort on it accordingly.
The interactions with your existing tools. Be careful about how your ecosystem of tools behaves as a whole, including across different projects. For example, when working on multiple Kubernetes clusters or cloud provider accounts, it is critical to avoid working on the wrong one. I will give more details later in the article.
The adaptability of the tools you chose to the environments of your team members. Each of your coworkers has a different machine, OS, setup... so if you want them to adopt a tool, it needs to fit in all their personal cases. This is even more true for DevOps people, as we all have our favourite terminal emulator, shell, shell configuration, or text editor, which makes it even more difficult for uniform tool adoption.
Use as few tools as possible. Each time you add a new tool, you need to take into account the total time you will spend. It includes maintaining your tools and onboarding new team members if they need to also use them.
When you consider introducing a new tool in your environment, you need to ask yourself :
- Does it solve a task none of my other tools can? If not, even if the new tool is more performant than the old one(s), it might not be worth increasing the burden of your toolbox.
- Does it solve a task one of many of my tools cannot because they are reaching their limits? If so, maybe you can find a more powerful one that covers the different use cases.
Examples of solutions for managing your tools
Sometimes, your tasks require the use of a certain number of tools, which you cannot reduce and may become a burden for maintenance and/or onboarding. The solution is then to introduce yet another tool, to manage the others!
I will detail specific examples to manage your DevOps environment and deal with tools segregation and bootstrap. This list is not intended to be exhaustive and only reflects my opinion on some of the solutions I tried.
direnv is a very useful and simple tool to avoid configuration conflicts and overlapping between projects. With a simple
.envrc file located at the root of your project, direnv automatically loads and unloads the right environment variables when you enter or leave this directory.
A concrete example is to set a variable like
AWS_CONFIG_FILE in direnv rather than in your
.zshrc. It allows you to use separate AWS credentials configuration files according to the project you are currently working on.
While it is very handful for configuration separation, direnv only helps you with environment variables and does not solve the tools installation problem.
Custom scripts leave you considerable freedom to manage your tools and write new ones. You can easily automate your specific workflow and modify the scripts when it changes.
However, there are some points to be careful about when writing your custom scripts:
It will cost you time for maintenance. If your script calls external tools, changes on these tools might affect it. It truly is one more software for you to maintain.
Think ahead if you need to share it: make sure your script will work on your collaborators' setups.
You probably also want to commit your script to a git repository so that everyone can participate and stay up-to-date.
Choose the right language.
Shell is generally a good and easy choice if you use Linux or MacOS and need basic system administration features. However, it is not meant for building complex software.
Python will help you with more complex tasks thanks to its many packages and will have better compatibility with different systems. But you might need to install python and external dependencies, while shell is present by default on any system.
Golang might also be an option if you need speed, to build a CLI tool easily or miss static typing in Python. It definitely takes more effort to use if you are a beginner or only need a quick and simple script but is totally worth a try.
My favourite option to keep my machine clean, and bootstrap projects, very fast is to work inside containers. It is unmatched for project segregation and well configured it will only improve your efficiency.
You can probably achieve this kind of workflow with a simple makefile calling the usual
docker run, and
docker exec. However, there are already tools for you to do it conveniently.
If you don't need any graphical interface, Toolbox is a lightweight and straightforward tool. It is built on Podman and allows you to create toolbox containers and start a shell inside them very easily. You can also use your own custom images, which means you could distribute a container image to your team for everyone to work inside and have an incredibly standardized environment.
However, Toolbox is only supported on Linux at the moment, as Podman on Windows and MacOS is still under development.
VS Code development containers
Visual Studio Code features a similar approach as Toolbox: it allows you to open your project in a container of your choice, that you can customize at will, and is called Remote-Containers.
For most use cases, the whole configuration lives in a Dockerfile and in a JSON file for customizing the interactions between the container and VS Code (which ports to expose, which extensions to install in the container...). It allows you to git this configuration so that all your team members will be prompted to open the project inside a container with the same setup next time they launch VS Code.
The only downside I see is the mandatory use of VS Code, which might not be your favourite editor or your colleagues'.
I have been working on several projects using VS Code Remote-Containers (mostly for development, but I successfully tested their use in DevOps with my new project), and I found the experience very transparent overall.
GitPod and GitHub Codespaces
The next step in managing your environments with containers is to host them in the cloud. It is what web IDEs like GitPod or GitHub Codespaces propose, and might get very popular in the future. Codespaces even became Github's Engineering team's default editor. These kinds of tools promise an incredible experience for reproducible and always accessible environments, and I am looking forward to trying them.
As Site Reliability Engineers, part of our job is to provide the best tools to the development teams, while keeping the use of these tools as simple and trouble-free as possible. For this reason, it is important for us to invest in the maintenance and onboarding of our tools and make sure we can keep our work environments as clean and efficient as possible.
Lastly, in my never-ending quest for better ways to improve my local environment, I have recently been recommended to take a look at NixOS which I find very interesting, and so might you!