Operable

Pour ce cadran, nous avons choisi de présenter des outils et des technologies qui permettent d’améliorer les opérations courantes sur les infrastructures Cloud Native.

Adopt

13

Kustomize

Kustomize makes it easy to configure your complex deployments in Kubernetes.


Kustomize lets you manage resource configurations for Kubernetes using YAML files with easy-to-use syntax. It also enables you to manage complex configurations for multi-environment applications by applying serial configurations.


In fact, Kustomize applies patches and overlays to a basic configuration. This simplifies managing complex configurations more than Helm, where you must always redefine a value file. And all this while keeping your code as DRY as possible!


Moreover, Kustomize automatically generates new Kubernetes secrets whenever you modify a data field in the configuration file. This feature keeps secrets secure while simplifying your rollbacks.


However, Kustomize is more challenging to grasp for people unfamiliar with Kubernetes resources and their declaration in YAML.  It's also less flexible than Helm in terms of customization: it's impossible to integrate logic via templating, for example.


We believe that Kustomize and Helm are two complementary tools for managing deployment configurations for Kubernetes. While Kustomize is ideal for managing configurations modularly, Helm offers a convenient way of managing more complex application packages.


By using Kustomize with Helm, teams can benefit from the advantages of both tools. For example, Helm can be used to manage complete application packages, while Kustomize can be used to customize specific configurations for deployment environments.


Kustomize is a powerful tool for managing deployment configurations for Kubernetes, with a simple syntax that perfectly complements Helm.

15

Renovate

Automate patch management of external dependencies (libraries) for infrastructure and applications


Patch Management is a significant challenge for platform security. Our infrastructures, as much as the applications deployed on them, use external components (dependencies, very often open source) that must be updated regularly to correct security flaws and bugs. With the rapid pace of updates and the growing number of dependencies, it can be difficult to keep all our dependencies up to date.


Renovate is an open-source dependency management tool that automates updating packages in your projects. It analyzes your dependency configuration files (such as package.json, pom.xml, or build.gradle) and automatically generates pull requests for necessary package updates.


Renovate is compatible with various package managers, including npm, yarn, pip, Mavn, and NuGet, making it easily adaptable to different programming languages. It also allows you to analyze the dependencies of your infrastructure, supporting Helm charts, Docker images, and Terraform modules. You can use it with major Git Providers such as Gitlab, Github, BitBucket, and Azure DevOps.


Highly configurable, it adapts perfectly to the development workflows of our projects. It can be integrated via CI tasks or, our preference, as a cronjob in a Kubernetes cluster to optimize processing with Redis caching. This deployment mode will enable us to scale more easily by deploying several "instances" of Renovate (cronjob Kube) to spread the load and adapt its operation.


With daily execution and automatic merge of changes (when the CI is valid), we automate part of the correction of security flaws by automatically applying patches. 


Renovate also enables us to track the evolution of our dependencies by providing us with an overview of the changes (new minor or major versions) that need to be taken into account to keep our dependencies up to date (via the open Merge Request/Pull Request list). We aim to ensure that we don't fall behind on major releases and always benefit from security updates in the long term.


Renovate allows us to reduce the burden of Patch Management and free up time to work on improvements that will bring more value to our customers' businesses.

16

Terraform

Today, Terraform is the leading Infrastructure as Code (IAC) tool on the market. It enables you to provision and manage resources on all Cloud Providers.

We've created hundreds of infrastructures on several cloud providers and used Terraform every time. An IAC tool is essential when launching into the cloud, as it facilitates collaboration and the operability of an infrastructure. 


Terraform shines through with a wide range of features:

  • The use of modules to define sets of resources that meet a precise need and can be easily reused.
  • State Terraform for tracking the life cycle of each resource
  • Compatibility with multiple clouds and systems thanks to providers. For example: AWS, OVH, but also Github

We know that IAC tools are offered by cloud providers such as CDK AWS, cloud formation, and even ARM for Azure. But their vendor lock-in and lack of interoperability made us lean towards Terraform. 


Despite being a leader in IAC for managing infrastructure, it is necessary to have a framework for the code base. Padok has converged on a WYSIWYG (What You See Is What You Get) pattern that helps standardize code and collaboration. 


The points to remember are : 

  • Organizing Terraform states according to business needs
  • Create modules that meet a complex or reproducible need
  • Don't hesitate to factorize code as your infrastructure evolves

Tips for use 💡

And don't forget the tools for syntax quality and maintainability: terraform fmt, tfllint, tfautomv, or terraform-docs. 

 

Terraform is today's benchmark tool for building and maintaining infrastructure in the cloud. Tools such as Terragrunt further enhance its ability to manage at-scale infrastructure by offering features to avoid code redundancy, known as DRY (Don't Repeat Yourself).

12

Grafana

Grafana is an open-source dashboarding platform for all your cloud environments.


Grafana is an open-source dashboarding tool created by the eponymous company, Grafana Labs. It enables users to create dynamic, customizable dashboards to monitor and analyze metrics related to your infrastructure.

 

A wide range of data sources can be connected to Grafana: temporal databases such as InfluxDB, Prometheus, ElasticSearch, and even the native monitoring services of your preferred Cloud Provider. Grafana's intuitive user interface then lets you group the various data into real-time graphs, gauges and bar charts, to name but a few.

 

Grafana is an essential monitoring tool for your Kubernetes clusters, easy to install and configure thanks to its Helm chart. What's more, you can define your dashboards using ConfigMap. If you don't want the responsibility of managing your visualization tool, don't worry, there's a Grafana Cloud SaaS offering.

 

With the rise of microservice architectures and the use of the cloud to create complete, complex environments, Grafana has become an essential tool for both operational and development teams.

14

Prometheus Operator

Prometheus Operator makes it easy to deploy and manage an entire technical stack around Prometheus to monitor a Kubernetes cluster.


Prometheus is the benchmark tool for metrology on Kubernetes architectures. Deployment and management are made particularly simple with Prometheus Operator, while other ancillary, but no less necessary, components are added to improve the operability of your platform: 

  • Alerting with Alerte Manager
  • HTTP monitoring with Blackbox Exporter
  • Visualize metrics with Grafana

Installation of the Prometheus suite is a single command, and after just a few minutes, you'll have access to all your cluster's metrics and much more. The operator will be able to manipulate Prometheus resources via Kube CRDs. So you won't need to configure your resources in Prometheus but simply declare them in the Kube API. 


We'll automate monitoring by adding these resources to our charts, and Prometheus will monitor each deployed application (metrics or HTTP monitoring, for example).


However, the operator does not solve the major problem with Prometheus: a consolidated, centralized view in multi-environment, multi-cluster architectures. You'll need to deploy components that bridge the gap between different deployments: a central Grafana and solutions like Thanos to increase data retention.


Even if other solutions exist, such as Datadog (for a fee), Prometheus remains Kubernetes's de facto community standard. It will always be a good choice for operating your clusters.

17

Terragrunt

Terragrunt is a tool offered by Gruntwork to enhance Terraform and boost its ability to manage multi-module deployments.


Terraform is the current community standard for as-code deployment of cloud resources. It includes libraries (called "providers") for almost all the resources of the major Cloud Providers.


However, Terraform has its limitations, penalizing teams who need to manage a multi-module infrastructure or large infrastructures. Indeed, in such cases, it is often necessary to split Terraform deployment into several modules (sometimes also called layers) to simplify them or avoid collisions of Terraform states.


However, this can quickly become very complicated to manage as it becomes necessary :

  • Use remote states to share outputs between states
  • Duplicate or over-template backend configurations that are not natively configurable in Terraform
  • Manage common or environment-specific variables, for example, via files and symbolic links

Terragrunt sits on top of this to create and manage auto-generated Terraform workspaces. Terragrunt's enhanced functionality can be used to link layers. Terragrunt provides a better link between layers while relying on Terraform's proven deployment capabilities.


What's more, Terragrunt is configured using the same language as Terraform, HashiCorp Language (HCL), which has been extended to add the necessary functionalities. This facilitates team training and reduces the feeling of having a new tool to master.


Today, other tools try to meet this need, but Terragrunt is our favorite because it achieves the result by adding only a very thin layer around Terraform.

Trial

20

Excalidraw+

Excalidraw+ is a SaaS virtual whiteboard solution. Its simplicity makes it possible to draw diagrams with the same ease as on a sheet of paper while retaining the ability to store and share them like a Google Doc.


In just 2 clicks, Excalidraw+ creates an unlimited blank page ("Scene") on which you can draw shapes as if on a board. Scenes are grouped into "Collections," to which you can assign team rights in a dedicated workspace.


Excalidraw+'s great strength lies in its simplicity. Only basic shapes (e.g., rectangles, circles, arrows, text boxes) and limited formatting capabilities (e.g., colors, 4 font sizes) are available in the default view.


The result is better day-to-day collaboration, based on many graphical representations and more up-to-date architecture diagrams, because the effort required to maintain them is minimized.


Excalidraw+ lets you make any kind of diagram and collaborate effectively at a distance with visual support. If you need to make a diagram and don't know where to do it, there's no need to hesitate 😉 You can try the free version at excalidraw.com.


However, the tool has the following limitations:

  • Overly broad rights management, e.g. :
    • You must be an administrator to manage rights
    • Only teams can be given rights to files
    • A user can only be a member of 6 teams at a time
    • Subfolders cannot be created
  • No SLAs displayed (even though the application is generally always available)
  • No choice of data storage location

These limitations justify putting it on "Trial" instead of "Adopt."

18

Atlantis

Atlantis is an application for automating the use of Terraform via pull requests. It provides a workflow for maintaining the consistency of an infrastructure defined in IaC.


Atlantis is an open-source tool that automates contributions to a Terraform code base, allowing you to execute the plan, apply, and import commands directly in the pull request. As a result, you can see the feedback directly in the comments. The application can be hosted anywhere and uses the webhook system of Github, Gitlab, or Bitbucket. 


This solves collaboration problems on large infrastructures and provides a history of modifications made.


More complex workflows mean even faster DevX (Developer Experience): 

  • Autoplanning for each new commit or pull request, providing a quick overview of the state of the infrastructure, and validating the impact of changes made
  • Auto-merging to merge the pull request if all plans are functional

However, there is still room for improvement if this tool is to become a benchmark:

  • The server that manages Atlantis holds the credentials to access your infrastructure. Consequently, you need to instantiate several servers to separate access to several infrastructures, which can quickly become complex.
  • Its architecture severely limits it, and scaling it is not straightforward. Other more robust and mature solutions exist if you have large-scale infrastructure needs.

 

Atlantis is a promising tool, but its complex management of rights and scalability is why we're putting it on "Trial." Interesting features, such as drift detection, are planned in its roadmap and deserve to keep it on the radar.

19

Custom Operators

Custom operators allow you to automate tasks in Kubernetes by adding functionality to its API.


Kubernetes is very popular as a container orchestrator. But first and foremost, it's an extensible API. You can add new resources to the Kubernetes API and extend its functionality by creating your own operators.


You'll define Custom Resource Definitions (CRDs) when creating your operator. The operator code takes advantage of the Kubernetes reconciliation pattern to trigger events in your cluster each time a CRD instance is added, modified, or deleted. This can help automate repetitive tasks (reducing your TOIL) and add custom application functionality. If you're using Kubernetes, you're probably already using custom operators like cert-manager or ArgoCD daily.


In a SaaS environment, for example, each new customer requires the creation of a new tenant. With an operator, it's possible to automatically create all the necessary resources by declaring a new object in your Kubernetes cluster!


However, creating an operator can be complicated: you need to thoroughly understand how Kubernetes works and the lifecycle and different edge cases of what you want to automate. And testing all edge cases is no mean feat. 


It's important to note that you can write your operators in any programming language: Java, Rust... and even Ansible. We recommend Golang: you'll find a wealth of resources to help you, and Red Hat's operator-SDK allows you to bootstrap your code very efficiently.


Creating an operator can offer many benefits to DevOps teams working with Kubernetes. However, it can also be complex, requiring a certain amount of programming and Kubernetes expertise.

21

Terratest

Since infrastructure is the foundation of any robust application, it should be tested! Terratest is one of the few test libraries available for Terraform.


Terratest is the reference for testing Terraform code. Coded in Go, this library lets you write unit tests for Terraform and Terragrunt.

Terratest allows you to deploy an infrastructure and carry out tests on : 

  • HTTPS calls to check whether load balancers are working properly
  • API requests to check gateway api response
  • SSH connections to bastion servers 
  • The right status for a resource 
  • Error-free terraform apply

These tests are becoming essential for growing infrastructures, as errors can quickly appear due to Terraform or provider version changes, and above all, to guarantee the non-regression of existing functionalities.


We position it as a "Trial" because it is not yet an industry standard. In particular, it is not used by the open-source modules maintained by Cloud Providers. The main obstacles are : 

  • The mechanics of designing a test to validate the operation of the infrastructure is not an easy, documented process
  • The availability of an environment to carry out these tests generates additional costs, even if over a short period of time

As a side note, we've also used it to test our Helm packages, and we're delighted!

Assess

26

Tacos

Tacos, or Terraform Automation and Collaboration Software, is a typology for managing large-scale Terraform code using a GitOps approach.


Terraform is massively used to build infrastructures, but maintaining a code base has its limits on a large scale. It's difficult to : 

  • Propose an efficient flow of contributions, from code review to tests and previews on a pull request
  • Monitor changes made on each layer 
  • Detect changes made to the infrastructure outside the code. This causes desynchronization between the code and the existing infrastructure.

Tacos is particularly useful for this type of problem, i.e., for large infrastructures with multiple environments and applications. 


Tacos include:

  • Automation workflows around IaC code such as Terraform, Terragrunt and Pulumi
  • Detection of desynchronization between code and existing systems thanks to regular, automated executions
  • A dashboard consolidating a history of changes and an estimate of the cost of the infrastructure

Tacos is a framework that solves these problems thanks to a centralized interface and integrations with Github or Gitlab. 


Today's leaders in this field are: Spacelift, Terraform Cloud, Sclar, env0. They all have their advantages and disadvantages. But they all come with a hefty price tag, around 200 euros a month because their free tier does not allow you to manage large infrastructures efficiently.


Currently, there are no open-source options for Tacos. The only exception is Atlantis, but it's not a full-featured Tacos since it only handles workflows on Github, Bitbucket, and GitLab.


Another limitation of Tacos tools is that all solutions are currently hosted in the Cloud. This means they cannot be used on sensitive infrastructures that want to guarantee a high level of IS privatization.


Tacos is a typology with a promising future, but which today lacks open source alternatives.

22

Crossplane

Crossplane is an infrastructure-as-code tool based on Kubernetes. It lets you create Cloud resources using Custom Resources Definitions.


Crossplane is an infrastructure-as-code (IaC) technology developed by Upbound. It enables infrastructure resources to be deployed using Kubernetes as a state manager. It works similarly to GCP's Config Connector or AWS Controllers for Kubernetes.


Crossplane is deployed as an operator in Kubernetes. To use it to manage your infrastructure, you'll need to deploy a dedicated provider as a Custom Resource Definition (CRD). The provider then deploys CRDs for each Cloud resource (for example, for AWS: an EC2 instance, a VPC, a Lambda...).


Combined with GitOps technologies such as ArgoCD, Crossplane can be transformed into a true Cloud self-service platform. Using YAML and Kubernetes attributes to define and link your entire infrastructure is highly intuitive. The minimum knowledge required to start using Crossplane is significantly lower than Terraform.


However, we note a number of counterpoints that do not allow us to be entirely confident in the use of Crossplane in production:


  • Modifying an immutable field in a resource does not result in its replacement.
  • No native support for sharing information between resources managed by different providers
  • Dependence on a Kubernetes cluster to manage your infrastructure implies impeccable cluster management
  • Crossplane has no notion of "plan" or "dry-run," as found in other tools.
  • Importing existing resources is a poorly documented and risky process, even though it's a widespread use case.

Today, we use Crossplane to solve specific problems such as MySQL database configuration. We see it as a tool to watch, as it could become a serious competitor in the IaC field by addressing the problems we've mentioned. 

23

Hermes

Hermes is an open-source project from HashiCorps Labs that allows you to structure the management of your Google Docs documents within your Google Workspace organization.


Every organization faces the challenge of managing its documentation from a certain size onwards. The oral culture of the early days soon gives way to an initial tool for centralizing technical and organizational knowledge. 


Choosing the right tool for every stage of your company's life is almost impossible. The One Size Fits All approach is often favored because managing several tools is complex. But is this the right solution? Why not return to an old Unix adage: a program that does one thing but does it well?


Hermes is an open-source document management tool created by the HashiCorp Labs team. It enables you to manage documents within your organization by facilitating the lifecycle: document standard, drafting, proofreading, validation, invalidation, collaboration, and, most importantly, search.


Hermes could be a pragmatic choice if you're in a Google Workspace organization and use Drive to create and store Docs documents. Hermes' strengths: 

  • simple, functional interface
  • ability to create templates and document lifecycles
  • full-text search via Algolia's powerful engine

In short, Hermes does little but does it well.


The product is young and limited (only Google Workspace is supported), but 

it looks promising for a first release. It's not an official Hashicorp project, but the company's history suggests there will certainly be evolutions, thanks to the community, and the tool will mature over time.


We recommend you try it or put it on your list of open-source projects to watch out for, as it could be an excellent long-term tool for pragmatically managing structured text documents.

24

Kubernetes Gateway API

Kubernetes Gateway API lets you manage access to Kubernetes services from outside the cluster with a role-oriented approach between Ops and developers.


When we want to expose Kubernetes services outside our cluster, we tend to use Ingress resources. We therefore deploy Ingress Controllers such as those offered by Nginx, Traefik or Kong, which will have their own annotations to direct traffic and manage the Ingresses attached to them. Generally speaking, the developers and Ops in charge of the cluster will be working on these same resources, which can sometimes cause disruptions.


In order to better separate the role of each in managing the exposure of application services, a new concept has recently emerged: the Kubernetes Gateway API. It enables Ops to set up a global gateway at the cluster level (cross-namespace), with an L4 or L7 load balancer as the entry point.


Developers are then free to create their own HTTPRoutes in their namespaces containing their configurations. It's worth noting that these resources provide natively more functionalities, such as header-based matching and traffic weighting.


The Kubernetes Gateway API is still relatively new but is gaining popularity due to its ability to simplify route management in complex Kubernetes environments. It also offers greater visibility and control over gateways, making detecting errors and security issues easier.


At Padok, this technology has great promise for teams looking to simplify route management in Kubernetes environments. As this technology continues to mature, it should gain popularity and become the benchmark, even if it means replacing Ingress. 


In fact, GCP has integrated it into its GKE service under the name GKE Gateway Controller, and it's in GA!

25

Pulumi

Pulumi is an Infrastructure As Code tool that uses languages such as Python, Go, and Typescript. It offers many possibilities but is not yet the default choice for building your own IAC infrastructure.


Terraform may be the leader in IAC, but Pulumi is a serious contender with an approach using languages such as Python or GO. The main advantage of this approach is that it makes it easier to write conditional code, a complex task in Terraform.



Pulumi offers 2 features that set it apart from other IAC tools:

  • Native providers are automatically updated according to the official Cloud Providers API. So there's no need to wait for the provider to be manually updated following a new feature released on the official API before using it. This means you can quickly take advantage of the latest Cloud Provider features!
  • Secret management by encryption, which enables sensitive data to be written directly into the code. Data is encrypted with keys from providers such as aws kms, hashi vault, or gcp kms, and decrypted at just-in-time runtime.

Using Pulumi is a good compromise for teams made up of developers only who want to stick to a familiar language. This is advantageous, as the maintenance processes and best practices in place guarantee code quality. 


However, Pulumi comes with its own language limitations. For example, managing dependencies with `node_modules` can become cumbersome when scaling the code base. 


In conclusion, if you have a specific need to use languages such as GO or Python in your complete stack, getting started with Pulumi will be simpler. However, Pulumi doesn't solve Terraform's fundamental problems, as it's still as complicated as ever to create, organize and maintain IAC code. If you have an existing infrastructure managed with Terraform, we don't advise you to migrate to Pulumi!

Hold

27

Dependabot

Dependabot is a GitHub-native tool for managing security-critical updates of external dependencies in your applications.


Dependabot is a tool that allows you to manage updating your external application dependencies (packages) within your applications by automatically creating pull requests when an update is available.


Natively available on GitHub, its strength, it is also possible to use it on other Git Providers such as GitLab and Azure DevOps, but this requires more effort for its integration.


By default, Dependabot takes a slightly different approach to Renovate, concentrating on security-critical updates to create less "noise" on projects. This may come in handy for applications with few developers to maintain them.


But for us, this is its only major difference. Like Renovate, it also analyzes most of your dependency files for Node.js, Spring Boot, Python, etc. Still, it is more limited regarding your infrastructure code, not supporting Helm, for example.


It can also be criticized for using an open-source CVE database (GitHub Advisory Database), sometimes leading to controversial security alerts that create a lot of noise.


We, therefore, prefer Renovate in most cases. Nevertheless, we're keeping an eye on Dependabot, which is regularly updated by GitHub and tends to improve over time.