You've heard of Kubernetes, the container orchestrator that is eating the world of distributed systems development, and therefore you've heard of Docker, the container engine that can build and run any application. But what is a container exactly? What are containers suitable for, and how do they work?
We often hear that a container is:
- A packaging format for applications;
- A lightweight virtual machine;
- A set of processes isolated from the rest;
These answers are each correct to a degree: when packaged into a container, an application can run inside an isolated environment, which feels like being inside a virtual machine but uses fewer resources. However, containers are much more than that. Answers like these fail to provide a common vocabulary on which all actors of the software industry can rely during conversations that revolve around containers. This article aims to clarify what constitutes a container, a container image, and a container runtime, so that developers, operators, and technically-minded decision-makers have a common understanding of what container technology is.
What problem do containers solve?
To best understand what a software container is, one should first know why they are useful. As Solomon Hykes, co-founder of Docker, explained in 2013, the concept comes from shipping containers: boxes with a standard shape, size, and locking mechanism used to ship goods around the world. Any shipping container can be moved around by the same cranes, ships, trains, and trucks because these only interact with the box itself, regardless of its contents. This separation of concerns allows for automation, which leads to higher reliability and lowers costs.
The software industry has a myriad of boxes into which to put code. Boxes like Debian software packages (.deb files) or Java Archives (.jar archives) allow developers to package their applications so that they can easily be copied to and run on different machines. But these boxes have limitations: Java Archives can only hold Java code, and Debian packages only work on the Debian family of Linux distributions. Modern applications rely on a variety of programming languages and operating systems: the industry needs a box that is multilingual and multi-platform.
When putting code inside a box and shipping it to various computers, the main concern is that the code should behave in the same way on each platform, be it a developer's laptop, a staging server, or the public cloud. In practice, this is difficult because a software component's dependencies can reach far beyond its code. For example, a Django web application needs the Python code interpreter to run, but may also rely on system libraries, which can be different depending on the operating system. The code must be packaged with its entire environment to run reliably everywhere. In other words, every single dependency must fit into the application's box.
This idea is not new to the software industry. For many years now, operators have used virtual machines as this kind of box. Everything fits inside a VM: the code, its dependencies, and even the operating system. The critical insight here is to ship the application alongside the entire system. A developer's choice of distribution affects the behavior of their application, and swapping out these system libraries would change the behavior of the software. The system is part of any application a developer writes.
The issue with virtual machines is that they also contain virtual hardware. A developer should not decide how storage or networking is going to work, or what kind of processor to use. Such overreach would hinder the infrastructure provider's freedom to make hardware decisions based on where the application is being deployed and would break the separation of concerns between developers and operators. Think back to the metaphor of the shipping container. A shipping company should be free to use whichever train or warehouse they wish to move and store containers; the contents of a container should not be a factor when deciding what infrastructure to use. Another problem with virtual machines is that they use a lot of CPU and memory, and take a long time to boot. A virtual machine is still a machine. It is not a suitable unit for software delivery.
To sum up, to deploy software reliably and repeatably across computers, we need a box into which to put the application's code. This box should contain the entire system so that developers have a complete understanding of what they are shipping, but should not include machine details because that would be taking it too far. It should also not suffer from the performance cost of a virtual machine. Last but not least, this box must have a standard interface that stays the same regardless of what application is stored inside.
At the most fundamental level, a container should be a standard box for software. Check here how to migrate your legacy app to a containerized app.
DotCloud — now Docker, Inc. — released Docker as an attempt to leverage Linux namespacing to provide software engineers with a standard box. Linux namespaces are a feature of the Linux kernel that allows for one set of programs to see one set of computing resources, and another set of programs to see another set of computing resources. This feature was present in the kernel long before Docker was released, but using raw kernel capabilities directly is not high on the list of priorities of most developers. However, Docker provided an interface that made almost no mention of Linux namespaces and aimed to be as easy to use as possible.
Docker open-sourced three things that allowed their container tools to be widely adopted:
- A standard container format;
- Tools for developers to build containers;
- Tools for operators to run containers;
Docker's standard box for packaging software is a file archive inside of which developers can put all that their program needs — code, libraries, or any other file — as well as basic instructions for starting the application. This archive is called a container image. Once built, images are static: they never change. This immutability is essential because it allows developers to version their images, and operators not to worry about an image changing when the application it contains is running. Docker's image format uses a copy-on-write filesystem to make images immutable, but this is an implementation detail and goes beyond the scope of this article.
To build container images with Docker, developers need to write a recipe called a Dockerfile. This recipe is a simple text file stored alongside a software component's code that provides instructions on how to package it in a standalone manner. Only data mentioned in the Dockerfile ends up in the final image, so developers have complete visibility over what the image contains. The steps described inside the Dockerfile should be as explicit as possible about dependencies — like the choice of Linux distribution, or library versions — as to make the build process as deterministic and repeatable as possible. Developers can collaborate across heterogeneous working environments and be confident that they always package their code in the same way. This confidence improves productivity: automated pipelines reliably build container images just as well as developer laptops.
This is what a Dockerfile looks like:
After building an image, developers can share it over the network, and operators can copy it to the machine where they want to run the packaged software. There, they can use Docker to open up the image and run the application that is inside, without needing to install anything onto the underlying system other than Docker itself. A container runtime is a program that takes a container image and runs the application found inside. Docker's container runtime uses Linux namespaces to create separate environments for each running container.
A program running inside a container sees no other program running on the machine. In reality, all containers share the underlying computer and operating system. Still, the Linux kernel makes sure that applications running in separate containers know nothing about one another: they have different filesystems, network and storage interfaces, processors, and even memory. This separation is purely logical, so it costs very little in performance. The files inside a container image serve as a read-only filesystem for containers created from that image. Programs running inside a container cannot see any other files. Code running inside a container behaves independently of the underlying system: the application is entirely separate from the infrastructure.
Remember the separation of concerns mentioned earlier. Docker allows developers to put anything their application might need inside a container, without needing to know how or where the container might run. Conversely, operators can deploy a container based on any image, regardless of its contents or how exactly developers built it. This separation of concerns has paved the way for massive amounts of automation: developers need only provide their code and a Dockerfile so that their software can be packaged, deployed, and run reliably by automated pipelines.
At Padok, we believe that developers and operators should write Dockerfiles together to make development environments as productive as possible. Introducing developers to the world of operations and making sure all technical teams communicate efficiently is the first step toward modern DevOps methodology.
The Open Container Initiative (OCI)
Following Docker's release, a large community emerged around the idea of using containers as the standard unit of software delivery. As companies started using containers to package and deploy their software more and more, Docker's container runtime did not meet all technical and business needs that engineering teams could have. In response to this, the community started developing new runtimes with different implementations and capabilities. Simultaneously, new tools for building container images aimed to improve on Docker's speed or ease of use. To make sure that all container runtimes could run images produced by any build tool, the community started the Open Container Initiative — or OCI — to define industry standards around container image formats and runtimes.
Docker's original image format has become the OCI Image Specification, and various open-source build tools support it, including:
- BuildKit, an optimized rewrite of Docker's build engine;
- Podman, an alternative implementation of Docker's command-line tool;
- Buildah, a command-line alternative to writing Dockerfiles;
Given an OCI image, any container runtime that implements the OCI Runtime Specification can unbundle the image and run its contents in an isolated environment. Docker donated its runtime, runc, to the OCI to serve as the first implementation of the standard. Other open-source implementations exist, including:
- Kata containers, which use virtual machines for improved isolation. Docker’s use of Linux namespaces has some flaws which allow applications to escape their containers under certain circumstances. For specific use-cases, like running untrusted workloads, stronger security guarantees are required; Kata containers aim to make using VMs as simple as using Docker containers.
- gVisor, a.k.a runsc, which focuses on security and efficiency. Released in 2018 by Google, gVisor stands half-way between machine virtualization and Linux namespacing. It runs containerized applications inside a sandbox that implements many Linux system calls in userspace. In other words, applications running inside the gVisor sandbox rarely interact with the underlying Linux kernel directly, reducing the attack surface untrusted workloads may exploit. This approach allows for increased security while not incurring the performance cost of running a virtual machine.
- Firecracker, a runtime optimized for serverless workloads. This container technology powers AWS Lambda and AWS Fargate. Building on the same virtualization techniques behind Google’s Chrome OS, Firecracker runs containerized applications inside MicroVMs: lightweight virtual machines optimized for running single applications instead of entire operating systems. This approach allows serverless computing providers to maximize the number of workloads they can run without compromising the isolation between their users’ programs.
Thanks to these open standards, the use of containers became widespread across the software industry, and new engineering solutions were needed to make practical use of containers at scale. One of these solutions was Kubernetes: a distributed platform for orchestrating containers on large clusters of machines. When Google released Kubernetes in 2015, the individual nodes of the cluster used Docker's runtime to run containers and manage container images. In late 2016, developers introduced an abstraction between Kubernetes and the container runtime it uses: the Container Runtime Interface — or CRI, for short.
To plug a new container runtime into Kubernetes, all that is needed is a small piece of code called a shim that translates requests made by Kubernetes into requests understandable by the runtime. In theory, each additional runtime would need a custom shim, but a generic one exists for all container runtimes that implement the OCI Runtime Specification. This shim is CRI-O, another open-source project created by the community. When combined, these standards and abstractions have a powerful effect: developers can ship any compliant container image to any compliant Kubernetes cluster.
A container is a standard unit of software delivery that allows engineering teams to ship software reliably and automatically.
Container technology is a central part of the cloud-native landscape and is improving how software engineers practice their craft. Docker paved the way for the open-source community to build a thriving ecosystem. If a team does not work with containers yet, developers can begin building and running containers on their laptops today, and operators can deploy them to the cloud tomorrow.
If you have questions about how containers, Docker, or Kubernetes can help your organization reach its goals, feel free to contact us.