14 October 2020
Docker introduced Multi Staged Builds in v17.05. It answered one of the most challenging tasks when building a docker image: keeping the size down. The Multi Staging feature in this new version helps you build optimized and small docker images.
Back in the days, you often ended up with two docker files: dev and production. The development file was bloated with artifacts that you did not need, while the production one was optimized and usually didn’t include any artifacts. Therefore you ended with two Dockerfile, increasing the maintenance cost of keeping one Docker for each environment.
Some experienced users would make incredible contraptions using scripts to use the same Dockerfile in development and production while keeping the size down. The Multi staging build allows you to optimize your file and be productive in development and production.
To understand how Multi Staging builds works, I'll show you an example using React and Nginx. Those are often used together with Docker to construct and provide the files to the user with Nginx.
To get started, you'll need:
- An IDE such as VS Code, Sublime Text, Atom, etc.…
Check the list of the most used docker commands that you MUST know in order to do this tutorial.
Create a Multi Stages Dockerfile
Let's build the following React App. This project is a small React app that helps you bootstrap projects with React & Redux. I've used it for many months to get started on the development of React applications quickly. Some of the dependencies are outdated, but it should show you what you can achieve with Multi staging.
Step 1 - Clone the repository
Start by cloning the repository. You can even clone your own React project and adapt the Dockerfiles shown below to get Docker Multi Staging working for your project.
git clone https://github.com/StanGirard/ReactBoilerplate && cd ReactBoilerplate
This react project uses a common Dockerfile to build and run the React app. It isn't optimized and mostly used for development purposes.
Step 2 - Build the docker image
ReactBoilerplate you'll find the following Dockerfile:
It is a reasonably common Dockerfile that copies the current folder inside the Docker, installs the necessary dependencies, and runs the application. It is, however, not optimized for production. If you have used your project, just put the Dockerfile at the root of your project.
We can optimize the size of the container by not copying all the files. In the ReactBoilerplate folder, you'll find a .dockerignore file containing:
This file tells the docker script COPY not to import these existing files. If you have more files or folders that you don't want to include in your image, don't forget to add them here. We do not copy node_modules because it is relatively big and will bloat our image. The build folder has no use for this in this Image and thus should not be included.
Want to know more about containers ? Read this article.
Build the container
Run the following command to build the container:
docker build -f Dockerfile -t react-b:latest .
Run the container
You can run the container with the command:
Step 3 - Expose the app with Nginx
Our end goal is to produce an optimized image for production purposes. The image that we currently created takes 369MB of storage and uses 242.9MB of rams while being idle. Docker Multi-stage building should help increase our image's performance and keep the size and ram usage down. If you run your images inside a Kubernetes Cluster, smaller images and a better RAM leads to more pods running on a fewer number of nodes.
Let's use the builder feature. It allows us to define multiple stages in our Docker build. If you use numerous FROM keywords in your Dockerfile, the latest one will be added to the newly created container.
Modify the existing Dockerfile
First, we need to modify the Dockerfile to optimize it for production purposes. We will replace the
CMD ["npm", "start"]at the end of the docker file by
RUN npm run build and the
RUN npm install —silent by
RUN npm install --silent--only=prod This will create the folder
build with the optimized code for production.
Here is the modified Dockerfile
Now we need to add a stage to our Dockerfile. We want to run our optimized production application inside an Nginx container. Inside our Dockerfile, we need to add a stage using the Nginx image and copy all the Nginx files.
First of all, create a file called nginx.conf and paste this code into it:
Add a stage to your Dockerfile
Modify the existing file. Add the stage name after the
FROM node:13.12.0-alpine as builder
By changing this line above, we told Docker that this image is a stage. We will be able to reference it in other stages. We can copy specific files and folders from this image with the command
COPY —from*=<stage_name>. Our Multi staging Dockerfile is starting to take shape. We need to add the Nginx stage, and we are done.
COPY --from=builder tells Docker to precisely copy the files inside the
/app/build/folder inside our Nginx container.
You can run the newly created container with:
The newly created container only uses 6.4MB of RAM while idle and only takes 32 MB of storage. It is a huge difference.
|Memory Idle||Normal||Multi Staging|
|RAM||242.9 MB||6.4 MB|
|Storage||369 MB||32 MB|
Step 4 - Go further
In the following tutorial, we just scratched the surface of what you can do with the Multi-staging features of Docker.
You can go further and implement many more features, such as:
- Stoping at a specific build stage when building the container.
- Use an external image as a stage. Either a locally available one or from a Docker registry.
- Use a previous stage as a new stage
Stopping at a specific build stage
In order to build your container and stopping at a specific stage, you need to run this command:
docker build --target builder -f Dockerfile -t react-b:latest .
--target option allows you to specify a specific stage at which you'd like to stop. It can be useful if your last step is only used in production or for testing. In this example, we stopped before copying to Nginx.
Use a previous stage as a new stage
Let's say that you have many stages in your file and need to use a previously created stage for the following one. All you need to do is:
FROM builder as build2
This tells Docker to use the stage builder as a source for this stage. It can be used for running tests on a previous stage.
Use an external image as a new stage
You can tell Docker to copy files from a docker image that is either locally available on a remote docker repository. To do so, all you need is:
COPY --from=nginx:latest /etc/nginx/nginx.conf /nginx.conf
Advantages & Drawbacks
Using multi-staged containers has many advantages, but some of its drawbacks may keep you away from this neat feature.
In our example above, our multi-staged container is more than ten times smaller than the original container. As we upload with tiny container to the cloud, the size difference can make a significant impact. Let's imagine that you upload your container to a repository such as ECR (Elastic Container Registry) from AWS.
- At home, I typically have a 20Mbps upload speed, which would bring the upload time close to 2 minutes.
- The tiny container would upload in just a few seconds.
A couple of minutes is not a considerable time to wait. But if your container needs multiple images to build, test, and run, it can significantly increase its size. Having only the container's running environment helps you decrease the image size and gain extra minutes and bandwidth.
If you are running a Kubernetes cluster, each pod needs to pull the image at every update; bigger images means longer download time. If you have dozens of updates across hundreds of services at the end of the day, a reduced image size significantly decreases your services' downtime.
Making sure your images are secure is not an easy job. In an original container, you could end up with vulnerabilities brought by stages that you do not require in the build stage. By removing unnecessary dependencies and reducing the exposition of your image, you drastically reduce the security risks.
Bandwidth is relatively cheap on every cloud provider, and so the cost difference between a big and small image is not that huge. However, if your containers are downloaded thousands of times across multiple regions, the cost can factor in your choice.
Sometimes, especially for applications, some libraries are needed to run the app. It can even be tricky to include those libraries while making the image as small as possible with a barebone image.
The advantages of a multi-staged build are numerous. However, it is only required in some specific use cases. In systems with an automated CI/CD systems and multiple applications requiring docker images, the size difference can significantly decrease your roll-out times.