How to speed up your GitLab CI?

In today's software industry, being able to convert ideas into code and push it in production as fast as possible is of crucial importance.

Automation of every process of software development, testing and deployment helps cutting the toil of human intervention in the process and therefore speed it up.

Having an efficient CI/CD workflow is an important part of this puzzle. At Padok we like to use GitLab CI, as it’s open-source, pretty straightforward and quick to set up and allows its runners to be hosted anywhere.

In this article, we will see how you can optimize and speed up your GitLab CI pipeline while keeping the bill as low as possible.


Host your own runners

There are two ways to run CI/CD jobs on GitLab:

  • Using the runners provided by the host of your repository (for example gitlab.com). Those usually have minute and spec limitations that grow with the plan you subscribed to.
  • Deploying your own runners. The GitLab runner is nothing more than an open-source piece of software that you can run pretty much anywhere. Once installed and configured, it will register on your GitLab server and listen for the jobs it has to run. The jobs can run in Kubernetes pods, in Docker containers on virtual machines or even as simple processes on the runner itself.

Deploying and hosting your own runners is a good way to optimize your Gitlab CI solution to your own needs.

If you deploy the runners on Kubernetes (there is, of course, a Helm chart for that), you will be able to specify the specs (CPU, memory, disk space, etc.) needed for the job for them to use exactly the right amount of resources.

Being able to deploy your own runner is one of the best features of Gitlab CI. It would be a shame not to make the most of it!

Rewrite your Dockerfile to make the most of the docker cache

This advice is specific to a kind of job that is present in most CI workflows: the build of a Docker image in which your application will be shipped.

If you don’t build container images, you can skip this part.

Docker has a built-in caching mechanism that allows you to build images faster by re-using layers from a former build.

One important thing to know is that, during the build process, docker creates multiple intermediate images, each corresponding to an instruction in your Dockerfile. Before reusing a layer from a former build, docker will have to determine if this layer is still valid.

It does so differently depending on the kind of instruction:

  • for a COPY it will check if the file to copy has changed since the last build
  • for a RUN it will check if the command has changed
  • etc.

You should always write your Dockerfiles with the instructions that will most probably invalidate the cache at the end. For example, copying your application code into the image should be one of the last steps.

If you Dockerfile looks like this:

You should probably rewrite it so it looks like this:

Reuse the docker cache from a former build

If you are using a Kubernetes runner for your GitLab CI, each build job is run on a different container with its own Docker daemon. Therefore, you can’t reuse the cache as easily as explained in part 2, since there was no former build (as far as this Docker daemon knows).

What you can do is pull a docker image that you built earlier from your central image repository before building a new one and using the --cache-form instruction.

So let’s see in details what is done is the script part of this GitLab CI job:

  • An image tagged latest is pulled from the repository. If this image doesn’t already exist the job shouldn’t fail, thus the presence of the || true.
  • A new build is run with the --cache-from option. We specify the name and tag of the image pulled in the first step so the cache is reused from it. The built image is tagged with the short sha of the commit and latest.
  • Both the short sha and latest tagged images are pushed to the repository.

With this way of building your docker images, each build will use the cache from the former build.

One negative aspect is that you will spend time pulling and pushing the latest image but the time saved by reusing its cache is usually way greater.

Re-think how your jobs use the cache

Pulling and pushing the GitLab CI cache can sometimes take more time than running the job itself.

That’s why it’s worth thinking about its usage and disabling it when possible.

Ask yourself this question each time you write a new job: Do I need some files from the cache to run this task?

If not, just disable it or use another cache more suited to the task.

Be careful: If you defined the cache globally rather than for each job, you also have to explicitly disable it:

In the example above, I have to specify an empty set cache: {} so the cache is not used for this job.

NB: Don’t mistake artifacts for cache. Artifacts are results of a job (build artifact, test reports, etc.) Don’t enable the cache on a job just to be able to reuse an artifact on the next job: that’s what artifacts are for!

Allow a single pipeline per ref and make jobs interruptible

As you might have noticed, most latency/bugs with the Gitlab CI occur when plenty of jobs are run simultaneously (If not, please take a moment to thank the person on your team responsible for the marvel of a CI you got there!).

A solution for that — other than giving more compute power to the architecture on which your runners run — is to lower the number of triggered pipelines.

Of course, just cancelling pipelines for random commits could get you into trouble with the person who pushed it.

Nonetheless, there are pipelines that won’t be missed by anyone if they do not complete. Those are the non-HEAD pipelines: a pipeline that has been triggered on the last commit of a branch before another push to the branch; this means that the commit is no longer the last one, thus non-HEAD.

To put it simply, we only want to run a pipeline on a commit if it’s the HEAD of the branch. If during the pipeline lifetime, a new commit is pushed on the branch, the pipeline should be stopped. The pipeline that is running on the new commit — which is the new HEAD — will continue as if nothing happened.

To do that you first have to enable that feature on GitLab CI web interface.

Go to Settings > CI/CD > General Pipelines and click on the box showed below:

Auto-cancel

Once this box is checked, the old pipeline will be stopped once it has completed its current job.

If you want to go further you can mark your jobs as interruptible, so the pipeline will be stopped even during the execution of a job. This is especially useful on tests or build jobs that may take more than a few minutes.

Automatically rerun jobs that failed on known issues

Even with the best intention, your Gitlab CI may encounter issues that are beyond your control.

Be it because of network issues keeping you from downloading your dependencies, budget restrictions that keep you from scaling up your infrastructure, etc. sometimes, the only advice you can give someone whose pipeline has failed is “Just run it again”.

However, a lot of time may pass between the moment the pipeline has failed and the developer who owns the pipeline notices it and runs it again. Plus it’s not only time: it’s also toil that could be automated.

The retry keyword allows you to automatically rerun operations that fail for particular reasons. Because there is no reason to do ourselves something that a machine can do better:

I hope those few tips are helpful to you and will help you cut time on your CI pipelines. If you’re not already using GitLab CI, you might want to check out this article on how to deploy a Kubernetes app with GitLab pipelines. Or if you’re already familiar with Gitlab CI, here’s how to use it to generate testing environments on the fly with Kubernetes.

Camil Sadiki

Camil Sadiki

Camil is a Site Reliability Engineer (SRE) at Padok. He shares with our clients his expertise in DevOps Technologies, such as Kubernetes, Helm, AWS, and Gitlab

What do you think? Leave your comments here !