Posted on 5 January 2023.
Let’s say you need to deploy multiple complex environments and you plan to use Gitlab pipelines, how could you proceed? At Padok, we already faced this issue and solved it using a combination of multi-project dynamic pipelines, artifacts, and dependency relationships between Gitlab jobs.
What are we building here?
This article describes our solution to the question, the components we developed, and how we arranged them to work together. The objective here is to create the following pipeline structure:
Gitlab is a web-based source-control repository based on Git that is designed for team collaboration and productivity. It is a very powerful developer tool that lets your teams be in control, optimize your workflows to match your deployment SLAs and gives tons of options to create complex CI/CD pipelines.
This last part really got us to consider Gitlab CI as the solution to the problem at hand. Plus it is very popular already, with many examples in the literature to automate your DevOps lifecycle, deploy applications in Kubernetes, implement GitOps best practices, and many more!
How do you set up dynamic Gitlab pipelines?
First things first, we need a master job that can run downstream pipelines dynamically. We want to create as many environments as the user decides to in the input parameters: it can be only 1, 3 just like in this example, or more. The question: how do you trigger the
create-env job multiple times? The answer: using dynamic Gitlab pipelines.
Dynamic pipelines in GitLab CI are generated programmatically based on certain conditions or parameters. This means that the steps and tasks in the pipeline are not fixed, but can change depending on the input or context.
For example, you could use them to automatically run a job that has to do the same task hundreds of times, with each instance being almost identical to the others, but with very subtle differences. It would be very tedious, and near impossible, to write out each variation of the same job. Instead of writing thousands of lines of code, you could generate them using dynamic pipelines in Gitlab CI.
Dynamic pipelines help manage complex or large CI/CD pipelines, where the tasks and dependencies can vary depending on the context. They allow teams to automate and customize their CI/CD processes, making them more efficient and effective.
# bootstrap-env/.gitlab-ci.yml # boostrap-env # ├── .gitlab-ci.yml <-- # ├── generate_templates.py # └── requirements.txt variables: ENVIRONMENTS: description: "User input: comma-separated list of environments" value: "dev,prod,staging" stages: - templating - deployment generate-templates: stage: templating image: python:3.10 before_script: - pip install -r requirements.txt script: - python generate_templates.py --env $ENVIRONMENTS artifacts: paths: - environments.yml deploy-envs: stage: deployment trigger: include: - artifact: environments.yml job: generate-templates strategy: depend
In this example, we use a custom python script to generate a YAML configuration file for Gitlab CI/CD jobs. This is the first stage of the pipeline:
templating . In the second stage
deployment, we use the generated file to deploy every environment initially requested by the user in the
As a result, the environments.yml configuration file will contain 3 jobs, each responsible for creating an environment: dev, prod, and staging.
Users can input the environments they need, and the templating stage will create as many jobs as there are environments to bootstrap.
Voilà! We have, 3 child jobs for the 3 environments we asked for, but how do you make them actually create new environments?
How do you set up multi-project downstream Gitlab pipelines?
Second, we want every step of the target pipeline architecture to be in a dedicated Gitlab project. The problem is to tell Gitlab “can you trigger the pipeline of this other project to create a new environment for me please?”. Gitlab actually offers two different ways to achieve this: using a
trigger or using calling the API.
We will need both in this example because the maximum depth of downstream pipelines that can be triggered is 2, but here we will need at least 3! Hopefully, triggering them with the API resets the counter, and you can use this technique to go as many levels down as you want. This is a cool trick to know, but it comes with its share of trade-offs.
On the plus side, both methods have similar behavior in terms of parent-child representation. As you would expect from a
trigger job, you can see the pipelines running in the parent project, and the child project as well with a link to the original.
However, on the negative side, the POST command syntax can become a little overwhelming when lots of variables are involved and need to be passed down to the child pipeline.
In addition, the parent job calling for the child pipeline with the API doesn't wait for their child pipelines to terminate. Instead, they exit with a success as soon as the curl command is correctly executed. This can become a problem when you need other jobs from later stages to require these steps done. One workaround is to add a waiting loop after the API call.
# bootstrap-env/generated-template.yml deploy-staging: environment: staging variables: GITLAB_PROJECT_ID: 123456789 # the project ID of 'create-env' GITLAB_REF: main script: - > curl --request POST --form "token=$CI_JOB_TOKEN" --form "ref=$GITLAB_REF" --form "variables[ENVIRONMENT]=$CI_ENVIRONMENT_NAME" "https://gitlab.com/api/v4/projects/$GITLAB_PROJECT_ID/trigger/pipeline"
With this, each generated job triggers the “create-env” pipeline. We call the Gitlab API using
curl and specify important parameters such as
token, which is required to trigger multi-project pipelines. Additionally, we pass down relevant variables: in this example, we need an
ENVIRONMENT variable for later stages.
In turn, “create-env” triggers “add-resources” using:
# create-env/.gitlab-ci.yml stages: - resources add-env-resources: stage: resources rules: - if: $CI_PIPELINE_SOURCE == "pipeline" trigger: project: "$GITLAB_GROUP/add-resources" branch: main strategy: depend # [ ... ]
For multi-project pipelines, the
trigger keyword accepts Gitlab project as parameters, either a simple string or with the project keyword. Besides, it is worth mentioning that the
trigger:project syntax is for Premium Gitlab accounts exclusively. The
strategy: depend option makes the parent pipeline's status depend on its children's status.
Also, notice the use of
rules to let the job run only when another pipeline is the trigger. This helps avoid unintentional environment creation on code pushes, and merge requests… which can rapidly turn into very serious problems!
Further down in the architecture, multi-project downstream pipelines will work the same: “resource1” calls “add-resource1” using a
trigger logic, and so on.
That is a problem solved. However, things get tricky when artifacts are involved.
How do you pass artifacts around between Gitlab jobs?
Lastly, artifacts. We use them to convey information about the resources created throughout the procedure. In this example,
resource2 needs information about
resource3 needs information about
resource2. An artifact is created at the end of each
add-resource# pipeline, and we fetch them at the parent level to pass down variables to other children when creating new resources.
Don’t hesitate to go check the first diagram at the top of this article: in this section, we focus on the right-most side including all resource-related repositories and pipelines.
One of the main challenges is the dependency graph between jobs: parents jobs have to wait for their children to exit successfully. Then they can get the generated artifacts. Implementing a dependence strategy using
stages is relatively easy with jobs in the same project:
# add-resources/.gitlab-ci.yml # add-resources # ├── .gitlab-ci.yml <-- # ├── resource1.yml # ├── resource2.yml # └── resource3.yml variables: ARTIFACT_RESOURCE3: "resource3-outputs.zip" ARTIFACT_RESOURCE2: "resource2-outputs.zip" ARTIFACT_RESOURCE1: "resource1-outputs.zip" stages: - resource1 - resource2 - resource3 add-resource1: stage: resource1 trigger: include: - local: "resource1.yml" strategy: depend add-resource2: stage: resource2 trigger: include: - local: "resource2.yml" strategy: depend add-resource3: stage: resource3 trigger: include: - local: "resource3.yml" strategy: depend
By default, all artifacts from previous stages are passed down to future stages in Gitlab. Thus arise issues with storage, security, and pipeline speed. To alleviate this, you can also consider the
dependencies field which lets you specify exactly what artifacts the job requires.
On the other hand, when jobs are part of different projects in a multi-project downstream pipeline, the
stage approach is not enough and you will have to use the
# add-resources/resource2.yml # add-resources # ├── .gitlab-ci.yml # ├── resource1.yml # ├── resource2.yml <-- # └── resource3.yml variables: GITLAB_PROJECT: add-resource2 GITLAB_REF: main # Parse artifacts to get information needed for resource2 pipeline parse-resource1-artifact: before_script: - apt-get update && apt-get install -y zip jq needs: - project: "$GITLAB_GROUP/add-resource1" job: add-resource1 ref: main artifacts: true script: - echo "Parsing resource1 artifact from \"$ARTIFACT_RESOURCE1\"..." - > RESOURCE1_ID=$(unzip -p $ARTIFACT_RESOURCE1 resource1_information.json | jq -r '.id') && echo $RESOURCE1_ID # Triggers the creation of a new resource2 resource trigger-resource2-pipeline: variables: RESOURCE1_ID: $RESOURCE1_ID trigger: project: "$GITLAB_GROUP/$GITLAB_PROJECT" branch: $GITLAB_REF strategy: depend forward: pipeline_variables: true needs: - parse-resource1-artifact
needs logic takes precedence over the
stage logic: meaning if both fields are set, the pipeline ignores the stage order and only focuses on needs. According to the official documentation, the same goes for the artifacts:
"When a job uses
needs, it no longer downloads all artifacts from previous stages by default, because jobs with
needs can start before earlier stages complete".
In this example, we make sure the first job retrieves the artifact from the previous step before running the actual pipeline: in
parse-resource1-artifact, we download the artifact and then extract some information.
For the sake of the example, we implemented a simplistic parsing logic, but the outputs can be custom JSON files, logs, Terraform states, Kubernetes API calls… you get the idea.
That's all folks! We hope you learned something useful today. Feel free to reach out to us, we are always happy to discuss and learn from you. Cheers 🤍.