Published on

GitHub Actions Concurrency Group Trap

Authors
  • avatar
    Name
    Rafał Nowicki
    Twitter

Idea

As a developer, I've been exploring GitHub Actions as the primary automation tool for a new project. For the first month, I focused on fulfilling 80% of our team's automation needs. As the project grew, we incrementally improved the workflows and fixed edge cases. However, I recently realized that I didn't fully understand what was happening with the workflows.

Initial implementation

Initially, we used the concurrency feature in many workflows. However, we encountered problems with only one where it started troubling issues. The workflow we were having issues with was run on main branch merges, which was a moment where we wanted to double-check if merged changes passed all quality checks and deployed our staging environment.

In general workflow looks like the following:

Example workflow

Our workflow was configured with the concurrency feature turned on, with grouping and cancel-in-progress features.

concurrency:
  group: ci-master
  cancel-in-progress: true

We prevented running jobs in parallel for the given group, and when a new job was added to the queue, it cancelled the running job.

Example workflow got cancelled

We made a merge of Feature A, then Feature B, before Feature A jobs finished. GitHub Actions ensured that there were always running jobs from only one running workflow at a time.

This behaviour was expected because all we cared about at this time was - that the last commit be verified and deployed to the staging environment.

Better implementation

At some stage, we realized that we needed to have more control over the deployment script run and prevent it from being cancelled. Cancelled deployment jobs troubled us sometimes bad deployments and broken staging application instances.

The simplest idea was to get rid of the cancel-in-progress feature. Then jobs, especially our deployment, could go into the queue.

The workflow implementation looks like this.

concurrency:
  group: ci-master

Does it really work?

Although Github documentation is usually pretty clear, it seemed that concurrency feature is simple enough and there is not much to read about.

The current implementation worked well initially. We expected to group our jobs and prevent them from running in parallel. In case of a new merge, we wanted jobs waiting in the queue to be processed.

Expected behavior

However, things became complicated when the queue grew. At the time of this article's publication (October 2023), the concurrency groups feature in GitHub Actions "silently" canceled some of the jobs from the queue. It seemed that it kept a maximum of one latest workflow/job runs in the queue.

Unexpected behavior

Actions required

This situation pushed us to run completely asynchronous workflows without grouping or cancelling any. Verifying every commit started to matter for us and deployment issues were addressed separately.

It caused more troubles, but that is another story.

Summary

I've worked with many CI/CD tools, and unexpected cancellations are a bit odd. Although when you're a bit more careful with reading the documentation, you can find that:

When a concurrent job or workflow is queued, if another job or workflow using the same concurrency group in the repository is in progress, the queued job or workflow will be pending. Any previously pending job or workflow in the concurrency group will be canceled. To also cancel any currently running job or workflow in the same concurrency group, specify cancel-in-progress: true.

Additionally, when you dig into the community, you can find some issues raised by the community complaining about that behavior and suggesting a different approach. One of the issues is open since 2021, and I hope it can be addressed at some point.