Published on

Github Actions Failure Notification Hacks

Authors
  • avatar
    Name
    Rafał Nowicki
    Twitter

Abstract

A growing codebase can unexpectedly turn simple tasks with simple solutions into simple tasks that need sophisticated solutions. This article describes how we can and cannot implement failure notifications with GitHub Actions. Depending on how complicated is your workflow, you might find that Github Action features can be not as helpful as expected.

Requirements

Our team is notified on the Slack channel when the main branch is unstable.

Requirements

Level 0: Very simple use case

Starting simple. We want to build an application after the merge to the main branch.

To achieve that we’re configuring a workflow with a single job as follows.

Simple scenario

The first step is the default step for GA to ensure our code was fetched. The second step is responsible for building our application. The build process could be defined in the block or project scripts, depending on what your project stack is. I’m calling there just a simple shell command to keep the snippet short. The third step uses randomly chosen slack notification action available in GitHub Actions Marketplace.

Note the condition used for our notification. Based on the documentation ‘if: failure()’ will be called if any previous step fails. Also, notice that we won’t be notified of cancelled jobs.

jobs:
 build:
  runs-on: ubuntu-latest
  steps:
   - uses: actions/checkout@v4
   - name: Build
    run: echo "Build code"
   - name: Slack Notification
    if: failure()
    uses: rtCamp/action-slack-notify@v2
    env:
      SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}

Anyway so far so good, we successfully meet our requirements.

Level 2: More complex scenario

We still want to build the application after merging to the main branch, but before doing that we would like to run the tests. If tests are failing we would like to skip the build. Failing tests means that our branch is unstable, so we want to be notified on Slack too.

Upgrading workflow

For setting up dependency we need to provide ‘needs’ condition as following.

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Test
        run: echo “Test”

  build:
    runs-on: ubuntu-latest
    needs:
      - test
    steps:
      - uses: actions/checkout@v4
      - name: Build
        run: echo "Build code"
      - name: Slack Notification
        if: failure()
        uses: rtCamp/action-slack-notify@v2
        env:
          SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}

At first glance seems all good, but let’s analyze our cases.

tests pass, build pass

In our success scenario, no notification will be triggered thanks to initially setting ‘if: failure()’ in the build job.

tests pass, build fails

In the first failure case, we're getting notifications, as ‘if: failure()’ condition is met.

tests fails, build ???

In the second failure case, thanks to our dependency setup, the build job will be cancelled. When the job is cancelled none of its steps is executed. That means our notification won’t be called.

As we can see this scenario cannot work. Attempt to decouple notifications as separate jobs keeping the build as its dependency will not work either.

What would happen if the notification as a separate job doesn't set up any dependency but adds the condition to failure?

notify:
  runs-on: ubuntu-latest
  steps:
    - uses: actions/checkout@v4
    - name: Slack Notification
      if: failure()
      uses: rtCamp/action-slack-notify@v2
      env:
        SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}

My first thought was that this job would start when the workflow was marked as failed. Nothing could be further from the truth. A job that doesn't have any dependency starts immediately. For us, it means that it starts before any other job could fail.

To my best knowledge, the only way to meet our requirement is to duplicate the code related to Slack notification. The workflow could look like the following.

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Test
        run: echo “Test”
      - name: Slack Notification
        if: failure()
        uses: rtCamp/action-slack-notify@v2
        env:
          SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}

  build:
    runs-on: ubuntu-latest
    needs:
      - test
    steps:
      - uses: actions/checkout@v4
      - name: Build
        run: echo "Build code"
      - name: Slack Notification
        if: failure()
        uses: rtCamp/action-slack-notify@v2
        env:
          SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}

Level 3: Well-decomposed workflow

What if your workflow is a bit more complex? Some of my workflows consist of four, five or even more jobs. Some of them depend on each other and some of them are implemented with the Matrix feature. I prefer to keep jobs short and simple. Is there a way to get a notification is any simple way? Let's assume our workflow consists of three jobs as following.

Decomposed workflow draft

As you probably know, that's a good practice to avoid code duplications. Making sure that your well-decomposed workflow covers failure scenarios might be a challenge. At the same time, GitHub actions don't allow us to implement failure notifications easily avoiding that.

In addition, your notification step could be a bit more complex. You are probably setting up there few more parameters like slack channel, icon, or custom message.

What we could do there is to reduce duplication using the GitHub Actions reusable workflows feature. There you can find an example workflow, that doesn’t need any inputs in our case.

name: Shared failure notification

on:
  workflow_call:

jobs:
  notify:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v3
      - name: Slack Notification
        uses: rtCamp/action-slack-notify@v2
        env:
          SLACK_WEBHOOK: ${{ secrets.SLACK_WEBHOOK }}
          SLACK_COLOR: #ff0000
          SLACK_TITLE: Main branch is unstable!
          SLACK_USERNAME: bot

And this is how you could apply it.

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Test
        run: echo “Test”
      - name: Slack Notification
        if: failure()
        uses: ./.github/workflows/shared-failure-notification.yaml

  build:
    runs-on: ubuntu-latest
    needs:
      - test
    steps:
      - uses: actions/checkout@v4
      - name: Build
        run: echo "Build code"
      - name: Slack Notification
        if: failure()
        uses: ./.github/workflows/shared-failure-notification.yaml

  deploy:
    runs-on: ubuntu-latest
    needs:
      - build
    steps:
      - uses: actions/checkout@v4
      - name: Deploy
        run: echo “Deploy code"
      - name: Slack Notification
        if: failure()
        uses: ./.github/workflows/shared-failure-notification.yaml

As a result we are replicating jobs from the initial scenario trying to save some duplicated code at the same time. Basically we could transform above code to a simple diagram like this.

Final workflow with shared job

This solution is not ideal, but probably make the most sense in given circumstances. I wish it was easier to implement custom failure notifications in well-decomposed GitHub Action workflows. Probably it will be at some point. Fingers crossed.