In continuous integration, we often reuse the same workflow for multiple events. For example, the build workflow needs to run when we push a new commit, or when someone creates a new pull request, or before we create a new release. However, with this approach, we end up having the same workflow multiple times. It is probably not that important for public projects, but it wastes precious free minutes for private repositories.

Let us assume we have a build workflow like this:

name: Node.js CI

on:
  push:
    branches:
    - '*'
  pull_request:
    branches:
    - master

jobs:
  build:
    runs-on: ubuntu-latest

    strategy:
      matrix:
        node-version: [10.x, 12.x, 14.x, 15.x]

    steps:
    - uses: actions/[email protected]
    - name: Set up Node.js
      uses: actions/[email protected]
      with:
        node-version: ${{ matrix.node-version }}
    - run: npm cit

Then we create a feature branch. We develop something, commit, and push. When our commits land to the feature branch, this triggers our “Node.js CI” workflow. So far, so good. Sometime later, we consider that the feature is ready and create a new pull request. This PR will again trigger the “Node.js CI” workflow. Then we decide to amend something, create a new commit, and then push it.

What happens now? Because we have made a push, this triggers our workflow (GitHub will display it as “Node.js CI (push)”). Because this push updates the pull request, this will run our workflow once again (GitHub will display it as “Node.js CI (pull_request)”). The workflow runs four jobs (for different Node.js versions). Therefore we have four wasted jobs running: the jobs running for pull_request are the same as the ones running for push.

So is there a way to avoid running duplicate jobs? To some extent.

What is the pull_request event anyway, and why can’t we get rid of it in favor of push? When pull requests come from the same repository, we can safely eliminate the pull_request trigger: you cannot create a pull request without pushing your commits into the source branch. However, if a pull request comes from a different repository, the picture looks different. The push event (if enabled) happens in the context of that other repository. The workflow running in that repository cannot trigger any other workflows in ours. Moreover, we cannot trust its outcome because the repository owner could have modified or even disabled the workflow.

The easiest workaround is to restrict the push event only to the main repository branch, like this:

on:
  push:
    branches:
    - master
  pull_request:
    branches:
    - master

(replace master with the name of your primary branch)

In this scenario, GitHub will trigger the push event when you push something to the master (or merge a pull request into it). This workaround obviously won’t work if your workflows need to run for all branches (for example, if you need to build and push a Docker image or build your frontend and deploy it with Netlify to the test site).

If you do need your workflows to run for every branch, you need to find another way.

There used to be a solution, which relied upon the fact that PRs coming from other users had username: prefix in the branch name. However, this no longer works.

If we want to restrict the pull_request event to foreign repositories only, we need to find out how to find out whether the PR comes from a forked repository. Fortunately, this is easy. If you look at the official reference, you will notice two important fields: pull_request.head.repo.full_name and pull_request.base.repo.full_name. They contain the full name (user/repo) of the head and base repositories, respectively. If these names differ, the pull request comes from another repository.

Now we can update our workflow:

on:
  push:
    branches:
    - '**'
  pull_request:
    branches:
    - '**'
jobs:
  build:
    runs-on: ubuntu-latest
    if: github.event_name != "pull_request" || github.event.pull_request.head.repo.full_name == github.event.pull_request.base.repo.full_name

The if clause determines when the workflow is to be run:

  • if the current event is not a pull request;
  • or, if the head and base repositories for the pull request differs

Now let us complicate the task a bit.

Consider you want to run the workflow when a new tag is created (for example, you need to build the project before creating a new release). We then modify the workflow:

on:
  push:
    branches:
    - '**'
    tags:
    - '**'
  pull_request:
    branches:
    - '**'

Since this example mentions Node.js, let us see what happens when we run npm version patch (this is a high-level overview, without diving deep into the details):

  1. It updates the version field of package.json.
  2. After that, it commits the changes.
  3. Finally, it creates a tag with git tag, matching the version in package.json

Provided that your branch was in sync with the origin before running npm version, it is now one commit ahead.

Now, if we push our changes, here is what is going to happen:

  • GitHub will run our workflow for the push event.
  • GitHub will run the second copy of our workflow because it sees the new tag.

We need to have our workflow only once.

Is there a way to find out if the commit points to a tag? Fortunately, yes: we can use the git tag --points-at HEAD command. But first, we need to retrieve the list of all tags from the repository. Not a problem: git fetch --depth=1 origin +refs/tags/*:refs/tags/*.

We cannot run a shell command from if, that’s why we need to create a helper job:

  prepare:
    runs-on: ubuntu-latest
    outputs:
      head_tag: ${{ steps.head_tag.outputs.head_tag }}
      foreign_pr: ${{ steps.fp.outputs.foreign_pr }}
    steps:
      - name: Checkout
        uses: actions/[email protected]

      - name: Retrieve tags
        run: git fetch --depth=1 origin +refs/tags/*:refs/tags/*

      - name: Check if Git tag exists
        id: head_tag
        run: |
          if [[ "$ref" == refs/heads/* ]]; then
            echo "::set-output name=head_tag::$(git tag --points-at HEAD)"
          else
            echo "::set-output name=head_tag::"
          fi

      - name: Check if this is a foreign PR
        id: fp
        run: |
          if [ "${{ github.event_name }}" == "pull_request" ] && [ "${{ github.event.pull_request.head.repo.full_name }}" != "${{ github.event.pull_request.base.repo.full_name }}" ]; then
            echo "::set-output name=foreign_pr::yes"
          else
            echo "::set-output name=foreign_pr::"
          fi

Then we need to make the build job depend on this prepare job and update the if condition:

  build:
    runs-on: ubuntu-latest
    needs: prepare
    if: "(github.event_name == 'push' && needs.prepare.outputs.head_tag == '') || (github.event_name == 'pull_request' && needs.prepare.outputs.foreign_pr == 'yes')"

The final (a bit optimized) version will look like this:

name: Node.js CI

on:
  push:
    branches:
      - '**'
    tags:
      - '**'
  pull_request:
    branches:
      - '**'

jobs:
  prepare:
    runs-on: ubuntu-latest
    outputs:
      head_tag: ${{ steps.check.outputs.head_tag }}
      foreign_pr: ${{ steps.check.outputs.foreign_pr }}

    steps:
      - name: Checkout
        uses: actions/[email protected]

      - name: Retrieve tags
        run: git fetch --depth=1 origin +refs/tags/*:refs/tags/*

      - name: Set output variables
        id: check
        run: |
          fpr="no"
          tag=""
          if [[ "${{ github.ref }}" == refs/heads/* ]]; then
            tag="$(git tag --points-at HEAD)"
          elif [[ "${{ github.ref }}" == refs/pull/* ]] && [ "${{ github.event.pull_request.head.repo.full_name }}" != "${{ github.event.pull_request.base.repo.full_name }}" ]; then
            fpr="yes"
          fi
          echo "::set-output name=foreign_pr::${fpr}"
          echo "::set-output name=head_tag::${tag}"

  build:
    runs-on: ubuntu-latest
    needs: prepare
    if: "(github.event_name == 'push' && needs.prepare.outputs.head_tag == '') || (github.event_name == 'pull_request' && needs.prepare.outputs.foreign_pr == 'yes')"
    strategy:
      matrix:
        node-version: [10.x, 12.x, 14.x, 15.x]

    steps:
      - uses: actions/[email protected]
      - name: Set up Node.js
        uses: actions/[email protected]
        with:
          node-version: ${{ matrix.node-version }}
      - run: npm cit

The “Set output variables” step sets both variables; this results in a shorter code. We rely upon the fact that it is possible to use the ref’s prefix to distinguish between events. For example, refs/heads/ means that this is a push event that affected a branch, refs/tags/ is for a push event involving a tag, and refs/pull/ is a pull request.

The (heavy-weight) build workflow now runs when one of the following conditions is true:

  • a PR comes from “foreign” repository;
  • a tag is pushed;
  • or, a push to a branch does not affect any tags.

Note that the prepare job will run for every event that triggers this workflow. However, it is lightweight and should not run longer than a few seconds.

GitHub Actions: How to Avoid Running the Same Workflow Multiple Times
Tagged on:         

Leave a Reply

Your email address will not be published. Required fields are marked *