The Gatekeeper: Continuous Integration

Last updated on 2026-02-15 | Edit this page

Overview

Questions

What happens if I forget to run the tests before pushing?
How do I ensure my code works on Windows, Linux, and macOS?
How do I automate uv in the cloud?

Objectives

Create a GitHub Actions workflow file that lints and tests on every push.
Configure the astral-sh/setup-uv action for cached, high-performance CI.
Define a test matrix to validate code across multiple Python versions and operating systems.
Connect CI to the release pipeline from the Release Engineering episode.

The Limits of Local Hooks

In the Quality Assurance episode, we installed prek to run ruff before every commit. That is a good first line of defence, but it has gaps:

A collaborator can bypass hooks with git commit --no-verify.
Hooks only run on your machine, with your operating system and Python version.
If it works on your MacBook but breaks on a colleague’s Linux cluster, you will not find out until they complain.

Continuous Integration (CI) closes these gaps by running your test suite on a neutral server every time code is pushed. It is the “gatekeeper” that protects the main branch.

Flowchart showing a developer pushing code, GitHub Actions running checkout, install, lint, and test steps, then either allowing merge or blocking

Anatomy of a Workflow File

GitHub Actions reads YAML files from .github/workflows/. Each file describes when to run (on), what machine to use (runs-on), and what commands to execute (steps).

Let’s create our gatekeeper. Start by making the directory:

BASH

mkdir -p .github/workflows

Now create .github/workflows/ci.yml with the following content. This mirrors exactly what we did locally in the Quality Assurance episode: lint, then test.

SH

name: CI

on:
  push:
    branches: [main]
  pull_request:

jobs:
  check:
    name: Lint and Test
    runs-on: ubuntu-latest
    steps:
      - name: Checkout Code
        uses: actions/checkout@v4

      - name: Install uv
        uses: astral-sh/setup-uv@v5
        with:
          enable-cache: true

      - name: Set up Python
        run: uv python install 3.12

      - name: Install Dependencies
        run: uv sync --all-extras --dev

      - name: Lint
        run: uv run ruff check .

      - name: Format Check
        run: uv run ruff format --check .

      - name: Test
        run: uv run pytest --cov=src

Callout

The astral-sh/setup-uv action installs uv and (with enable-cache: true) caches the downloaded packages between runs. This makes subsequent CI runs significantly faster than a fresh install each time.

Challenge

Challenge: Reading the Workflow

Before we push anything, make sure you understand the structure. Answer the following:

Which event triggers this workflow on a pull request?
What operating system does the job run on?
Why do we use ruff format --check instead of ruff format?

Show me the solution

The pull_request trigger (under on:) fires whenever a PR is opened or updated against any branch.
ubuntu-latest (a Linux virtual machine hosted by GitHub).
--check exits with an error if files would be reformatted, without actually modifying them. In CI we want to detect problems, not silently fix them. The developer should run ruff format locally and commit the result.

The Test Matrix

The workflow above runs on one OS with one Python version. That is better than nothing, but one of the biggest risks in scientific Python is compatibility.

A script might work on Linux but fail on Windows due to path separators (/ vs \).
Code might work on Python 3.12 but fail on 3.11 because it uses a feature added in 3.12 (like type statement syntax).
A filename like aux.py is perfectly legal on Linux but reserved on Windows.

A Matrix Strategy tells GitHub to run the same job across every combination of parameters. We define the axes (Python versions, operating systems) and GitHub spins up one runner per combination.

Replace the jobs: block in your ci.yml with the version below. The steps remain identical; only the job header changes.

SH

name: CI

on:
  push:
    branches: [main]
  pull_request:

jobs:
  check:
    name: Test on ${{ matrix.os }} / Py ${{ matrix.python-version }}
    runs-on: ${{ matrix.os }}
    strategy:
      fail-fast: false
      matrix:
        python-version: ["3.11", "3.12"]
        os: [ubuntu-latest, windows-latest, macos-latest]

    steps:
      - uses: actions/checkout@v4

      - uses: astral-sh/setup-uv@v5
        with:
          enable-cache: true

      - name: Install Python ${{ matrix.python-version }}
        run: uv python install ${{ matrix.python-version }}

      - name: Install Dependencies
        run: uv sync --all-extras --dev

      - name: Lint
        run: uv run ruff check .

      - name: Format Check
        run: uv run ruff format --check .

      - name: Test
        run: uv run pytest --cov=src

Two Python versions times three operating systems gives six parallel jobs. If any single job fails, the Pull Request is blocked.

Diagram showing a git push triggering six parallel CI jobs across two Python versions and three operating systems, all feeding into a merge decision

Challenge

Challenge: The Windows Path Bug

Consider the following line in chemlib:

PYTHON

data_path = "src/chemlib/data/file.txt"

Why would this fail on the windows-latest runner?
Rewrite it using pathlib so it works on all three operating systems.
Which episode’s key lesson does this reinforce?

Show me the solution

Windows uses backslash (\) as the path separator. A hardcoded forward slash string will not resolve correctly on Windows.

Use pathlib.Path:

PYTHON

from pathlib import Path
data_path = Path("src") / "chemlib" / "data" / "file.txt"

The very first episode (Writing Reproducible Python), where we introduced pathlib for cross-platform file handling.

Connecting CI to Releases

In the Release Engineering episode, we manually uploaded artifacts to TestPyPI with uvx twine. We also previewed an automated release job. Now that we understand how workflows are structured, let’s see the complete picture.

Add a second job to the same ci.yml file. This job only runs when you push a version tag (e.g., v0.1.0) and only after the test matrix passes.

YAML

release:
  needs: check
  if: github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v')
  runs-on: ubuntu-latest
  permissions:
    id-token: write
    contents: read

  steps:
    - uses: actions/checkout@v4
    - uses: astral-sh/setup-uv@v5

    - name: Build
      run: uv build

    - name: Publish to TestPyPI
      uses: pypa/gh-action-pypi-publish@release/v1
      with:
        repository-url: https://test.pypi.org/legacy/

Key details:

needs: check: The release job waits for all six matrix jobs to pass. A broken build is never published.
id-token: write: Enables OIDC Trusted Publishing. GitHub proves its identity to PyPI directly, so you never need to store an API token as a secret.
The tag filter: Only tags starting with v (like v0.1.0) trigger the release. Normal pushes to main run tests but do not publish.

Challenge

Challenge: The Release Workflow

Walk through the following scenario:

You merge a pull request to main. Does the release job run?
You tag the merge commit with git tag v0.2.0 and git push --tags. What happens now?
Imagine the windows-latest / Py 3.11 job fails. Does the release still happen?

Show me the solution

No. The if: condition requires the ref to start with refs/tags/v. A push to main does not match.
The tag push triggers CI. All six matrix jobs run. If they pass, the release job runs: it builds the wheel and sdist, then publishes to TestPyPI via OIDC.
No. The needs: check dependency means the release job is skipped when any matrix job fails. The tag remains, and you can re-trigger after fixing the issue.

Key Points

Continuous Integration runs your test suite on a neutral server on every push, catching problems that local hooks miss.
astral-sh/setup-uv provides a cached, high-performance uv environment in GitHub Actions.
A Matrix Strategy tests across multiple operating systems and Python versions in parallel.
CI can gate releases: the release job uses needs: to ensure tests pass before publishing.