15 min read Educational
GitHub Actions workflow timeline showing jobs running in parallel lanes with sequential steps, demonstrating parallel job execution and dependency control GitHub Actions workflow timeline showing jobs running in parallel lanes with sequential steps, demonstrating parallel job execution and dependency control

GitHub Actions Explained: Complete Guide to Workflow Structure

Master GitHub Actions workflows with our comprehensive guide covering workflow structure, execution flow, optimization strategies, and performance monitoring. Learn how to build faster, more reliable CI/CD pipelines.

Understanding GitHub Actions Workflow Anatomy

GitHub Actions has become an indispensable tool for automating software development workflows. From CI/CD pipelines to scheduled tasks, it empowers developers to build, test, and deploy their code efficiently. But what exactly happens when a workflow is triggered?

This post dives deep into the anatomy of a GitHub Actions workflow run, exploring its structure and how its components interact to execute your automation.

Core Concepts: Workflows, Jobs, Steps, and Actions

Before we dissect a workflow run, let's clarify the key building blocks:

  • Workflow: A configurable automated process made up of one or more jobs. Workflows are defined by a YAML file checked into your repository (typically in the .github/workflows directory). They can be triggered by various events like pushes, pull requests, or schedules.

  • Job: A set of steps that execute on the same runner. By default, jobs run in parallel. You can also configure jobs to run sequentially if one job depends on the successful completion of another. Jobs run in isolated environments, so sharing data between jobs requires artifacts or outputs.

  • Step: An individual task that can run commands or an action. Steps within a job are executed in sequence and share the same filesystem, making it easy to pass data between steps using files or environment variables.

  • Action: A reusable piece of code that performs a specific task. Actions can be custom code written for your repository, sourced from the GitHub Marketplace, or from public repositories.

  • Runner: The virtual machine or container that executes your jobs. GitHub provides hosted runners (Ubuntu, Windows, macOS) or you can use self-hosted runners on your own infrastructure. Each job runs on a fresh runner instance.

The Lifecycle of a Workflow Run

When an event triggers a workflow (e.g., a push to the main branch), GitHub Actions initiates a workflow run. Each workflow run is a single instance of your workflow executing.

1. Workflow Run Initialization

GitHub reads the workflow YAML file associated with the triggering event. It then creates a new workflow run entry and prepares to execute jobs.

2. Job Execution: Parallelism and Dependencies

GitHub Actions organizes jobs into a directed acyclic graph (DAG) based on their dependencies, then executes this graph efficiently.

  • Job Dependencies (needs): You define the execution order of jobs using the needs keyword. If jobB needs jobA, jobB will only start after jobA has completed successfully. Jobs without dependencies can start immediately.

  • Parallel Execution: GitHub Actions starts all jobs that have no pending dependencies simultaneously, each on its own runner (a virtual machine or container). As jobs complete, any dependent jobs become eligible to start. This creates efficient parallel execution while respecting the dependency graph.

    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
          - name: Build application
            run: echo "Building..."
    
      test:
        runs-on: ubuntu-latest
        needs: build # test job will only run if build job succeeds
        steps:
          - name: Run tests
            run: echo "Testing..."
    
      deploy:
        runs-on: ubuntu-latest
        needs: [build, test] # deploy job needs both build and test to succeed
        steps:
          - name: Deploy application
            run: echo "Deploying..."
    
  • Optionality and Conditional Execution (if): Jobs (and steps) can be made optional or conditional using the if conditional. This allows you to skip jobs based on certain conditions, such as the branch name, event type, or the outcome of previous jobs.

    jobs:
      publish_docs:
        runs-on: ubuntu-latest
        if: github.ref == 'refs/heads/main' && github.event_name == 'push'
        steps:
          - name: Publish documentation
            run: echo "Publishing docs for main branch..."
    

    If a job is skipped due to an if conditional that evaluates to false, any jobs that need it will also be skipped unless they have alternative conditions or are configured to run even if dependencies fail (using if: always(), if: success(), if: failure(), if: cancelled()).

  • Sharing Data Between Jobs: Since jobs run on separate runners, they can't directly share files or variables. To pass data between jobs, you have two main options:

    • Artifacts: Upload files in one job and download them in another using actions/upload-artifact and actions/download-artifact. Artifacts persist beyond the workflow run completion (default retention: 90 days for public repos, 400 days for private repos) and can be shared across multiple jobs within the same workflow run.
    jobs:
      build:
        runs-on: ubuntu-latest
        steps:
          - name: Build app
            run: echo "build output" > dist/app.js
          - name: Upload build artifacts
            uses: actions/upload-artifact@v4
            with:
              name: build-files
              path: dist/
      
      test:
        runs-on: ubuntu-latest
        needs: build
        steps:
          - name: Download build artifacts
            uses: actions/download-artifact@v4
            with:
              name: build-files
              path: dist/
          - name: Test app
            run: test dist/app.js
    
    • Job Outputs: Pass simple string values between jobs using the outputs keyword. The upstream job defines outputs, and downstream jobs can access them via the needs context.
    jobs:
      build:
        runs-on: ubuntu-latest
        outputs:
          version: ${{ steps.get_version.outputs.version }}
        steps:
          - name: Get version
            id: get_version
            run: echo "version=1.2.3" >> $GITHUB_OUTPUT
      
      deploy:
        runs-on: ubuntu-latest
        needs: build
        steps:
          - name: Deploy version
            run: echo "Deploying version ${{ needs.build.outputs.version }}"
    

3. Step Execution: Sequential and Atomic

Within each job, steps are executed sequentially. Each step must complete before the next one begins.

  • Sequential Order: The order of steps in your workflow YAML file dictates their execution order.

  • Atomicity (Generally): Each step is generally treated as an atomic unit. If a step fails (e.g., a command exits with a non-zero status code), the job typically stops immediately, and subsequent steps in that job are not executed. The job itself is then marked as failed.

    • You can control this behavior using continue-on-error: true for a step, which allows the job to proceed even if that specific step fails.
    jobs:
      example_job:
        runs-on: ubuntu-latest
        steps:
          - name: First step
            run: echo "This is the first step."
          - name: Second step (might fail)
            run: exit 1
            continue-on-error: true # Job continues even if this step fails
          - name: Third step
            run: echo "This step runs regardless of the second step's outcome."
    

4. Run Attempts: Handling Flakiness and Retries

Sometimes, workflows or jobs might fail due to transient issues like network glitches or temporary unavailability of external services. GitHub Actions provides a mechanism to handle such scenarios: run attempts.

  • Workflow Run Attempts: If an entire workflow run fails, you can manually re-run it from the GitHub UI. This creates a new attempt for the same workflow run ID.

  • Job Re-runs: More granularly, you can re-run individual failed jobs within a workflow run. This is particularly useful if only a small part of your workflow failed due to a transient issue.

  • Automatic Retries (via Actions): While GitHub Actions doesn't have a built-in top-level retry mechanism for jobs in the YAML definition, you can implement retry logic within your steps using shell commands or by leveraging community actions designed for retries. For example, a step could use a script that attempts an operation multiple times before failing.

  • Concurrency and max-parallel: For matrix strategy jobs (which run the same job across multiple configurations like different OS versions or Node.js versions), you can use strategy.max-parallel to limit how many jobs run at once. While not a direct retry, it can help manage resources and potentially reduce failures caused by overwhelming external services.

  • Workflow-level concurrency: You can use concurrency settings at the workflow level to ensure that only one run of a workflow (or a group of workflows) executes at a time, or to cancel in-progress runs when a new one is triggered for the same concurrency group. This isn't a retry, but it helps manage execution flow and can prevent issues arising from simultaneous conflicting runs.

    name: CI
    
    on:
      push:
        branches: [ main ]
      pull_request:
        branches: [ main ]
    
    concurrency:
      group: ${{ github.workflow }}-${{ github.ref }}
      cancel-in-progress: true # Cancels previous runs in the same group
    
    jobs:
      # ... your jobs ...
    

A "run attempt" refers to each execution instance of a workflow or a specific job within that workflow. If you re-run a failed job, it becomes a new attempt for that job under the same parent workflow run.

5. Artifacts and Caching

As workflows become more complex, you'll often need to share build outputs between jobs, preserve files for later use, or speed up repeated operations. GitHub Actions provides two key mechanisms for this: artifacts for sharing and storing files, and caching for optimizing performance.

  • Artifacts: Jobs can produce artifacts (files or collections of files) that can be shared between jobs in the same workflow run (using actions/upload-artifact and actions/download-artifact) or stored for later download. Artifacts from one job are typically uploaded at the end of that job and downloaded at the beginning of a dependent job. Use artifacts for build outputs, test results, or any files you need to preserve or share.

  • Caching: Workflows can cache dependencies and other files to speed up subsequent runs. This is managed via actions/cache. Use caching for dependencies (like node_modules, Python packages) or intermediate build files that can be reused across workflow runs to reduce execution time.

    - name: Cache dependencies
      uses: actions/cache@v3
      with:
        path: ~/.npm
        key: ${{ runner.os }}-node-${{ hashFiles('**/package-lock.json') }}
        restore-keys: |
          ${{ runner.os }}-node-
    

6. Workflow Run Completion

Once all jobs have completed (successfully, failed, or skipped), the overall workflow run is marked with a final status (e.g., Success, Failure, Cancelled). Notifications can be configured based on these outcomes.

Visualizing the Flow

The best way to understand workflow execution is to visualize it as a timeline with parallel job lanes:

GitHub Actions workflow timeline showing jobs running in parallel lanes with sequential steps, color-coded by type GitHub Actions workflow timeline showing jobs running in parallel lanes with sequential steps, color-coded by type
  • Jobs run in parallel lanes unless dependencies are specified with needs.
  • Steps within each job run sequentially, one after another in their lane.
  • Dependencies create waiting periods - the Deploy job waits for both Build and Test to complete.

Optimizing Workflows with CI Vitals

Now that you understand how GitHub Actions workflows operate, let's explore how to measure and optimize their performance using CI Vitals — three core metrics that reveal the true health of your CI/CD pipeline:

  • WET (Workflow Execution Time): How long your workflows take to complete
  • NFR (Noise-to-Fix Ratio): What percentage of failures are infrastructure noise vs. real bugs
  • POT (Pipeline Overhead Time): How much time is wasted on queuing, retries, and inefficiencies

When Your CI Vitals Degrade: Diagnostic Guide

🏎️ High WET (Slow Workflows): Where to Look

Symptoms: Developers complaining about slow builds, long feedback loops, context switching while waiting for results.

GitHub Actions Components to Investigate:

  1. Job Dependencies (needs)

    • Look for unnecessary sequential dependencies
    • Check if jobs that could run in parallel are artificially chained
    • Identify the critical path (longest sequence of dependent jobs)
  2. Step Order Within Jobs

    • Move fast, likely-to-fail steps (linting, quick tests) to the beginning
    • Ensure expensive operations (builds, integration tests) run only after quick checks pass
  3. Caching Strategy

    • Check cache hit rates in your workflow logs
    • Look for repeated dependency downloads that should be cached
    • Verify cache keys are specific enough to be useful but not so specific they never hit
  4. Artifact Usage

    • Identify jobs that rebuild the same artifacts
    • Look for opportunities to build once and share via artifacts

Quick Wins:

  • Remove unnecessary needs dependencies
  • Add caching for dependencies and build outputs
  • Parallelize independent jobs
  • Use artifacts to avoid redundant builds

🎯 High NFR (Unreliable Pipeline): Where to Look

Symptoms: Frequent "Re-run failed jobs" clicks, developers losing trust in the pipeline, failures that resolve themselves on retry.

GitHub Actions Components to Investigate:

  1. Step Failure Patterns

    • Review failed step logs for infrastructure vs. code issues
    • Look for network timeouts, runner provisioning failures, external service unavailability
    • Identify flaky tests that pass/fail inconsistently
  2. Retry Logic

    • Check if steps that interact with external services have appropriate retry mechanisms
    • Look for missing error handling in custom scripts
    • Verify that retries are only used for transient failures, not test failures
  3. External Dependencies

    • Identify steps that depend on external services (package registries, APIs, databases)
    • Look for missing fallback strategies or timeout configurations
    • Check for rate limiting issues with external services
  4. Runner Environment Issues

    • Review logs for runner provisioning failures
    • Check for resource constraints (memory, disk space)
    • Look for conflicts between parallel jobs using shared resources

Quick Wins:

  • Add retry logic to network-dependent steps
  • Use continue-on-error: true for non-critical steps (artifact uploads, notifications)
  • Implement proper timeout configurations
  • Isolate flaky external dependencies

🗑️ High POT (Wasted Time): Where to Look

Symptoms: Workflows that feel slower than they should be, lots of waiting time, frequent manual re-runs.

GitHub Actions Components to Investigate:

  1. Queue Times

    • Check workflow run logs for time spent waiting for runners
    • Look for peak usage times when runners are scarce
    • Review concurrency settings that might be causing unnecessary queuing
  2. Cache Misses

    • Analyze cache hit/miss ratios in workflow logs
    • Look for cache keys that are too specific or change too frequently
    • Identify large downloads that happen repeatedly
  3. Redundant Work

    • Find jobs that perform similar operations (multiple builds, repeated setup)
    • Look for steps that could be combined or eliminated
    • Check for unnecessary file operations or data transfers
  4. Retry Overhead

    • Count how often workflows are manually re-run
    • Measure time spent on failed attempts that eventually succeed
    • Identify patterns in retry behavior

Quick Wins:

  • Optimize cache keys for better hit rates
  • Combine related steps to reduce overhead
  • Use artifacts to eliminate redundant builds
  • Implement smarter concurrency controls

Measuring Your CI Vitals

To effectively optimize your workflows, you need to track these metrics over time:

  • WET: Monitor p75 and p90 execution times for your critical workflows
  • NFR: Track the ratio of infrastructure failures to legitimate test failures
  • POT: Measure queue times, retry frequency, and cache miss rates

Key Takeaways

Understanding GitHub Actions anatomy empowers you to build efficient, reliable CI/CD pipelines:

  • Workflow Structure: Jobs run in parallel by default, steps run sequentially within jobs, and needs controls dependencies
  • Performance Optimization: Use caching for dependencies, artifacts for job data sharing, and matrix strategies for parallelization
  • Reliability Best Practices: Implement continue-on-error for cleanup jobs, set timeouts, and use conditional execution
  • Monitoring & Measurement: Track CI Vitals (WET, NFR, POT) to measure pipeline health and distinguish real failures from infrastructure noise

Understanding both GitHub Actions anatomy and CI Vitals gives you the complete picture: how your workflows work and how well they're performing. Use this knowledge to build pipelines that boost developer productivity instead of hindering it.


Want automated CI Vitals tracking for your GitHub Actions workflows? Cimatic provides instant insights with zero configuration required.

Ready to Optimize Your GitHub Actions Workflows?

Get early access to Cimatic's CI Vitals analytics designed specifically for GitHub Actions optimization.

Join Waitlist

Kamil Chmielewski

Kamil Chmielewski

Software engineer with 20+ years of experience optimizing CI/CD pipelines. Creator of Cimatic, helping engineering teams build faster, more reliable development workflows.

Tags:

#github actions #github workflows #ci/cd #workflow optimization #devops #pipeline performance #automation #developer productivity