CI/CD Pipelines That Actually Make You Faster

December 26, 2025·ScaledByDesign·

ci-cddevopspipelinesengineering

The 45-Minute Pipeline

Your engineer pushes code. CI starts. Forty-five minutes later, the build fails because of a flaky test that has nothing to do with their change. They re-run it. Another 45 minutes. It passes. They merge. The whole cycle took 2 hours for a 10-line change.

This isn't CI/CD. This is continuous waiting.

Why Pipelines Get Slow

Typical slow pipeline breakdown:
  Install dependencies:    3 min  (downloads the internet every time)
  Lint + type check:       4 min  (checks everything, not just changes)
  Unit tests:              8 min  (runs all 2,400 tests serially)
  Integration tests:      12 min  (spins up databases, waits for seeds)
  E2E tests:              15 min  (launches browser, flaky selectors)
  Build:                   5 min  (no caching, rebuilds everything)
  Deploy to staging:       3 min

  Total: ~50 minutes
  With flaky retry: ~95 minutes

The Fast Pipeline Architecture

Principle 1: Parallelize Everything

# BEFORE: Sequential pipeline (50 minutes)
steps:
  - install
  - lint
  - type-check
  - unit-tests
  - integration-tests
  - e2e-tests
  - build
  - deploy
 
# AFTER: Parallel pipeline (12 minutes)
steps:
  - install (cached, 30 seconds)
  - parallel:
    - lint + type-check (3 min)
    - unit-tests (3 min, parallelized across 4 runners)
    - build (2 min, cached)
  - integration-tests (4 min, only affected services)
  - e2e-tests (5 min, only critical paths)
  - deploy (1 min, pre-built artifact)

Principle 2: Cache Aggressively

# Cache node_modules based on lock file hash
- name: Cache dependencies
  uses: actions/cache@v4
  with:
    path: node_modules
    key: deps-${{ hashFiles('package-lock.json') }}
 
# Cache build output
- name: Cache Next.js build
  uses: actions/cache@v4
  with:
    path: .next/cache
    key: nextjs-${{ hashFiles('**/*.ts', '**/*.tsx') }}
 
# Cache test results (don't re-run unchanged tests)
- name: Cache test results
  uses: actions/cache@v4
  with:
    path: .jest-cache
    key: tests-${{ hashFiles('src/**/*.test.ts') }}
 
# Impact: Install goes from 3 min to 30 sec
# Build goes from 5 min to 45 sec (incremental)

Principle 3: Only Test What Changed

# Determine which files changed
- name: Get changed files
  id: changes
  run: |
    FILES=$(git diff --name-only ${{ github.event.before }} HEAD)
    echo "changed=$FILES" >> $GITHUB_OUTPUT
 
# Only run backend tests if backend files changed
- name: Backend tests
  if: contains(steps.changes.outputs.changed, 'src/api/')
  run: npm run test:api
 
# Only run frontend tests if frontend files changed
- name: Frontend tests
  if: contains(steps.changes.outputs.changed, 'src/components/')
  run: npm run test:components
 
# Only run E2E if critical paths changed
- name: E2E tests
  if: |
    contains(steps.changes.outputs.changed, 'src/app/checkout') ||
    contains(steps.changes.outputs.changed, 'src/app/auth')
  run: npm run test:e2e

Principle 4: Kill Flaky Tests

A flaky test is worse than no test. It erodes trust in the
entire test suite. Engineers start ignoring failures.

Detection:
  Track test pass/fail rate over 30 days.
  Any test that fails > 2% of the time WITHOUT code changes
  is flaky.

Policy:
  1. Flaky test detected → Quarantine immediately
  2. Quarantined tests run separately (not blocking)
  3. Owner has 1 week to fix or delete the test
  4. If not fixed in 1 week, test is deleted

Dashboard:
  Total tests:          2,400
  Passing:              2,385 (99.4%)
  Quarantined (flaky):  12 (0.5%)
  Disabled:             3 (0.1%)
  Flaky rate trend:     ↓ (was 3.2% last month)

The Pipeline Stages

Stage 1: Fast Feedback (< 3 minutes)

Runs on EVERY push. Must be fast. Must be reliable.

  ✓ Linting (ESLint)
  ✓ Type checking (TypeScript)
  ✓ Unit tests (affected files only)
  ✓ Security scanning (dependency audit)

If this stage fails, the developer knows within 3 minutes.
They can fix it while the code is still in their head.

Stage 2: Thorough Testing (< 10 minutes)

Runs on PR creation and updates.

  ✓ Full unit test suite (parallelized)
  ✓ Integration tests (affected services)
  ✓ Build verification
  ✓ Bundle size check
  ✓ Performance benchmarks (compared to main)

This is the quality gate. PRs can't merge without green.

Stage 3: E2E and Deploy (< 10 minutes)

Runs after merge to main.

  ✓ Critical path E2E tests (login, checkout, core flows)
  ✓ Build production artifact
  ✓ Deploy to staging
  ✓ Smoke tests against staging
  ✓ Deploy to production (if smoke tests pass)

Not every code path needs E2E. Test the 5-10 critical
user journeys that generate revenue.

Measuring Pipeline Health

Weekly Pipeline Metrics:

Speed:
  Median pipeline time:      8 min (target: < 10 min)
  P95 pipeline time:         14 min (target: < 20 min)
  Time to first feedback:    2 min (target: < 3 min)

Reliability:
  Pipeline success rate:     94% (target: > 95%)
  Flaky test rate:           0.8% (target: < 1%)
  False positive rate:       0.3% (target: < 0.5%)

Throughput:
  Deploys per day:           8 (target: > 5)
  Lead time (commit → prod): 45 min (target: < 1 hour)
  Rollback time:             3 min (target: < 5 min)

The ROI Math

Team of 8 engineers, 12 PRs per day:

Before (50-min pipeline):
  Wait time per PR: 50 min × 1.3 avg retries = 65 min
  Daily wait time: 65 min × 12 PRs = 780 min (13 hours)
  Context switching cost: ~30 min per wait = 6 hours/day
  Total daily cost: 19 engineer-hours wasted

After (12-min pipeline):
  Wait time per PR: 12 min × 1.05 avg retries = 13 min
  Daily wait time: 13 min × 12 PRs = 156 min (2.6 hours)
  Context switching cost: minimal (can stay focused)
  Total daily cost: 2.6 engineer-hours

  Savings: 16.4 engineer-hours per day
  At $100/hour loaded cost: $1,640/day = $426,000/year

  Investment to fix: 2-3 weeks of engineering time (~$40K)
  ROI: 10x in year one

A fast, reliable pipeline isn't a nice-to-have — it's the highest-leverage investment in engineering productivity. Every minute you shave off the pipeline compounds across every engineer, every PR, every day. Fix the pipeline, and everything else gets faster.

Technical Interviews Are Broken — Here's What We Do Instead

The On-Call Rotation That Doesn't Burn Out Your Team

CI/CD Pipelines That Actually Make You Faster

December 26, 2025·ScaledByDesign·

ci-cddevopspipelinesengineering

The 45-Minute Pipeline

This isn't CI/CD. This is continuous waiting.

Why Pipelines Get Slow

Typical slow pipeline breakdown:
  Install dependencies:    3 min  (downloads the internet every time)
  Lint + type check:       4 min  (checks everything, not just changes)
  Unit tests:              8 min  (runs all 2,400 tests serially)
  Integration tests:      12 min  (spins up databases, waits for seeds)
  E2E tests:              15 min  (launches browser, flaky selectors)
  Build:                   5 min  (no caching, rebuilds everything)
  Deploy to staging:       3 min

  Total: ~50 minutes
  With flaky retry: ~95 minutes

The Fast Pipeline Architecture

Principle 1: Parallelize Everything

# BEFORE: Sequential pipeline (50 minutes)
steps:
  - install
  - lint
  - type-check
  - unit-tests
  - integration-tests
  - e2e-tests
  - build
  - deploy
 
# AFTER: Parallel pipeline (12 minutes)
steps:
  - install (cached, 30 seconds)
  - parallel:
    - lint + type-check (3 min)
    - unit-tests (3 min, parallelized across 4 runners)
    - build (2 min, cached)
  - integration-tests (4 min, only affected services)
  - e2e-tests (5 min, only critical paths)
  - deploy (1 min, pre-built artifact)

Principle 2: Cache Aggressively

# Cache node_modules based on lock file hash
- name: Cache dependencies
  uses: actions/cache@v4
  with:
    path: node_modules
    key: deps-${{ hashFiles('package-lock.json') }}
 
# Cache build output
- name: Cache Next.js build
  uses: actions/cache@v4
  with:
    path: .next/cache
    key: nextjs-${{ hashFiles('**/*.ts', '**/*.tsx') }}
 
# Cache test results (don't re-run unchanged tests)
- name: Cache test results
  uses: actions/cache@v4
  with:
    path: .jest-cache
    key: tests-${{ hashFiles('src/**/*.test.ts') }}
 
# Impact: Install goes from 3 min to 30 sec
# Build goes from 5 min to 45 sec (incremental)

Principle 3: Only Test What Changed

# Determine which files changed
- name: Get changed files
  id: changes
  run: |
    FILES=$(git diff --name-only ${{ github.event.before }} HEAD)
    echo "changed=$FILES" >> $GITHUB_OUTPUT
 
# Only run backend tests if backend files changed
- name: Backend tests
  if: contains(steps.changes.outputs.changed, 'src/api/')
  run: npm run test:api
 
# Only run frontend tests if frontend files changed
- name: Frontend tests
  if: contains(steps.changes.outputs.changed, 'src/components/')
  run: npm run test:components
 
# Only run E2E if critical paths changed
- name: E2E tests
  if: |
    contains(steps.changes.outputs.changed, 'src/app/checkout') ||
    contains(steps.changes.outputs.changed, 'src/app/auth')
  run: npm run test:e2e

Principle 4: Kill Flaky Tests

A flaky test is worse than no test. It erodes trust in the
entire test suite. Engineers start ignoring failures.

Detection:
  Track test pass/fail rate over 30 days.
  Any test that fails > 2% of the time WITHOUT code changes
  is flaky.

Policy:
  1. Flaky test detected → Quarantine immediately
  2. Quarantined tests run separately (not blocking)
  3. Owner has 1 week to fix or delete the test
  4. If not fixed in 1 week, test is deleted

Dashboard:
  Total tests:          2,400
  Passing:              2,385 (99.4%)
  Quarantined (flaky):  12 (0.5%)
  Disabled:             3 (0.1%)
  Flaky rate trend:     ↓ (was 3.2% last month)

The Pipeline Stages

Stage 1: Fast Feedback (< 3 minutes)

Runs on EVERY push. Must be fast. Must be reliable.

  ✓ Linting (ESLint)
  ✓ Type checking (TypeScript)
  ✓ Unit tests (affected files only)
  ✓ Security scanning (dependency audit)

If this stage fails, the developer knows within 3 minutes.
They can fix it while the code is still in their head.

Stage 2: Thorough Testing (< 10 minutes)

Runs on PR creation and updates.

  ✓ Full unit test suite (parallelized)
  ✓ Integration tests (affected services)
  ✓ Build verification
  ✓ Bundle size check
  ✓ Performance benchmarks (compared to main)

This is the quality gate. PRs can't merge without green.

Stage 3: E2E and Deploy (< 10 minutes)

Runs after merge to main.

  ✓ Critical path E2E tests (login, checkout, core flows)
  ✓ Build production artifact
  ✓ Deploy to staging
  ✓ Smoke tests against staging
  ✓ Deploy to production (if smoke tests pass)

Not every code path needs E2E. Test the 5-10 critical
user journeys that generate revenue.

Measuring Pipeline Health

Weekly Pipeline Metrics:

Speed:
  Median pipeline time:      8 min (target: < 10 min)
  P95 pipeline time:         14 min (target: < 20 min)
  Time to first feedback:    2 min (target: < 3 min)

Reliability:
  Pipeline success rate:     94% (target: > 95%)
  Flaky test rate:           0.8% (target: < 1%)
  False positive rate:       0.3% (target: < 0.5%)

Throughput:
  Deploys per day:           8 (target: > 5)
  Lead time (commit → prod): 45 min (target: < 1 hour)
  Rollback time:             3 min (target: < 5 min)

The ROI Math

Team of 8 engineers, 12 PRs per day:

Before (50-min pipeline):
  Wait time per PR: 50 min × 1.3 avg retries = 65 min
  Daily wait time: 65 min × 12 PRs = 780 min (13 hours)
  Context switching cost: ~30 min per wait = 6 hours/day
  Total daily cost: 19 engineer-hours wasted

After (12-min pipeline):
  Wait time per PR: 12 min × 1.05 avg retries = 13 min
  Daily wait time: 13 min × 12 PRs = 156 min (2.6 hours)
  Context switching cost: minimal (can stay focused)
  Total daily cost: 2.6 engineer-hours

  Savings: 16.4 engineer-hours per day
  At $100/hour loaded cost: $1,640/day = $426,000/year

  Investment to fix: 2-3 weeks of engineering time (~$40K)
  ROI: 10x in year one

Technical Interviews Are Broken — Here's What We Do Instead

The On-Call Rotation That Doesn't Burn Out Your Team

CI/CD Pipelines That Actually Make You Faster

The 45-Minute Pipeline

Why Pipelines Get Slow

The Fast Pipeline Architecture

Principle 1: Parallelize Everything

Principle 2: Cache Aggressively

Principle 3: Only Test What Changed

Principle 4: Kill Flaky Tests

The Pipeline Stages

Stage 1: Fast Feedback (< 3 minutes)

Stage 2: Thorough Testing (< 10 minutes)

Stage 3: E2E and Deploy (< 10 minutes)

Measuring Pipeline Health

The ROI Math

Ready to Ship?

CI/CD Pipelines That Actually Make You Faster

The 45-Minute Pipeline

Why Pipelines Get Slow

The Fast Pipeline Architecture

Principle 1: Parallelize Everything

Principle 2: Cache Aggressively

Principle 3: Only Test What Changed

Principle 4: Kill Flaky Tests

The Pipeline Stages

Stage 1: Fast Feedback (< 3 minutes)

Stage 2: Thorough Testing (< 10 minutes)

Stage 3: E2E and Deploy (< 10 minutes)

Measuring Pipeline Health

The ROI Math

Ready to Ship?