Lab: Detecting and Preventing Secret Leaks in CI/CD Pipelines

Overview

Secret leaks in CI/CD pipelines are the number one cause of pipeline compromise. Exposed credentials — API keys, database passwords, cloud access tokens — give attackers a direct path into production systems. According to GitGuardian’s 2025 State of Secrets Sprawl report, over 12 million new secrets were detected in public GitHub commits in a single year.

The problem is not that developers are careless. It is that modern software delivery involves dozens of configuration files, environment variables, and integration points where secrets can accidentally end up in version control. A single leaked AWS key can cost an organization tens of thousands of dollars in minutes.

This hands-on lab walks you through setting up a multi-layer secret detection strategy covering three critical checkpoints:

Pre-commit scanning — Catch secrets before they ever reach the repository.
In-pipeline scanning — Block pull requests and pushes that contain secrets.
Post-commit and runtime scanning — Detect secrets that slip through and trigger remediation.

By the end of this lab, you will have a working defense-in-depth setup using gitleaks, truffleHog, GitHub secret scanning, and custom detection rules.

Prerequisites

Before starting, ensure you have the following:

Git 2.30+ installed and configured.
Python 3.8+ with pip available.
Docker (optional, but recommended for containerized scanning).
A GitHub account with access to create repositories (required for the GitHub secret scanning exercise).
A terminal (macOS, Linux, or WSL on Windows).

No real secrets or cloud accounts are needed. We will use intentionally planted test secrets throughout this lab.

Environment Setup

Step 1: Create a Test Repository

Start by creating a fresh Git repository that we will use for all exercises:

mkdir secret-leak-lab && cd secret-leak-lab
git init
echo "# Secret Leak Detection Lab" > README.md
git add README.md
git commit -m "Initial commit"

Step 2: Plant Intentional Test Secrets

We need realistic (but fake) secrets in various locations to simulate a real-world scenario. Create the following files:

A .env file with database credentials:

cat > .env <<'EOF'
DB_HOST=localhost
DB_PORT=5432
DB_USER=admin
DB_PASSWORD=SuperSecret123!
DB_NAME=production_db
SECRET_KEY=a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6
EOF

A Python script with a hardcoded AWS key:

cat > deploy.py <<'EOF'
import boto3

# WARNING: These are intentionally fake credentials for testing
AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"
AWS_SECRET_ACCESS_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"

def deploy_to_s3(bucket, file_path):
    s3 = boto3.client(
        's3',
        aws_access_key_id=AWS_ACCESS_KEY_ID,
        aws_secret_access_key=AWS_SECRET_ACCESS_KEY
    )
    s3.upload_file(file_path, bucket, file_path)
    print(f"Deployed {file_path} to s3://{bucket}")

if __name__ == "__main__":
    deploy_to_s3("my-app-bucket", "dist/app.zip")
EOF

A YAML config with a database connection string:

cat > config.yml <<'EOF'
app:
  name: my-application
  environment: production

database:
  url: "postgresql://admin:SuperSecret123!@db.example.com:5432/prod"
  pool_size: 10

redis:
  url: "redis://:MyRedisPassword@cache.example.com:6379/0"
EOF

A Dockerfile with an API key in an ENV instruction:

cat > Dockerfile <<'EOF'
FROM python:3.11-slim

WORKDIR /app
COPY requirements.txt .
RUN pip install -r requirements.txt

# WARNING: Never do this in production
ENV API_KEY=sk-proj-abc123def456ghi789jkl012mno345pqr678stu901vwx234
ENV STRIPE_SECRET_KEY=sk_live_4eC39HqLyjWDarjtT1zdp7dc

COPY . .
CMD ["python", "app.py"]
EOF

Step 3: Verify the “Before” State

At this point, nothing prevents these secrets from being committed:

git add -A
git status

Git happily stages everything, secrets included. There are no hooks, no scanning, and no guardrails. This is the state most repositories start in. Our goal is to change that.

# Reset staging so we can test scanning before committing
git reset HEAD

Exercise 1: Pre-commit Scanning with gitleaks

Gitleaks is an open-source tool designed to detect hardcoded secrets in Git repositories. It supports scanning the working directory, commit history, and can run as a pre-commit hook to block secrets before they are committed.

Install gitleaks

Choose your preferred installation method:

# macOS (Homebrew)
brew install gitleaks

# Docker
docker pull zricethezav/gitleaks:latest

# Go (from source)
go install github.com/gitleaks/gitleaks/v8@latest

Verify the installation:

gitleaks version

Run gitleaks Manually

Scan the working directory for secrets:

gitleaks detect --source . -v

You should see output similar to the following:

Finding:     AWS_ACCESS_KEY_ID = "AKIAIOSFODNN7EXAMPLE"
Secret:      AKIAIOSFODNN7EXAMPLE
RuleID:      aws-access-key-id
Entropy:     3.52
File:        deploy.py
Line:        4

Finding:     AWS_SECRET_ACCESS_KEY = "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
Secret:      wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY
RuleID:      aws-secret-access-key
Entropy:     4.71
File:        deploy.py
Line:        5

Finding:     DB_PASSWORD=SuperSecret123!
Secret:      SuperSecret123!
RuleID:      generic-credential
Entropy:     3.40
File:        .env
Line:        4

Finding:     STRIPE_SECRET_KEY=sk_live_4eC39HqLyjWDarjtT1zdp7dc
Secret:      sk_live_4eC39HqLyjWDarjtT1zdp7dc
RuleID:      stripe-secret-key
Entropy:     4.20
File:        Dockerfile
Line:        9

12:14PM INF 6 commits scanned.
12:14PM WRN leaks found: 6

Gitleaks correctly identifies AWS keys, generic credentials, Stripe keys, and more. Each finding includes the file, line number, and the detection rule that triggered.

Set Up gitleaks as a Pre-commit Hook

The pre-commit framework makes it easy to run gitleaks automatically before every commit. First, install pre-commit:

pip install pre-commit

Create a .pre-commit-config.yaml file in your repository root:

repos:
  - repo: https://github.com/gitleaks/gitleaks
    rev: v8.21.2
    hooks:
      - id: gitleaks

Install the hook:

pre-commit install

Test the Hook: Commit a Secret (Blocked)

git add deploy.py
git commit -m "Add deployment script"

The commit is blocked:

Detect hardcoded secrets.................................................Failed
- hook id: gitleaks
- exit code: 1

12:15PM WRN leaks found: 2

The pre-commit hook stops the commit entirely. The secret never makes it into the Git history.

Test the Hook: Commit Clean Code (Passes)

Create a file with no secrets:

cat > utils.py <<'EOF'
def format_date(date_obj):
    return date_obj.strftime("%Y-%m-%d")

def sanitize_input(user_input):
    return user_input.strip().replace("<", "&lt;").replace(">", "&gt;")
EOF

git add utils.py
git commit -m "Add utility functions"

Output:

Detect hardcoded secrets.................................................Passed
[main abc1234] Add utility functions
 1 file changed, 5 insertions(+)

Clean code passes through without issue.

Exercise 2: In-Pipeline Scanning with gitleaks in GitHub Actions

Pre-commit hooks are a strong first layer, but they run locally and can be bypassed. In-pipeline scanning adds a server-side enforcement layer that cannot be skipped.

Create the GitHub Actions Workflow

Create the file .github/workflows/secret-scan.yml with the following content:

name: Secret Scanning

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]

jobs:
  gitleaks:
    name: Detect Secrets with gitleaks
    runs-on: ubuntu-latest
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run gitleaks
        uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
          GITLEAKS_LICENSE: ${{ secrets.GITLEAKS_LICENSE }}

The fetch-depth: 0 parameter is critical — it ensures the full Git history is available so gitleaks can scan all commits, not just the latest one.

How It Works in Practice

Scenario A: A PR with a leaked secret. A developer accidentally adds an API key to a configuration file and opens a pull request. The gitleaks action scans the diff, detects the secret, and marks the check as failed. The PR cannot be merged until the secret is removed.

Scenario B: A clean PR. A developer opens a PR with application logic and no secrets. The gitleaks action scans the diff, finds nothing, and marks the check as passed. The PR can proceed to code review and merge.

To enforce this, go to your repository Settings → Branches → Branch protection rules and add gitleaks as a required status check for the main branch. This prevents anyone from merging a PR that fails the secret scan.

Exercise 3: In-Pipeline Scanning with truffleHog

TruffleHog takes a different approach to secret detection. In addition to pattern matching, it can verify whether detected secrets are actually live and active by testing them against the corresponding service’s API.

Install truffleHog

# pip
pip install trufflehog

# Docker
docker pull trufflesecurity/trufflehog:latest

# Homebrew
brew install trufflehog

Run truffleHog Against the Test Repo

# Scan the local repo
trufflehog git file://. --only-verified

The --only-verified flag tells truffleHog to only report secrets that it has confirmed are active. This dramatically reduces false positives. If you want to see all findings including unverified ones, omit the flag:

# Show all findings (verified and unverified)
trufflehog git file://.

Verified vs. Unverified: A verified secret is one that truffleHog has tested against the provider’s API and confirmed is active. For example, it will attempt to authenticate with an AWS key to see if it works. An unverified secret matches a known pattern but has not been confirmed as active — it could be a revoked key, a placeholder, or a false positive.

Create a GitLab CI Pipeline Job

TruffleHog integrates well into GitLab CI. Add the following to your .gitlab-ci.yml:

stages:
  - security

secret-scan:
  stage: security
  image:
    name: trufflesecurity/trufflehog:latest
    entrypoint: [""]
  script:
    - trufflehog git file://. --fail --json > trufflehog-results.json
  artifacts:
    when: always
    paths:
      - trufflehog-results.json
    expire_in: 30 days
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'
    - if: '$CI_COMMIT_BRANCH == "main"'

The --fail flag causes truffleHog to exit with a non-zero status code if secrets are found, which fails the pipeline. The --json flag outputs structured results that can be parsed by other tools or dashboards.

gitleaks vs. truffleHog: When to Use Which

Feature	gitleaks	truffleHog
Detection method	Regex + entropy	Regex + entropy + verification
Secret verification	No	Yes (tests if secret is live)
Speed	Very fast	Slower (due to verification)
False positive rate	Moderate	Low (with –only-verified)
Custom rules	Yes (.gitleaks.toml)	Yes (custom detectors)
Pre-commit support	Native	Via wrapper script
Best for	Fast pre-commit and PR checks	Deep scans and verification

Recommendation: Use gitleaks for fast pre-commit hooks and PR checks. Use truffleHog for periodic deep scans and when you need verified results to prioritize remediation.

Exercise 4: GitHub Secret Scanning and Push Protection

GitHub offers built-in secret scanning that works at the platform level. Unlike gitleaks and truffleHog, which you install and configure yourself, GitHub secret scanning is integrated directly into the repository settings.

Enable Secret Scanning

Go to your repository on GitHub.
Navigate to Settings → Code security and analysis.
Enable Secret scanning.
Enable Push protection.

Push protection is the key feature here. When enabled, GitHub will block any push that contains a recognized secret pattern before it reaches the repository.

Test Push Protection

Attempt to push a commit containing a known secret pattern, such as an AWS access key or a GitHub personal access token:

# Stage and commit a file with a test secret
git add deploy.py
git commit -m "Add deploy script"
git push origin main

GitHub blocks the push with a message like:

remote: error: GH013: Repository rule violations found for refs/heads/main.
remote:
 remote: - GITHUB PUSH PROTECTION
remote:   —————————————————————————————————————————
remote:     Resolve the following violations before pushing again
remote:
remote:     — Push cannot contain secrets —
remote:
remote:
remote:      (?) To push, remove secret from commit(s) or follow this URL to allow the secret.
remote:
remote:      — Amazon AWS Access Key ID —
remote:        locations:
 remote:          - commit: abc1234def5678
 remote:            path: deploy.py:4
remote:
! [remote rejected] main -> main (push rule violations)
error: failed to push some refs

Handling False Positives

If GitHub flags a value that is not a real secret (for example, a test fixture or a documentation example), you can bypass push protection with a reason. GitHub provides a URL in the rejection message where you can:

Select a bypass reason: “It’s used in tests”, “It’s a false positive”, or “I’ll fix it later”.
Submit the bypass, which allows the push but logs the event for audit.

Organization administrators can see all bypasses in the Security → Secret scanning dashboard.

The Partner Program

GitHub partners with over 200 service providers (AWS, Stripe, Twilio, SendGrid, and others) through its secret scanning partner program. When a secret from a partner is detected:

GitHub notifies the service provider automatically.
The provider revokes the compromised credential.
The repository owner is notified via email and the Security tab.

This means that even if a secret slips past all other defenses and reaches a public repository, the damage window can be reduced to minutes through automatic revocation.

Exercise 5: Custom Secret Patterns

Default detection rules cover common providers (AWS, Stripe, GitHub, Google Cloud, etc.), but most organizations also have internal secrets with custom formats that standard tools will not detect. Gitleaks supports custom rules through a .gitleaks.toml configuration file.

Create a Custom gitleaks Configuration

Create a .gitleaks.toml file in the repository root:

[extend]
# Extend the default gitleaks configuration
# useDefault = true

[[rules]]
id = "mycompany-api-key"
description = "MyCompany Internal API Key"
regex = '''MYCOMPANY-KEY-[A-Za-z0-9]{32}'''
tags = ["internal", "api-key"]
keywords = ["mycompany-key"]

[[rules]]
id = "internal-database-url"
description = "Internal Database Connection String"
regex = '''postgresql://[^:]+:[^@]+@internal-db\.[a-z0-9-]+\.corp\.[a-z]+\.com'''
tags = ["internal", "database"]
keywords = ["internal-db"]

[[rules]]
id = "internal-jwt-signing-key"
description = "Internal JWT Signing Key"
regex = '''JWT_SIGNING_KEY=[A-Za-z0-9+/=]{64,}'''
tags = ["internal", "jwt"]
keywords = ["jwt_signing_key"]

[allowlist]
description = "Global allowlist"
paths = [
  '''(.*?)test(.*?)\.py''',
  '''(.*?)_test\.go''',
  '''(.*?)spec(.*?)\.js''',
  '''(.*?)fixtures(.*?)''',
  '''README\.md'''
]

[[rules.allowlist]]
id = "mycompany-api-key"
regexes = [
  '''MYCOMPANY-KEY-EXAMPLE[A-Za-z0-9]{24}''',
  '''MYCOMPANY-KEY-TEST[A-Za-z0-9]{28}'''
]

Understanding the Configuration

Custom rules: The [[rules]] sections define patterns specific to your organization. The regex field uses Go-compatible regular expressions. The keywords field helps gitleaks quickly filter files — only files containing the keyword are scanned with the full regex, which improves performance.
Global allowlist: The [allowlist] section defines paths that should be excluded from all scanning. Test files, fixtures, and documentation are common exclusions.
Rule-level allowlist: The [[rules.allowlist]] section defines exceptions for specific rules. Here, we exclude known example keys that appear in documentation or test helpers.

Test the Custom Configuration

Create a file with a custom secret to test:

cat > internal-config.py <<'EOF'
MYCOMPANY_API_KEY = "MYCOMPANY-KEY-a1b2c3d4e5f6g7h8i9j0k1l2m3n4o5p6"
EOF

Run gitleaks with the custom config:

gitleaks detect --source . --config .gitleaks.toml -v

The custom rule detects the internal API key that would have been missed by default rules.

Building a Defense-in-Depth Strategy

No single layer of secret detection is sufficient. Developers can skip pre-commit hooks. Pipeline scans only catch secrets at PR time. GitHub push protection only covers patterns from known providers. A robust strategy layers all of these together.

The Five Layers of Secret Defense

┌─────────────────────────────────────────────────────┐
│  Layer 1: Pre-commit Hook (gitleaks)                │
│  → Catches secrets before they enter local history  │
├─────────────────────────────────────────────────────┤
│  Layer 2: PR / Merge Request Scan (gitleaks action) │
│  → Blocks PRs that contain secrets                  │
├─────────────────────────────────────────────────────┤
│  Layer 3: Push Protection (GitHub / GitLab)         │
│  → Platform-level block on known secret patterns    │
├─────────────────────────────────────────────────────┤
│  Layer 4: Post-merge Scan (truffleHog scheduled)    │
│  → Weekly deep scan with verification               │
├─────────────────────────────────────────────────────┤
│  Layer 5: Runtime Monitoring (vault audit logs)     │
│  → Detect secret usage anomalies in production      │
└─────────────────────────────────────────────────────┘

Why No Single Layer Is Enough

Pre-commit hooks can be bypassed with git commit --no-verify or by using a Git client that does not support hooks.
Pipeline scans only run on branches that trigger CI — direct pushes to unprotected branches or force pushes may skip them.
Push protection only detects patterns from known providers — custom internal secrets are not covered.
Post-merge scans are reactive — they find secrets after they are already in the repository.

Each layer compensates for the weaknesses of the others. Together, they create a system where a secret would need to bypass all five layers to go undetected.

Combined Workflow: PR Scanning and Weekly Full Scan

Here is a combined GitHub Actions workflow that runs gitleaks on every PR and performs a full repository scan weekly:

name: Secret Scanning (Multi-Layer)

on:
  push:
    branches: [main, develop]
  pull_request:
    branches: [main]
  schedule:
    # Full repository scan every Monday at 6:00 AM UTC
    - cron: '0 6 * * 1'

jobs:
  # Layer 2: PR and push scanning
  gitleaks-pr-scan:
    name: gitleaks PR Scan
    runs-on: ubuntu-latest
    if: github.event_name == 'push' || github.event_name == 'pull_request'
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Run gitleaks (diff scan)
        uses: gitleaks/gitleaks-action@v2
        env:
          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

  # Layer 4: Weekly deep scan with truffleHog
  trufflehog-full-scan:
    name: truffleHog Full Repository Scan
    runs-on: ubuntu-latest
    if: github.event_name == 'schedule'
    steps:
      - name: Checkout code
        uses: actions/checkout@v4
        with:
          fetch-depth: 0

      - name: Install truffleHog
        run: pip install trufflehog

      - name: Run truffleHog (full scan with verification)
        run: |
          trufflehog git file://. --fail --json > trufflehog-results.json || true
          if [ -s trufflehog-results.json ]; then
            echo "::error::Secrets detected in repository. See trufflehog-results.json."
            cat trufflehog-results.json | python -m json.tool
            exit 1
          fi

      - name: Upload scan results
        if: always()
        uses: actions/upload-artifact@v4
        with:
          name: trufflehog-results
          path: trufflehog-results.json
          retention-days: 90

This workflow ensures that every code change is scanned in real time, and the entire repository history is audited weekly for secrets that may have been missed.

Cleanup

After completing the lab, remove the test secrets and reset the repository:

# Remove test files with secrets
rm -f .env deploy.py config.yml Dockerfile internal-config.py

# Remove test configurations (optional — keep if you want to reuse them)
# rm -f .gitleaks.toml .pre-commit-config.yaml

# Commit the cleanup
git add -A
git commit -m "Remove test secrets from lab exercises"

# If you want to completely remove secrets from Git history,
# use git-filter-repo (more thorough than git filter-branch):
pip install git-filter-repo
git filter-repo --invert-paths --path deploy.py --path .env --path config.yml --path Dockerfile

Important: Simply deleting files does not remove them from Git history. Anyone with access to the repository can still find the secrets in past commits. Use git-filter-repo to rewrite history and permanently remove sensitive files. After rewriting history, force-push to the remote and have all collaborators re-clone the repository.

Key Takeaways

Secrets in CI/CD are the most common attack vector for pipeline compromise. A single leaked credential can give an attacker full access to production infrastructure.
Pre-commit hooks with gitleaks provide the fastest feedback loop. Developers are alerted immediately, before the secret ever enters Git history.
In-pipeline scanning is a mandatory enforcement layer. It cannot be bypassed by developers and ensures that no PR with secrets is merged.
TruffleHog’s verification capability reduces false positives dramatically. Use it for scheduled deep scans where accuracy matters more than speed.
GitHub push protection and the partner program add platform-level defense that works without any configuration in your CI/CD pipelines.
Custom detection rules are essential for catching organization-specific secrets that standard tools will miss. Invest time in writing rules for your internal key formats.
Defense in depth is the only reliable strategy. No single tool or layer catches everything. Combine pre-commit, in-pipeline, push protection, scheduled scans, and runtime monitoring for comprehensive coverage.

Next Steps

Now that you can detect and prevent secret leaks, the next step is to eliminate hardcoded secrets entirely by adopting proper secrets management:

Secrets Management in CI/CD Pipelines — Learn how to use HashiCorp Vault, AWS Secrets Manager, and native CI/CD secret stores to inject secrets at runtime without ever storing them in code.
Short-Lived Credentials and Workload Identity Federation — Eliminate long-lived secrets entirely by using OIDC-based workload identity federation to authenticate pipelines to cloud providers without any stored credentials.