Introduction: Why Manual Security Reviews Don’t Scale
Every engineering team eventually hits the same wall: security reviews that depend on human eyeballs cannot keep pace with the velocity of modern CI/CD pipelines. When teams deploy dozens or hundreds of times per day, asking a security engineer to manually review every Terraform plan, Kubernetes manifest, or Dockerfile becomes a bottleneck that either slows delivery to a crawl or gets bypassed entirely.
The consequences are predictable. Misconfigurations slip through. Containers run as root. Base images drift to unpatched versions. Terraform provisions publicly accessible S3 buckets. These are not exotic zero-day vulnerabilities — they are known-bad patterns that could be caught automatically if we had a systematic way to express and enforce security rules.
This is where Policy as Code enters the picture. Instead of embedding security checks as fragile shell scripts scattered across pipeline definitions, Policy as Code treats security rules as first-class artifacts: declarative, version-controlled, testable, and enforceable at every stage of the CI/CD lifecycle.
In this guide, we will explore how to use the Open Policy Agent (OPA) and its policy language Rego to build automated, auditable security gates in your CI/CD pipelines — gates that scale with your deployment velocity instead of against it.
What Is Policy as Code?
Policy as Code is a methodology for defining, managing, and enforcing rules using code rather than manual processes or ad-hoc scripts. At its core, it involves writing declarative rules that are evaluated against structured data to produce decisions — allow, deny, or warn.
Core Concepts
- Declarative rules evaluated against structured data: Policies describe what must be true, not how to check it. A policy engine receives structured input (JSON, YAML) and evaluates rules against it to produce a decision.
- Separation of policy from pipeline logic: Policies live in their own repositories, maintained by security or platform teams. Pipeline definitions reference policies but do not contain them. This separation of concerns means policy changes do not require pipeline changes, and vice versa.
- Version-controlled, testable, reviewable: Because policies are code, they go through the same lifecycle as application code — pull requests, code reviews, automated testing, and versioned releases.
- Auditable by design: Every policy evaluation produces a decision with a clear trace of what was evaluated, which rules matched, and why. This is essential for compliance and incident response.
How It Differs from Shell Script Checks
Many teams start with shell scripts in their pipelines — a grep for “latest” in a Dockerfile, a jq query against a Terraform plan. These work initially but fall apart quickly:
- Shell scripts are imperative and brittle — a minor format change breaks them.
- They lack composability — combining multiple checks requires orchestration logic.
- They produce inconsistent output — no standard format for violations or warnings.
- They are hard to test in isolation.
- They cannot be easily shared across teams or pipelines.
Policy as Code solves all of these problems by providing a structured, declarative framework with a dedicated evaluation engine.
OPA and Rego Fundamentals
The Open Policy Agent (OPA) is a general-purpose, open-source policy engine maintained by the Cloud Native Computing Foundation (CNCF). It decouples policy from the services that need to enforce it, providing a single framework for policy across the stack — from Kubernetes admission control to CI/CD gates to API authorization.
How OPA Works
OPA follows a simple model: input → policy → decision.
- Input: Structured data (JSON) representing the thing being evaluated — a Kubernetes manifest, a Terraform plan, a Dockerfile parse tree, or a pipeline configuration.
- Policy: One or more Rego files defining rules that evaluate the input.
- Decision: A JSON result indicating whether the input complies, and if not, why.
Rego Syntax Basics
Rego is OPA’s purpose-built policy language. It is declarative, meaning you describe conditions rather than writing step-by-step logic. Key building blocks include:
- Packages: Namespace policies logically (e.g.,
package cicd.docker). - Rules: Named expressions that evaluate to true or produce values.
- Imports: Reference data from the input or external data sources.
A Simple Example: Deny the “latest” Tag
Let’s start with a common security rule: deny any Kubernetes deployment that uses the latest image tag, since it makes builds non-reproducible and hides the actual version running in production.
# policy/k8s/deny_latest_tag.rego
package k8s.images
deny[msg] {
container := input.spec.template.spec.containers[_]
endswith(container.image, ":latest")
msg := sprintf("Container '%s' uses the 'latest' tag — pin to a specific version", [container.name])
}
deny[msg] {
container := input.spec.template.spec.containers[_]
not contains(container.image, ":")
msg := sprintf("Container '%s' has no tag specified (defaults to 'latest') — pin to a specific version", [container.name])
}
This policy iterates over all containers in a Kubernetes deployment spec and generates a denial message if the image uses :latest or has no tag at all.
Running OPA Locally
You can evaluate this policy locally with the OPA CLI:
# Save a sample input
cat > input.json <<'EOF'
{
"spec": {
"template": {
"spec": {
"containers": [
{"name": "app", "image": "myregistry/app:latest"},
{"name": "sidecar", "image": "envoyproxy/envoy:v1.28.0"}
]
}
}
}
}
EOF
# Evaluate the policy
opa eval --input input.json --data policy/ "data.k8s.images.deny"
The output will include the denial message for the app container using :latest, while the sidecar container with a pinned version passes cleanly.
CI/CD Use Cases for OPA
OPA is not limited to Kubernetes. Its input-agnostic design makes it useful wherever you need to validate structured data against rules. Here are the most impactful CI/CD use cases.
Validating Kubernetes Manifests Before Deployment
Catch misconfigurations before they reach the cluster: missing resource limits, privileged containers, host network access, missing security contexts, or non-compliant labels.
# policy/k8s/deny_privileged.rego
package k8s.security
deny[msg] {
container := input.spec.template.spec.containers[_]
container.securityContext.privileged == true
msg := sprintf("Container '%s' must not run in privileged mode", [container.name])
}
deny[msg] {
not input.spec.template.spec.containers[_].resources.limits
msg := "All containers must define resource limits"
}
Enforcing Dockerfile Best Practices
Using tools like conftest with a Dockerfile parser (such as hadolint‘s JSON output or dockerfile-json), you can enforce rules like no running as root and pinned base images:
# policy/docker/best_practices.rego
package docker
deny[msg] {
input.stages[_].commands[i].cmd == "user"
input.stages[_].commands[i].value == "root"
msg := "Dockerfile must not explicitly set USER to root"
}
deny[msg] {
stage := input.stages[_]
stage.base_image
not contains(stage.base_image, "@sha256:")
not regex.match(`:.+$`, stage.base_image)
msg := sprintf("Base image '%s' must be pinned to a tag or digest", [stage.base_image])
}
Checking Terraform Plans for Security Violations
Convert a Terraform plan to JSON with terraform show -json tfplan, then validate it against security policies:
# policy/terraform/aws_security.rego
package terraform.aws
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_s3_bucket"
resource.change.after.acl == "public-read"
msg := sprintf("S3 bucket '%s' must not be publicly readable", [resource.address])
}
deny[msg] {
resource := input.resource_changes[_]
resource.type == "aws_security_group_rule"
resource.change.after.cidr_blocks[_] == "0.0.0.0/0"
resource.change.after.type == "ingress"
msg := sprintf("Security group rule '%s' must not allow ingress from 0.0.0.0/0", [resource.address])
}
Validating Pipeline Configurations
You can even enforce rules on the pipeline definitions themselves — ensuring that every pipeline includes required steps like SAST scanning, secret detection, or image signing:
# policy/pipeline/required_steps.rego
package pipeline
required_jobs := {"sast-scan", "secret-detection", "image-sign"}
missing_jobs[job] {
job := required_jobs[_]
not job_exists(job)
}
job_exists(name) {
input.jobs[name]
}
deny[msg] {
count(missing_jobs) > 0
msg := sprintf("Pipeline is missing required security jobs: %v", [missing_jobs])
}
Enforcing Branch Protection and Approval Policies
Validate that changes to production branches come with the required approvals and pass mandatory checks before merge. OPA can evaluate GitHub or GitLab webhook payloads or API responses to enforce these policies programmatically.
Integrating OPA into CI/CD Pipelines
The most ergonomic way to integrate OPA into CI/CD is through Conftest, a testing tool built on top of OPA specifically designed for validating structured configuration files. It understands YAML, JSON, HCL, Dockerfile, and many other formats out of the box.
GitHub Actions: OPA with Conftest
# .github/workflows/policy-check.yml
name: Policy Checks
on:
pull_request:
branches: [main]
jobs:
validate-kubernetes:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Install Conftest
run: |
LATEST=$(wget -qO- "https://api.github.com/repos/open-policy-agent/conftest/releases/latest" | jq -r '.tag_name' | sed 's/v//')
wget -q "https://github.com/open-policy-agent/conftest/releases/download/v${LATEST}/conftest_${LATEST}_Linux_x86_64.tar.gz"
tar xzf conftest_${LATEST}_Linux_x86_64.tar.gz
sudo mv conftest /usr/local/bin/
- name: Validate Kubernetes manifests
run: |
conftest test k8s/*.yaml \
--policy policy/k8s/ \
--output json \
--all-namespaces
validate-terraform:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
- name: Install Conftest
run: |
LATEST=$(wget -qO- "https://api.github.com/repos/open-policy-agent/conftest/releases/latest" | jq -r '.tag_name' | sed 's/v//')
wget -q "https://github.com/open-policy-agent/conftest/releases/download/v${LATEST}/conftest_${LATEST}_Linux_x86_64.tar.gz"
tar xzf conftest_${LATEST}_Linux_x86_64.tar.gz
sudo mv conftest /usr/local/bin/
- name: Generate Terraform plan JSON
run: |
cd terraform/
terraform init
terraform plan -out=tfplan
terraform show -json tfplan > tfplan.json
- name: Validate Terraform plan
run: |
conftest test terraform/tfplan.json \
--policy policy/terraform/ \
--output json
GitLab CI: Conftest in a CI Job
# .gitlab-ci.yml
stages:
- validate
- build
- deploy
policy-check-k8s:
stage: validate
image: openpolicyagent/conftest:latest
script:
- conftest test k8s/*.yaml
--policy policy/k8s/
--output json
--all-namespaces
rules:
- changes:
- k8s/**/*
- policy/k8s/**/*
policy-check-terraform:
stage: validate
image:
name: hashicorp/terraform:latest
entrypoint: [""]
before_script:
- apk add --no-cache wget
- wget -q https://github.com/open-policy-agent/conftest/releases/download/v0.50.0/conftest_0.50.0_Linux_x86_64.tar.gz
- tar xzf conftest_0.50.0_Linux_x86_64.tar.gz
- mv conftest /usr/local/bin/
script:
- cd terraform/
- terraform init
- terraform plan -out=tfplan
- terraform show -json tfplan > tfplan.json
- conftest test tfplan.json --policy ../policy/terraform/
rules:
- changes:
- terraform/**/*
- policy/terraform/**/*
Conftest vs Raw OPA CLI: When to Use Which
- Use Conftest when: You are validating configuration files (YAML, JSON, HCL, Dockerfiles) in CI/CD. Conftest handles file parsing, provides structured output formats, supports multiple file types, and follows established conventions (
deny,warn,violationrules). - Use the raw OPA CLI when: You need to evaluate policies against custom JSON input, integrate OPA as a sidecar or daemon for runtime decisions, work with OPA bundles, or need the full OPA API (partial evaluation, profiling, etc.).
For most CI/CD security gate use cases, Conftest is the right choice. It reduces boilerplate and integrates cleanly into pipeline steps.
Writing Effective Rego Policies
Writing Rego policies that are maintainable, debuggable, and useful in practice requires following established patterns and conventions.
Deny-by-Default vs Allow-by-Default
There are two fundamental approaches:
- Deny-by-default: Everything is allowed unless a
denyrule matches. This is the standard Conftest convention and works well for CI/CD gates where you want to catch specific known-bad patterns. - Allow-by-default with explicit denials: Same as above — this is the most common pattern for CI/CD use cases.
For maximum security, some organizations use a strict deny-by-default model where the input must explicitly match an allow rule or it is rejected. This is more appropriate for admission control than CI/CD gates.
# Deny-by-default (common for CI/CD — catches specific violations)
package k8s.images
deny[msg] {
# Explicitly deny known-bad patterns
container := input.spec.template.spec.containers[_]
endswith(container.image, ":latest")
msg := sprintf("Container '%s' uses ':latest' tag", [container.name])
}
# Strict allow-only (everything not explicitly allowed is denied)
package k8s.registries
allowed_registries := {
"gcr.io/my-project",
"us-docker.pkg.dev/my-project",
}
deny[msg] {
container := input.spec.template.spec.containers[_]
image := container.image
not image_from_allowed_registry(image)
msg := sprintf("Container '%s' uses image '%s' from a non-approved registry", [container.name, image])
}
image_from_allowed_registry(image) {
some registry in allowed_registries
startswith(image, registry)
}
Generating Meaningful Violation Messages
A policy that says “violation detected” is nearly useless. Good violation messages should tell the engineer what is wrong, where in the configuration it occurs, and ideally how to fix it:
deny[msg] {
container := input.spec.template.spec.containers[_]
not container.securityContext.runAsNonRoot
msg := sprintf(
"Container '%s' must set securityContext.runAsNonRoot to true. "
"See: https://wiki.internal/policies/container-security#non-root",
[container.name]
)
}
Policy Testing with opa test
Rego policies should be tested just like application code. OPA includes a built-in test framework:
# policy/k8s/deny_latest_tag_test.rego
package k8s.images
test_deny_latest_tag {
result := deny with input as {
"spec": {"template": {"spec": {"containers": [
{"name": "app", "image": "nginx:latest"}
]}}}
}
count(result) == 1
contains(result[_], "latest")
}
test_allow_pinned_tag {
result := deny with input as {
"spec": {"template": {"spec": {"containers": [
{"name": "app", "image": "nginx:1.25.3"}
]}}}
}
count(result) == 0
}
test_deny_no_tag {
result := deny with input as {
"spec": {"template": {"spec": {"containers": [
{"name": "app", "image": "nginx"}
]}}}
}
count(result) == 1
}
Run tests with:
opa test policy/ -v
Organizing Policies by Domain
A clean policy repository structure makes policies discoverable and maintainable:
policy/
├── k8s/
│ ├── deny_latest_tag.rego
│ ├── deny_latest_tag_test.rego
│ ├── deny_privileged.rego
│ ├── deny_privileged_test.rego
│ ├── require_labels.rego
│ └── require_labels_test.rego
├── terraform/
│ ├── aws_security.rego
│ ├── aws_security_test.rego
│ ├── gcp_security.rego
│ └── gcp_security_test.rego
├── docker/
│ ├── best_practices.rego
│ └── best_practices_test.rego
└── pipeline/
├── required_steps.rego
└── required_steps_test.rego
Managing Policy Bundles
For organizations with many teams, distributing policies as OPA bundles is the recommended approach. Bundles are versioned tarballs of Rego files and data that can be hosted on any HTTP server, OCI registry, or cloud storage:
# Build a bundle
opa build -b policy/ -o bundle.tar.gz
# Push to an OCI registry
conftest push myregistry.io/policies/security:v1.2.0
# Pull and use in a pipeline
conftest pull myregistry.io/policies/security:v1.2.0
conftest test k8s/*.yaml --policy policy/
This approach allows security teams to publish policies centrally while application teams consume specific versions, and enables controlled rollouts of new policy versions.
Failing Pipelines Safely and Explicitly
Enforcing policy in CI/CD is as much an engineering challenge as a security one. Rolling out hard failures on day one will create chaos. A measured approach is essential.
Hard Gates vs Soft Gates
Conftest supports two rule types that map cleanly to this distinction:
denyrules: Hard gates. The pipeline fails if anydenyrule matches.warnrules: Soft gates. The pipeline logs the warning but continues. This is invaluable for rolling out new policies.
# Start with warn, promote to deny once teams have adapted
warn[msg] {
container := input.spec.template.spec.containers[_]
not container.resources.requests
msg := sprintf("[WARN] Container '%s' should define resource requests", [container.name])
}
Policy Exceptions and Waivers
No policy can cover every legitimate edge case. You need a mechanism for exceptions that is auditable and does not bypass the system entirely:
# policy/k8s/exceptions.rego
package k8s.images
import data.exceptions
# Skip the deny rule if an approved exception exists
deny[msg] {
container := input.spec.template.spec.containers[_]
endswith(container.image, ":latest")
not exception_exists(input.metadata.name, container.name)
msg := sprintf("Container '%s' uses the 'latest' tag", [container.name])
}
exception_exists(deployment, container) {
exception := exceptions.approved[_]
exception.deployment == deployment
exception.container == container
exception.policy == "deny-latest-tag"
time.now_ns() < exception.expires_ns
}
The exceptions data file is also version-controlled and requires approval:
# data/exceptions.json
{
"approved": [
{
"deployment": "legacy-app",
"container": "app",
"policy": "deny-latest-tag",
"reason": "Legacy build system cannot produce tagged images — migration tracked in JIRA-1234",
"approved_by": "security-team",
"expires_ns": 1735689600000000000
}
]
}
Reporting Policy Results to Dashboards
Conftest and OPA both support JSON output, which makes it straightforward to ship results to observability platforms. In your pipeline, capture the output and send it to your SIEM, logging platform, or a custom dashboard:
# Capture results as JSON
conftest test k8s/*.yaml --policy policy/k8s/ --output json > policy-results.json
# Ship to your logging platform
curl -X POST https://logging.internal/api/v1/policy-results \
-H "Content-Type: application/json" \
-d @policy-results.json
This creates an audit trail independent of CI/CD logs — essential for compliance and trend analysis.
Gradual Rollout: Audit Mode Before Enforce Mode
The recommended rollout strategy for any new policy follows this progression:
- Audit mode: Run the policy as
warnrules. Collect data on how many pipelines would fail. Share reports with teams. - Soft enforcement: Keep
warnrules but add notifications — Slack alerts, Jira tickets — so teams are aware and can remediate. - Hard enforcement: Promote rules from
warntodenyafter a communicated deadline. Ensure the exception process is in place. - Continuous tuning: Monitor false positives, adjust policies, add new rules based on incidents and threat intelligence.
This approach respects engineering workflows while steadily raising the security bar.
Limitations and Trade-offs
Policy as Code with OPA is powerful, but it is not without trade-offs. Being honest about limitations helps you make informed decisions.
The Rego Learning Curve
Rego is a purpose-built language with a unique evaluation model. It is not imperative — there are no loops or mutable variables in the traditional sense. Engineers accustomed to Python, Go, or Bash will need time to internalize Rego’s declarative, set-based approach. Invest in team training, pair programming on initial policies, and a library of well-commented example policies.
Performance with Large Inputs
OPA evaluates policies in memory. For most CI/CD use cases — Kubernetes manifests, Terraform plans, Dockerfiles — input sizes are small and evaluation is near-instantaneous. However, very large Terraform plans (thousands of resources) or complex policies with deep recursion can cause noticeable latency. Profile policies with opa eval --profile if performance becomes a concern.
OPA vs Other Policy Tools
OPA is not the only option. Consider alternatives based on your stack:
- Kyverno: Kubernetes-native policy engine. If your policies are exclusively about Kubernetes resources and you want YAML-based policies instead of Rego, Kyverno is an excellent alternative.
- HashiCorp Sentinel: Tightly integrated with Terraform Cloud/Enterprise. If your organization is standardized on HashiCorp tooling and you need policies primarily for Terraform, Sentinel may be more natural.
- AWS Cedar: Designed for application-level authorization. Not a direct competitor for CI/CD policy use cases, but relevant if you are building fine-grained authorization for your platform.
OPA’s strength is its generality. It works across Kubernetes, Terraform, Docker, pipeline configs, and any other structured data. If you need policy across multiple domains, OPA avoids tool sprawl.
Policy Drift and Maintenance
Policies are living artifacts. They require ongoing maintenance:
- New resource types and API versions need new rules.
- False positives erode trust and must be addressed promptly.
- Exceptions accumulate and need periodic review.
- Team turnover means knowledge about policy intent can be lost.
Treat your policy repository with the same rigor as your application code: assign owners, schedule reviews, track coverage, and deprecate stale rules.
Conclusion: Policy as Code Is Infrastructure
Policy as Code is not a nice-to-have or a compliance checkbox. It is infrastructure — the same way your CI/CD pipeline, your container orchestrator, and your cloud provider APIs are infrastructure. It deserves the same engineering discipline.
The path forward is clear:
- Start small. Pick one high-impact policy — deny
latesttags, require resource limits, block public S3 buckets — and implement it with Conftest in a single pipeline. - Build the muscle. Write tests for your policies. Set up a policy repository with CI. Get the team comfortable with Rego.
- Expand systematically. Add policies per domain (Kubernetes, Terraform, Docker, pipeline config). Roll out in audit mode first.
- Operationalize. Build dashboards. Define the exception process. Integrate with your incident response workflow.
Treat your policies like code: test them, review them, version them, deploy them. The result is a security posture that scales with your delivery velocity — enforceable, auditable, and automated from the first commit to production.