Automatically rebuild and redeploy container images to patch new CVEs — without waiting for the next feature release.

Self-Healing Containers

When a new CVE affects a base image or a package in your container, the usual response is a manual chain: someone notices, someone opens a PR, someone reviews, someone redeploys. Self-healing containers automate that chain so your production images are always patched — typically within an hour of a CVE being published.

How It Works

Self-healing runs a four-step loop, driven by continuous scanning:

Detect — a new CVE affects a component in one of your container images.
Plan — Griffin generates a patch plan: upgrade, apply security patch, or swap to a Gold-hardened equivalent.
Rebuild — the runner rebuilds the image with the patch, running your existing test suite.
Promote — if tests pass and the image passes guardrails, the image is pushed to your registry under a new tag. An admission controller rollout picks it up.

Each step is pluggable; you decide where humans are in the loop.

Modes

Advisory

Safeguard detects, plans, and files a PR against your Dockerfile repo or a Kubernetes manifest repo. Humans review and merge. Best for teams just turning self-healing on.

Autonomous (promotion gated)

Safeguard detects, plans, rebuilds, and promotes to a staging tag. Human approval required to promote to production tags. Default for most enterprise customers.

Autonomous (continuous)

Safeguard detects, plans, rebuilds, and promotes to production — end to end. Policies still apply: a change that fails the production policy bundle is held for review. Recommended once you have confidence in your test coverage.

Patch Strategies

Griffin picks a strategy per finding:

Strategy	When
Upstream bump	A patched upstream version exists.
Backport patch	Upstream doesn't have a patched version, but a fix is in HEAD.
Gold substitution	A Gold artifact replaces the vulnerable component. See Gold Registry.
Dependency pin	A safe indirect dependency version exists but is not resolved; Griffin pins it.
Defer	No safe fix exists; the finding is left open with a time-boxed exception.

Base-Image Self-Healing

The common case: your FROM node:20-alpine picks up a new libcrypto CVE. Self-healing rebuilds the image against the latest node:20-alpine digest (or a hardened Gold equivalent) and redeploys.

For FROM scratch and distroless images, Safeguard tracks which binaries are baked in and rebuilds the layer for the affected binary only.

Dependency-Level Self-Healing

Inside your image, npm ci, pip install, cargo fetch, etc. pull in hundreds of transitive deps. When a CVE lands on one of them:

The lockfile is resolved to the minimum version that resolves the CVE.
The image is rebuilt with the updated lockfile.
The test suite runs.
A diff of the lockfile is written into the resulting image as a provenance attestation.

Test Integration

Self-healing calls your existing test suite before promoting. Supported harnesses:

Any CI provider — safeguard heal --ci github / --ci gitlab / --ci azure / --ci buildkite / --ci jenkins / --ci circleci. The runner delegates the build/test to your pipeline and watches the result.
Local runner — if you have the Safeguard runner self-hosted, tests run there.
Synthetic tests — a lightweight smoke test (health endpoints, startup time, memory footprint) runs for every rebuild even if no full test suite is declared.

Registry and Kubernetes Integration

Registry

Self-healed images are pushed with clear tagging conventions:

<registry>/<image>:<original-tag>-sg-heal-<date> for staging promotions.
<registry>/<image>:<original-tag> overwritten for continuous mode (with immutable digest pinning).

Every image carries:

An SBOM attestation describing the patch.
A Griffin explanation attestation — plain-English description of what was fixed.
A signed provenance attestation referencing the source commit and build environment.

Kubernetes

If the Safeguard Helm operator is installed, rolling out a self-healed image is automatic:

The operator watches registry tags.
New patched digests trigger a controlled rollout (one pod at a time, respecting PodDisruptionBudgets).
The rollout is paused automatically if readiness probes fail.

Observability

The Self-Healing dashboard shows:

Images currently being watched.
Pending, in-progress, and completed heal cycles.
Time-to-heal histogram (CVE published → image deployed).
Rollback count (heals that failed verification).

Representative numbers from customer tenants:

Median time-to-heal: 20-45 minutes (CVE published → production rollout).
P95 time-to-heal: 2 hours.
Rollback rate: < 1%.

Rollback

Every heal is reversible:

The previous image digest is preserved.
A one-click rollback from the UI or safeguard heal rollback --image <image> restores it.
Automatic rollback triggers on: readiness probe failures, error-rate spikes, or policy violations detected on the new image.

Air-Gapped Self-Healing

For air-gapped environments, self-healing runs entirely inside your infrastructure:

The Safeguard operator bundles rebuild tooling and the required base images.
Snapshot updates are delivered via signed tarballs.
No egress required at runtime.

Turning It On

# .safeguard/self-heal.yaml
self_heal:
  mode: autonomous-staging   # advisory | autonomous-staging | continuous
  test_command: "npm test"
  notify:
    slack: "#sec-ops"
  strategy_overrides:
    - package: log4j-core
      prefer: upstream-bump

Or via the UI: ESSCM → Asset → Self-Healing → Configure.

Continuous Scanning — detects the CVE that triggers self-healing.
Gold Registry — Gold images are the preferred substitution target.
Auto-Fix — source-code auto-fix is the companion to container self-healing.
Workflows — orchestrates self-healing at scale.
Griffin AI — the model that plans and verifies each heal.

Self-Healing Containers

On this page