Detect malicious packages, images, and model weights using Eagle — Safeguard's purpose-built classification model.

Malware Detection

Supply chain attackers ship malicious code as packages, container layers, or model weights. Traditional antivirus and signature-based tooling miss most of them. Safeguard uses Eagle — our classification model trained on confirmed-malicious artifacts — to classify every artifact your organization pulls or publishes.

What Eagle Looks For

Eagle scores each artifact across seven indicator classes:

Indicator	What it means
Install-script behavior	Post-install hooks that contact unusual hosts, install additional packages, or write outside the package's own directory.
Obfuscated code	Dense eval / Function / base64 / zlib-wrapped payloads; string encoding patterns known from real incidents.
Egress patterns	Hard-coded URLs matching known C2 infrastructure, Discord / Telegram exfiltration, Pastebin, etc.
Credential harvesting	Reads from `~/.aws`, `~/.ssh`, `~/.npmrc`, browser profiles, clipboard.
Typosquat similarity	Name within edit-distance 2 of a top-1000 package, owned by a different publisher.
Metadata anomalies	Newly-published package with aggressive versioning, unusual author history, no repository URL.
Behavior divergence from prior versions	New network / file system / process activity that didn't exist in the previous release.

The output per artifact:

Score 0-100.
Classification — benign / suspicious / malicious.
Indicators fired — specific rules that contributed.
Griffin-authored explanation — plain-English summary of what was unusual.

Coverage

Eagle classifies artifacts in:

Ecosystem	Coverage
npm	Every publish, retroactive scan of the last 10 years
PyPI	Every publish, retroactive scan
Maven Central	Every publish + historical scan
RubyGems	Every publish
NuGet	Every publish
crates.io	Every publish
Go modules	Every publish
Composer	Every publish
Docker Hub / GHCR / ECR / ACR / GCR	Every tag push, plus configured periodic scan of running images
Hugging Face	Every model, every fine-tune, every config change

Enforcement Points

Eagle's classification feeds into policy at multiple points:

Inline at Install

Via the Safeguard runner or a pre-install hook:

# .safeguard/malware-policy.yaml
malware:
  block: [malicious]
  warn: [suspicious]
  confirm_ttl: 300  # how long a classification is trusted before re-check

If npm install pulls a package classified malicious, the install aborts with an explanation.

At Registry

The Gold Registry admits no artifact that Eagle classifies above score 30.

For your own internal registry, enable inline classification on pushes so internal publishes also get scored.

At CI

- uses: safeguard/malware-scan@v2
  with:
    fail-on: malicious

At Admission

The Kubernetes admission controller refuses pods with an image layer containing malicious content.

Historical Scans

When Eagle is updated, it re-scores the entire corpus retroactively. If a package you depend on is reclassified as malicious because of a signal Eagle previously missed, you get a finding within seconds of the re-scoring pass.

The finding includes:

When the package was installed / deployed.
What specifically triggered the new classification.
Recommended remediation (rollback, Gold substitution, or quarantine).

Real Examples

Eagle has publicly surfaced malicious activity in:

npm packages with hidden post-install scripts exfiltrating ~/.aws/credentials.
PyPI typosquats of popular ML packages.
Compromised model repositories on Hugging Face distributing pickle-based payloads.
OCI images with layers that dropped cryptominers on container start.

See the Incident Analysis blog category for writeups.

False Positives

Eagle is tuned for precision — the classification band-widths are:

Score < 30: benign, released silently.
Score 30-70: suspicious, surfaces as a warning with the indicator list.
Score > 70: malicious, blocked by default policy.

Reported false-positive rate for score > 70 is approximately 1 in 10,000. Reporting a suspected false positive via research@safeguard.sh triggers human review within 24 hours.

Bypass Attempts

When attackers try to evade detection, Eagle flags the evasion itself:

String splitting / concatenation of known malicious identifiers.
Delayed-execution payloads that only run weeks after install.
Package trust "warm-up" — publishing a few benign versions before the malicious one.
Typosquat rotation (publish under many similar names to defeat allowlists).

AI Model Malware

For AI artifacts specifically:

Pickle payload detection — model weight files that embed executable Python via pickle.
Hidden backdoors — statistical tests for neurons or prompt-triggered behaviors that produce attacker-controlled output.
Config-level injection — transformers config files that reference external code with arbitrary URLs.

API

safeguard malware scan --npm react@19.0.0
safeguard malware scan --image ghcr.io/vendor/widget:1.2
safeguard malware history --package react --since 2020

AI Models — Eagle model versions and capabilities.
Gold Registry — malware-free curated artifacts.
Zero-Day Discovery — how malware findings feed Safeguard's research pipeline.
Guardrails & Enforcement — enforcing malware policy at every point.

Malware Detection

On this page