Model Capabilities
What Griffin, Eagle, and Lino can actually do — concrete tasks, input shapes, output contracts, and example prompts.
Model Capabilities
This page is the practical reference for what Safeguard's models can do: what you can ask them, what inputs they accept, what outputs they promise, and where their limits are.
For version history and deployment options, see AI Models.
Griffin — Reasoning and Remediation
Capability matrix
| Capability | Input | Output | Latency (p50) |
|---|---|---|---|
| Vulnerability explanation | CVE ID + affected component | Plain-English writeup with attack scenarios and exploit requirements | ~3s |
| Remediation plan | CVE ID + repo context | Ranked list of upgrade paths with breakage risk | ~6s |
| Auto-fix PR | Repo + target CVE(s) | Git branch with changes, test run log, PR description | 30s – 5min |
| Multi-repo PR | Target CVE across N repos | N coordinated PRs | 2 – 20 min |
| Reachability reasoning | Finding + call graph | "reachable" / "unreachable" / "conditional" + justification | ~4s |
| SBOM diff narration | Two SBOMs | Summary of what changed and why it matters | ~5s |
| Natural-language query | Free text question | Answer with linked data | ~3s |
| Agentic execution | Goal + scope | Plan → execute → verify loop | variable |
Agentic mode
Griffin runs a tool-use loop with the same tool surface exposed via the MCP server. A typical session:
Goal: "Resolve all reachable KEV-listed CVEs in the payments service this sprint."
Iteration 1:
tool: safeguard_find_vulnerabilities { project: payments, kev: true, reachable: true }
result: 7 findings
Iteration 2:
tool: safeguard_get_remediation_plan { finding_ids: [...] }
result: 6 upgradeable, 1 requires fork
Iteration 3:
tool: safeguard_remediate_npm + safeguard_open_pull_request × 6
result: 6 PRs open
Iteration 4:
tool: safeguard_jira_create { severity: high, title: "Fork lodash for CVE-X" }
result: ticket SEC-1234 opened
Summary: 6 auto-fixes, 1 ticket for manual fork.You can scope Griffin's tool surface — read-only triage versus full write access — per role or per workflow.
Input limits
- Context window: 200k tokens for Griffin 3.0; effective "usable" ~180k after system prompts.
- Max code bundle per pass: ~150k tokens; larger repos are chunked.
- Request-level budget per tenant: configurable soft caps + hard ceiling.
Output guarantees
- Structured outputs validated against JSON schemas before being returned to the client.
- Claims about files / functions / lines reference the actual repository state at call time; stale references auto-refresh.
- When Griffin is uncertain, the response includes an explicit confidence band and suggested next steps.
Example prompts
"Give me a remediation plan for CVE-2024-3094 across every Go service in the 'platform' team."
"Explain why CVE-2023-44487 (HTTP/2 Rapid Reset) appears in services that don't expose HTTP/2."
"Show the dependency path from my-api@main to a reachable log4j-core 2.17.0."
"Open PRs to pin all transitive dependencies on KEV-listed CVEs in production services; flag anything that would require a major version bump."
"What would happen if I upgraded spring-boot-starter-web from 2.7 to 3.2 in billing-service?"Eagle — Classification and Anomaly Detection
Capability matrix
| Capability | Input | Output |
|---|---|---|
| Package classification | Package + version | Benign / suspicious / malicious + score + indicators |
| Container layer classification | Image digest + layer | Same |
| Model weight classification | Weight file (safetensors / pickle) | Same + pickle-opcode analysis |
| Runtime anomaly scoring | Runtime telemetry stream | Per-workload anomaly score over time |
| Typosquat detection | Package name | Nearest top-1000 neighbors + confidence that it's a squat |
| Behavioral divergence | Two versions of an artifact | Differences with risk assessment |
How Eagle answers
Every Eagle output includes:
- Score (0-100).
- Classification.
- Contributing indicators (which rules fired).
- Confidence (based on training-data coverage of similar artifacts).
- Recommended action (allow / review / block / quarantine).
Example usage
Inline at npm install:
$ npm install some-package@1.2.3
# Eagle classification runs via Safeguard's install-hook.
Safeguard: some-package@1.2.3 classified MALICIOUS (score 94)
Indicators:
- post-install script contacts unusual host (tox-xxxx.run)
- base64-encoded payload in index.js
- package metadata created < 48h ago
- author has one prior malicious upload
Install aborted. Use `safeguard malware override` to continue at your own risk.False positive handling
- Verified false positives are added to an allowlist layer that doesn't change Eagle's model.
- If you believe a classification is wrong, submit via
safeguard malware dispute --id <finding>. Reviews run within 24h.
Lino — Compliance and Policy
Capability matrix
| Capability | Input | Output |
|---|---|---|
| Framework mapping | Framework (FedRAMP, CRA, SSDF, etc.) | Control-by-control evidence map from your environment |
| SBOM QA | CycloneDX or SPDX document | Quality score + list of missing minimum elements |
| Policy authoring | Plain-English intent | Policy YAML ready to apply |
| Policy explanation | Policy YAML | Plain-English summary + effects |
| Attestation reasoning | In-toto / SLSA attestation | Does it meet claimed level? why / why not |
| Gap analysis | Current state + target framework | What to do, in priority order |
Example: generate a policy from intent
Input: "For production workloads handling PCI data, block any container
whose SBOM is older than 7 days OR contains any KEV-listed vulnerability."
Output:
---
apiVersion: safeguard.sh/v1
kind: Policy
metadata:
name: pci-prod-strict
spec:
targets:
- kind: Image
labels:
env: production
data_class: pci
rules:
- id: sbom-fresh
condition: age(attestations.sbom) > "7d" OR attestations.sbom == null
effect: BLOCK
- id: no-kev
condition: any(vulnerabilities, kev == true)
effect: BLOCKExample: framework gap analysis
Input: "Gap analysis for CRA for our 'widget-cloud' product."
Output:
Coverage summary: 61 / 84 requirements met
Critical gaps:
- Annex I (1) coordinated vulnerability disclosure policy not published
- Annex I (9) SBOM not delivered with product releases
- Annex II (2) security support period not documented
Medium gaps: ...
Low gaps: ...
Recommended next steps (in order):
1. Publish SECURITY.md with VDP; template linked
2. Attach CycloneDX SBOM to each release; one-click workflow below
3. Document support window in product docsShared Properties
Privacy
- Customer code, SBOMs, and telemetry are never used to train shared models.
- Per-tenant fine-tunes stay inside the tenant boundary.
- Inference request bodies are retained 24 hours by default; configurable 0–365 days for Enterprise.
Inference endpoints
- Shared cloud — hosted in Safeguard's FedRAMP HIGH environment. Default.
- Dedicated tenancy — single-tenant inference cluster.
- Self-hosted — quantized builds on customer GPUs for air-gapped / IL7.
Versioning
Pin a model version for reproducibility:
ai_models:
griffin: "3.0" # or "3.1-preview"
eagle: "3.0"
lino: "2.0"Model outputs are reproducible when:
- Version is pinned.
- Temperature = 0 (the default for tool-use workflows).
- The underlying knowledge graph is at the same snapshot (audit via
X-SG-Knowledge-Snapshotresponse header).
API
# Chat with Griffin
safeguard griffin chat "Summarize this SBOM's risk"
# One-shot agentic task
safeguard griffin run --goal "Fix critical KEV CVEs on main" --project my-api
# Eagle classification
safeguard eagle classify --npm react@19.0.0
# Lino policy generation
safeguard lino policy --intent "block unsigned prod images"All three models are accessible via the MCP Server, the REST API, and the CLI.
Evaluation Datasets
For Enterprise tenants:
- Custom benchmark creation — upload your own eval set (prompts + expected outputs) to validate model behavior on your codebase.
- Per-tenant regression gates — a new model version is rolled out only if it scores ≥ current on your benchmark.
- Model selection per workflow — different workflows can pin different model versions.
Enterprise customers who want Safeguard to fine-tune Griffin on their internal security incident corpus can open a conversation with the support team; this is a structured engagement, not a one-click feature.
Related
- AI Models — version history, deployment modes, privacy.
- Griffin AI — Griffin-specific feature guide.
- MCP Server — exposing model capabilities to external AI clients.
- Workflows — building model-driven automations.