🔒 Sovereign AI Safety

Your data stays
where you put it.

Enterprise governance layer on top of model-provided safety — with a sovereign option that never sends a byte to external APIs. Self-hosted. Auditable. Near-zero carbon.

0
External API Calls
10ms
Classification Latency
~0W
GPU Required
See Pricing Trust & Security

LLMs already filter content. Why add more?

Claude, GPT, and Gemini include built-in safety filters — RLHF training, real-time classifiers, output moderation. Your agents are not running unfiltered LLMs.

MeetLoyd's moderation layer adds enterprise governance on top: configurable thresholds per compliance pack (HIPAA stricter than default), a full audit trail with per-category scores, consistent policy enforcement across all LLM providers, and sovereign mode for organizations with data residency requirements.

Additionally, OpenAI offers a free Moderation API — a separate classification endpoint that checks content against safety categories. It’s free for API users (no per-call charge). Content sent to the API is not used for training (since March 2023), but is retained in abuse monitoring logs for up to 30 days by default.

For many organizations, this is fine. For EU enterprises under GDPR, regulated industries, or any organization with policies that prohibit sending content to external services — even for classification — MeetLoyd's sovereign moderation eliminates the trade-off entirely.

Note: Standard mode requires each tenant to provide their own OpenAI API key (BYOK pattern) — MeetLoyd does not share a platform key. Sovereign mode requires no external keys at all.

One toggle. Your choice of sovereignty.

Standard uses OpenAI's free Moderation API (requires tenant's own API key). Sovereign runs a BERT classifier on your CPU — nothing leaves, no key needed. Both satisfy EU AI Act Article 14.

STANDARD

OpenAI Moderation

Free, fast, proven. Content is sent to OpenAI for classification. Available to all tiers.

  • Zero cost (OpenAI provides it free)
  • ~100ms latency (network round-trip)
  • Proven accuracy across all content types
  • Content leaves your infrastructure
SOVEREIGN

Self-Hosted Moderation

BERT-based classifier running on CPU. Nothing leaves. Category-aware thresholds tuned for business content.

  • Zero external API calls
  • ~10ms latency (local CPU inference)
  • 6 categories with business-tuned thresholds
  • Full audit trail for compliance review
  • Near-zero carbon footprint (CPU only)

Sovereign with state-of-the-art accuracy.

Enable LLM escalation to re-classify borderline content with Llama Guard 3 — still fully self-hosted, with a tenant-configurable fallback chain.

Content in
  ↓
Detoxify BERT (CPU, ~10ms, always runs)
  ↓
Score > threshold?BLOCK (immediate, no escalation)
Score < 0.5?ALLOW
Borderline (0.5–threshold)?Escalate
  ↓
1. Llama Guard 3 (MeetLoyd vLLM)
  Failed?
2. Your fallback vLLM (tenant-configured)
  Failed?
3. Your fallback LLM (any sovereign endpoint)
  Failed?
4. Detoxify allow + audit (business never blocks because LLM infra is down)

Compliance. Carbon. Continuity.

🛡

EU AI Act Ready

Both modes satisfy Article 14 human oversight requirements. Sovereign + LLM adds explainable decisions — natural language reasoning for every classification.

🌿

Near-Zero Carbon

Sovereign Detoxify runs on CPU at ~0.1W per inference. No GPU required. The lowest-carbon content moderation available for enterprise AI.

Business Continuity

If LLM infrastructure goes down, sovereign mode gracefully falls back to Detoxify-only with audit logging. Your agents keep running.

AI safety that respects your data sovereignty.

Standard or sovereign. Your data, your choice.

See Pricing Learn More Trust & Security