jailbreak detection API

Jailbreak Detection API for AI Products

A jailbreak detection API helps product teams spot when users or untrusted context try to bypass safety, policy, or tool boundaries in an LLM app.

Run demo scan

Where it fits

Screening prompt traffic before it reaches a high-privilege model or agent tool.
Adding security telemetry to moderation, support, coding, or research assistants.
Testing whether a known jailbreak family still works after prompt or model changes.

Operational steps

Send the candidate prompt, recent turns, policy name, and app surface to the detection endpoint.
Use the response severity to block, review, log, or downscope the request.
Replay known jailbreak families in staging before release.
Feed confirmed misses back into a custom test pack for future regression scans.

Common risks

A single-turn classifier misses gradual roleplay or authority-shifting attacks.
The app detects toxic language but not attempts to override system instructions.
Detection logs contain sensitive user text without minimization controls.

How PromptGuard Scan fits the workflow

PromptGuard Scan combines a maintained jailbreak library with scan reports and API responses that fit product telemetry, CI checks, and security review workflows.

Ready to test a real AI surface?

Pricing

Team annual is selected by default.

Annual billing is 50% off. All plans use NOWPayments checkout and keep the product page open.

Dev

For solo builders validating one product before launch.

$25/mo

$294 billed yearly. Save 50%.

5 apps500 scans

Prompt injection scans
Jailbreak template checks
PII and key leak detection
HTML risk report
Email support

Recommended

Team

For engineering teams shipping AI apps through pull requests.

$75/mo

$894 billed yearly. Save 50%.

20 appsUnlimited scans

Everything in Dev
GitHub Actions and GitLab CI gates
Multi-turn bypass testing
PDF export and CVSS scoring
Shared workspaces and API access

Enterprise

For platform teams, private deployments, and audit-heavy AI systems.

$250/mo

$2,994 billed yearly. Save 50%.

Unlimited appsUnlimited scans

Everything in Team
Private deployment path
Custom test packs
Compliance evidence exports
Priority security review support

Security playbooks

Practical guides for LLM app security decisions.

prompt injection scanner Prompt Injection Scanner for LLM Apps Read workflow LLM security testing tool LLM Security Testing Tool for Release Gates Read workflow AI app security audit AI App Security Audit Checklist Read workflow jailbreak detection API Jailbreak Detection API for AI Products Read workflow LLM vulnerability scanner LLM Vulnerability Scanner for Agents and RAG Read workflow AI red team SaaS AI Red Team SaaS for Continuous Testing Read workflow ChatGPT security testing ChatGPT Security Testing for Apps and Custom Workflows Read workflow LLM penetration testing LLM Penetration Testing Workflow Read workflow