AI RELEASE ASSURANCE · EU AI ACT READY · EMPIRICAL

Models don't stay aligned after interaction.

MTCP tells you if they will.

Multi-model, multi-language constraint persistence evaluation. 32 models across 12 languages. 183,924 probe interactions.

✓ Black-box — API access only, no weights or vendor cooperation needed ✓ Empirical — 183,924 real probe interactions, not simulated ✓ Audit-ready — SHA-256 signed Release Decision Pack
Aligned to: EU AI Act · NIST AI RMF · ISO/IEC 42001 · FCA · MAS FEAT · NDMO · NCA

MTCP identifies which models can be governed at runtime and which will bleed through no matter how much control-plane engineering you throw at them.

For AI Engineers

Test if your model maintains safety constraints across temperature settings and conversation turns. Get a deploy/don't-deploy answer in 5 minutes.

For Procurement Teams

Compare AI providers on constraint durability. See which models maintain alignment under real-world variation.

For Compliance Officers

Audit trail proving AI models maintain safety constraints across operating conditions. EU AI Act Article 12 ready.

How MTCP Works
1
Submit API endpoint — no weights or vendor access needed
2
MTCP runs full behavioural durability evaluation
3
Receive Release Decision Pack — APPROVED / RESTRICTED / REJECTED
4
Download tamper-evident evidence trail (SHA-256)
5
Gate deployment or satisfy regulatory audit
Models evaluated
32
Total evaluations
183,924
Research papers
48
Governance layers
15
MTCP Governance Stack — 15 Layers
BIS — Single-model constraint persistence
CSAS — Cross-system coordination admissibility
JRS — Jurisdiction resolution at boundaries
TDS — Temporal drift detection over time
CCS — Constraint conflict resolution
RES — Remediation effectiveness measurement
ACPS — Adversarial persistence resistance
BEC — Blockchain evidence chain integrity
COS — Constraint object specification
LRP — Legitimacy resolution protocol
GRC — Governance reference conditions
DRADeployment readiness attestation
Gate — Admissibility enforcement (PERMIT/DENY)
Quantum — Post-quantum cryptographic validity
PRPRuntime behavioural monitoring

Measure

183,924 probe interactions across 32 frontier models in 12 languages at 4 temperature settings. The largest independent constraint persistence dataset published.

Boundary Integrity Score

Verify

Concealed control probes detect training data exposure. SHA-256 signed evidence packs. Machine-readable audit trail per run.

Control Probe Degradation

Gate

Release Decision Pack delivers APPROVED / APPROVED WITH RESTRICTIONS / REJECTED verdict with runtime guidance and regulatory alignment metadata.

Release Decision Pack

Beyond Single-Model Evaluation

MTCP evaluates constraint persistence at three levels. Each level produces empirical evidence, a grading scale, and audit-ready documentation.

Multi-Language Evaluation

The first multi-language, multi-script constraint persistence evaluation. 12 languages across 4 script families.

Non-Latin deployment requires language-specific evaluation. Standard English-only benchmarks cannot predict multilingual constraint reliability.

Who Uses MTCP

  • Procurement Teams
    Compare 32 evaluated models before vendor selection. Attach MTCP certificate to procurement documentation.
  • AI Risk Officers
    Empirical evidence for board-level risk sign-off. Quantified BIS, CPD, and TSI scores per model.
  • Compliance Leads
    EU AI Act Article 12 ready. NIST AI RMF aligned. Audit-ready evidence packs downloadable immediately.
  • Deployment Gatekeepers
    Set minimum BIS threshold. Block release on REJECTED verdict. Retest after model changes.
  • Sovereign AI Programmes
    Independent evaluation of model stacks across Arabic and multilingual contexts. NDMO and NCA alignment evidence. Board-ready compliance documentation for Gulf sovereign AI infrastructure.

Public Evidence

Full results with temperature breakdowns and metric definitions.

The MTCP evidence layer provides comparative release assurance data across 32 independently evaluated frontier models. 183,924 structured probe interactions at four temperature settings.

View Evidence →

Ready to evaluate your model?

Submit your endpoint for a confidential MTCP evaluation. Receive a Release Decision Pack, full evidence audit trail, and deployment verdict. EU AI Act ready.