Why this matters: The fastest way to ship AI features is to stand on vendors—but the fastest way to create messy risk is to skip due diligence. This guide gives you a pragmatic, 30-question checklist organized into eight domains. You’ll also get a scoring rubric, negotiation levers, red-flag patterns, and a one-page template you can paste into your next RFP.
How to use this checklist
- Decide your risk appetite (low/medium/high) and weight domains accordingly.
- Ask vendors to answer in writing. Require yes/no + evidence/URL + owner for each item.
- Score each answer 0–3 (0 = no/unknown, 3 = strong with proof). Multiply by weights; set a pass threshold.
- Attach contract riders for must-haves (retention, data use, breach evidence).
Domain A: Data handling & retention
- Retention controls: Can you set prompt/output retention to 0 or custom durations (incl. backups/DR)?
- Training use: Can you disable training/analytics on your data? Is it enforced technically and contractually?
- Redaction support: Will the vendor accept customer-side redaction (placeholders only)? Any server-side minimization options?
- Deletion proof: Can they provide deletion receipts or logs for subject-level or tenant-level purges?
Red flags: “We might use your data to improve services by default,” or “Backups can’t be purged.”
Domain B: Residency & subprocessors
- Region pinning: Can you lock processing and storage to specific regions?
- Subprocessor map: Do they disclose all subprocessors who might see prompts/logs? Are flow-down obligations in place?
- Local options: Do they offer VPC/private endpoints or on-prem modes for sensitive workloads?
Red flags: Unclear backup geography; no subprocessor list; surprise analytics vendors.
Domain C: Security & identity
- Auth: SSO (SAML/OIDC), MFA, service identities, SCIM for user lifecycle?
- Egress control: Can you restrict API access by IP range/VPC peering? Support for mTLS or customer-managed keys?
- Secrets handling: Do logs/telemetry reject high-entropy secrets? Are there server-side guards?
Red flags: Shared API keys; broad engineer access to customer logs; plaintext logs.
Domain D: Observability & auditability
- Logging schema: Do they log decisions/metrics without raw prompts? Can you get structured exports?
- Tenant isolation: Can you get logs scoped to your tenant with correlation IDs?
- Audit rights: Contractual right to audit or receive independent assurance (SOC 2/ISO) that covers AI components?
Red flags: “We can’t provide logs,” or logs contain raw text; no recent assurance reports.
Domain E: Legal & discovery
- Subpoena process: Will they notify you and challenge overbroad requests? Documented process?
- Data classification alignment: Can your classifications (Restricted/Internal) be enforced in their product?
- Recordkeeping: Can they align with your records schedules and legal holds?
Red flags: Silent on discovery; blanket carve-outs; no legal points of contact.
Domain F: Model governance
- Model register: Can they disclose model versions, training data posture, safety features?
- Routing control: Can you pin versions, choose providers, or fail over? Is there a changelog with deprecations?
- Quality gates: Support for JSON schema validation, output filters, and tool allowlists?
Red flags: Surprise model switches; opaque updates; no validation hooks.
Domain G: Incident response
- Breach evidence: Will they provide forensic logs, timeline, and scope analysis?
- Notification SLAs: Contractual timelines for security and privacy incidents? Named on-call escalation path?
- Drills: Do they run table-tops/red-blue for prompt leaks and secret exposure?
Red flags: “We don’t share forensic detail,” or PR-driven comms with no artifacts.
Domain H: Commercials & indemnities
- IP indemnity: Coverage for claims arising from normal use of outputs? Caps and exclusions?
- Data processing addendum (DPA/BAA): For personal data/PHI, do they sign DPAs/BAAs that cover prompts/outputs/telemetry/backups?
- Exit plan: Data export (structured logs, configs) and deletion timeline upon termination?
- Uptime & credits: SLAs with meaningful credits; multi-vendor routing allowed by contract?
Red flags: No indemnity; no DPA/BAA; punitive lock-in; export friction.
Scoring rubric (example)
Score | Description | Evidence |
---|---|---|
0 | No control or refused | None |
1 | Policy only | Docs without proof |
2 | Implemented | Screenshots/links/contracts |
3 | Mature + audited | Recent SOC/ISO, live demos, exports |
Weight domains (e.g., Data Handling ×3, Incident ×2, Legal ×2, others ×1). Set pass at ≥75% of possible points and no critical red flags.
Negotiation levers
- Training opt-out by default: Make it contractual and visible in the admin UI.
- Retention 0–30 days: Tie to data category; backups included; deletion receipts.
- Subprocessor notice: Advance notice and right to object for material changes.
- Audit extracts: Quarterly structured log exports + model routing snapshots.
- Indemnity uplift: For regulated workloads, raise caps and tighten exclusions.
Red-flag patterns and what to do
- Vendor keeps raw prompts for analytics by default: Require opt-out and verify with exports; otherwise route through your redaction gateway so they only see placeholders.
- Opaque model/version changes: Require a pinned version mode or a 30-day notice period; add a kill-switch in your router.
- No DSAR or deletion mechanics: If they can’t delete, ensure you only ever send placeholders (so “deletion” is moot) or pick a different vendor for PI.
Fast-track due diligence (2-week plan)
- Day 1: Send checklist; request admin demo; ask for SOC/ISO, DPA/BAA templates.
- Day 2–3: Security/legal review of docs; draft rider with must-haves.
- Day 4–5: Technical demo: verify logs, retention, routing, and data-use flags live.
- Day 6–7: Pilot via your gateway in observe-only; collect detection/latency metrics.
- Day 8–10: Close gaps or pick alternative; finalize contract with riders.
Template (paste into your RFP)
Provide responses for each item with links/screenshots and contract references. A. Retention & Data Use: [1–4] B. Residency & Subprocessors: [5–7] C. Security & Identity: [8–10] D. Observability & Auditability: [11–13] E. Legal & Discovery: [14–16] F. Model Governance: [17–19] G. Incident Response: [20–22] H. Commercials & Indemnities: [23–26] Scoring: 0–3 each; required threshold: __; critical must-haves: __.
Make your stack vendor-portable
Even with a great vendor, build your own control plane: a gateway that handles redaction, logging, and routing. With that in place, you can switch providers behind the scenes if pricing, features, or risk posture changes, without rewriting applications.
Bottom line
Vendor risk for AI is solvable with a clear checklist, proof-or-it-didn’t-happen evidence, and a control plane you own. Ask sharp questions, demand demonstrations, and keep sensitive data minimized at the boundary. You’ll move fast and sleep better.
Related reading: Audit-Ready LLMs • Secure AI API Integration • Data Residency & Sovereignty
Questions about AI security?
Our experts are here to help you implement secure AI solutions for your organization.
Contact Our Experts