AI Threat Modelling with MITRE ATLAS and OWASP

A practical workflow for modelling AI security threats using MITRE ATLAS, ATT&CK, OWASP Top 10, and OWASP AI Exchange.
AI threat modelling should answer one question: what can go wrong when models, data, prompts, users, APIs, infrastructure, and business decisions are connected?
Classic threat modelling still applies. AI adds model-specific risks, but the product still depends on identity, cloud permissions, CI/CD, web APIs, storage, logs, and human approval. The practical approach is to combine frameworks instead of forcing every risk into one list.
This article uses MITRE ATLAS, MITRE ATT&CK, and OWASP as the core references.
Hero image note: the hero image is an original AI-generated illustration created for this post. It does not use copied third-party images, logos, or branded assets.
Framework Roles
Use MITRE ATLAS for model-specific threats:
data poisoning
prompt injection
model extraction
model inversion
adversarial examples
evasion
unsafe model behavior
Use MITRE ATT&CK for the systems around the model:
phishing and credential theft
cloud permission abuse
CI/CD compromise
service-account misuse
lateral movement
log exfiltration
persistence and defense evasion
Use OWASP for the application and process layer:
assets and trust boundaries
data flow mapping
broken access control
injection
insecure design
vulnerable components
logging and monitoring gaps
AI lifecycle governance
The overlap is useful. If a risk appears in multiple frameworks, it likely deserves priority.
Practical Workflow
Start with one AI feature, not the entire AI program. A useful scope sounds like: "Support assistant answers questions from internal documentation and can create draft tickets." A vague scope like "AI assistant" is too broad.
For that feature, document:
user goal
model or provider
input sources
retrieval sources
output destination
tool permissions
data retention
logging behavior
human approval points
Then draw the data flow:
user prompt
authentication layer
application backend
prompt builder
retrieval system
vector database
model endpoint
tool APIs
logs and analytics
human review queue
Mark trust boundaries between user-controlled input, retrieved content, internal instructions, privileged tools, and stored logs.
Assets to Protect
AI assets are broader than the model itself:
model access or weights
system prompts
training and fine-tuning data
retrieval documents
embeddings and vector indexes
user conversations
tool credentials
business rules
logs and traces
evaluation datasets
If exposure or manipulation would hurt the business, include it in the threat model.
Controls That Matter
Prompts guide behavior, but code should enforce security. Strong controls include:
authorization before retrieval
tenant filtering outside the model
scoped tool credentials
allowlisted tool calls
schema validation for tool arguments
approval gates for sensitive actions
output filtering for secrets and personal data
retrieval chunk and context limits
prompt-injection scanning for documents
immutable audit logs
model behavior evaluations before release
The model can recommend an action. The application should decide whether the action is allowed.
Validation Tests
Threat modelling is incomplete until controls are tested. Include tests for:
direct prompt injection
indirect prompt injection through documents
unauthorized document retrieval
system prompt disclosure attempts
malicious tool-call arguments
oversized context input
sensitive data in output
poisoned knowledge-base content
cross-tenant access attempts
For high-risk features, these tests should become release gates, not one-time manual checks.
Example: Support Assistant
A support assistant that answers from internal documentation and creates draft tickets has these assets:
internal support articles
customer tickets
user identity
ticket API token
system prompt
conversation history
model logs
Main threats:
user asks for another customer's tickets
retrieved document contains malicious instructions
prompt injection creates harmful ticket drafts
assistant leaks hidden instructions
API token is abused outside the model
logs store private data without retention controls
Controls:
authorize documents before retrieval
keep tenant checks outside the model
make ticket creation draft-only by default
validate tool arguments
require confirmation before creating records
scan outputs for sensitive data
log document IDs and tool calls
rate-limit extraction-like behavior
This is concrete enough for engineering, security, and product teams to act on.
Common Mistakes
Avoid these mistakes:
treating the model as the security boundary
modelling only the prompt and ignoring identity, storage, APIs, logs, and deployments
forgetting classic web and cloud risks because the project is "AI"
doing the threat model once and never updating it after prompts, tools, models, or documents change
Final Checklist
Before shipping an AI feature, answer:
What user data enters the system?
What internal data can be retrieved?
Who authorizes retrieval?
What instructions are trusted?
What content is untrusted?
What tools can the model call?
What can those tools change?
What logs are created?
How are prompt injection and data leakage tested?
Which ATLAS, ATT&CK, and OWASP risks apply?
What controls exist outside the model?
Who owns the threat model after launch?
If the team cannot answer those questions, the AI feature is not ready for sensitive workflows.
Conclusion
MITRE ATLAS helps describe AI-specific attacks. MITRE ATT&CK covers the infrastructure attack path. OWASP keeps the process grounded in assets, data flows, trust boundaries, and testable controls.
The goal is not a huge diagram. The goal is a clear map of what can go wrong and what the system does to stop it.
References
Thanks for reading. See you in the next lab.

