Direct prompt injection
User input overrides system prompt, exfiltrates instructions, or coerces unsafe output.
Prompt injection, RAG leakage, tool-use safety, and the surrounding service — tested against the OWASP Top 10 for LLM Applications and ready for NIST AI RMF or EU AI Act framing.
User input overrides system prompt, exfiltrates instructions, or coerces unsafe output.
A retrieved document, web page, or email instructs the model into unintended actions.
Cross-tenant retrieval leakage, source-document poisoning, content isolation in retrieval.
Unsafe tool selection, parameter tampering, unbounded chains, privilege escalation through tools.
Conversation, training-data, and cross-user context leakage in chat and agent surfaces.
API auth, rate limiting, abuse resistance, and the boring web-app issues that wrap the model.
Both, depending on scope. Most engagements start at the model surface (prompt injection, RAG, tool use) and extend into the surrounding app and API — which is usually where the real impact lives.
Yes. We map findings to the OWASP Top 10 for LLM Applications and to standard OWASP Web and API Top 10 categories where the surrounding service is in scope.
Yes. We can frame the report in the language of the NIST AI Risk Management Framework or AI Act high-risk system requirements on request, in addition to standard SOC 2 / ISO control mappings.
We craft adversarial documents, web pages, or email content that the feature retrieves, and verify whether the instructions inside that content can override the model. We test both common injection patterns and ones tuned to the structure of your prompts.
Yes. Whether the model is hosted (OpenAI, Anthropic, Bedrock) or self-hosted (Llama, Mistral, custom), the surface that an attacker can reach is the prompt, the retrieval, the tools, and the surrounding service. That is what we test.
A 30-minute review with our lead pentester. No slides, no pitch — we look at what you have, tell you what we would test first, and give you a fair scope and timeline.