Prompt injection is not a prompt problem

Raptoric

Services▾Industries▾Research▾Customers Company

Sign inEngage Raptoric

The Raptoric Journal/AI Security

AI SecurityMay 28, 2026 · 7 min read

Prompt injection is not a prompt problem

Teams keep trying to patch injection with better system prompts. The fix lives in the architecture, not the wording.

Written by

Raptoric AI Security

LinkedInX / TwitterCopy link

Every few weeks a team shows us a new system prompt. It is longer than the last one, full of capital letters and the word NEVER. They are sure this version finally stops the model from leaking data or calling the wrong tool. It does not, and it never will, because they are solving the wrong problem.

Why wording cannot win

A language model treats every token it reads as input. It does not have a separate, privileged channel for instructions and a lower-class channel for data. When your application pastes a web page, a support ticket, or a PDF into the context, the model reads the attacker’s text with exactly the same trust it gives your own rules.

That means an instruction buried in retrieved content can override the one you wrote, no matter how forcefully you wrote it. You are not in an argument the model can referee. You handed both sides the same microphone.

If untrusted text and trusted instructions share one context, you have already lost. The only question is how much.

Where the real controls live

The durable fixes are structural, and they sit outside the prompt:

Treat every tool the model can call as an attack surface. Scope each one to the minimum it needs, and require confirmation for anything that moves money, data, or state.
Put a hard boundary between retrieved content and instructions. Tag untrusted text, and never let it expand the model’s permissions.
Validate outputs the way you validate any other untrusted input — before they reach a database, a shell, or another service.
Log the full chain: what was retrieved, what the model decided, what it called. You cannot investigate what you did not record.

How we test it

When we red-team an AI system, we do not grade the system prompt. We map the trust boundaries, then attack across them: indirect injection through retrieved documents, tool-call hijacking, and data exfiltration through the model’s own outputs. The findings we hand back are architectural, because that is where the fix has to happen.

Better wording buys you a day. Better structure buys you the year.

Want this tested on your own systems?

A senior engineer will scope it with you on a 30-minute call.

Book a scoping call

Keep reading

All insights →

01Offensive Security

A scan is not a pentest

Read →5 min read

02Threat Detection & Response

Most alerts are noise. The job is the signal.

Read →6 min read

03Security Program & Risk

SOC 2 is a floor, not a finish line

Read →5 min read

Stay current

Subscribe to the Raptoric briefing.

Monthly intelligence digest. Disclosure highlights, threat-actor activity, and engagement field notes from our practitioners.

name@company.com

Issued monthly · unsubscribe anytime · PGP available

Raptoric

A technical cybersecurity services firm. Engineering-grade rigor across five practice lines. Engaged by 140+ organizations in financial services, healthcare, technology, and government.

Services

Offensive SecurityApplication & CloudDetection & ResponseProgram & RiskAI SecurityView all services →

Industries

Financial ServicesHealthcareTechnology & SaaSGovernment & DefenseAI PlatformsCritical Infrastructure

Research

2026 Adversary ReportDisclosures & CVEsThreat IntelligenceEngineering Blog

Company

AboutCareersNewsroomContactResponsible AI

Engage

Book a scoping callPGP keyshello@raptoric.com

✓SOC 2 Type II

✓ISO 27001:2022

✓CREST

✓CHECK

✓PCI QSA

✓NIST 800-171

Audited annually · references on request

PrivacyTermsResponsible disclosureModern slavery statementTrust center