Blog
/
Education

When Prompt Injection Gets Real: Use GraphQL Federation to Contain It

cover
Brendan BondurantTanya Deputatova

Brendan Bondurant & Tanya Deputatova

min read

TL;DR

The Problem – Controls Built for Humans, Not Models

From 2024 to 2025, major AI security breaches like Amazon Q, Vanna.AI, and EchoLeak showed that security controls built for human users fail when applied to large language models. The systems trusted model output as safe execution.

The Gap – No Runtime Boundaries Around Model Behavior

When LLMs run code, call APIs, or trigger builds, unverified logic slips through. The issue isn’t the model but the lack of trust boundaries to decide what an LLM is allowed to execute or access.

The Fix – WunderGraph Cosmo as the Runtime Control Plane

WunderGraph Cosmo applies federation principles—persisted operations, scoped access, and signed configurations—to enforce runtime boundaries that block unverified execution, prevent data leaks, and secure builds from tampering.

Execution Badges for Prompt Injection

Section 1: The Comfortable Illusion of Safety in AI Systems

By 2025, large language models were writing SQL, pushing code, sending emails, and triggering builds. For many teams, this shift felt safe. Their apps already had authentication, logging, and API gateways in place.

But those controls were built for users, not models.

Incidents over the past year showed how that assumption fails. When an LLM receives instructions, those instructions can include executable logic hidden inside prompts, code snippets, or markdown links. Once the model runs that output, traditional defenses don’t trigger because there are no runtime gates to decide what an AI can execute, what data it can reach, and which artifacts it can deploy.

Each failure shared the same pattern:

  • Instructions executed without verification
  • Access scopes broader than intent
  • Builds trusted artifacts without proof

The outcome: compromised environments, leaked data, and shaken confidence.

You don’t have to predict every malicious prompt, but you need to govern what executes, what’s reachable, and what’s trusted. Federation gives you the runtime surface to enforce those boundaries consistently across every service, schema, and client.

What to Lock Down at Runtime

AI Trusted Execution Flow

First, only approved operations run. Then we limit who can run them. We hide anything the client doesn’t need. Finally, we only ship signed configs.

Federation supports a secure-by-design posture by making safe defaults such as allowlists, scoped identities, signed configurations, and contract as code the standard path for execution.

Why GraphQL Federation Is the Runtime Control Plane AI Systems Need

Prompt injection isn’t a model flaw; it’s a system design gap that federation closes.

In a federated architecture, every query and credential passes through a central router that enforces policy across services.

In each major incident from 2024 to 2025, attackers didn’t break the LLM. They broke the system around it. Hidden instructions inside trusted data such as emails, repositories, or visualization requests were treated as valid actions.

Traditional controls like authentication and encryption assume threats come from outside the trust boundary. In prompt injection, the logic hides inside it.

  1. Persisted operations restrict behavior to pre-approved actions.
  2. Scoped access binds credentials to least privilege.
  3. Signed configurations verify and block tampered builds.

These controls don’t sanitize every prompt. They make sure the system won’t execute or expose anything outside its defined trust zone. It's a more durable defense than chasing injection patterns.

Case Study: Vanna.AI and Blind Execution of Model Output

Real incidents illustrate what happens when execution trust is misplaced.

In mid-2024, researchers discovered a critical flaw in Vanna.AI, a Python library that turned natural language into SQL and visualizations. The issue wasn’t in the model’s reasoning. It was in the system’s trust.

CVE-2024-5565 • CVSS 8.1 • Affected: vanna <= 0.5.5 • Patched: none as of publication date (per GHSA)

Vanna’s visualization path used an LLM to generate Plotly code, then executed that code directly with Python’s exec() function. Any attacker who could influence the prompt could inject arbitrary Python instructions. The system ran the model’s code without review.

Execution chain: ask(…) → generate_plotly_code(…) → get_plotly_figure(…) → exec()

The result: remote code execution with a CVSS score of 8.1.

What Went Wrong

  • No runtime gating: Model output was trusted and executed as-is.
  • No scope separation: The LLM held implicit authority to run arbitrary logic.
  • Default path executed LLM-generated Plotly code via exec() when visualize=True (default).

JFrog recommends sandboxing execution and adding output-integrity checks. These were mitigations, not defaults in Vanna.

What Would Have Changed the Outcome

With a federated access layer enforcing persisted operations, the system would never execute LLM-generated code directly.

Cosmo can be configured to accept only pre-registered queries (persisted operations), and anything that is not on the allowlist is rejected. Even if a prompt produced Python or SQL code, it would be blocked from execution because it doesn’t match an approved operation.

By assigning the AI client a scoped role — for example, read-only analytics access — even approved operations would be limited in reach. The AI could query approved data, but couldn't invoke runtime commands or modify state.

Finally, signed configurations ensure that no unverified code or schema changes are deployed into production, closing the loop between definition and execution.

After the disclosure, the maintainer released a hardening guide to help users apply safer defaults.

Case Study: EchoLeak and Hidden Instructions in Content

The same pattern appeared in content workflows, where trusted data became a delivery vector.

In early 2025, researchers disclosed EchoLeak, where a single crafted email triggered Copilot to encode sensitive data into an image URL, exfiltrating it through a trusted proxy.

CVE-2025-32711 • CVSS 9.3 (Critical, Microsoft) / 7.5 (High, NIST) • Affected: Microsoft 365 Copilot • Patched: Mitigated (vendor guidance issued)

No sandbox failed or vulnerability was exploited.

The system followed user instructions as designed — but the result was a data leak.

Vendor response: Microsoft deployed mitigations and issued guidance for isolating untrusted input.

What Went Wrong

  • No prompt isolation or input sanitization – malicious Markdown was accepted as trusted context.
  • No output validation – generated outbound requests weren’t checked before execution.

What Would Have Changed the Outcome

Federation would have contained it through scoped identities, schema contracts hiding sensitive fields, and persisted operations denying unapproved requests.

Case Study: Amazon Q and the Compromised Build

Even mature enterprise supply chains weren’t immune once trust boundaries blurred.

In July 2025, AWS disclosed a supply-chain breach in the Amazon Q Developer VS Code extension.

CVE-2025-8217 • Affected: Amazon Q Developer VS Code Extension • Patch: Upgrade to v1.85.0; remove v1.84.0 instances entirely

An attacker exploited an over-scoped build token to gain repository access, merging a pull request that injected malicious prompt instructions into version 1.84.0.

Vendor response: AWS patched the extension, rotated credentials, and published incident details .

What Went Wrong

  • Over-scoped token allowed unauthorized commits to reach production.
  • No signature verification; signed artifacts would have blocked the tampered release.
  • No runtime validation against an approved manifest before deployment.

The failure stemmed from missing trust boundaries.

What Would Have Changed the Outcome

  1. Signed Configurations and Schemas rejecting unsigned or altered files at startup.
  2. Scoped Access and RBAC limiting token permissions.
  3. Persisted Operations that block injected or unapproved queries.
  4. Audit Logs : capturing every schema, config, and key change.

Patterns in Prompt Injection Incidents and Their Implications

Across all three incidents, the pattern was the same. Each began when untrusted logic entered a trusted path, escalated because the system executed it without verification, and grew worse through over-scoped access or unsigned artifacts.

Each failure lines up with a missing boundary — all of which federation now enforces for you.

Implication for Teams

Prompt injection is not an edge case but a predictable outcome of systems that execute model output without policy.

The fix isn't more filters, but rather, stronger boundaries:

  • Govern execution with allowlists.
  • Govern reach with scoped identities.
  • Govern trust with signed artifacts.

Teams that implement these controls move from reactive patching to proactive containment.

They no longer rely on the model’s good behavior — they rely on verifiable controls.

Federation Controls to Implement This Quarter

Recent incidents exposed gaps in execution control, identity scope, and artifact trust.

These are the controls teams can apply now to contain similar risks:

RiskControlCosmo FeatureEffect In Practice
Injected or unvetted instructions executing at runtimeEnforce a persisted operation allowlistblock_non_persisted_operationsOnly pre-approved queries run; injected logic fails validation
AI clients retrieving or mutating sensitive dataApply role-based access control and field-level scopes@requiresScopes directives, API key RBACAI identities can reach only approved operations and fields
Prompt-injected queries discovering hidden fieldsPublish a schema contract for AI consumersContracted schema via Cosmo Studio or CLIThe AI never sees or queries fields outside its contract
Compromised or tampered deploymentsRequire cryptographic signing of schemas and configsgraph.sign_key (HMAC)Unsigned or altered builds are rejected automatically
Silent misuse or failed policy checksEnable audit and access loggingaccess_logs, Studio audit trailConfiguration changes are recorded in Studio’s audit log; failed or rejected requests appear in router access logs when enabled

Each control maps directly to a real failure pattern from the incidents above. Models still misbehave, but now the system won’t.

Implementation Roadmap

You can’t tune your way out of prompt injection, but you contain it with runtime boundaries.

Here’s how to roll out controls in order of impact:

1. Enforce Trusted Execution

  • Action: Enable block_non_persisted_operations: true.
    1
    2
    3
  • Outcome: Only approved GraphQL operations run; injected or ad hoc queries are rejected.

2. Scope Every Identity

Action: Create a schema contract that exposes only the fields your AI clients need.

  1. Tag sensitive fields in your schema:

    1
    2
    3
    4
    5
    6
  2. Create the contract via CLI:

    1
    2
    3
    4
    5

Outcome: Your AI clients get a clean, focused schema without access to sensitive data.

3. Publish a Reduced Schema

  • Action: Schema contracts are automatically published when created via the CLI command above. The contract creates a filtered version of your schema that excludes tagged fields.
  • Outcome: Confidential or internal fields are excluded from the AI's graph through the contract schema.

4. Verify What Ships

  • Action: Sign router configurations and schemas with an HMAC key.
1
2
  • Outcome: Tampered or unsigned artifacts are blocked at startup.

5. Record Everything

  • Action: Enable access logs and include failed requests.
1
2
3
4
  • Outcome: Configuration changes and rejected requests are traceable. (Audit logs are recorded automatically in Cosmo Studio.)

By following this sequence, teams move from reactive response to controlled execution.

Each layer builds on the last: first decide what runs, then who can run it, then what’s visible, and finally what’s trusted.

This approach doesn’t rely on predicting the next exploit. It governs every outcome.

Observe Scope Contracts Sign

Closing Takeaways

The Problem

The past year’s incidents didn’t come from clever attackers or flawed models.
They came from systems that trusted model outputs as if they were human decisions without verifying what, who, or how.

The Gap

You can’t eliminate prompt injection, but you can contain its impact.
Resilient teams do this by governing what executes, what’s reachable, and what’s trusted.
Even if the model goes off script, these controls hold the line.

The Fix

Cosmo was built for this shift, not just to connect services but to enforce runtime rules that protect and isolate AI-driven systems.
Federation isn’t just for stitching schemas; it’s how you enforce runtime rules.

By combining trusted execution, scoped access, and verified configuration, teams can turn prompt injection from a breach into a blocked request.

The Goal

The goal isn’t perfection.
It’s predictability.

You decide what the AI can do, what it can see, and what it can ship.
Everything else is denied, logged, and contained.

That’s practical AI governance.


For a broader look at responsible AI system design, see Rethinking API Access in the Age of AI Agents by Cameron Sechrist, which introduces harm limiting as a framework for guiding model behavior beyond access control.

Brendan Bondurant

Brendan Bondurant

Technical Content Writer

Brendan Bondurant is the technical content writer at WunderGraph, responsible for documentation strategy and technical content on GraphQL Federation, API tooling, and developer experience.

Tanya Deputatova

Tanya Deputatova

Data Architect: GTM & MI

Tanya brings cross-functional background in Data & MI, CMO, and BD director roles across SaaS/IaaS, data centers, and custom development in AMER, EMEA and APAC. Her work blends market intelligence, CRO and pragmatic LLM tooling teams actually adopts and analytics that move revenue.