GraphQL, API Design, Security · · by Michael Wybraniec

The Hidden Dangers of GraphQL: Security, Performance, and Complexity Risks

A comprehensive analysis of GraphQL's critical challenges in production environments, covering security vulnerabilities, performance bottlenecks, and architectural complexity that make it risky for enterprise applications.

GraphQL, once heralded as a revolutionary approach to API design, has revealed significant hidden dangers in real-world, production-grade systems. This comprehensive analysis explores the critical security vulnerabilities, performance bottlenecks, and architectural complexity that make GraphQL risky for modern, security-conscious, and performance-critical applications.

Key Findings:

  • Security Risks: Authorization gaps, rate limiting challenges, and introspection vulnerabilities
  • Performance Issues: N+1 query problems and exponential complexity growth
  • Architectural Problems: Business logic coupling and testing difficulties
  • Alternative Solutions: OpenAPI 3.0+ and TypeSpec workflows that provide better security and performance

Based on 6+ years of production experience and extensive community feedback, this analysis provides actionable insights for teams considering GraphQL adoption or migration strategies.

Primary Source

"Why, after 6 years, I'm over GraphQL" by Matt Bessey

Additional References

Security & Performance Analysis:

Alternative Solutions:

Community Discussion:

1.1. Authorisation

GraphQL’s flexibility is a double-edged sword: Because it exposes a fully introspectable query API, every field (not just top-level objects) must be explicitly authorised per request context. Failing to do so leads to data leaks like exposing emails or blocked user info. This makes fine-grained access control much harder than in REST, where each endpoint is tightly scoped and authorised individually.

query {
  user(id: 321) {
    handle # ✅ I am allowed to view Users public info
    email # 🛑 I shouldn't be able to see their PII just because I can view the User
  }
  user(id: 123) {
    blockedUsers {
      # 🛑 And sometimes I shouldn't even be able to see their public info,
      # because context matters!
      handle
    }
  }
}

Key risks:

1.1.1. Field-Level Authorization Gaps

⚠️ Leads to PII leaks and broken access control.

Without per-field checks, sensitive data like email or blockedUsers can leak even if object-level authorization exists.

📋 Resources:

1.1.2. Lack of Context-Aware Access Control

⚠️ Results in authorization bypasses or context-based data leaks.

Permissions often depend on the requester's context—such as being someone’s blocked user. GraphQL’s nested resolvers complicate applying these checks reliably.

📋 Resources:

1.1.3. Overexposed Schema via Introspection

⚠️ Expands the attack surface and facilitates automated schema exploration.

GraphQL’s introspection feature reveals the full schema by default, enabling attackers to easily enumerate and exploit API structure.

📋 Resources:

---
    header: GraphQL Authorization
---
sequenceDiagram
    participant Client
    participant GraphQLServer
    participant UserContext
    participant AuthorizationLayer

    Client->>GraphQLServer: Complex Multi-Field Query
    
    GraphQLServer->>UserContext: Fetch Current User State
    UserContext-->>GraphQLServer: User Roles, Relationships

    loop For Each Requested Field
        GraphQLServer->>AuthorizationLayer: Validate Field Access
        AuthorizationLayer->>UserContext: Check User Context
        
        alt Public Field
            AuthorizationLayer-->>GraphQLServer: ✅ Green: Allow Access
        else Private Field
            AuthorizationLayer-->>GraphQLServer: 🔴 Red: Deny Access
        else Context-Sensitive
            AuthorizationLayer-->>GraphQLServer: 🟠 Orange: Conditional Access
        end
    end

    Note over Client,GraphQLServer: Authorization Complexity Analysis
    Note over AuthorizationLayer: Checks per field:
    Note over AuthorizationLayer: 1. User Roles
    Note over AuthorizationLayer: 2. Field Sensitivity
    Note over AuthorizationLayer: 3. Contextual Relationships
---
    header: REST Authorization
---
sequenceDiagram
    participant Client
    participant RESTServer
    participant UserContext
    participant AuthorizationLayer

    Client->>RESTServer: Request Specific Endpoint
    
    RESTServer->>UserContext: Fetch User Roles/Permissions
    UserContext-->>RESTServer: User State and Roles
    
    RESTServer->>AuthorizationLayer: Validate Endpoint Access
    AuthorizationLayer->>UserContext: Confirm User Permissions
    
    alt Endpoint Authorized
        AuthorizationLayer-->>RESTServer: ✅ Green: Allow Full Access
        RESTServer->>Client: Return Complete Resource
    else Endpoint Unauthorized
        AuthorizationLayer-->>RESTServer: 🔴 Red: Deny Access
        RESTServer->>Client: 403 Forbidden Error
    end

    Note over Client,RESTServer: Authorization Simplicity
    Note over AuthorizationLayer: Single Check:
    Note over AuthorizationLayer: 1. Endpoint Permission
    Note over AuthorizationLayer: 2. User Role
    Note over RESTServer: Entire Resource Allowed or Denied

GraphQL lets clients build deep, nested queries — attackers can abuse this to overload your server via exponentially expensive queries.

Example schema

type Article {
  title: String
  tags: [Tag]
}
type Tag {
  name: String
  relatedTags: [Tag]
}

Problem query

query {
  tag(name: "security") {
    relatedTags {
      relatedTags {
        relatedTags {
          relatedTags {
            relatedTags { name }
          }
        }
      }
    }
  }
}
  • Assumed cost: 5^5 = 3,125
  • Real cost (if 10 tags): 10^5 = 100,000

GraphQL mitigations

  • Static estimate: Use field weights, but risky with unknown list lengths
  • Credit buckets: Track actual cost per user/session over time
  • Depth limits: Helps, but not enough

REST rate limit example

Rack::Attack.throttle('API v1', limit: 200, period: 60) do |req|
  if req.path =~ /\/api\/v1\//
    req.env['rack.session']['session_id']
  end
end

Bottom line GraphQL needs layered defenses: estimate, track, limit, and monitor — or risk denial-of-service via query abuse.

---
    header: Mitigation Strategy
---
flowchart LR
    A[GraphQL Query] --> B{Complexity Estimation}
    
    B --> |Field Depth| C[Nested Resolver Depth]
    B --> |List Expansion| D[Recursive Tag Relationships]
    B --> |Directive Count| E[Multiple Directives]
    
    C --> F{Complexity Calculation}
    D --> F
    E --> F
    
    F --> |Exponential Growth| G[Complexity Score]
    
    G --> H{Rate Limiting}
    
    H --> |Exceed Threshold| I[Query Rejected]
    H --> |Within Limits| J[Query Processed]
    
    subgraph Mitigation Strategies
        K[Maximum Depth Restriction]
        L[Complexity Credits]
        M[Interval-Based Tracking]
    end
    
    Note[Preventing Expensive Queries]

Query parsing

Before a query is executed, it is first parsed. We once received a pen-test report evidencing that its possible to craft an invalid query string that OOM'd the server. For example:

flowchart LR
    A[GraphQL Introspection] --> B{Schema Exposure}
    
    B --> |Full Schema| C[Detailed Type Information]
    B --> |Field Details| D[Resolver Insights]
    B --> |Relationship Mapping| E[Internal Structure]
    
    C --> F[Potential Security Risks]
    D --> F
    E --> F
    
    F --> G{Mitigation Strategies}
    
    G --> |Disable Introspection| H[Limit Schema Visibility]
    G --> |Partial Exposure| I[Controlled Type Revelation]
    G --> |Authentication| J[Restrict Schema Access]
    
    subgraph Attack Surface
        K[Automated Exploration]
        L[Schema Enumeration]
        M[Vulnerability Discovery]
    end
    
    Note[Reducing Information Leakage]

This is a syntactically valid query, but invalid for our schema. A spec compliant server will parse this and start building an errors response containing thousands of errors which we found consumed 2,000x more memory than the query string itself. Because of this memory amplification, its not enough to just limit the payload size, as you will have valid queries that are larger than the the smallest dangerous malicious query.

If your server exposes a concept of maximum number of errors to accrue before abandoning parsing, this can be mitigated. If not, you'll have to roll your own solution. There is no REST equivalent to this attack of this severity.

Performance

When it comes to performance in GraphQL people often talk about it's incompatibility with HTTP caching. For me personally, this has not been an issue. For SaaS applications, data is usually highly user specific and serving stale data is unacceptable, so I have not found myself missing response caches (or the cache invalidation bugs they cause…).

The major performance problems I did find myself dealing with were…

Data fetching and the N+1 problem

I think this issue is pretty widely understood nowadays. TLDR: if a field resolver hits an external data source such as a DB or HTTP API, and it is nested in a list containing N items, it will do those calls N times.

flowchart LR
    A[Initial Query: Fetch Users] --> B{Resolver}
    B --> |Nested Friends| C[Fetch Friend 1]
    B --> |Repeated for Each User| D[Fetch Friend 2]
    B --> |N Times| E[Fetch Friend N]
    
    C --> F[Additional DB Calls]
    D --> F
    E --> F
    
    subgraph Performance Impact
        F --> G[Exponential Query Complexity]
        G --> H[Increased Latency]
        G --> I[Resource Intensive]
    end
    
    Note[N+1 Query: Inefficient Data Fetching]

This is not a unique problem to GraphQL, and actually the strict GraphQL resolution algorithm has allowed most libraries to share a common solution: the Dataloader pattern. Unique to GraphQL though is the fact that since it is a query language, this can become a problem with no backend changes when a client modifies a query. As a result, I found you end up having to defensively introduce the Dataloader abstraction everywhere just in case a client ends up fetching a field in a list context in the future. This is a lot of boilerplate to write and maintain.

Meanwhile, in REST, we can generally hoist nested N+1 queries up to the controller, which I think is a pattern much easier to wrap your head around:

class BlogsController < ApplicationController
  def index
    @latest_blogs = Blog.limit(25).includes(:author, :tags)
    render json: BlogSerializer.render(@latest_blogs)
  end

  def show
    # No prefetching necessary here since N=1
    @blog = Blog.find(params[:id])
    render json: BlogSerializer.render(@blog)
  end
end

Authorisation and the N+1 problem

But wait, there's more N+1s! If you followed the advice earlier of integrating with your library's authorisation framework, you've now got a whole new category of N+1 problems to deal with. Lets continue with our X API example from earlier:

class UserType < GraphQL::BaseObject
  field :handle, String
  field :birthday, authorize_with: :view_pii
end

class UserPolicy < ApplicationPolicy
  def view_pii?
    # Oh no, I hit the DB to fetch the user's friends
    user.friends_with?(record)
  end
end

query {
  me {
    friends { # returns N Users
      handle
      birthday # runs UserPolicy#view_pii? N times
    }
  }
}

This is actually trickier to deal with than our previous example, because authorisation code is not always run in a GraphQL context. It may for example be run in a background job or an HTML endpoint. That means we can't just reach for a Dataloader naively, because Dataloaders expect to be run from within GraphQL (in the Ruby implementation anyway).

In my experience, this is actually the biggest source of performance issues. We would regularly find that our queries were spending more time authorising data than anything else. Again, this problem simply does not exist in the REST world.

I have mitigated this using nasty things like request level globals to memoise data across policy calls, but its never felt great.

Coupling

In my experience, in a mature GraphQL codebase, your business logic is forced into the transport layer. This happens through a number of mechanisms, some of which we've already talked about:

flowchart LR
    A[Business Logic] --> B{GraphQL Layer}
    
    B --> |Authorization| C[Scattered Permission Checks]
    B --> |Data Fetching| D[Dataloader Complexity]
    B --> |Resolver Logic| E[Transport Layer Entanglement]
    
    C --> F[Multiple Authorization Points]
    D --> G[Repeated Data Fetching Patterns]
    E --> H[Difficult Integration Testing]
    
    subgraph Challenges
        F
        G
        H
    end
    
    Note[Business Logic Leakage into GraphQL]

The net effect of all of this is to meaningfully test your application you must extensively test at the integration layer, i.e. by running GraphQL queries. I have found this makes for a painful experience. Any errors encountered are captured by the framework, leading to the fun task of reading stack traces in JSON GraphQL error responses. Since so much around authorisation and Dataloaders happens inside the framework, debugging is often much harder as the breakpoint you want is not in application code.

And of course, again, since its a query language you're going to be writing a lot more tests to confirm that all those argument and field level behaviours we mentioned are working correctly.

Complexity

Taken in aggregate, the various mitigations to security and performance issues we've gone through add significant complexity to a codebase. It's not that REST does not have these problems (though it certainly has fewer), it's just that the REST solutions are generally much simpler for a backend developer to implement and understand.

And more…

So those are the major reasons I am, for the most part, over GraphQL. I have a few more peeves, but to keep this article growing further I'll summarise them here..

  • GraphQL discourages breaking changes and provides no tools to deal with them. This adds needless complexity for those who control all their clients, who will have to find workarounds.
  • Reliance on HTTP response codes turns up everywhere in tooling, so dealing with the fact that 200 can mean everything from everything is Ok through to everything is down can be quite annoying.
  • Fetching all your data in one query in the HTTP 2+ age is often not beneficial to response time, in fact it will worsen it if your server is not parallelised, vs sending separate requests to separate servers to process in parallel.

Alternatives

Ok, end of the rant. What would I recommend instead? To be up front, I am definitely early in the hype cycle here, but right now my view is that if you:

  1. Control all your clients
  2. Have ≤3 clients
  3. Have a client written in a statically typed language
  4. Are using >1 language across the server and clients2

You are probably better off exposing an OpenAPI 3.0+ compliant JSON REST API. If, as in my experience, the main thing your frontend devs like about GraphQL is its self documenting type safe nature, I think this will work well for you. Tooling in this area has improved a lot since GraphQL came on the scene; there are many options for generating typed client code even down to framework specific data fetching libraries. My experience so far is pretty close to "the best parts of what I used GraphQL for, without the complexity Facebook needed".

As with GraphQL there's a couple of implementation approach…

Implementation first tooling generates OpenAPI specs from a typed / type hinted server. FastAPI in Python and tsoa in TypeScript are good examples of this approach3. This is the approach I have the most experience with, and I think it works well.

Specification first is equivalent to "schema first" in GraphQL. Spec first tooling generates code from a hand written spec. I can't say I've ever looked at an OpenAPI YAML file and thought "I would love to have written that myself", but the recent release of TypeSpec changes things entirely. With it could come a quite elegant schema first workflow:

  1. Write a succinct human readable TypeSpec schema
  2. Generate an OpenAPI YAML spec from it
  3. Generate statically typed API client for your frontend language of choice (e.g. TypeScript)
  4. Generate statically typed server handlers for your backend language & server framework (e.g. TypeScript + Express, Python + FastAPI, Go + Echo)
  5. Write an implementation for that handler that compiles, safe in the knowledge that it will be type safe

This approach is less mature but I think has a lot of promise.

To me, it seems like powerful and simpler options are here, and I'm excited to learn their drawbacks next 😄.

Thanks for reading! See Hacker News and Reddit for more discussion on this article.

  1. Persisted queries are also a mitigation for this and many attacks, but if you actually want to expose a customer facing GraphQL API, persisted queries are not an option. ↩
  2. Otherwise a language specific solution like tRPC might be a better fit. ↩
  3. In Ruby, I guess because type hints are not popular, there is no equivalent approach. Instead we have rswag which generates OpenAPI specs from request specs. It would be cool if we could build an OpenAPI spec from Sorbet / RBS typed endpoints! ↩

Visualization of GraphQL Challenges

Authorization Complexity

sequenceDiagram
    participant Client
    participant GraphQLServer
    participant AuthorizationLayer
    participant Database

    Client->>GraphQLServer: Complex Multi-Field Query
    GraphQLServer->>AuthorizationLayer: Validate Field Access
    loop For Each Requested Field
        AuthorizationLayer->>Database: Check User Permissions
        Database-->>AuthorizationLayer: Permission Results
        
        alt Field Authorized
            AuthorizationLayer-->>GraphQLServer: Allow Access
        else Field Unauthorized
            AuthorizationLayer-->>GraphQLServer: Deny Access
        end
    end
    
    Note over Client,GraphQLServer: N+1 Authorization Overhead
    Note over AuthorizationLayer: Repeated Permission Checks
    Note over Database: Multiple Unnecessary Queries

N+1 Query Problem

flowchart LR
    A[Initial Query: Fetch Users] --> B{Resolver}
    B --> |Nested Friends| C[Fetch Friend 1]
    B --> |Repeated for Each User| D[Fetch Friend 2]
    B --> |N Times| E[Fetch Friend N]
    
    C --> F[Additional DB Calls]
    D --> F
    E --> F
    
    subgraph Performance Impact
        F --> G[Exponential Query Complexity]
        G --> H[Increased Latency]
        G --> I[Resource Intensive]
    end
    
    Note[N+1 Query: Inefficient Data Fetching]

Query Complexity and Rate Limiting

flowchart TB
    A[GraphQL Query] --> B{Complexity Estimation}
    
    B --> |Field Depth| C[Nested Resolver Depth]
    B --> |List Expansion| D[Recursive Tag Relationships]
    B --> |Directive Count| E[Multiple Directives]
    
    C --> F{Complexity Calculation}
    D --> F
    E --> F
    
    F --> |Exponential Growth| G[Complexity Score]
    
    G --> H{Rate Limiting}
    
    H --> |Exceed Threshold| I[Query Rejected]
    H --> |Within Limits| J[Query Processed]
    
    subgraph Mitigation Strategies
        K[Maximum Depth Restriction]
        L[Complexity Credits]
        M[Interval-Based Tracking]
    end
    
    Note[Preventing Expensive Queries]

Architectural Coupling

flowchart LR
    A[Business Logic] --> B{GraphQL Layer}
    
    B --> |Authorization| C[Scattered Permission Checks]
    B --> |Data Fetching| D[Dataloader Complexity]
    B --> |Resolver Logic| E[Transport Layer Entanglement]
    
    C --> F[Multiple Authorization Points]
    D --> G[Repeated Data Fetching Patterns]
    E --> H[Difficult Integration Testing]
    
    subgraph Challenges
        F
        G
        H
    end
    
    Note[Business Logic Leakage into GraphQL]

Introspection and Schema Exposure

flowchart TB
    A[GraphQL Introspection] --> B{Schema Exposure}
    
    B --> |Full Schema| C[Detailed Type Information]
    B --> |Field Details| D[Resolver Insights]
    B --> |Relationship Mapping| E[Internal Structure]
    
    C --> F[Potential Security Risks]
    D --> F
    E --> F
    
    F --> G{Mitigation Strategies}
    
    G --> |Disable Introspection| H[Limit Schema Visibility]
    G --> |Partial Exposure| I[Controlled Type Revelation]
    G --> |Authentication| J[Restrict Schema Access]
    
    subgraph Attack Surface
        K[Automated Exploration]
        L[Schema Enumeration]
        M[Vulnerability Discovery]
    end
    
    Note[Reducing Information Leakage]

Note: This mindmap provides a comprehensive overview of GraphQL challenges with inline code examples. Full context is available in the article text.

Michael Wybraniec

Michael Wybraniec

Freelance, MCP Servers, Full-Stack Development, Architecture