LangGraph vs. n8n vs. a Custom State Machine: Honest Trade-offs After Shipping All Three

Every AI agent I have shipped in the past 18 months was built with one of three orchestration approaches: LangGraph, n8n, or a custom state machine written in TypeScript or Python. Each of these has been the right choice in at least one project and a painful mistake in at least one other. This is not a survey of the docs — it is what I learned after production failures.

What you are actually choosing between

The three approaches differ not in capability but in where complexity lives.

n8n puts complexity in a visual graph that non-engineers can read and modify. Execution is event-driven and managed by the n8n runtime. You do not write orchestration logic — you draw it. The trade-off is that anything unusual requires a Code node, and the Code node is where n8n workflows silently accumulate technical debt.

LangGraph puts complexity in Python code that engineers control precisely. The framework models your agent as a directed graph where nodes are functions and edges are conditions. State is typed, transitions are explicit, and the execution engine handles the loop. The trade-off is that every capability requires Python code, and the framework's own abstractions add a layer of indirection you have to debug through.

A custom state machine puts complexity wherever you put it, which is its strength and its problem. You own the state model, the transition logic, the retry behavior, and the persistence layer. There is nothing between your code and production. The trade-off is that you are also building the infrastructure that LangGraph and n8n provide for free.

Where n8n wins and where it breaks

n8n is the right choice when the primary consumer of the workflow is not an engineer. If the client's ops team needs to edit trigger conditions, swap out email templates, or add new integration steps without a deployment cycle, n8n is the correct choice. The visual graph is genuinely readable to non-engineers in a way that Python code is not.

n8n also wins for integrations. The 400+ built-in nodes mean connecting to Notion, HubSpot, Airtable, Gmail, Slack, and Stripe requires no code. For workflows that are primarily integration-and-routing with AI as one step, n8n's integration library is a serious advantage.

Where n8n breaks: parallel agent execution with shared state. n8n's execution model is sequential by default. You can use SplitInBatches and merge nodes to simulate parallelism, but it is fragile. When I tried to build a multi-agent research workflow where four agents ran simultaneously and then reconciled results, the n8n implementation required three nested sub-workflows and a polling loop to check completion status. The same logic took 80 lines of Python in a custom state machine and was dramatically easier to debug. The n8n version failed 3–4% of the time due to race conditions in the merge node; the Python version never had a concurrency bug because I controlled the async execution explicitly.

Where LangGraph wins and where it breaks

LangGraph is the right choice when you need typed state, clear execution traces, and the ability to add human-in-the-loop checkpoints without redesigning the whole system. The framework's checkpoint system lets you pause execution before any node, inspect the current state, and resume — which is essential for approval workflows and for debugging agents that make decisions you cannot easily reproduce.

LangGraph also handles cycles well. If your agent needs to loop — call a tool, evaluate the result, decide whether to call another tool or return to the user — LangGraph's graph model makes the loop explicit. In n8n, a loop requires a manual trigger and a polling mechanism. In a custom state machine, you implement the loop yourself. In LangGraph, a cycle is just an edge that points backward.

Where LangGraph breaks: overhead for simple workflows. A workflow that calls an LLM, parses the output, and posts a result somewhere does not need LangGraph's state management or checkpointing. Adding LangGraph to a simple linear workflow adds complexity without benefit and makes the codebase harder to explain to a non-LangGraph engineer who has to maintain it. I have inherited two LangGraph codebases where the previous engineer used the framework for a workflow that would have been 50 lines of plain async Python. The LangGraph version was 200 lines and required familiarity with the framework's node/edge API to understand.

LangGraph also has versioning challenges. The framework moves fast. Breaking changes between 0.1 and 0.2 required non-trivial migration work on two active projects.

Where a custom state machine wins and where it hurts

Custom state machines are the right choice when you need behavior that no framework provides cleanly: non-standard retry logic, complex rollback behavior, multi-tenant state isolation, or integration with an existing application's data model.

The project where I reached for a custom state machine was a voice AI intake workflow where the agent needed to maintain state across multiple Vapi calls (a user calls in, the agent collects information, the user calls back 20 minutes later and the agent needs to continue from where it left off). Neither n8n nor LangGraph handled this naturally. A custom state machine with a Postgres-backed state store, a job queue, and a resume endpoint was the right architecture. It took longer to build but never had a reliability problem in 8 months of production.

The cost of a custom state machine is maintenance. Every time a new engineer joins the project, you pay the onboarding tax of explaining your state model, your transition logic, and your error handling conventions. With LangGraph or n8n, there is documentation and a community. With a custom state machine, the documentation is whatever you wrote.

The decision framework

Start with n8n if: non-engineers need to modify the workflow, the primary work is integration rather than AI logic, and the workflow is mostly linear.

Start with LangGraph if: you need checkpointing, human-in-the-loop, typed state, or complex conditional branching in Python.

Start with a custom state machine if: you need behavior that frameworks do not support cleanly, you need deep integration with your application's existing data model, or reliability requirements are high enough that you cannot afford framework-level abstractions you do not control.

None of these choices is permanent. I have migrated n8n workflows to custom state machines when complexity outgrew the visual editor. I have replaced custom state machines with LangGraph when the project's requirements aligned with what the framework provides. The switching cost is real but manageable if you keep your core business logic out of the framework layer from the beginning.

For cost modeling before you commit to an orchestration approach, the AI Agent Cost Estimator helps estimate token and runtime costs across different architectures so you can compare approaches before writing code.

LangGraph vs. n8n vs. a Custom State Machine: Honest Trade-offs After Shipping All Three

What you are actually choosing between

Where n8n wins and where it breaks

Where LangGraph wins and where it breaks

Where a custom state machine wins and where it hurts

The decision framework

Related reading

Token Budgets for AI Features: How I Give PMs a Real Answer in 10 Minutes

OpenAI Assistants API vs. Building Your Own Loop: What Actually Broke for Us

Why Your n8n AI Workflow Silently Breaks at 3 AM (And the 4 Observability Hooks That Catch It)