Cost per run (effective)
$0.04
Model production AI-agent cost per run and per month with editable token pricing, embedding refresh, retrieval overhead, and fixed infrastructure.
Privacy: This tool runs entirely in your browser. Your input is not sent to our servers.
Cost per run (effective)
$0.04
Total monthly cost
$762.48
LLM monthly (in + out)
$61.70
Embedding monthly
$0.78
Input cost per run
$0.00
Output cost per run
$0.00
Lean assumes lower volume and tighter token discipline. Stress assumes higher traffic, larger contexts, and more retrieval pressure.
| Scenario | Monthly total | Effective cost/run |
|---|---|---|
| Lean | $669.26 | $0.05 |
| Base | $762.48 | $0.04 |
| Stress | $961.01 | $0.04 |
See the full breakdown in What RAG in production actually costs, then compare implementation patterns in Brief AI Agent.
Need help productionizing your model stack? Explore Custom AI Applications and browse Case Studies.
Advertisement
Yes. Retrieval overhead and fixed infrastructure fields are separate so you can model vector DB, cache, workers, logging, and monitoring explicitly.
No. You should paste current prices from your provider documentation. The tool is a transparent calculator, not a live pricing feed.
Yes. Set embedding and retrieval values to zero if your flow has no retrieval layer, then model pure prompt-completion workloads.