The Right Way to Generate a Financial Model with AI
Generating a financial model with AI is now easy. You open Claude or ChatGPT, describe the business, and a few seconds later you have revenue lines, a cost structure, an EBITDA bridge, maybe a cash flow. It looks finished.
Keeping that model is the hard part. The day you change one assumption, the links don't follow. Values quietly freeze into hardcodes. Next session, the AI has forgotten what the model was. The output that looked finished turns out to be disposable.
This is the gap that matters. Most advice on "AI financial modeling" optimizes the moment of generation. The practitioners who actually rely on these models have learned that the moment of generation is the cheap part. What's expensive is the life of the model: the edits, the scenarios, the review, the next session. It is also why so much AI work stalls: 95% of enterprise generative AI pilots deliver no measurable bottom-line impact, not because the models are weak, but because the output never makes it into a durable workflow.
So the real question is not "how do I generate a model with AI?" It's "how do I generate one that survives contact with reality?"
Why the Naive Way Breaks
The default workflow is a chat window and a spreadsheet. You prompt, the AI writes cells, you copy them into Excel. It works for the first draft and falls apart after it.
Three failures show up every time, and they are the same three the best AI-native finance tools now name openly:
It drifts. A spreadsheet stores results, not logic. The relationships between variables live in the arrangement of cells, implicit and undocumented. When the AI rewrites cells to reflect a new assumption, the dependency chain breaks silently. The model does roughly 80% of what you wanted and drifts on the 20% that actually matters: the linked variables, the propagation, the edge cases.
It forgets. Each new session starts from zero. The AI has no memory of your model's structure, your conventions, or the decisions you made last week. You re-explain the business, the sign conventions, the way you compute working capital, every single time.
It burns tokens. Because a spreadsheet is an opaque grid with no map, the AI re-reads the whole thing every time you change one thing. Practitioners hit their token limit in under ten prompts on a real file. The cost is not only money: it's the iteration loop, the constant juggling between chat and sheet.
None of these are prompt problems. You cannot prompt your way out of them, because the cause is structural: logic and data are fused together in cells, and the AI is working against a grid instead of a model.
"Bad Data In, Bad Data Out" Applies to Structure Too
The founder of a leading finance scaleup made a point recently, in a public conversation, that finance people already know but AI enthusiasts often skip: AI does not do magic. Bad data in, bad data out. The output is only as good as the context you feed it.
His own answer was to describe the agent layer his product exposes as "the recipe given to the agents so they query the data correctly without getting tripped up on the special cases." Not raw access to data. Encoded conventions. Guardrails.
That is the whole insight, and it generalizes. The quality of an AI-generated financial model is set before generation, by the structure and the conventions you give the AI to work with. A model is not a pile of numbers. It is a set of relationships and a set of rules. If the AI never sees the relationships and never learns the rules, it will reconstruct them, badly, on every prompt.
He also admitted something telling: the one thing his web-based agent cannot do well is the scenario-driven forecast, the three-year plan you reshape by prompting ("make the optimistic case more optimistic") and have it re-edit cleanly. That, he said, only works through an on-premise cowork agent like Claude Code. Which is exactly the workflow this article is about, and exactly where raw Excel drifts and a structured layer does not.
The Right Way, in Four Principles
Generating a financial model with AI that holds up comes down to giving the AI something better than a grid to work with.
1. Give the AI structure, not cells
Instead of letting the AI write into an opaque sheet, give it a model: named variables, explicit types, a dependency graph. When the AI addresses "revenue" or "headcount cost" by name rather than by cell reference, it can reason about the model instead of scanning it. This is what kills both drift and token waste at the root.
2. Encode your conventions once
Your business rules, the way you compute working capital, your sign convention, your sector assumptions, should be written down once and reused on every session, not rediscovered through correction. One experienced ex-CFO who built a full business plan with AI put it plainly: she should have set all her business rules at the start, instead of adjusting "a little every week." A conventions file (Layerz calls it FINANCE.md) is where that lives. The AI reads it and operates inside your rules from the first prompt.
3. Keep the structure persistent and versioned
The model should not evaporate when the session ends. A persistent, versioned structure means the AI picks up where it left off, you can see what changed and roll back, and the model gets better across sessions instead of starting over. Persistence is what turns a one-shot generation into a model you actually own.
4. Keep Excel as the output, not the workspace
Excel is still the deliverable. Your counterpart, your board, your auditor all open a standard .xlsx. The mistake is living inside Excel while you build. The right pattern is to build in a structured layer and export to Excel on demand: clean, reproducible, auditable. The spreadsheet is a compatibility format on the way out, not the place the logic lives.
What This Looks Like in Practice
The naive workflow: prompt, paste into Excel, prompt again, watch the model drift, hit the token wall, start a fresh session next week and re-explain everything.
The structured workflow: the AI builds against a named, typed model with your conventions loaded. You change one assumption and the change propagates through the dependency graph instead of breaking it. Tokens go to the edit you asked for, not to re-reading the whole grid. The structure persists, so next session resumes instead of restarting. And at any point, you export a clean Excel file that anyone can open.
Same AI. Completely different outcome. The difference is not the prompt or the model. It's whether the AI is generating into structure or into a grid.
Where to Go Deeper
Each of the three failure modes has its own fix, covered in detail:
- The token problem and how to stop burning your budget on grids: How to Save Tokens When Building Financial Models with AI
- The drift problem and how to keep the logic intact through edits: How to Build a Financial Model with AI That Doesn't Drift
- The traceability problem and how to make the model defensible: How to Build an Auditable Financial Model with AI
The right way to generate a financial model with AI is not a better prompt. It's a better thing for the AI to work with: structure instead of cells, conventions instead of guesses, persistence instead of a blank session. Generation is the easy 80%. The structure is what saves the 20% that decides whether the model is worth defending.
Layerz is the finance workspace that gives Claude that structure. It holds the model so the AI doesn't have to, keeps your conventions in a FINANCE.md, versions every change, and exports clean Excel anytime. Explore Layerz →