Claude Code for Finance: Building Financial Models That Hold Up

Claude Code started as a tool for developers, an agent that lives in your terminal, reads your files, runs code, and iterates on a project with you. But a growing number of finance people have noticed something: the same agent is unusually good at financial work. It reasons through a model, holds context across a long task, and edits real files instead of pasting snippets into a chat.

If you already use Claude on spreadsheets, Claude Code feels like a step up. It is. But the same wall that breaks chat plus Excel is still there, just further along. This article is about where Claude Code genuinely helps with finance, where it breaks, and the setup that makes it reliable enough to defend.

Why finance people are reaching for Claude Code

Most finance teams have already adopted AI, and nearly all of them now use it in their work (AI in finance statistics). The first wave was chat: open Claude, describe the business, get a model back. The second wave is agentic, and Claude Code is the clearest example of it.

What makes Claude Code different from a chat window for a finance task:

It works on files, not messages. The model lives in a file the agent reads and rewrites, not in a transcript you copy out of.
It holds a long task. Claude Code can carry out a multi-step job, build the revenue lines, wire the cost structure, check the cash flow, without you re-feeding context at every turn.
It can run things. It executes code, so it can compute, validate, and check its own output instead of guessing at arithmetic.
It persists in a workspace. Point it at a folder and the work stays there between sessions, in principle.

For a practitioner who is comfortable in a terminal, this is a real upgrade over juggling a chat window and a spreadsheet. It is also why "Claude Code finance" and "Claude for finance" have become real searches: people are trying to make this workflow work.

Where it breaks: pointing Claude Code at a raw spreadsheet

The trouble starts when the thing Claude Code is editing is a raw Excel file or a loose grid of numbers. Then you hit the same three failures that break chat plus Excel, the ones the best AI-native finance tools now name openly:

It drifts. A spreadsheet stores results, not logic. The relationship "revenue equals price times quantity" exists only as the arrangement of cells, implicit and undocumented. When the agent rewrites cells to reflect a new assumption, the dependency chain breaks silently. Claude Code does about 80% of the work well, then drifts on the 20% that decides whether the model is right: the linked variables, the propagation, the edge cases. Independent benchmarks bear this out: the best model on FinSheet-Bench reaches just 82.4% accuracy on financial spreadsheets, and barely 20% on complex aggregation tasks (AI financial model accuracy statistics). (More on this failure mode in How to Build a Financial Model with AI That Doesn't Drift.)

It forgets. "Persists in a workspace" is true for the files, not for the model's logic. The structure, the conventions, the reasons behind each choice, all of that is still implicit in the cells. A new session re-reads the grid and re-infers what the model means, slightly differently each time.

It burns tokens. A spreadsheet is an opaque grid with no map. To change one thing reliably, the agent re-reads large parts of the file. On a real model you hit context limits fast, and the iteration loop slows to a crawl.

None of these are prompt problems. You cannot prompt your way out of them, because the cause is structural: in a spreadsheet, logic and data are fused in the same cells, and the agent is working against a grid instead of a model. Claude Code is a sharper tool than a chat window, but a sharper tool on the wrong surface still slips.

The setup that makes Claude Code reliable for finance

The fix is not a better prompt or a bigger context window. It is giving Claude Code something better than a grid to work on. Two levers do most of the work.

Lever 1: give it your business context, once

Claude Code is only as good as the context it has. Your sign convention, how you compute working capital, your sector assumptions, the way your EBITDA is defined, these should be written down once and read at the start of every session, not rediscovered through correction.

This is what a conventions file does. Layerz uses an open standard called FINANCE.md: a structured file that encodes your financial conventions in a format any agent can read before it touches the numbers. EBITDA definition, FX policy, restatement rules, declared once, versioned, queryable. The agent operates inside your rules from the first prompt instead of guessing at them.

Lever 2: connect it to a model layer, not a grid

The deeper fix is to stop pointing the agent at cells and point it at a model. A model layer exposes named, typed variables, an explicit dependency graph, and timeline semantics, instead of an anonymous range of cells. (Why this matters in depth: Why Finance Agents Need a Model Layer, Not Just Spreadsheet Access.)

In practice, this is what MCP is for. Claude Code connects to a professional-grade MCP server that holds the model's structure, so the agent reasons on a real workspace:

It addresses "revenue" or "headcount cost" by name, not by cell reference, so it can reason about the model instead of scanning it.
When it changes an assumption, the change propagates through the dependency graph instead of breaking it.
The structure persists and is versioned, so the next session resumes instead of restarting.
Tokens go to the edit you asked for, not to re-reading the whole grid.

Keep Excel as the output, not the workspace

Excel is still the deliverable. Your counterpart, your board, your auditor all open a standard .xlsx. The mistake is living inside Excel while you build. The right pattern is to build in a structured layer with Claude Code and export to Excel on demand: clean, reproducible, auditable. The spreadsheet is a compatibility format on the way out, not the place the logic lives.

What the workflow looks like in practice

The naive Claude Code workflow: open the agent on a folder with an Excel file, prompt it to build or change the model, watch it edit cells, hit the token wall on a real file, and next week start a fresh session where it re-infers the structure from the grid.

The structured workflow: Claude Code reads your FINANCE.md for conventions, connects through MCP to a model layer that holds named variables and an explicit dependency graph, and edits the model by operating on its structure. You change one assumption, the change propagates. You ask "what depends on this?" and get an answer from the graph, not a scan of cells. The structure persists across sessions. At any point, you export a clean Excel file that anyone can open and audit.

Same agent. Completely different outcome. The difference is not Claude Code. It is whether Claude Code is working on a model or on a grid.

When plain Claude Code plus Excel is enough

This distinction only matters for iterative, multi-session financial work with real conventions and real stakes.

For a one-shot job, "read these ten values from this file and write a summary", Claude Code on a raw spreadsheet is entirely sufficient. It reads, computes, writes, done.

For exploring a model you received, asking what a formula does, drafting a sensitivity table, summarizing structure, plain Claude or Claude Code is genuinely useful, and you should use it.

The structured setup becomes necessary when the agent needs to modify the model and respect its logic, when it works across sessions and cannot reconstruct context each time, when conventions must be honored and not just formulas, and when the output has to be defensible to a board or a data room. For finance professionals building agent workflows around their core models, budget, deal model, unit economics, those conditions are almost always present.

The bottom line

Claude Code is one of the best tools a finance person can use right now. It reasons well, holds a long task, and works on real files. But on a raw spreadsheet it inherits the same failure modes as every other AI plus grid setup: it drifts, it forgets the logic, it burns tokens. The fix is structural. Give it your conventions in a FINANCE.md, connect it to a model layer through MCP, and keep Excel as the export. Then Claude Code stops slipping on the 20% that matters and starts producing models you can actually defend.

For the full approach to AI-built models, start with The Right Way to Generate a Financial Model with AI.

Layerz is the finance workspace that gives Claude and Claude Code that structure. It holds the model so the agent doesn't have to, keeps your conventions in a FINANCE.md, versions every change, and exports clean Excel anytime, all via MCP. Explore Layerz →