How to Save Tokens When Building Financial Models with AI

If you use Claude or ChatGPT on a real financial model, you have probably hit this: the token limit arrives in under ten prompts. You make one small change, and the cost spikes. By mid-afternoon the session is exhausted and you have moved the model forward by very little.

This is not a sign you are using the AI wrong. It is a sign of how spreadsheets and AI interact. The good news is that the cause is specific, which means the fix is too.

Why Your Tokens Disappear

A spreadsheet is an opaque grid. There is no map of what matters. The AI cannot tell, from the file alone, which cells are assumptions, which are outputs, what the dependency chain is, or what the model is supposed to do.

So it does the only thing it can: it re-reads everything. Every time you change one thing, the AI re-ingests the whole model to understand the context before it can act. One experienced ex-CFO described the effect exactly: "every time you change one thing, it re-reads an entire model, hugely token-hungry."

That is the mechanism. The cost is not proportional to the size of your edit. It is proportional to the size of your whole model, on every single turn.

And the pain is not only the token meter. It is the iteration loop. The real friction practitioners report is the constant juggling between the chat and the spreadsheet, re-loading context each time, never quite resuming where they left off.

Tokens Are a Structure Problem, Not a Prompt Problem

You can shave a few tokens with prompt hygiene: shorter instructions, fewer pasted ranges. It helps at the margin and does not change the order of magnitude. The model is still re-reading the grid every turn.

The order-of-magnitude fix is structural. If the AI works against a defined model instead of an opaque grid, it no longer needs to re-read everything to know what it is looking at. Here is what that means in practice.

1. Give the AI a model, not a grid

When variables are named and typed, and the relationships between them are explicit, the AI can address "headcount cost" or "ARR" directly. It does not have to scan thousands of cells to reconstruct what they mean. The map is already there. This alone removes most of the re-reading.

2. Edit surgically

In a structured model, changing one assumption is a targeted operation on one node, not a rewrite of the sheet. The AI touches what you asked it to touch. The tokens go to the edit, not to re-ingesting the rest of the model as context.

3. Persist context across sessions

Most of the token waste is re-explaining. Every fresh session, you re-describe the business, the conventions, the structure. A persistent model means the AI resumes instead of restarting. The context is already loaded because it never went away.

4. Encode your conventions once

Your business rules, the way you compute working capital, your sign convention, your sector assumptions, should live in one place the AI reads once, not be re-derived through trial and correction every session. A conventions file (Layerz calls it FINANCE.md) is read at the start and keeps the AI inside your rules without you re-typing them. Fewer corrections means fewer round trips means fewer tokens.

What Changes in Practice

The grid workflow: you edit one assumption, the AI re-reads the entire model to get its bearings, it costs a large slice of your budget, and you do that again for the next edit. Ten prompts later you are out.

The structured workflow: you edit one assumption, the AI operates on that node and the variables it feeds, the cost matches the edit, and the session keeps going. The same budget that bought you ten prompts on a grid buys you a full afternoon of real work on a structured model.

This is the same logic the broader market is converging on. The value of an AI layer over finance data, as the founder of a leading finance scaleup recently put it, is not raw access to the data. It is the encoded structure that lets the agent query it correctly without re-deriving everything each time. Structure is what makes AI affordable to run.

The Bottom Line

If your tokens vanish in a handful of prompts, the problem is not your prompt and not your AI. It is that the AI is fighting an opaque grid and re-reading it on every turn. Give it a structured model, named variables, explicit dependencies, persistent context, conventions written down once, and the cost collapses to the size of the edit you actually made.

For deeper context on why structure also fixes drift and reliability, see How to Build a Financial Model with AI That Doesn't Drift, and for the full picture, The Right Way to Generate a Financial Model with AI.

Layerz holds your model's structure so Claude doesn't re-read a grid on every edit. Named variables, surgical edits, persistent context, conventions in a FINANCE.md, and clean Excel export anytime. Fewer tokens, more model. Explore Layerz →