AI Financial Model Risk and Governance: Statistics

Last updated: June 2026

Putting generative AI (Claude, ChatGPT, Excel copilots) on top of financial modeling creates a new class of risk: confidently fabricated numbers, sensitive data leaving the building through unsanctioned tools, and a widening gap between AI usage and the governance that should accompany it. This page collects verified figures with their primary sources, so the numbers can be cited and traced. Each stat links to a primary or authoritative source with its scope and year.

Confident fabrication: where generative AI breaks on financial numbers

81% of financial questions were answered incorrectly or refused by GPT-4-Turbo paired with a retrieval system on the FinanceBench benchmark. With long-context prompts, the failure rate fell to 21% for GPT-4-Turbo and 24% for Claude-2. Sample of 150 cases from FinanceBench (10-K, 10-Q, 8-K and earnings filings from 40 US public companies); 16 model configurations tested, n=2,400 answers manually reviewed; November 2023. (Islam et al., Patronus AI, Contextual AI & Stanford, 2023) + (Patronus AI, 2023)

82.4% accuracy is the best score any model reaches on FinSheet-Bench (Gemini 3.1 Pro), roughly one error every six questions. The authors conclude that "no standalone model achieves error rates low enough for unsupervised use in professional finance applications." 24 evaluation files of varying complexity and layout; 10 model configurations from OpenAI, Google and Anthropic; synthetic portfolio data modeled on real private-equity fund structures; March 2026. (Ravnik et al., Qubera AG / University of Zurich, 2026)

19.6% average accuracy on complex aggregation tasks, versus 89.1% on simple lookups, across all 10 models tested on FinSheet-Bench. Accuracy degrades steadily as task complexity rises. Pooled across all model configurations; ~500 questions per model; 24 evaluation files; synthetic data modeled on private-equity fund structures; 2026. (Ravnik et al., arXiv, 2026)

10 to 20% error rates persist even for frontier models on multi-step numerical reasoning over financial tables, despite high overall accuracy. On the hardest category (multivariate calculation), Claude-Sonnet-4 reaches 80.0% (95.6% overall); the authors caution this subset holds only ~10 cases and should not be over-read. FAITH benchmark; 14 LLMs; 2,406 answerable spans drawn from 2024 10-K reports of 453 S&P 500 companies; context-aware masked-span prediction; ACM ICAIF'25. (Zhang et al., arXiv / ACM ICAIF'25, 2025) + (Cognaptus, 2025)

Hallucinations reach the ledger

86% of CFOs said their finance team had encountered at least one instance of inaccurate or "hallucinated" data while using AI. Survey of 100 CFOs at mid-market US companies, conducted late September 2025, published January 2026; vendor-sponsored (Maximor AI), small sample, fielded by Wakefield Research. (Wakefield Research for Maximor AI, via CFO Dive, 2026) + (Journal of Accountancy, 2026)

14% of CFOs say they completely trust AI to deliver accurate accounting data on its own. Same survey: 100 CFOs at mid-market US companies ($50M-$500M revenue), late September 2025, published 28 January 2026; vendor-sponsored. (Wakefield Research for Maximor AI, 2026) + (CFO Dive, 2026)

Nearly one-third (~33%) of all respondents say their organization has experienced negative consequences from generative AI inaccuracy, the most commonly cited risk to cause negative consequences; 51% of respondents at AI-using organizations report at least one negative consequence. McKinsey Global Survey on AI; 1,491 participants across regions, industries and sizes; data collected 16-31 July 2024; published March 2025. (McKinsey & Company (QuantumBlack), 2025)

93% of UK accountants and bookkeepers who encounter AI-related mistakes estimate they spend up to ten hours per month correcting errors caused by AI-generated advice (44% up to three hours, 39% four to ten hours). Subset of 500 UK accountants and bookkeepers who report encountering public-AI errors; fielded by Censuswide for Dext over the first two weeks of December 2025. (Dext (Censuswide), 2025) + (CFOtech UK, 2025)

Shadow AI in finance

49% of employees report using AI tools not sanctioned by their employer; 58% of those rely on free versions, and 23% admit sharing financial statements or sales data with unsanctioned tools. Sapio Research survey of 2,000 employees (1,000 UK, 1,000 US) at organizations with 500+ staff; fielded November 2025, published 27 January 2026. (BlackFog (Sapio Research), 2026) + (CIO.com, 2026)

16.6% of all detected sensitive-data exposures in enterprise AI prompts were financial: financial projections, investment analysis and sales pipeline data accounted for 95,852 instances, the single largest category. Anonymized telemetry from US and UK enterprises via Harmonic Protect; 22,458,240 prompts and uploads analyzed, 579,113 sensitive-data instances detected, across 665 GenAI tools; 1 Jan-31 Dec 2025. Vendor dataset, not a representative panel. (Harmonic Security, 2025) + (SecurityBrief UK, 2025)

2.6% of 22.4 million enterprise AI prompts and uploads contained company-sensitive data (579,113 detected exposures). Anonymized data from US and UK enterprises monitored via Harmonic Protect, across 665 generative and embedded AI tools; 1 Jan-31 Dec 2025. (Harmonic Security, 2025) + (SecurityBrief UK, 2025)

About one-third of senior finance professionals report that shadow AI is already a noticeable issue in their organization, and 40% say their organization either has no rules governing AI use or they are unsure what those rules are. Survey of 311 senior finance professionals across 22 sectors, global, fielded March-April 2026. (insightsoftware, 2026) + (GlobeNewswire / Yahoo Finance, 2026)

More than 40% of global organizations are predicted to suffer security and compliance incidents from the use of unauthorized AI tools by 2030; 69% of organizations already have evidence or suspect employees are using public generative AI at work. Gartner prediction; the 69% figure is from a survey of 302 cybersecurity leaders worldwide, conducted March-May 2025. (Gartner, Inc., 2025) + (Infosecurity Magazine, 2025)

The audit and traceability gap

Article 12 of the EU AI Act requires high-risk AI systems to technically enable automatic recording of events (logs) over the system's lifetime, ensuring a level of traceability appropriate to the system's intended purpose. Regulation (EU) 2024/1689; applies to providers and deployers of high-risk AI systems in the EU; obligations apply from 2 August 2026. (European Parliament & Council, 2024) + (European Commission, AI Act Service Desk)

Article 50 of the EU AI Act requires providers of AI systems generating synthetic audio, image, video or text to mark outputs in a machine-readable format, detectable as artificially generated or manipulated, as far as technically feasible. Regulation (EU) 2024/1689; transparency obligation applying from 2 August 2026. (European Parliament & Council, 2024) + (European Commission, AI Act Service Desk)

21% of organizations say they have a mature governance model in place for agentic AI. Among the listed missing capabilities are "audit trails that capture the full chain of agent actions to help ensure accountability." 3,235 IT and business leaders directly involved in AI programs, across 24 countries and 6 industries; data collected August-September 2025. (Deloitte, State of AI in the Enterprise, 2026)

13% of organizations reported breaches of AI models or applications, and of those compromised, 97% reported not having proper AI access controls in place. IBM Cost of a Data Breach Report 2025, research by Ponemon Institute; 600 organizations worldwide that suffered a breach, data collected March 2024-February 2025. (IBM / Ponemon Institute, 2025) + (Kiteworks, 2025)

40% of senior finance professionals are concerned about the lack of formal sign-off processes for AI-generated outputs. Survey of 311 senior finance professionals across 22 sectors, global, fielded March-April 2026. (insightsoftware, 2026) + (GlobeNewswire, 2026)

Regulation is arriving

Up to €35,000,000 or 7% of total worldwide annual turnover (whichever is higher) is the maximum administrative fine under the EU AI Act for breaching the prohibited-practices rules of Article 5. Lower tiers apply for other obligations (€15M or 3%) and for incorrect or misleading information (€7.5M or 1%). Regulation (EU) 2024/1689, Article 99; applies to providers and deployers under EU jurisdiction; for SMEs and startups the lower of the two amounts applies. (European Union (Parliament & Council), 2024) + (European Commission, AI Act Service Desk)

30% increase in legal disputes for technology companies by 2028 is predicted to result from AI regulatory violations. Gartner prediction, tied to a survey of 360 IT leaders involved in deploying GenAI tools, conducted May-June 2025. (Gartner, Inc., 2025) + (Analytics India Magazine, 2025)

92% of UK accountants and bookkeepers believe public AI tools should be regulated and/or restricted when providing financial or tax advice, including 70% who call for formal regulation. Censuswide survey of 500 UK accountants and bookkeepers across firm sizes, regions and sectors; fielded the first two weeks of December 2025; commissioned by an accounting-software vendor. (Dext (Censuswide), 2025) + (FinTech Global, 2025)

The governance maturity gap

63% of breached organizations either have no AI governance policy or are still developing one; only 37% have policies to manage AI or detect shadow AI. IBM Cost of a Data Breach Report 2025, research by Ponemon Institute; 600 breached organizations worldwide, March 2024-February 2025. (IBM / Ponemon Institute, 2025) + (The Actuary (IFoA), 2025)

31% of European organizations have a formal, comprehensive AI policy in place, while 83% of IT and business professionals in Europe believe employees in their organization are using AI (a perception, not a measured usage rate). 561 IT and business professionals in Europe, the European cut of a global ISACA survey of 3,200+ respondents; fieldwork 28 March-14 April 2025. (ISACA, 2025) + (Infosecurity Magazine, 2025)

12% of AI-using financial-services firms have an AI risk management framework, and only 18% have a formal testing program for their AI tools. ACA/NSCP 2024 AI Benchmarking Survey; 200+ compliance and risk leaders at financial-services firms (mostly asset managers), online survey June-July 2024. (ACA Group & NSCP, 2024) + (FinTech Global, 2024)

78% of senior business leaders lack full confidence that their organization could pass an independent AI governance audit within 90 days, a gap Grant Thornton calls the "AI proof gap." AI Impact survey of nearly 1,000 senior leaders across multiple US industries; early 2026; self-reported perceptions, not an attestation. (Grant Thornton, 2026) + (Journal of Accountancy (AICPA & CIMA), 2026)

One-third (33%) of AI use cases deployed by UK financial-services firms are third-party implementations, up from 17% in 2022; the survey flags third-party dependencies as the risk expected to grow most over three years, and notes 46% of firms have only a "partial" understanding of the AI they use. Third joint Bank of England / FCA survey on AI in UK financial services; 118 respondents across 6 sectors; conducted 2024, published 21 November 2024. (Bank of England & FCA, 2024) + (Stephenson Harwood, 2024)

The cost of failure

$670,000 in higher breach costs, on average, was observed at organizations with high levels of shadow AI compared with those having low or no shadow AI. IBM Cost of a Data Breach Report 2025, research by Ponemon Institute; 600 breached organizations worldwide, March 2024-February 2025. (IBM / Ponemon Institute, 2025) + (Cybersecurity Dive, 2025)

1 in 5 (20%) organizations reported a breach due to shadow AI, that is, unsanctioned AI tools adopted by employees without IT or security oversight. IBM Cost of a Data Breach Report 2025, research by Ponemon Institute; 600 breached organizations worldwide, March 2024-February 2025. (IBM / Ponemon Institute, 2025) + (Nudge Security, 2025)

37% of the time employees save using AI is lost to rework, including correcting errors, verifying outputs and rewriting low-quality content; only 14% of employees consistently get clear, positive net outcomes from AI. 3,200 full-time employees at organizations with $100M+ revenue, active AI users, split half leaders / half employees, across North America, APAC and EMEA; fielded November 2025. The official release rounds the rework figure to "nearly 40%." (Workday (fieldwork by Hanover Research), 2026) + (CFO.com, 2026)

A$97,000+ (about US$63,000) was refunded by Deloitte Australia on a roughly A$440,000 government contract after a report was found to contain references to nonexistent academic research and a fabricated quote from a federal court judgment; the revised version disclosed use of a generative AI system (Azure OpenAI / GPT-4o). Single documented case (October 2025); report commissioned by Australia's Department of Employment and Workplace Relations, errors identified by a University of Sydney researcher. (CFO Dive, 2025) + (Fortune, 2025)

50% of UK accountants and bookkeepers are aware of businesses that have suffered direct financial losses (overpayments, missed allowances, penalties, fines or compliance issues) after acting on incorrect or misleading AI-generated advice. Censuswide survey of 500 UK accountants and bookkeepers across firm sizes, regions and sectors; fielded the first two weeks of December 2025. The figure measures awareness of affected businesses, not losses suffered by respondents themselves. (Dext (Censuswide), 2025) + (FinTech Global, 2025)

Sources

Changelog

2026-06: Initial publication.

AI Financial Model Risk and Governance: Statistics (2026)