Can large language models really act as competent financial advisors for ordinary individuals? A March 2026 working paper titled “AI Financial Advice: Supply, Demand, and Life Cycle Implications,” authored by Taha Choukhmane, Tim de Silva, Weidong Lin, and Matthew Akuzawa, tries to answer that question with unusual rigor.
The researchers focused primarily on GPT-5.2, with Gemini 3 Flash used as a robustness check, and put both models through a stress test designed to mirror how a real American household might use them in practice.
How the test was structured
The study built a full life cycle model covering U.S. household income, spending, saving, and investment behavior. Labor market shocks and asset returns were calibrated to actual U.S. data, so the simulated environment reflected real-world economics rather than a textbook abstraction.
The team then collected real prompts from a demographically representative sample of around 1,000 U.S. adults. Those prompts described each person’s actual financial situation and asked for spending and investment guidance. From there, the researchers ran simulations of each respondent’s financial life from age 22 to 90, applying the LLM’s two-pass advice at every stage.
Why this approach matters
Most “is AI a good advisor” tests use cherry-picked questions and check whether the answers sound plausible. This study instead measures the long-term outcome of following the advice across an entire working and retirement life. The distinction is important. A piece of advice can sound reasonable on a Thursday afternoon and still leave a household in trouble at age 70.
What it means for individuals
The honest takeaway from this kind of work is twofold. First, LLMs are getting close to providing genuinely useful baseline financial guidance for households who otherwise would not consult any advisor. Second, close is not the same as good enough, and the failure modes tend to compound over decades rather than show up immediately.
For individuals using these tools today, the practical advice is unchanged. Treat LLM responses as a useful starting point, sanity check them against a fee-only fiduciary or trusted source, and avoid acting on any single AI response for irreversible decisions like retirement account allocations or major insurance purchases.
This paper is part of a growing academic effort to pressure-test AI systems against real human financial questions. Expect more of this work in 2026 and 2027 as the models, and the audience for them, continue to grow.
Source: Choukhmane, T., de Silva, T., Lin, W., and Akuzawa, M. “AI Financial Advice: Supply, Demand, and Life Cycle Implications” (March 2026).
