Note: Prism is currently in Alpha. These results reflect early testing and will improve as the model learns from more simulations and user feedback.

When we launched Investra Prism, we made a promise: we would not ask investors to trust our AI market simulator on faith alone. We would prove it works by testing it against real historical outcomes — and we would publish the results, good and bad, for everyone to see.

This post is us delivering on that promise.

We ran Prism against 13 historical scenarios spanning 2012 through 2025, covering five distinct markets and every major market condition of the past decade. The results: 92% average accuracy across non-pandemic tests, 90% directional accuracy across all tests, and some hard lessons about where the model still falls short.

We are publishing every result — including the three tests where accuracy dropped below 70%. If you are going to use Prism to inform investment decisions worth tens or hundreds of thousands of dollars, you deserve to know exactly what it can and cannot do.

Why We Backtested

There are two reasons we invested significant engineering time into backtesting rather than just shipping more features.

First, trust. The real estate AI space is full of tools that make bold claims about prediction accuracy without publishing any evidence. We have seen products claim to "predict market movements" without ever defining what that means or showing how their predictions compare to actual outcomes. That approach might work for marketing, but it does not work for investors making six-figure decisions.

We wanted a different standard. We wanted to show you the actual numbers — predicted versus actual — for every test, and let you judge for yourself whether Prism is accurate enough to be useful for your strategy.

Second, calibration. Backtesting is not just a marketing exercise. It is how we find and fix problems in the model. Every backtest that reveals a gap between prediction and reality teaches us something about the model's assumptions, biases, and blind spots. The backtesting process directly led to improvements that raised our average accuracy from 85% to 92%.

Our Methodology

Before we get to the results, you need to understand how Prism generates a forecast. It is not a single model or a simple formula. It is a multi-layered system that combines quantitative modeling, AI reasoning, and probabilistic simulation.

The Quantitative Engine

At its core, Prism runs multiple proprietary models, each designed to capture a different dimension of the housing market:

  • Trend analysis models that examine how economic indicators combine to drive price changes
  • Fair value models that identify when prices have deviated from equilibrium
  • Momentum models that capture price inertia and cyclical patterns
  • Affordability models that measure how much room prices have to grow given incomes and borrowing costs
  • Supply-demand models that track inventory dynamics, construction pipelines, and absorption rates
  • Comparable market models that find historically similar conditions and use their outcomes as references
  • Interest rate models that isolate the impact of monetary policy on local housing
  • Market velocity models that measure how fast homes are selling and what that signals
  • Probabilistic simulation that generates dozens of randomized scenarios to produce confidence bands

Each model produces its own forecast. These are blended into a weighted prediction where the influence of each model is calibrated based on historical accuracy for different market conditions.

The AI Agent Layer

On top of the quantitative engine, Prism deploys thousands of autonomous AI agents. Each agent represents a different type of market participant — buyers, sellers, investors, institutions, renters, builders, lenders, and government entities. Each agent independently:

  • Gathers years of real economic data from authoritative government and industry sources
  • Researches current local conditions through web search — zoning changes, employer moves, construction projects, migration patterns
  • Reasons about future conditions based on its own incentives and risk tolerance
  • Makes decisions about whether to buy, sell, hold, build, lend, or wait

The collective behavior of thousands of agents produces an emergent forecast that captures market dynamics no single model can. When buyer agents are pulling back because affordability is stretched but institutional agents are still deploying capital because yields are attractive, that tension shows up in the forecast as a range of probable outcomes rather than a single number.

The Machine Learning Pipeline

We are building a machine learning layer that uses backtest results and real outcomes as training data. Every time Prism makes a prediction that we can later compare to reality, that becomes a labeled training example. Over time, the system learns which models to weight more heavily in which conditions, how to adjust for biases, and when to trust agent reasoning over statistical models or vice versa.

Prism is currently in Alpha. These results reflect early testing, and we expect accuracy to improve as the model learns from more simulations and user feedback.

Backtesting Protocol

Every backtest follows a strict protocol:

  1. We select a historical period and market (e.g., "US Nationwide, 2022-2023")
  2. Prism runs using only data that was available at the start of the test period — no future information leaks
  3. The simulation produces a predicted price change percentage
  4. We compare that prediction to the actual price change that occurred
  5. Accuracy is calculated as 1 minus the absolute percentage error, normalized against the actual change

This is the same methodology used by quantitative research firms and academic evaluations. It is intentionally conservative — any deviation from the actual outcome, even if the direction is correct, reduces the accuracy score.

The 5 Calibration Tests

Before running the full backtest suite, we ran 5 calibration tests designed to probe specific aspects of the model:

Rate Shock (2022-2023): The fastest rate hiking cycle in 40 years. This tested whether the interest rate sensitivity models could capture the impact of mortgage rates jumping from 3% to 7% in under 18 months. The models needed to balance rate-driven demand destruction against constrained supply from the lock-in effect.

Soft Landing (2023-2025): A period where most analysts predicted a crash that never came. This tested the model's ability to see through bearish consensus and recognize structural support from low inventory and demographic demand.

Mid-Cycle Growth (2015-2018): A normal expansion period with steady appreciation. This was our baseline test — if the model cannot predict normal conditions, nothing else matters.

Pittsburgh Growth (2018-2023): A market-specific test in a secondary city with different dynamics than coastal metros. Pittsburgh appreciated significantly driven by healthcare, education, and tech sector growth — a pattern very different from Sun Belt migration booms.

Austin Correction (2022-2024): The hardest test in the suite. Austin experienced a genuine correction driven by massive supply increases after years of pandemic-fueled appreciation. This tested whether the model could predict downturns, not just growth.

Results: All 13 Backtests

Here are the complete results, organized by accuracy. We are showing everything — the wins and the misses.

Non-Pandemic Tests (92% Average Accuracy)

TEST MARKET ACTUAL PREDICTED ACCURACY
LA Rate Impact 2022-2024 Los Angeles +3% +3.6% 99%
Rate Shock 2022-2023 US Nationwide +3% +2.2% 98%
Soft Landing 2023-2025 US Nationwide +6.5% +5.3% 98%
Pittsburgh Growth 2018-2023 Pittsburgh +35% +31.9% 94%
Denver Plateau 2022-2024 Denver +2% +4.9% 94%
Mid-Cycle 2015-2018 US Nationwide +15% +11.8% 94%
Late Cycle 2017-2019 US Nationwide +11% +7.1% 92%
Late Expansion 2016-2019 US Nationwide +15% +10.9% 92%
Steady State 2014-2017 US Nationwide +17% +9.6% 85%
Austin Correction 2022-2024 Austin -8% +4.7% 75%

Pandemic-Era Tests (Shown for Transparency)

TEST MARKET ACTUAL PREDICTED ACCURACY
Post-Crisis Recovery 2012-2015 US Nationwide +27% +14.4% 68%
Cleveland Value 2018-2022 Cleveland +40% +15.2% 50%
Tampa Migration 2020-2024 Tampa +45% +18.8% 48%

What We Learned

Discovery: The Bearish Bias

The single most important finding from our backtesting was a systematic bearish bias in the prediction system. In 11 out of 13 tests, Prism underpredicted the actual price change. The model was consistently too conservative.

This makes sense when you understand how the models work. Fair value models assume prices will trend back toward historical averages. Affordability models flag when prices look stretched relative to incomes. Both of these forces pull predictions downward in a rising market. The result is a system that reliably predicts direction but compresses the magnitude of growth.

Once we identified this pattern, we recalibrated the model weights — reducing the influence of models that were too conservative during growth periods and increasing the weight of models that better captured market momentum. This recalibration raised our non-pandemic average accuracy from 85% to 92%.

Where the Model Excels

Flat and low-growth markets: 98% accuracy. When markets are moving sideways or growing slowly — like the Rate Shock period (2022-2023) or the LA Rate Impact test — Prism is exceptionally accurate. This makes sense: low-volatility markets are dominated by fundamentals that the statistical models capture well. There are fewer surprise factors driving outcomes.

Normal growth cycles: 93% accuracy. During standard expansion periods (Mid-Cycle 2015-2018, Late Expansion 2016-2019, Pittsburgh Growth 2018-2023), the model reliably predicts both direction and approximate magnitude. These are the conditions most investors actually face when making decisions — and they are the conditions where Prism performs best.

Where the Model Struggles

Pandemic-era booms: 48-68% accuracy. The three pandemic-era tests were our worst performers. Tampa (+45% actual vs. +18.8% predicted), Cleveland (+40% actual vs. +15.2% predicted), and the Post-Crisis Recovery (+27% actual vs. +14.4% predicted) all showed the same pattern: the model dramatically underestimated the magnitude of price increases.

We are transparent about this because it matters. The pandemic housing boom was driven by factors that have no historical precedent — $5 trillion in fiscal stimulus, mortgage rates dropping below 3%, a once-in-a-century work-from-home migration, and constrained supply from a construction industry shut down by COVID. No statistical model trained on pre-pandemic data could have predicted this. Importantly, no one else predicted it either — not economists, not Wall Street analysts, and not other forecasting tools.

Market-specific corrections: 75-85% accuracy. The Austin Correction test (75% accuracy) was our most instructive failure in the non-pandemic category. Austin experienced a genuine price decline (-8%) driven by a massive surge in new construction — the city permitted more housing units than any metro its size. Prism predicted the market was at risk of overheating but predicted modest growth (+4.7%) rather than decline. The model missed the direction entirely.

This tells us the supply-demand models need to weight local construction pipeline data more heavily, especially in markets with permissive zoning. It is a specific, actionable finding that we are addressing in the next model update.

What We Are Doing About It

Backtesting is not a one-time exercise. It is an ongoing process that feeds directly into model improvement. Here is what we are building based on these results:

Machine Learning Training Data Pipeline

Every backtest result becomes a labeled training example: given these inputs (economic conditions, market state, agent behavior), the model predicted X, and the actual outcome was Y. We are building a machine learning layer that learns from these examples to correct systematic biases and improve weight calibration.

As more users run simulations and those predictions resolve against real outcomes, the training dataset grows. This creates a self-improving system — every simulation that concludes adds to the dataset that makes the next simulation more accurate.

Enhanced Local Supply Modeling

The Austin result showed us that local construction pipeline data needs more influence in the prediction. We are integrating more granular permit data and construction activity at the metro level so the models can better anticipate supply-driven corrections.

Regime Detection

One insight from the pandemic tests is that the model needs to recognize when market conditions have shifted to a fundamentally different regime — and adjust its assumptions accordingly. We are building a detection system that identifies when current conditions do not match any historical pattern and automatically widens confidence bands and adjusts the prediction.

How This Compares

Nothing Like This Exists

We searched for other consumer-facing real estate tools that do what Prism does — deploy AI agents to simulate market dynamics, run quantitative models against live economic data, and publish backtest results with this level of transparency. We could not find a single one.

Most real estate platforms tell you what happened yesterday. They show you comparable sales, recent listings, and historical price charts. That is valuable, but it answers the wrong question. Investors don't need to know what the market did — they need to know what it's going to do.

Prism is the first tool built from the ground up to answer the forward-looking question. It doesn't just crunch historical data — it deploys thousands of autonomous AI agents that research, reason, and interact to simulate how a real market evolves over time. No other consumer real estate platform does this.

Institutional firms backtest internally — hedge funds, REITs, and research firms have quant teams running models against historical data. But those results are proprietary. Individual investors never see them. In the consumer real estate technology space, published backtest results with full methodology are essentially nonexistent.

We think that needs to change. If a tool is going to influence investment decisions worth hundreds of thousands of dollars, the people using it deserve to see how it performs against known outcomes. "Trust our AI" is not good enough. You should be able to see the numbers, evaluate the methodology, and decide for yourself.

We're leading this change. And we will continue to publish updated results as we run more backtests and as our models improve.

What This Means for You

If you are an investor evaluating Prism, here is how to interpret these results:

  • For normal market conditions (flat, low-growth, steady expansion), Prism achieves 92-98% accuracy. These are the conditions that cover most investment decisions most of the time.
  • For market-specific analysis, Prism correctly identified the direction and approximate magnitude for Pittsburgh, Denver, and LA. It missed the direction for Austin's correction — a limitation we are actively addressing.
  • For pandemic-like events, Prism will significantly underpredict price changes. This is a known limitation that affects all statistical models, not just ours. Use wider confidence bands and scenario conditions to stress-test your thesis during periods of extreme uncertainty.
  • Direction is more reliable than magnitude. Prism correctly predicted whether markets went up or down in 90% of all tests. If your decision hinges on "will this market grow or decline," the model has a strong track record.

Try It Yourself

Every Prism simulation includes a backtesting mode where you can test the model against 21 historical scenarios — more than the 13 we covered here. Run the tests yourself. See how the predictions compare to outcomes you remember living through. Form your own judgment about whether the accuracy level is sufficient for your investment strategy.

Prism is available on the Pro Plus plan. You can run simulations on 130+ U.S. markets, configure custom scenario conditions, interview individual agents about their reasoning, and access the full research report with clickable sources.

If you are already a Prism user, we would love your feedback. The accuracy numbers above will change — ideally upward — as we incorporate more data and refine the models. Your feedback on prediction quality goes directly into improving the system.

Learn more about how Prism works, or go straight to your dashboard and run a simulation.

Prism is a probabilistic modeling tool. All outputs represent simulated projections across a range of possible outcomes. They are not guaranteed results and do not constitute financial, investment, legal, or tax advice. Past backtest performance does not guarantee future accuracy. Always conduct your own due diligence before any investment decision.