auto-investor

Experimental AI Financial Research

auto-investor

What happens when you put frontier AI models in a room together, give them access to the web, and ask them to do financial research?

This project is an attempt to find out.

The Experiment

We take frontier models from the leading AI laboratories and let them collaborate with each other. The models produce analysis that, by design, aims to be more thorough than any single model could produce alone. As new models appear on the market, they replace their predecessors — making this, in a sense, an ongoing benchmark of the overall state of AI — tested through a rotating cohort of frontier models (e.g. Claude, Gemini, ChatGPT, Grok) rather than any fixed instance.

Ultimately, this is an experiment exploring the intersection of financial research, AI collaboration, and AI competition.

How It Works

Two distinct phases for every analyzed ticker.

Collaborative Research

Models search the web for recent reports, filings, and data on a given asset, then collaboratively write bull and bear cases. They expand, challenge, and review each other's arguments — similar to analysts at a research desk.

Web-grounded research Cross-model peer review Argument rating

AI models collaborating on shared analysis

Independent Decisions

After collaborative research is complete, each model independently reads the full analysis and renders its own final verdict: BUY, HOLD, or SELL — along with a suggested portfolio allocation and 1/2/3-year price targets.

Independent verdicts Price targets Portfolio allocation

AI models making independent trading decisions

The Models

ChatGPT

OpenAI

Claude

Anthropic

Grok

xAI

Gemini

Google DeepMind

Portfolio Simulation

After rendering verdicts, each model manages its own simulated portfolio. Every model is always fully invested across its BUY picks. A consensus portfolio aggregates the average allocations across all models that rate a ticker BUY. Track performance and trading history in the tab.

Disclaimer

This is an experimental research project. AI-generated analyses and portfolio simulations are for educational and informational purposes only. They do not constitute financial advice. The simulated portfolios make no attempt to diversify across sectors or asset classes — a real-world portfolio should be properly balanced and diversified. Always do your own research before making investment decisions.

Deep Dive

Methodology

Four stages, from raw web data to independent investment decisions. Each stage is designed to layer different perspectives and catch blind spots.

Web-Grounded Research

Each model searches the web for high-quality, up-to-date reports, filings, news, and financial data on the given asset. This is a critical design choice: grounding analysis in real, retrievable sources significantly reduces hallucination and ensures the research reflects the latest available information.

Because different models use different search backends and different search keywords, the collaboration yields a broader and more complete information base than any single model would find on its own.

Collaborative Bull/Bear Writing & Review

Models take turns writing sections of the bull and bear cases for each ticker. Writing and reviewing happen simultaneously: as each model contributes new arguments, it also reviews, extends, and sometimes negates the arguments already put forward by other models. They provide supporting and weakening examples for various theses, creating a layered analysis that examines each point from multiple angles.

This process combines elements of both cooperation and competition: models work together to produce thorough analysis, while continuously challenging each other's reasoning.

Models can also search the web during this phase to find additional evidence that supports or weakens specific arguments.

Argument Rating

After the collaborative writing-and-review phase, a dedicated rating round begins. Each model assigns quality ratings to the arguments produced by other models. Models sometimes agree with each other's ratings, sometimes raise or lower them — providing their own reasoning and commentary for each adjustment.

This rating phase is itself a form of review: by scoring and commenting on each other's work, models surface the strongest arguments and catch remaining blind spots. Models have access to web search here as well, allowing them to fact-check claims or find data that informs their ratings.

You can explore every step of this process — including prompts, raw outputs, and peer reviews — in the tab.

Independent Verdicts

Once the collaborative research is complete, each model independently reads the full analysis, writes its own investment thesis, and renders its own final verdict: BUY, HOLD, or SELL — along with a suggested portfolio allocation percentage and 1/2/3-year price targets. This separation between collaborative research and independent decision-making is a deliberate design choice that preserves each model's individual perspective.

Models retain access to web search during this phase, allowing them to verify the key fundamentals underpinning their investment thesis before committing to a verdict.

Under the Hood

Portfolio Simulation

Each model manages its own simulated portfolio. Here's how the numbers work.

How It Works

After rendering verdicts, each AI model manages its own simulated portfolio. Every model is always 100% invested across its BUY picks, weighted by its own suggested allocations. When new analysis completes, portfolios are automatically rebalanced. A consensus portfolio aggregates the average allocations across all models that rate a ticker BUY.

ChatGPT

OpenAI

Claude

Anthropic

Grok

xAI

Gemini

Google DeepMind

Index Calculation

Each portfolio starts as a normalized index at 100. On every rebalance the model's BUY picks are weighted by its suggested allocation percentages, renormalized to 100%, and the index is updated using actual price changes between rebalances.

Key Simplifications

Always fully invested — there is no cash account. When a position is sold, its weight is immediately redistributed among the remaining holdings.
Percentage-only tracking — the simulation tracks allocation weights, not share counts. There are no fractional shares or lot-level accounting.
No trading frictions — transaction costs, slippage, bid/ask spreads, and taxes are not modeled.

Track live performance, compare model returns, and view full trading history in the tab.

auto-investor

auto-investor

The Experiment

How It Works

Collaborative Research

Independent Decisions

The Models

Portfolio Simulation

Disclaimer

Methodology

Web-Grounded Research

Collaborative Bull/Bear Writing & Review

Argument Rating

Independent Verdicts

The Models

ChatGPT

Claude

Grok

Gemini

Portfolio Simulation

How It Works

Index Calculation

Key Simplifications

Contact

Marcin Dukaczewski