Research note · trading-models desk

What actually matters in Jane Street’s prediction problems

Jane Street doesn’t publish its internal models — the closest public artifacts are its two Kaggle competitions built on real (anonymized) production trading data, the winning write-ups, and the firm’s own framing of the problem. We scraped those sources and ranked the concepts that authors explicitly state were the most important to performance.

SOURCES 6 COMPETITIONS 2 (2020–21, 2024–25) TEAMS COVERED ~8,000 COMPILED 2026-06-09

01Concept ranking — stated importance across write-ups

Emphasis index: 3 pts when a source explicitly calls the concept “most important / biggest gain,” 1 pt when it is used and credited as material. Normalized to 10. Chips mark which sources cite it: JS = Jane Street’s own framing · ’21 = Market Prediction top write-ups · ’24 = Real-Time Forecasting write-ups · MLC = ML Contests 2025 meta-review.

02Challenge → technique map

Jane Street states four structural challenges in its competition briefs. Every top write-up is, in effect, an answer to one or more of them.

03In their own words

What each source explicitly singled out as the thing that mattered.

Jane Street · competition brief

“A strategy that works well with past data is unlikely to do so in the future… the relationship of the features with the response is constantly changing.”

→ non-stationarity is the named enemy

Yirun Zhang · 1st place, ’21 Market Prediction

Supervised autoencoder trained jointly with the MLP — bottleneck features reject “random relations”; Gaussian-noise augmentation and sample weights credited with the score boost.

→ denoising + true-objective weighting won it

scaomath · top 6%, ’21 Market Prediction

Most important: a utility-function regularizer (train on the metric you’re paid on), dynamic model selection for volatile vs. calm days, and median-of-middle-60% ensemble blending.

→ objective alignment + regime switching

E. Volkova · ’24 Real-Time Forecasting

“This approach [online learning] significantly improved the model’s performance on CV (+0.008)… one-day updates for almost a year is enough.” Ensembling added only +0.0007.

→ adaptation beat ensembling ~10:1

GRU + auxiliary targets · ’24 Real-Time Forecasting

Time-series GRU over each day’s sequence, four auxiliary responders (multi-task), market-average and rolling-window features, strict time-series CV with a 200-date holdout.

→ sequence context + multi-task learning

ML Contests · 2025 meta-review

“GBDTs remain the go-to tabular winner’s tool, sometimes as part of an ensemble alongside neural nets” — yet both Jane Street competitions were won by neural networks.

→ JS data is the exception that proves NNs

04Where the score actually came from — ’24 winner’s ledger

The only top write-up that publishes a component-by-component attribution (CV/LB deltas, same scale). Note the shape: adaptation dwarfs everything else.

05Sources

Source	What it is	Link
1st / 4,245	Yirun Zhang, “Training Supervised Autoencoder with MLP” — ’21 winner	kaggle discussion 224348
mirror	Numerai forum replication thread of the ’21 winning architecture	forum.numer.ai/4338
241st / 4,245	scaomath repo — utility regularizer, regime-switching, median blending	github.com/scaomath/kaggle-jane-street
’24 entry	E. Volkova solution.md — GRU, online learning, gain attribution	github.com/evgeniavolkova/kagglejanestreet
host	Jane Street competition announcement & problem framing	blog.janestreet.com
meta	ML Contests, “State of ML Competitions 2025” — field-wide patterns	mlcontests.com