Research note · trading-models desk

What actually matters in Jane Street’s prediction problems

Jane Street doesn’t publish its internal models — the closest public artifacts are its two Kaggle competitions built on real (anonymized) production trading data, the winning write-ups, and the firm’s own framing of the problem. We scraped those sources and ranked the concepts that authors explicitly state were the most important to performance.

SOURCES 6 COMPETITIONS 2 (2020–21, 2024–25) TEAMS COVERED ~8,000 COMPILED 2026-06-09

01Concept ranking — stated importance across write-ups

Emphasis index: 3 pts when a source explicitly calls the concept “most important / biggest gain,” 1 pt when it is used and credited as material. Normalized to 10. Chips mark which sources cite it: JS = Jane Street’s own framing · ’21 = Market Prediction top write-ups · ’24 = Real-Time Forecasting write-ups · MLC = ML Contests 2025 meta-review.

2.5 5.0 7.5 10 cited by Adapt to non-stationarity online learning · regime switching 10.0 JS ’21 ’24 Overfitting control purged / embargoed time-series CV 8.5 JS ’21 ’24 Denoise low signal-to-noise supervised autoencoder · noise aug 8.0 JS ’21 Train on the true objective utility regularizer · sample weights 6.5 ’21 Ensembling seeds × architectures · median blends 6.0 ’21 ’24 MLC Multi-task auxiliary targets predict several responders at once 4.5 ’21 ’24 Context features lags · market averages · rolling stats 4.0 ’24 NN sequence models (+GBDT) GRU / MLP won; GBDT in blends 3.5 ’21 ’24 MLC Real-time inference budget latency shapes model size 2.5 JS ’21

02Challenge → technique map

Jane Street states four structural challenges in its competition briefs. Every top write-up is, in effect, an answer to one or more of them.

STATED CHALLENGE (JANE STREET) WINNING RESPONSE (WRITE-UPS) Non-stationarity “feature–response relationships are constantly changing” Low signal-to-noise weak signal, heavy noise, multicollinear features, fat tails Easy to overfit history “a strategy that worked in the past is unlikely to keep working” Real-time decisions predictions scored one-at-a-time under a latency budget Online learning daily one-day refits — biggest single gain in ’24 Regime-aware models + recency weighting switch models on volatility; weight recent days Supervised autoencoder denoising bottleneck features + Gaussian-noise aug — won ’21 Multi-task targets + ensembles several responders; average seeds & architectures Purged time-series CV + true-objective loss PurgedGroupTimeSeriesSplit; utility-metric finetuning Small, fast networks 300–400k-param MLPs sized for the inference loop

03In their own words

What each source explicitly singled out as the thing that mattered.

Jane Street · competition brief
“A strategy that works well with past data is unlikely to do so in the future… the relationship of the features with the response is constantly changing.”
→ non-stationarity is the named enemy
Yirun Zhang · 1st place, ’21 Market Prediction
Supervised autoencoder trained jointly with the MLP — bottleneck features reject “random relations”; Gaussian-noise augmentation and sample weights credited with the score boost.
→ denoising + true-objective weighting won it
scaomath · top 6%, ’21 Market Prediction
Most important: a utility-function regularizer (train on the metric you’re paid on), dynamic model selection for volatile vs. calm days, and median-of-middle-60% ensemble blending.
→ objective alignment + regime switching
E. Volkova · ’24 Real-Time Forecasting
“This approach [online learning] significantly improved the model’s performance on CV (+0.008)… one-day updates for almost a year is enough.” Ensembling added only +0.0007.
→ adaptation beat ensembling ~10:1
GRU + auxiliary targets · ’24 Real-Time Forecasting
Time-series GRU over each day’s sequence, four auxiliary responders (multi-task), market-average and rolling-window features, strict time-series CV with a 200-date holdout.
→ sequence context + multi-task learning
ML Contests · 2025 meta-review
“GBDTs remain the go-to tabular winner’s tool, sometimes as part of an ensemble alongside neural nets” — yet both Jane Street competitions were won by neural networks.
→ JS data is the exception that proves NNs

04Where the score actually came from — ’24 winner’s ledger

The only top write-up that publishes a component-by-component attribution (CV/LB deltas, same scale). Note the shape: adaptation dwarfs everything else.

+0.0080 online learning daily refits ~+0.0030 feature work mkt averages · rolling · lags ~+0.0020 aux targets multi-task responders +0.0007 ensembling 6 models, mean

05Sources

SourceWhat it isLink
1st / 4,245 Yirun Zhang, “Training Supervised Autoencoder with MLP” — ’21 winner kaggle discussion 224348
mirror Numerai forum replication thread of the ’21 winning architecture forum.numer.ai/4338
241st / 4,245 scaomath repo — utility regularizer, regime-switching, median blending github.com/scaomath/kaggle-jane-street
’24 entry E. Volkova solution.md — GRU, online learning, gain attribution github.com/evgeniavolkova/kagglejanestreet
host Jane Street competition announcement & problem framing blog.janestreet.com
meta ML Contests, “State of ML Competitions 2025” — field-wide patterns mlcontests.com