Jane Street doesn’t publish its internal models — the closest public artifacts are its two Kaggle competitions built on real (anonymized) production trading data, the winning write-ups, and the firm’s own framing of the problem. We scraped those sources and ranked the concepts that authors explicitly state were the most important to performance.
Emphasis index: 3 pts when a source explicitly calls the concept “most important / biggest gain,” 1 pt when it is used and credited as material. Normalized to 10. Chips mark which sources cite it: JS = Jane Street’s own framing · ’21 = Market Prediction top write-ups · ’24 = Real-Time Forecasting write-ups · MLC = ML Contests 2025 meta-review.
Jane Street states four structural challenges in its competition briefs. Every top write-up is, in effect, an answer to one or more of them.
What each source explicitly singled out as the thing that mattered.
“A strategy that works well with past data is unlikely to do so in the future… the relationship of the features with the response is constantly changing.”
Supervised autoencoder trained jointly with the MLP — bottleneck features reject “random relations”; Gaussian-noise augmentation and sample weights credited with the score boost.
Most important: a utility-function regularizer (train on the metric you’re paid on), dynamic model selection for volatile vs. calm days, and median-of-middle-60% ensemble blending.
“This approach [online learning] significantly improved the model’s performance on CV (+0.008)… one-day updates for almost a year is enough.” Ensembling added only +0.0007.
Time-series GRU over each day’s sequence, four auxiliary responders (multi-task), market-average and rolling-window features, strict time-series CV with a 200-date holdout.
“GBDTs remain the go-to tabular winner’s tool, sometimes as part of an ensemble alongside neural nets” — yet both Jane Street competitions were won by neural networks.
The only top write-up that publishes a component-by-component attribution (CV/LB deltas, same scale). Note the shape: adaptation dwarfs everything else.
| Source | What it is | Link |
|---|---|---|
| 1st / 4,245 | Yirun Zhang, “Training Supervised Autoencoder with MLP” — ’21 winner | kaggle discussion 224348 |
| mirror | Numerai forum replication thread of the ’21 winning architecture | forum.numer.ai/4338 |
| 241st / 4,245 | scaomath repo — utility regularizer, regime-switching, median blending | github.com/scaomath/kaggle-jane-street |
| ’24 entry | E. Volkova solution.md — GRU, online learning, gain attribution | github.com/evgeniavolkova/kagglejanestreet |
| host | Jane Street competition announcement & problem framing | blog.janestreet.com |
| meta | ML Contests, “State of ML Competitions 2025” — field-wide patterns | mlcontests.com |