trading-models · working paper VIOptions · Event Volatility

When Is Earnings Volatility Cheap?

A long straddle into scheduled earnings is a bet that the realized move beats the implied one. Owning that bet indiscriminately loses to the volatility risk premium; the only edge is the selection rule that decides when not to play.

Family options Window event (T−3 → T+1) Assets equities Data yfinance daily + synthetic IV Structure ATM straddle Edge selection filter Status Phase-1 synthetic · not tradeable

Abstract

Scheduled earnings announcements are the cleanest volatility events a retail trader can position around: the date is known weeks ahead, and a single report compresses a quarter of information into one overnight gap. The textbook expression is a long at-the-money straddle — a call and a put at the same strike and expiry — bought before the report and sold after, profiting from a large move in either direction. We complete the options strategy that Poudel's (2025) small-cap dissertation specifies and then explicitly declines to implement, and in doing so make one point precisely: buying earnings volatility is, on average, a losing trade, because the implied move priced into the straddle already embeds a positive volatility risk premium and is then crushed the instant the announcement removes the uncertainty it was charging for.

The strategy's only source of edge is therefore not owning the straddle but selecting which earnings to own it for. We forecast each event's realized move from the name's own past earnings reactions and enter only when that forecast exceeds the straddle's implied move by a margin, \(\widehat{m}_{\text{exp}}>m_{\text{imp}}\cdot k\) with \(k>1\). The filter, not the option, is the alpha. We pair it with a non-parametric validation battery — a centered bootstrap test for the per-trade mean and a Benjamini–Hochberg false-discovery correction across the candidate universe — precisely because a handful of earnings trades on a hand-picked watchlist is exactly the setting in which spurious edges are manufactured.

The model runs on a deliberately synthetic volatility surface (an elevated pre-earnings implied vol that collapses to a lower post-earnings level), so the Phase-1 numbers test mechanism, not tradeability. Across 216 earnings events on nine single names (2020–2026) the naïve unfiltered straddle bleeds as the premium predicts — a per-trade expectancy of \(-\$125.65\), profit factor \(0.68\), \(-\$27{,}140\) total, bootstrap \(p=0.052\). The selection filter lifts the relative numbers (filtered expectancy \(+\$56.69\), profit factor \(1.15\)), but the lift is not real edge: the filtered mean is statistically indistinguishable from zero (\(p=0.778\), \(95\%\) CI \([-\$324,\,+\$463]\)), it has a negative median (a fat right tail of two names, META and NFLX), and — decisively — its sign is an artifact of the assumed implied vol. Because the synthetic \(\sigma_{\text{pre}}\) is a single global constant, the implied move is ticker-independent (\(\approx0.075\) for all nine names), so the gate \(\widehat m_{\text{exp}}>m_{\text{imp}}\cdot k\) collapses into a pure realized-vol screen rather than a mispricing test; and sweeping \(\sigma_{\text{pre}}\) from \(30\%\) to \(65\%\) swings the profit factor from \(4.60\) (\(p=0.001\)) to \(0.29\). After Benjamini–Hochberg control the only name distinguishable from zero is NVDA — and it is a significant loser (\(-\$391.82\)/trade, \(0/3\) wins). The honest reading: the filter is necessary to avoid the premium, it is not shown to be sufficient to overcome it, and the "selection alpha" cannot even be tested until a real chain supplies name-specific implied moves.

Keywords. earnings announcements, implied move, long straddle, volatility risk premium, IV crush, selection filter, bootstrap inference, Benjamini–Hochberg FDR, event-driven volatility.

Introduction

Every other options strategy in this catalogue trades volatility as a continuous quantity. An earnings straddle trades it as a scheduled discrete event, and that changes the economics entirely. The market knows the report is coming; it bids implied volatility up in the days before, building a premium that prices the expected overnight jump; and it lets that premium evaporate the moment the report lands and the uncertainty resolves. A trader who simply buys the straddle is buying the most expensive insurance in the equity market at the exact moment it is most expensive, and watching it lose value whether or not the stock moves. The interesting question is never "should I buy the earnings straddle" — almost always the answer is no — but "for which earnings, if any, is the implied move cheap relative to what the stock will actually do."

This paper takes the event-volatility strategy from Poudel (2025), Small-Cap Stock Trading Strategies for Retail Traders (SSRN 5921742), Strategy Family E. That work describes the long-strangle-around-earnings trade and then steps back from it: "For simplicity, this section focuses on equity-based event trading rather than options." Strategy E was also the dissertation's weakest result (an out-of-sample Sharpe of 0.54, the lowest of its six families) — and that is not a coincidence to paper over but the central fact to build around. Naïve long earnings vol underperforms because the premium it pays usually exceeds the move it collects. We implement the options version the dissertation omitted, but with the selection discipline that its own results imply is mandatory.

Our claim, as with the delta-hedged paper that precedes this one, is not that the strategy prints money. It is that the strategy's profitability is conditional and forecastable: it lives or dies on whether a simple forecast of the realized earnings move can beat the implied move embedded in the straddle, net of an unforgiving four-legged transaction cost and a modelled volatility crush. We build the mechanism, we state exactly where the theory is sound and where it is fragile, and we validate the small-sample result with inference tools chosen to resist the very overfitting this kind of strategy invites.

The Instrument: an At-the-Money Straddle Around the Report

Let \(S_t\) be the underlying close on bar \(t\) and \(E\) the scheduled earnings datetime. A long straddle is one long call and one long put, struck together at the money and sharing the nearest listed expiry strictly after the report. With the strike snapped to the nearest grid increment, \(K=\operatorname{snap}(S_t)\), and both legs priced by Black–Scholes–Merton at an implied volatility \(\sigma\), the per-share premium is the sum of the two,

\begin{equation} P_{\text{straddle}}(S,K,T,\sigma) = C_{\text{BS}}(S,K,T,\sigma) + \mathrm{Put}_{\text{BS}}(S,K,T,\sigma), \end{equation}

and the position is delta-neutral at inception (an ATM call's \(+\Delta\approx+0.5\) is offset by the ATM put's \(-\Delta\approx-0.5\)), long gamma, long vega, and short theta on both legs. Its payoff at expiry is the familiar V: the holder profits if the underlying finishes far enough from \(K\) in either direction to recover the combined premium.

K (strike ≈ spot) K − P K + P max loss = −premium profit on a large move… …either way P&L at expiry underlying at expiry →
The long-straddle payoff at expiry. The position needs the underlying to finish beyond either breakeven, \(K\pm P\), to profit; inside that band it loses, with the maximum loss equal to the premium paid at the strike. The width of the breakeven band, \(P/S\), is the implied move — the realized move the trade must beat just to break even.

The Implied Move, the Risk Premium, and the Crush

The single number that governs the trade is the breakeven width above, the implied move. For an ATM straddle it is well approximated by the premium as a fraction of spot,

\begin{equation} m_{\text{imp}} \;=\; \frac{P_{\text{straddle}}}{S} \;\approx\; 0.8\,\sigma\sqrt{T}, \end{equation}

which is the option market's priced-in expected absolute move through expiry — explicitly the mean-absolute move \(\mathbb E|R|\), not the one-sigma move \(\sigma\sqrt{T}\) (the \(0.8\approx\sqrt{2/\pi}\) is the half-normal factor). Held through the announcement to a same-week exit, the straddle's profit-and-loss is, to first order,

\begin{equation} \Pi \;\approx\; M\cdot\big(\,|R_{\text{event}}|\;-\;m_{\text{imp}}\,\big)\;-\;\text{(theta + crush + costs)}, \end{equation}

with \(M=100\) the contract multiplier and \(R_{\text{event}}\) the realized announcement return. The trade wins when the realized move clears the implied move by enough to cover the carry. Two forces make that hard, and both are structural rather than incidental.

The volatility risk premium. Implied volatility is, on average across names and time, a biased-high forecast of subsequent realized volatility — sellers of options demand compensation for bearing variance risk. Bakshi & Kapadia (2003) document that delta-hedged option positions earn negative average returns precisely because of this premium; Carr & Wu (2009) measure variance risk premia as robustly negative for the buyer. Around earnings the premium is at its largest: implied vol is bid up specifically to price the event. The long straddle is the leveraged long side of exactly this losing average bet.

The crush. The instant the report is released, the uncertainty the pre-earnings implied vol was charging for disappears, and implied vol collapses from an elevated pre-event level to a lower post-event one. The straddle's remaining time value — its vega — is repriced down at the new, lower vol the moment you would exit. This is the "IV crush," and it is why a stock can move and the straddle still lose: the intrinsic gain from the move is offset by the vega loss from the vol collapse.

implied vol earnings (E) pre-earnings IV ↑ (45%) crush → 25% enter T−3 exit T+1 time →
The stylised earnings IV path the Phase-1 surface models. Implied vol is elevated entering the report and crushes immediately after. A straddle entered at T−3 pays the inflated pre-event vol; exited at T+1 it is marked at the crushed post-event vol. The buyer must collect enough realized move at the vertical line to offset this vega give-back plus three-plus days of theta.

The Selection Filter — the Only Edge

Because owning the straddle is a losing average bet, the strategy's profitability must come entirely from declining most of them. We forecast each event's realized move from the name's own history: the mean absolute close-to-next-bar return across its last \(\ell\) earnings dates, using prior events only,

\begin{equation} \widehat{m}_{\text{exp}} \;=\; \frac{1}{\ell}\sum_{j=1}^{\ell}\bigl|R_{E_j}\bigr|,\qquad E_j < E, \end{equation}

and we take the straddle only when this forecast exceeds the implied move by a margin governed by a single parameter \(k>1\):

Decision rule — the alpha \begin{equation} \text{enter the straddle} \iff \widehat{m}_{\text{exp}} \;>\; m_{\text{imp}}\cdot k, \qquad k>1. \end{equation} The two sides are deliberately like-for-like: \(m_{\text{imp}}=P/S\) is the option market's mean-absolute move, and \(\widehat{m}_{\text{exp}}\) is the realized mean-absolute move. The margin \(k\) is the cushion the realized history must clear to pay for the premium and the crush. Setting \(k>1\) is what enforces "only when implied vol looks cheap," not merely "whenever the stock has moved before."

On top of the k-gate sit hard no-trade filters that reject the event outright when the chain is untradeable: a non-positive or undefined implied vol, no listed expiry after the report, or a quoted half-spread wider than a cap. These are the operational safeguards of Poudel's Chapter 6, scoped to what Phase 1 can see.

NVDA m_imp · k = 9.0% exp = 10.4% ✓ FIRE m_exp large-cap m_imp · k exp < threshold ✗ DECLINE m_exp forecast vs. priced move — wider expected bar than the oxblood line means trade →
The selection filter in action on the 2023–2024 watchlist. NVDA's forecast move (10.4%) cleared the \(k=1.2\) threshold over its implied move (\(7.5\%\times1.2=9.0\%\)) and was traded; the large-cap names, whose muted earnings histories sat below their implied moves, were declined. The filter admitted one name of five — the discipline working as intended, even if the admitted name still lost.

Trade Construction, Frictions, and Sizing

An admitted event is traded on a strict, bar-counted schedule, deliberately independent of the calendar so that holidays and gaps do not misplace the legs. Entry is the third actual price bar before the earnings bar (T−3); exit is the first bar after (T+1). The legs are opened ATM at entry with a tenor placed beyond the exit, marked to market each bar by the engine, and closed at exit.

T−3 ENTER T−2 T−1 E earnings T+1 EXIT long straddle held across the event bars counted on the actual price index, not business-day offsets
The event window. Entry and exit are counted in actual trading bars around the earnings bar, so a gapped or holiday-shortened calendar never lands a leg on a non-trading date.

Frictions. A straddle is two legs, and the engine crosses a bid/ask spread on each — so the position pays the spread four times over its life (call and put at entry, call and put at exit). For a strategy whose entire edge is a few percentage points of move, this four-legged cost is not a footnote; it is one of the largest line items, and a realistic spread model is what keeps the backtest honest. Costs here are \(1.0\) bp fee plus \(0.5\) bp slippage per leg-crossing.

Sizing. A long option is defined-risk — the most you can lose is the premium — so position size is fixed-fractional against that premium: \(n=\big\lfloor (\text{capital}\cdot f)\,/\,(P\cdot M)\big\rfloor\) contracts for a risk fraction \(f\), with a cap on the number of concurrent open straddles. No stop-loss is needed below the premium floor; the option's own convexity is the risk control.

The Phase-1 Synthetic Experiment

This model runs without any real options-chain data. In place of quoted marks it uses a synthetic EventVolSurface: a single elevated implied vol before the report (\(\sigma_{\text{pre}}=45\%\)) that crushes to a lower level after (\(\sigma_{\text{post}}=25\%\)), with the invariant \(\sigma_{\text{pre}}>\sigma_{\text{post}}>0\) enforced at construction. The realized move that the straddle actually collects comes from the true underlying bars (yfinance daily); only the volatility it is priced at is synthetic.

This buys clarity at the price of tradeability. The experiment can validate the full pipeline end to end — calendar → event selection → straddle construction → frictioned mark-to-market → validation report — and it can show the mechanism by which the crush erodes a long straddle. What it cannot do is measure a real edge: the entry premium and the crush magnitude are assumptions, not market observations, so a profit here would prove nothing and a loss here indicts the assumptions, not the market. The Phase-1 numbers below are therefore framed as a mechanism check, explicitly not tradeable. The real chain (Phase 2/3) is what turns the same code into a tradeable test.

Results — the Filter Is Necessary, Not Sufficient

The study runs over nine liquid optionable single names (AAPL, AMZN, MSFT, NVDA, TSLA, META, GOOGL, NFLX, AMD) across their full cached earnings histories, 2020–2026 — 216 events in all. (SPY, the nominal default ticker, is an ETF with no earnings and never trades.) The unfiltered branch trades every event; the filtered branch trades only the \(33\) that cleared the \(k=1.2\) gate. Those \(33\) are heavily concentrated — META (\(11\)) and NFLX (\(10\)) supply two-thirds of them, TSLA (\(6\)) and NVDA (\(3\)) most of the rest, while MSFT and GOOGL never fire. That concentration is the first warning sign: because the synthetic implied move is a ticker-independent constant (\(\approx7.5\%\) for every name, since \(P/S\) for a near-ATM straddle is a function of \(\sigma\sqrt T\) only and \(\sigma_{\text{pre}}=45\%\) is fixed), the gate is not comparing implied against forecast at all — it is screening for names whose past realized earnings moves exceed \(\approx9\%\). The "selection filter" is, in Phase 1, a high-realized-vol screen.

Bar chart of filtered versus unfiltered final equity for each of the nine single names on the synthetic Phase-1 surface, 2020–2026.
Filtered vs. unfiltered final equity per name across the nine-name universe, synthetic Phase-1 surface, 2020–2026. Names whose gate never fired (e.g. MSFT, GOOGL) show no filtered bar; where the gate did fire the filtered branch took the same trades the unfiltered branch did. The pooled unfiltered book ends \(-\$27\rm{k}\) — the synthetic premium-and-crush drag the design predicts for long earnings vol.

Earnings-straddle Phase-1 synthetic results, 9 names, 216 events (33 gate-fired), 2020–2026 (results/validation.json, results/thorough_backtest.json).

MetricValueReading
Events / gate-fired216 / 33filter admits 15% of events
Unfiltered expectancy / trade−$125.65PF 0.68, −$27,140 total, p=0.052 — the premium bleed
Filtered expectancy / trade+$56.69PF 1.15 — positive but see below
Filtered median / trade−$117.35most filtered trades lose; mean is a fat tail (META+NFLX)
Filtered win rate39.4%fewer than half the moves beat the premium
Filtered bootstrap p-value0.778indistinguishable from zero (unfiltered, p=0.052, is more significant)
Filtered 95% CI[−$324, +$463]straddles zero — no detectable edge
FDR survivors (BH, α=0.05)1 of 4 — NVDAand NVDA is a significant loser (−$391.82, 0/3 wins)
pre_iv sensitivity (PF)4.60 → 0.29σ_pre 30%→65%: the sign is a free-parameter artifact

The honest reading is sharper than "necessary but not sufficient." The filter does lift the relative numbers — it turns the unfiltered \(-\$125.65\)/trade bleed into a nominal \(+\$56.69\) — but three facts dissolve that into noise. First, the lift is not statistically real: the filtered mean carries \(p=0.778\) with a \(95\%\) interval that comfortably contains zero, and the unfiltered branch is actually the more significant of the two (\(p=0.052\)). Second, the positive mean is a fat tail, not an edge: the filtered median trade loses \(\$117\), the win rate is under \(40\%\), and two names (META, NFLX) supply the entire positive contribution. Third, and decisively, the sign of the result is an artifact of an uncalibrated assumption — sweeping \(\sigma_{\text{pre}}\) from \(30\%\) to \(65\%\) moves the profit factor from \(4.60\) (\(p=0.001\)) to \(0.29\). Because that same fixed \(\sigma_{\text{pre}}\) makes the implied move ticker-independent, the gate the model calls "the alpha" is, in Phase 1, mechanically incapable of being a mispricing test at all. The validation harness is doing exactly its job: it reports an insignificant, artifact-driven result rather than a curve-fit positive.

Result. Long earnings volatility, even gated by the selection filter, shows no statistically significant edge on the synthetic Phase-1 surface, and the nominal positive is an artifact of the assumed implied vol. In Phase 1 the gate degenerates to a high-realized-vol screen, so the central hypothesis — that a forecast move can beat a genuinely mispriced implied move — is not merely unproven but untestable here. The edge, if one exists, must come from a real chain where implied moves are observed and sometimes genuinely cheap; the apparatus is ready for that test, but this is not it.

Validation Methodology

Seven trades on a hand-chosen watchlist is the textbook setting for a spurious edge, so the inference is deliberately conservative. Two tools, ported from Poudel's Chapter 4 into the project's validation harness, do the work.

Centered bootstrap. Earnings P&L is fat-tailed, bimodal (a few large wins, many small losses), and scarce — Student's \(t\) is the wrong instrument. We test \(H_0:\ \mathbb E[\Pi]=0\) by resampling the per-trade P&L with replacement, centering the bootstrap distribution at the null, and reporting the share of resamples whose mean is at least as extreme as observed, smoothed as \((c+1)/(B+1)\). The filtered \(p=0.778\) above says the observed mean is well within what pure noise would produce.

Benjamini–Hochberg FDR. When a watchlist is screened, each name is a hypothesis, and testing many inflates the chance that one looks significant by luck. The Benjamini–Hochberg step-up procedure controls the false-discovery rate: sort the \(p\)-values ascending and reject up to the largest rank \(i\) with \(p_{(i)}\le (i/m)\,\alpha\). Here exactly one name clears it — NVDA — and it clears as a significant loss, not a win (\(-\$391.82\)/trade, \(0/3\) wins); no name shows a significant positive edge. The guard is working: it refuses to certify the lucky-looking names (META, NFLX) whose individual bootstraps cross zero. The harness also exposes a Deflated-Sharpe hook for the number of configurations tried (Bailey & López de Prado, 2014) — though note that hook is not yet wired to the parameter sweeps, so the in-sample grid here carries no multiple-testing penalty.

Scenarios Where the Strategy Can Be Profitable

The Phase-1 loss is not a verdict that the trade can never pay — it is a verdict on the synthetic surface. There are concrete, theory-backed conditions under which a real-chain version of this exact machinery could earn:

In every case the load-bearing requirement is the same: a real implied move to compare the forecast against. That is the entire purpose of the deferred chain loader.

What Holds, and What Fails

It is worth stating plainly which parts of the theory the model validates and which parts are fragile.

Valid

Fragile or unproven

Limitations and Extensions

The phase-1 model is a mechanism, deliberately stripped to what free daily data and a synthetic surface allow. The decisive next step is the real options-chain loader (Phase 2/3): forward-snapshotting end-of-day chains for a liquid-earnings universe so that the implied move and the crush are observed, not assumed, at which point the same signal, sizing, and validation code becomes a tradeable test. Beyond that: replace the historical-move forecast with an IV-term-structure or skew signal; widen the universe to dozens of names so the FDR control has something to control; add the short-premium mirror that the variance risk premium argues is the favored side; snap expiries to listed weeklies and use the parsed BMO/AMC session timing (currently parsed but unused) to place entry and exit around the exact session. The honest summary is that this paper builds the apparatus and proves it works end to end on a synthetic surface — and reports, without flinching, that buying earnings volatility loses to the premium unless a real chain reveals a move the market genuinely under-prices.

References

  1. Poudel, S. (2025). Small-Cap Stock Trading Strategies for Retail Traders. Bachelor's dissertation, Metropolitan State University. SSRN 5921742. (Strategy Family E, the event-volatility strategy this model completes for options.)
  2. Patell, J. M. & Wolfson, M. A. (1979, 1981). Anticipated Information Releases Reflected in Call Option Prices; The Ex Ante and Ex Post Price Effects of Quarterly Earnings Announcements. Journal of Accounting Research. (The earnings IV run-up and crush.)
  3. Bakshi, G. & Kapadia, N. (2003). Delta-Hedged Gains and the Negative Market Volatility Risk Premium. Review of Financial Studies, 16(2), 527–566.
  4. Carr, P. & Wu, L. (2009). Variance Risk Premiums. Review of Financial Studies, 22(3), 1311–1341.
  5. Dubinsky, A., Johannes, M., Kaeck, A. & Seeger, N. J. (2019). Option Pricing of Earnings Announcement Risks. Review of Financial Studies, 32(2), 646–687.
  6. Efron, B. & Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Chapman & Hall. (The resampling test used for the per-trade mean.)
  7. Benjamini, Y. & Hochberg, Y. (1995). Controlling the False Discovery Rate. Journal of the Royal Statistical Society B, 57(1), 289–300.
  8. Bailey, D. H. & López de Prado, M. (2014). The Deflated Sharpe Ratio. Journal of Portfolio Management, 40(5), 94–107.
  9. Black, F. & Scholes, M. (1973); Merton, R. C. (1973). Option pricing foundations (the BSM pricer both legs are marked on).
  1. The implied-move approximation \(P/S\approx 0.8\,\sigma\sqrt T\) uses \(\sqrt{2/\pi}\approx0.7979\), the ratio of the mean absolute value to the standard deviation of a zero-mean normal; it is the mean-absolute move, deliberately matched to the mean-absolute forecast so the k-gate compares like with like.
  2. The synthetic surface marks both legs at \(\sigma_{\text{pre}}=0.45\) before the report and \(\sigma_{\text{post}}=0.25\) after, with \(\sigma_{\text{pre}}>\sigma_{\text{post}}>0\) enforced at construction. These are illustrative, not quoted; the Phase-1 result is conditional on them.
  3. Entry/exit are counted in actual price bars (T−3, T+1) around the first bar at or after the earnings datetime, not as business-day offsets, so gapped or holiday-shortened calendars never misplace a leg. The strategy requires the bar schedule up front because the entry bar precedes the earnings bar.
  4. The headline metrics.json keys off the default ticker (SPY) when it trades, else the first ticker whose gate fired (here NVDA); full per-ticker detail, the pooled bootstrap, and the FDR result live in validation.json.