Stealing HFT's Mean Reversion Playbook

If you Google “mean reversion strategy,” you’ll get a thousand variations of the same advice: “When price is two standard deviations below the moving average, buy. When it’s above, sell.” That’s not how the pros do it. In an HFT shop, mean reversion isn’t about Bollinger Bands - it’s about studying price movements themselves and betting that recent moves get unwound.

The same logic applies brilliantly to Polymarket. Prediction market prices are bounded between 0 and 1, news shocks cause overreactions, and retail traders pile in late on every poll release. That’s a mean reversion playground - if you build the bot the right way.

The core idea

If a price moved down today, bet it goes up tomorrow. If it moved up, bet it goes down. That’s it. No moving averages, no indicators. Just: what goes up must come down.

The trick is proving, statistically, that this pattern actually exists in your data and persists into the future. Most “strategies” overfit a pattern that’s already evaporated by the time you deploy. We use real out-of-sample validation to avoid that.

Mean reversion, visualized price oscillates around its mean

Every spike above the mean tends to revert below; every dip tends to revert above. Mean reversion is the statistical bet that this oscillation continues.

Step 01Get the data

The technique

Pull daily OHLC (Open, High, Low, Close) data for the asset you want to study. Each row is one bar: date, symbol, duration, open, close, high, low. For a daily strategy on a liquid asset like Bitcoin Cash, a few years of history is plenty.

One thing worth flagging early: Polymarket prices are probabilities (0 to 1), not unbounded asset prices. That actually helps mean reversion. A market at 0.85 literally cannot trend to infinity, so reversion is mechanically more likely.

Applied · BTC Up/Down 15-min

Polymarket runs a fresh BTC Up/Down market every 15 minutes (96 markets per day, 24/7). It’s the perfect testbed for this strategy: tight spreads, $50k-$200k of liquidity per market, and a price that mechanically pins between 0 and 1.

You don’t get OHLC bars from Polymarket. You get an order book and a trade stream. Your bot needs to either subscribe to the CLOB websocket and bucket trades into 15-minute bars yourself, or pull historical trades via the API and resample.

Treat each 15-minute window as one “bar.” The “close” of that bar is the last trade price before the market resolves. Pull at least 90 days (~8,640 bars) of history per market series before doing any analysis.

Step 02Convert prices to log returns

The technique

This is the most important conceptual move in the whole approach. Stop looking at prices. Start looking at price movements. Specifically, log returns:

log_return = log(today's close / yesterday's close)

Why log returns? Two reasons. First, they’re additive: sum them up and you get your compound rate of return. Second, they’re symmetric. A +5% log return and a -5% log return cancel out exactly, which makes the math clean.

Price vs. log return same data, different lens

Same series, two views. Price drifts and trends; log returns oscillate around zero. Statistical models work on the right one. Mean reversion lives in movements, not levels.

Applied · BTC Up/Down 15-min

Same formula. The BTC Up/Down 15-min price typically lives between 0.30 and 0.70 for most of its life because BTC genuinely is a coin flip on a 15-minute horizon. That’s the comfortable zone for log returns.

Markets near 0.02 or 0.98 (only relevant in the last few minutes before resolution) get extreme returns from tiny price moves. Either drop the final 2 bars of every market’s life from your dataset, or switch to logit-transformed returns once the market crosses 0.85 / 0.15.

log_return = log(p_t / p_{t-15min})    # for p in 0.20-0.80
logit_return = logit(p_t) - logit(p_{t-15min})    # for extreme p

Step 03Add the lag (autoregression)

The technique

Create a new column called close_log_return_lag_1, yesterday’s log return, sitting next to today’s. Now every row in the dataset says: “yesterday moved this much, today moved this much.”

This is autoregression, using a previous price movement to predict the next one. It’s the foundation of the whole strategy.

The shift, visualized six rows of data, after lag

The shift is just a one-row offset. Now every row pairs today’s return with yesterday’s, which is the whole input the strategy needs.

Applied · BTC Up/Down 15-min

Your bot maintains a small rolling buffer of the last 4 log returns for the active BTC Up/Down market. Lag-1 (just the previous 15-min bar) is usually enough to capture the reversion edge on this market.

Run the buffer in memory, not in a database. The bot reads it on every new bar close, computes the new return, then trims to the last N values. Cheap, fast, no I/O.

buffer = deque(maxlen=4)
on every 15-min close:
    p_close = last_trade_price
    log_ret = log(p_close / prev_p_close)
    buffer.append(log_ret)
    lag_1 = buffer[-2] if len(buffer) > 1 else None

Step 04Encode the direction

The technique

Reduce each lagged return to a simple sign, +1 if it went up, -1 if it went down. Throw away the magnitude on purpose. This lets you group the data into two clean buckets: “previous bar was up” vs “previous bar was down.”

direction = +1 if lag > 0 else -1

Step 05Study the price movements

The technique

This is where the mean reversion either shows up or it doesn’t. Group every row by direction (was the previous bar up or down?) and compute three numbers per bucket:

Sum of today’s log returns within each bucket
Mean of today’s log returns within each bucket
Count (how many bars fell in each bucket)

On Bitcoin Cash daily data from 2022 onward, the result is clean:

When the previous bar was down, today’s average return is positive.
When the previous bar was up, today’s average return is negative.

Today’s mean return, by previous bar’s direction BCH daily, 2022-2025

Both buckets show a positive expected value when traded in the reversion direction. The edge is small per trade (less than half a percent), but it’s a real, statistically confirmed pattern.

That’s mean reversion, statistically confirmed. The mean of each bucket is your expected value (EV) per trade, and both buckets show a tiny positive EV when traded in the reversion direction.

Applied · BTC Up/Down 15-min

Run this exact analysis on 90 days of historical BTC Up/Down 15-min bars. The pattern you’ll usually find:

Previous bar down → next bar averages +0.4% to +0.7% log return.
Previous bar up → next bar averages -0.3% to -0.5% log return.
Sample sizes are huge: ~96 bars/day × 90 days = ~8,640 trades per direction.

This is what makes the 15-min market such a clean test bed: the sample size is enormous, fees are predictable, and the per-bar move is small enough that retail overreaction shows up consistently.

Step 06Out-of-sample validation

The technique

This is the single most important step, and the one most retail “quants” skip. Split the data 75/25 by time. The oldest 75% is “in-sample,” the newest 25% is “out-of-sample.” Run the same analysis on each chunk separately.

If the mean reversion pattern shows up in both the old data and the recent data, it’s probably real. If it shows up only in old data, the pattern is dead and you’ll lose money trading it.

Financial data is non-stationary. Patterns shift. Think FTX collapsing overnight: Bitcoin’s return distribution changed dramatically in a single day. A pattern that worked from 2020-2022 might be gone by 2024.

Time-based 75/25 split pattern must hold in BOTH halves

Run the bucket analysis on the older 75% of bars and the newer 25% independently. If the reversion edge survives in both halves, it’s probably real. If it shows up only in old data, the pattern is dead.

Applied · BTC Up/Down 15-min

Even on a “stable” market like BTC Up/Down 15-min, regimes shift fast. Things that move the needle:

BTC volatility regime change (calm → volatile, or the reverse).
Macro news cycles (FOMC, CPI prints, halvings).
Big player flow (one whale farming the market for a few weeks can flip the edge).

For 15-min BTC Up/Down, recalibrate at least weekly. The recommended setup:

Every Sunday, run the bucket analysis on the trailing 90 days of bars.
Split that 90 days as 60 in-sample / 30 out-of-sample.
If the reversion edge is positive in both halves and at least 0.2% mean per bucket, keep trading.
Otherwise pause the bot until next Sunday’s check.

A pattern that validated three months ago may already be dead. Build the recalibration loop into the bot from day one.

Step 07Generate the signal and trade

The technique

The signal is dead simple. Flip the sign of the previous return:

signal = -1 * direction(lag_1)

If yesterday went down (direction = -1), signal = +1 (bet it goes up). If yesterday went up, signal = -1 (bet it goes down). Then:

trade_log_return = signal * close_log_return

This gives you the realized return of each trade. Sum them up cumulatively and you have your equity curve.

Cumulative log return, daily reversion signal ~21x over the period

A 52% win rate turns into a 21x equity curve only because every winning trade increases the next position size. This is what compounding a tiny edge looks like over four years.

Step 08Evaluate the strategy

Three numbers matter, in this order.

Win rate

On the Bitcoin Cash example, this strategy wins only 52% of trades. That’s it. People obsess over win rate and miss the point. What matters is that your average trade is positive (positive EV). A 49% win-rate strategy with big wins and small losses crushes a 70% win-rate strategy with small wins and huge losses.

Total compound return

Convert log returns back to normal returns:

total_return = exp(sum(trade_log_returns)) - 1

On the Bitcoin Cash example, this works out to ~21x over the period. Log returns naturally model compounding: every winning trade increases your next position size, every loss decreases it.

Annualized Sharpe ratio

Risk-adjusted return:

sharpe = (mean_trade_return / std_trade_return) * sqrt(N)

Where N is the number of bars per year (365 for daily crypto, 252 for daily equities, way higher for hourly bars). Higher Sharpe = smoother equity curve = safer to use leverage.

Applied · BTC Up/Down 15-min

All three metrics matter, but for the BTC Up/Down 15-min market the cost story is unusually clean:

Gas. ~$0.01-0.10 per trade on Polygon. Negligible if your position size is >$200.
Spread. 1-2 cents on the YES leg. This is your dominant cost. Account for it as a fixed -1.5% drag on every round-trip.
Resolution risk. Resolves every 15 minutes on the dot, you’ll never hold through a resolution by accident.

Realistic targets for the 15-min BTC Up/Down strategy:

Win rate: 51-54%
Average per-bar net return after costs: 0.05-0.15%
Annualized Sharpe (with 96 bars/day × 365): 1.5-2.5 if calibrated, <1 if regime is wrong

Worked example: 15-min BTC Up/Down bot

Everything above, applied end-to-end to one specific Polymarket market series. This is the running example the rest of the article has been pointing at.

The market

Polymarket lists a fresh “Will BTC be up in the next 15 minutes?” market every 15 minutes. The YES contract pays $1 if BTC is up at the next 15-min UTC boundary versus the previous one. Otherwise the NO side pays $1. New market, fresh book, every 15 minutes, all day, every day.

Anatomy of a 15-min BTC Up/Down market YES price, single market lifetime

Single 15-minute market. Price drifts on news, retail piles in late, and overshoots get unwound minute-by-minute. The reversion edge lives inside this oscillation: across 96 bars/day × 90 days that’s ~8,640 reversion opportunities to harvest.

Bot loop

Once the per-bucket EV is validated and clears spread cost, the live loop looks like this:

every 15 min at UTC :00, :15, :30, :45:
    # 1. close the previous bar
    p_close_t = last_trade_price(active_market)
    log_ret_t = log(p_close_t / p_close_t-1)

    # 2. close any open position; record realized PnL
    if open_position:
        exit_at_market()

    # 3. open the next market
    new_market = subscribe_to_next_market()
    p_open = mid_price(new_market)

    # 4. compute the signal
    direction = +1 if log_ret_t > 0 else -1
    signal    = -direction          # mean reversion: bet against last move

    # 5. check edge clears costs
    if abs(modeled_ev[direction]) < spread_cost + buffer:
        skip()
        continue

    # 6. enter
    side  = 'YES' if signal == +1 else 'NO'
    size  = 0.02 * capital          # 2% of bot capital
    place_marketable_limit(new_market, side, size, slippage=1tick)

    # 7. update stats & (weekly) re-run validation
    log_trade(...)

Expected per-bar economics

One trade, end-to-end (illustrative) $1,000 size, BTC Up/Down 15-min

Each 15-min bar throws off ~$2 net on a $1,000 trade. Tiny per bar. But you get ~96 bars/day, traded daily for years, with compounding. That’s the math behind the equity curve.

Annualized math

Punching the per-bar net through 96 bars/day, 365 days, with 2% capital sizing per trade:

96 bars/day × 365 days       = 35,040 trades/year
2% sizing × $10,000 capital  = $200 per trade
$2 net edge per $1,000       = $0.40 net per trade
0.40 × 35,040                = ~$14,000/year on $10K capital, before recompounding
With reinvestment (Kelly-ish): equity curve climbs ~3-5x per year, calibrated

That’s not “$200/year retail,” and it’s not “$25M HFT desk” either. It’s the boring middle: a small, validated edge that pays because compute is cheap and the bot trades 35,000 times a year.

Reality check

These numbers are illustrative, not a guarantee. Real performance depends on (a) whether the reversion edge is currently alive on this market, (b) how tight your execution actually is, and (c) regime stability. Run the validation pipeline before you trust any estimate. The bot’s whole job is to keep checking.

The real lesson

The Bitcoin Cash example wins 52% of its trades and 21x’s the capital because of one thing: a tiny statistical edge, traded frequently, with compounding. It’s not magic, it’s not deep learning, it’s not even a particularly sophisticated model. It’s careful data analysis, honest validation, and disciplined execution.

That’s the model to copy for Polymarket. Don’t reach for a neural network. Find a signal with a clean statistical edge, validate it survives out-of-sample, account for fees and slippage, and let compounding do the work. Re-validate constantly, because prediction markets shift faster than crypto.

The bot doesn’t need to be smart. It needs to be honest about its edge.

Building this in production?

The Discord #research-mean-reversion channel has the full notebook, the dataset, and members actively running this on live markets.

Join the Discord

Stealing HFT’s mean reversion playbook for your Polymarket bot.

The core idea

Step 01Get the data

The technique

Step 02Convert prices to log returns

The technique

Step 03Add the lag (autoregression)

The technique

Step 04Encode the direction

The technique

Step 05Study the price movements

The technique

Step 06Out-of-sample validation

The technique

Step 07Generate the signal and trade

The technique

Step 08Evaluate the strategy

Win rate

Total compound return

Annualized Sharpe ratio

Worked example: 15-min BTC Up/Down bot

The market

Bot loop

Expected per-bar economics

Annualized math

The real lesson

Building this in production?