Why the textbook version of mean reversion isn’t how the pros run it, and how to translate the real version into a working Polymarket strategy - with the math, the code, an out-of-sample validation pipeline, and a worked example on the BTC Up/Down 15-min market.
If you Google “mean reversion strategy,” you’ll get a thousand variations of the same advice: “When price is two standard deviations below the moving average, buy. When it’s above, sell.” That’s not how the pros do it. In an HFT shop, mean reversion isn’t about Bollinger Bands - it’s about studying price movements themselves and betting that recent moves get unwound.
The same logic applies brilliantly to Polymarket. Prediction market prices are bounded between 0 and 1, news shocks cause overreactions, and retail traders pile in late on every poll release. That’s a mean reversion playground - if you build the bot the right way.
If a price moved down today, bet it goes up tomorrow. If it moved up, bet it goes down. That’s it. No moving averages, no indicators. Just: what goes up must come down.
The trick is proving, statistically, that this pattern actually exists in your data and persists into the future. Most “strategies” overfit a pattern that’s already evaporated by the time you deploy. We use real out-of-sample validation to avoid that.
Every spike above the mean tends to revert below; every dip tends to revert above. Mean reversion is the statistical bet that this oscillation continues.
Pull daily OHLC (Open, High, Low, Close) data for the asset you want to study. Each row is one bar: date, symbol, duration, open, close, high, low. For a daily strategy on a liquid asset like Bitcoin Cash, a few years of history is plenty.
One thing worth flagging early: Polymarket prices are probabilities (0 to 1), not unbounded asset prices. That actually helps mean reversion. A market at 0.85 literally cannot trend to infinity, so reversion is mechanically more likely.
Polymarket runs a fresh BTC Up/Down market every 15 minutes (96 markets per day, 24/7). It’s the perfect testbed for this strategy: tight spreads, $50k-$200k of liquidity per market, and a price that mechanically pins between 0 and 1.
You don’t get OHLC bars from Polymarket. You get an order book and a trade stream. Your bot needs to either subscribe to the CLOB websocket and bucket trades into 15-minute bars yourself, or pull historical trades via the API and resample.
Treat each 15-minute window as one “bar.” The “close” of that bar is the last trade price before the market resolves. Pull at least 90 days (~8,640 bars) of history per market series before doing any analysis.
This is the most important conceptual move in the whole approach. Stop looking at prices. Start looking at price movements. Specifically, log returns:
log_return = log(today's close / yesterday's close)
Why log returns? Two reasons. First, they’re additive: sum them up and you get your compound rate of return. Second, they’re symmetric. A +5% log return and a -5% log return cancel out exactly, which makes the math clean.
Same series, two views. Price drifts and trends; log returns oscillate around zero. Statistical models work on the right one. Mean reversion lives in movements, not levels.
Same formula. The BTC Up/Down 15-min price typically lives between 0.30 and 0.70 for most of its life because BTC genuinely is a coin flip on a 15-minute horizon. That’s the comfortable zone for log returns.
Markets near 0.02 or 0.98 (only relevant in the last few minutes before resolution) get extreme returns from tiny price moves. Either drop the final 2 bars of every market’s life from your dataset, or switch to logit-transformed returns once the market crosses 0.85 / 0.15.
log_return = log(p_t / p_{t-15min}) # for p in 0.20-0.80
logit_return = logit(p_t) - logit(p_{t-15min}) # for extreme p
Create a new column called close_log_return_lag_1, yesterday’s log return, sitting next to today’s. Now every row in the dataset says: “yesterday moved this much, today moved this much.”
This is autoregression, using a previous price movement to predict the next one. It’s the foundation of the whole strategy.
The shift is just a one-row offset. Now every row pairs today’s return with yesterday’s, which is the whole input the strategy needs.
Your bot maintains a small rolling buffer of the last 4 log returns for the active BTC Up/Down market. Lag-1 (just the previous 15-min bar) is usually enough to capture the reversion edge on this market.
Run the buffer in memory, not in a database. The bot reads it on every new bar close, computes the new return, then trims to the last N values. Cheap, fast, no I/O.
buffer = deque(maxlen=4)
on every 15-min close:
p_close = last_trade_price
log_ret = log(p_close / prev_p_close)
buffer.append(log_ret)
lag_1 = buffer[-2] if len(buffer) > 1 else None
Reduce each lagged return to a simple sign, +1 if it went up, -1 if it went down. Throw away the magnitude on purpose. This lets you group the data into two clean buckets: “previous bar was up” vs “previous bar was down.”
direction = +1 if lag > 0 else -1
Sign of the previous 15-min log return. That’s the entire feature.
prev_dir = +1 if buffer[-2] > 0 else -1 # +1 means "the YES price moved up in the last 15 min" # -1 means "the YES price moved down in the last 15 min"
No moving averages, no z-scores, no indicators. Just the sign of the last bar.
This is where the mean reversion either shows up or it doesn’t. Group every row by direction (was the previous bar up or down?) and compute three numbers per bucket:
On Bitcoin Cash daily data from 2022 onward, the result is clean:
Both buckets show a positive expected value when traded in the reversion direction. The edge is small per trade (less than half a percent), but it’s a real, statistically confirmed pattern.
That’s mean reversion, statistically confirmed. The mean of each bucket is your expected value (EV) per trade, and both buckets show a tiny positive EV when traded in the reversion direction.
Run this exact analysis on 90 days of historical BTC Up/Down 15-min bars. The pattern you’ll usually find:
This is what makes the 15-min market such a clean test bed: the sample size is enormous, fees are predictable, and the per-bar move is small enough that retail overreaction shows up consistently.
This is the single most important step, and the one most retail “quants” skip. Split the data 75/25 by time. The oldest 75% is “in-sample,” the newest 25% is “out-of-sample.” Run the same analysis on each chunk separately.
If the mean reversion pattern shows up in both the old data and the recent data, it’s probably real. If it shows up only in old data, the pattern is dead and you’ll lose money trading it.
Financial data is non-stationary. Patterns shift. Think FTX collapsing overnight: Bitcoin’s return distribution changed dramatically in a single day. A pattern that worked from 2020-2022 might be gone by 2024.
Run the bucket analysis on the older 75% of bars and the newer 25% independently. If the reversion edge survives in both halves, it’s probably real. If it shows up only in old data, the pattern is dead.
Even on a “stable” market like BTC Up/Down 15-min, regimes shift fast. Things that move the needle:
For 15-min BTC Up/Down, recalibrate at least weekly. The recommended setup:
A pattern that validated three months ago may already be dead. Build the recalibration loop into the bot from day one.
The signal is dead simple. Flip the sign of the previous return:
signal = -1 * direction(lag_1)
If yesterday went down (direction = -1), signal = +1 (bet it goes up). If yesterday went up, signal = -1 (bet it goes down). Then:
trade_log_return = signal * close_log_return
This gives you the realized return of each trade. Sum them up cumulatively and you have your equity curve.
A 52% win rate turns into a 21x equity curve only because every winning trade increases the next position size. This is what compounding a tiny edge looks like over four years.
Translating “signal = +1” into actual orders:
For BTC Up/Down 15-min specifically:
Three numbers matter, in this order.
On the Bitcoin Cash example, this strategy wins only 52% of trades. That’s it. People obsess over win rate and miss the point. What matters is that your average trade is positive (positive EV). A 49% win-rate strategy with big wins and small losses crushes a 70% win-rate strategy with small wins and huge losses.
Convert log returns back to normal returns:
total_return = exp(sum(trade_log_returns)) - 1
On the Bitcoin Cash example, this works out to ~21x over the period. Log returns naturally model compounding: every winning trade increases your next position size, every loss decreases it.
Risk-adjusted return:
sharpe = (mean_trade_return / std_trade_return) * sqrt(N)
Where N is the number of bars per year (365 for daily crypto, 252 for daily equities, way higher for hourly bars). Higher Sharpe = smoother equity curve = safer to use leverage.
All three metrics matter, but for the BTC Up/Down 15-min market the cost story is unusually clean:
Realistic targets for the 15-min BTC Up/Down strategy:
Everything above, applied end-to-end to one specific Polymarket market series. This is the running example the rest of the article has been pointing at.
Polymarket lists a fresh “Will BTC be up in the next 15 minutes?” market every 15 minutes. The YES contract pays $1 if BTC is up at the next 15-min UTC boundary versus the previous one. Otherwise the NO side pays $1. New market, fresh book, every 15 minutes, all day, every day.
Single 15-minute market. Price drifts on news, retail piles in late, and overshoots get unwound minute-by-minute. The reversion edge lives inside this oscillation: across 96 bars/day × 90 days that’s ~8,640 reversion opportunities to harvest.
Once the per-bucket EV is validated and clears spread cost, the live loop looks like this:
every 15 min at UTC :00, :15, :30, :45:
# 1. close the previous bar
p_close_t = last_trade_price(active_market)
log_ret_t = log(p_close_t / p_close_t-1)
# 2. close any open position; record realized PnL
if open_position:
exit_at_market()
# 3. open the next market
new_market = subscribe_to_next_market()
p_open = mid_price(new_market)
# 4. compute the signal
direction = +1 if log_ret_t > 0 else -1
signal = -direction # mean reversion: bet against last move
# 5. check edge clears costs
if abs(modeled_ev[direction]) < spread_cost + buffer:
skip()
continue
# 6. enter
side = 'YES' if signal == +1 else 'NO'
size = 0.02 * capital # 2% of bot capital
place_marketable_limit(new_market, side, size, slippage=1tick)
# 7. update stats & (weekly) re-run validation
log_trade(...)
Each 15-min bar throws off ~$2 net on a $1,000 trade. Tiny per bar. But you get ~96 bars/day, traded daily for years, with compounding. That’s the math behind the equity curve.
Punching the per-bar net through 96 bars/day, 365 days, with 2% capital sizing per trade:
96 bars/day × 365 days = 35,040 trades/year 2% sizing × $10,000 capital = $200 per trade $2 net edge per $1,000 = $0.40 net per trade 0.40 × 35,040 = ~$14,000/year on $10K capital, before recompounding With reinvestment (Kelly-ish): equity curve climbs ~3-5x per year, calibrated
That’s not “$200/year retail,” and it’s not “$25M HFT desk” either. It’s the boring middle: a small, validated edge that pays because compute is cheap and the bot trades 35,000 times a year.
These numbers are illustrative, not a guarantee. Real performance depends on (a) whether the reversion edge is currently alive on this market, (b) how tight your execution actually is, and (c) regime stability. Run the validation pipeline before you trust any estimate. The bot’s whole job is to keep checking.
The Bitcoin Cash example wins 52% of its trades and 21x’s the capital because of one thing: a tiny statistical edge, traded frequently, with compounding. It’s not magic, it’s not deep learning, it’s not even a particularly sophisticated model. It’s careful data analysis, honest validation, and disciplined execution.
That’s the model to copy for Polymarket. Don’t reach for a neural network. Find a signal with a clean statistical edge, validate it survives out-of-sample, account for fees and slippage, and let compounding do the work. Re-validate constantly, because prediction markets shift faster than crypto.
The bot doesn’t need to be smart. It needs to be honest about its edge.
The Discord #research-mean-reversion channel has the full notebook, the dataset, and members actively running this on live markets.