PredictionX — Pricing Engine & Agent Mathematics

§ 01Overview

PredictionX prices binary prediction market contracts on Kalshi. A Kalshi commodity contract is economically equivalent to a cash-or-nothing digital option: it pays $1 if a commodity price exceeds (or falls below) a strike level $K$ on a given settlement date, and $0 otherwise.

The engine computes a fair probability $\hat{p}$ for each contract through a four-stage pipeline:

Data Ingestion

Fetch Kalshi market price (YES and NO independently), build a futures price strip, derive annualised volatility.

Distribution Fit → p_base

Fit a log-normal distribution over the futures strip and compute P(price > K) using the d₂ formula.

Time-Value Decay → p_tv = fair_value

Model how probability evolves as DTE shrinks — the same formula applied at each intermediate DTE. The point estimate at today's DTE becomes the engine's final fair value.

Persist & Backtest

Save the pricing run to SQLite. Once the contract settles, record the outcome and compare predicted vs actual probability using proper scoring rules.

The edge is then $\text{edge} = \hat{p} - p_{\text{market}}$, where $p_{\text{market}}$ is the independently quoted Kalshi mid-price for the relevant side.

§ 02Kalshi Binary Contracts

A Kalshi commodity contract resolves to YES ($1) or NO ($0) at expiry based on a single condition:

Payoff — Above Contract $$\text{Payoff} = \begin{cases} \$1 & \text{if } S_T > K \\ \$0 & \text{otherwise} \end{cases}$$

Payoff — Below Contract $$\text{Payoff} = \begin{cases} \$1 & \text{if } S_T < K \\ \$0 & \text{otherwise} \end{cases}$$

where $S_T$ is the underlying commodity spot price at settlement date $T$, and $K$ is the strike price parsed from the ticker suffix (e.g. -T80 → $K = \$80$).

The risk-neutral fair value equals the risk-neutral probability:

Risk-Neutral Fair Value $$\hat{p}_{\text{YES}} = \mathbb{E}^{\mathbb{Q}}\!\left[\mathbf{1}_{S_T > K}\right] = \mathbb{Q}(S_T > K)$$ $$\hat{p}_{\text{NO}} = 1 - \hat{p}_{\text{YES}}$$

YES and NO prices are independently quoted

Kalshi operates a separate orderbook for each side. YES and NO bid/ask are quoted independently by market makers:

Mid-Price Calculation $$p_{\text{YES,mkt}} = \frac{\text{yes\_bid} + \text{yes\_ask}}{2} \qquad p_{\text{NO,mkt}} = \frac{\text{no\_bid} + \text{no\_ask}}{2}$$ $$p_{\text{YES,mkt}} + p_{\text{NO,mkt}} \neq 1 \quad \text{in general}$$

Example on a liquid contract:

	Bid	Ask	Mid
YES	42¢	44¢	43¢
NO	57¢	61¢	59¢
Sum of mids	102¢ ≠ 100¢

The 2¢ gap is the market-maker's spread collected when YES and NO are sold simultaneously. Buying YES at ask (44¢) and NO at ask (61¢) costs 105¢ > $1 — no arbitrage.

Arbitrage-Free Bounds

$\text{yes\_ask} + \text{no\_ask} \geq \$1$ — you cannot buy both sides for less than the $1 payout.
$\text{yes\_bid} + \text{no\_bid} \leq \$1$ — you cannot sell both sides for more than $1.

These constraints are enforced by the matching engine and do not force mid-prices to sum to $1.

Edge against the correct side's price

Edge Definitions $$\text{edge}_{\text{YES}} = \hat{p}_{\text{YES}} - p_{\text{YES,mkt}}$$ $$\text{edge}_{\text{NO}} = (1 - \hat{p}_{\text{YES}}) - p_{\text{NO,mkt}}$$

Because $p_{\text{YES,mkt}} + p_{\text{NO,mkt}} \neq 1$, YES and NO edges are not equal-and-opposite. A wide-spread contract can show positive edge on both sides simultaneously.

Fallback When NO Prices Are Unavailable

If the API omits NO bid/ask, the engine falls back to arb-consistent complements: $\text{no\_bid} = 1 - \text{yes\_ask}$, $\text{no\_ask} = 1 - \text{yes\_bid}$, giving $p_{\text{NO,mkt}} = 1 - p_{\text{YES,mkt}}$. This is noted in the debug log and stored as no_prices_from_api = False.

§ 03Futures Price Strip

A futures strip is the set of quoted futures prices across all liquid delivery months for a commodity. For WTI crude (root CL):

Delivery Month	Ticker (yfinance)	Close Price
May 2026	`CLK26.NYM`	$62.40
Jun 2026	`CLM26.NYM`	$62.10
Dec 2026	`CLZ26.NYM`	$60.80

Ticker construction follows CME naming conventions: root + month code + 2-digit year + exchange suffix.

F	G	H	J	K	M	N	Q	U	V	X	Z
Jan	Feb	Mar	Apr	May	Jun	Jul	Aug	Sep	Oct	Nov	Dec

Exchange suffixes: .NYM (NYMEX energy/metals), .CMX (COMEX precious metals), .CBT (CBOT grains), .CME (CME livestock/indices/crypto).

§ 03bPyth Network Spot Price — Gold & Silver

Two series — KXGOLDD and KXSILVERD — settle on a Pyth Network spot price feed, not on COMEX futures. Kalshi's specifications for these series reference Pyth's 1-minute candle close for XAU/USD and XAG/USD respectively. Using the COMEX futures price as the anchor for these contracts would be systematically wrong.

Settlement Kinds

Series	`settlement_kind`	Point-Estimate Source	Vol Source
KXWTI, KXBRENTD, KXNATGASD, KXCOPPERD	`futures`	yfinance strip	yfinance continuous front-month
KXGOLDD	`pyth_spot`	Pyth XAU/USD (`765d2b…`)	yfinance GC=F historical vol
KXSILVERD	`pyth_spot`	Pyth XAG/USD (`f2fb02…`)	yfinance SI=F historical vol

How the Pyth Anchor Is Applied

For pyth_spot series, the pricing flow calls data/pyth.py to retrieve the current Pyth spot price and then calls apply_spot_anchor(strip, spot_price), which replaces the strip's front-month price with the live Pyth spot and returns the adjusted strip plus computed basis:

Spot Anchor Substitution $$F_{\text{front}}^{\text{adj}} = S_{\text{Pyth}} \qquad \text{(front-month price replaced)}$$ $$\text{basis} = F_{\text{front}} - S_{\text{Pyth}} \qquad \text{(stored in distribution\_params)}$$

Failure Is Fatal for pyth_spot Series

If data/pyth.py raises PythAPIError (HTTP failure, stale publish time, non-positive price), the pricing call raises PricingError — it does not silently fall back to the COMEX futures price. A stale or wrong underlying is worse than no price at all.

Pyth Staleness Guard

Each price returned by get_spot_price() is validated against its publish_time field. A price whose publish timestamp is more than 180 seconds behind wall-clock is rejected with PythAPIError — mirroring the yfinance frozen-data guard in data/futures.py.

§ 04Forward Price Interpolation

The key input to the pricing formula is the forward price $F$ — the expected futures price at the contract's settlement month. If the exact settlement month is in the strip, it is used directly. Otherwise we interpolate in log-price space:

Log-Price Linear Interpolation $$\ln F_T = \ln F_{t_1} + \frac{T - t_1}{t_2 - t_1} \left(\ln F_{t_2} - \ln F_{t_1}\right)$$ $$F_T = \exp\!\left(\ln F_T\right)$$

where $t_1 \le T \le t_2$ are the nearest delivery dates bracketing the settlement date, measured in calendar ordinals. Interpolating in log-price rather than price space preserves positivity and is consistent with the log-normal dynamics. When $T$ is beyond the last strip date, the last price is used (flat extrapolation).

§ 05Volatility Estimation

Volatility $\sigma$ is the annualised standard deviation of log-returns. Two estimation regimes apply depending on time-to-expiry:

Long-Term Regime (DTE ≥ 7 Days)

252 calendar days of daily closing prices are fetched for the continuous front-month contract (ticker {root}=F). Log-returns are computed and annualised:

Daily Realised Volatility $$r_i = \ln\!\left(\frac{S_i}{S_{i-1}}\right), \quad i = 1, \ldots, N$$ $$\sigma_{\text{daily}} = \sqrt{\frac{1}{N-1}\sum_{i=1}^{N}\!\left(r_i - \bar{r}\right)^2}$$ $$\sigma_{\text{annual}} = \sigma_{\text{daily}} \times \sqrt{252}$$

The $\sqrt{252}$ annualisation assumes 252 trading days per year. The engine clips the result to $[0.05,\ 2.00]$ to prevent extreme values from degenerate data.

Short-Term Regime (DTE < 7 Days)

For short-dated contracts, 5 days of hourly price data are used, with an asset-class-specific annualisation factor:

Hourly Realised Volatility $$\sigma_{\text{hourly}} = \text{std}\!\left(\ln\!\left(\frac{S_t}{S_{t-1}}\right)\right) \quad \text{(over 1h intervals)}$$ $$\sigma_{\text{annual}} = \sigma_{\text{hourly}} \times \sqrt{H_{\text{year}}}$$ $$H_{\text{year}} = \begin{cases} 252 \times 17 = 4{,}284 & \text{equity indices (RTH only)} \\ 365 \times 23 = 8{,}395 & \text{commodities, crypto, metals} \end{cases}$$

Energy and commodity markets trade nearly around the clock (one maintenance hour excluded). Using asset-class-specific $H_{\text{year}}$ prevents over- or under-stating intraday vol.

§ 06The Log-Normal Price Model

The engine models the terminal price $S_T$ as log-normally distributed under the risk-neutral measure $\mathbb{Q}$:

Risk-Neutral Log-Normal Dynamics $$S_T = F \cdot \exp\!\left(-\tfrac{1}{2}\sigma_T^2 + \sigma_T Z\right), \quad Z \sim \mathcal{N}(0,1)$$

where:

$F = F(T)$ is the futures price at the settlement month — the risk-neutral expectation of $S_T$
$\sigma_T = \sigma \sqrt{T}$ is the total volatility over the remaining time $T$ (in years)
The $-\tfrac{1}{2}\sigma_T^2$ term is the Itô correction keeping $\mathbb{E}^{\mathbb{Q}}[S_T] = F$

Log-Normal Terminal Distribution $$\ln S_T \sim \mathcal{N}\!\left(\ln F - \tfrac{1}{2}\sigma_T^2,\ \sigma_T^2\right)$$

This is the Black (1976) futures pricing model applied to binary options — the industry standard for commodity options and OTC derivatives on futures.

Why Use the Futures Price as the Mean?

Under $\mathbb{Q}$, the futures price is an unbiased expectation of the future spot price (no risk premium needed). Using $F$ rather than the current spot $S_0$ correctly accounts for the term structure — e.g., oil in backwardation where $F(T) < S_0$.

§ 07The d₂ Formula — Core Probability Calculation

Given the log-normal model, the probability that $S_T$ exceeds the strike $K$ has a closed-form solution — the digital option pricing formula, derived by integrating the log-normal density above $K$:

Binary Option Probability — Above $$\mathbb{Q}(S_T > K) = N(d_2)$$ $$d_2 = \frac{\ln\!\left(\dfrac{F}{K}\right) - \dfrac{1}{2}\sigma_T^2}{\sigma_T}$$

Binary Option Probability — Below $$\mathbb{Q}(S_T < K) = N(-d_2) = 1 - N(d_2)$$

where $N(\cdot)$ is the standard normal CDF $N(x) = \int_{-\infty}^{x} \frac{1}{\sqrt{2\pi}} e^{-t^2/2} \, dt$.

Derivation

We want $\mathbb{Q}(S_T > K)$. Since $\ln S_T \sim \mathcal{N}(\mu_T, \sigma_T^2)$ with $\mu_T = \ln F - \frac{1}{2}\sigma_T^2$:

\mathbb{Q}(S_T > K) = \mathbb{Q}(\ln S_T > \ln K) = \mathbb{Q}\!\left(\frac{\ln S_T - \mu_T}{\sigma_T} > \frac{\ln K - \mu_T}{\sigma_T}\right)

Let $Z = (\ln S_T - \mu_T)/\sigma_T \sim \mathcal{N}(0,1)$. Then:

\frac{\ln K - \mu_T}{\sigma_T} = \frac{\ln K - \ln F + \tfrac{1}{2}\sigma_T^2}{\sigma_T} = -d_2$$ $$\Rightarrow\ \mathbb{Q}(S_T > K) = \mathbb{Q}(Z > -d_2) = N(d_2) \qquad \blacksquare

Intuitive Interpretation of d₂

Scenario	$d_2$	$N(d_2)$	Interpretation
Deep ITM: $F \gg K$	$\to +\infty$	$\to 1.0$	Almost certain YES
At-the-money: $F = K$	$\approx -\tfrac{1}{2}\sigma_T$	$\approx 0.47$–$0.50$	Slight negative bias from log-normal skew
Deep OTM: $F \ll K$	$\to -\infty$	$\to 0.0$	Almost certain NO
At expiry: $T \to 0$	$\to \pm\infty$	$\to 0$ or $1$	Deterministic outcome

Relationship to Black-Scholes d₁ and d₂

In standard Black-Scholes for a vanilla call, $d_1 = d_2 + \sigma_T$. The call price is $C = F \cdot N(d_1) - K \cdot N(d_2)$, where $N(d_2)$ is the probability of exercise and $N(d_1)$ is the delta. For our cash-or-nothing digital, there is no delivery of the underlying — only $N(d_2)$ matters.

Key Insight

A Kalshi binary contract is priced using only $N(d_2)$ from Black-Scholes. There is no $d_1$ term because the payoff is a fixed $\$1$, not the underlying price. This makes the formula simpler and less sensitive to model assumptions than vanilla option pricing.

Total Volatility $\sigma_T$

Long-Term (DTE in Days) $$T = \frac{\tau}{365}, \quad \sigma_T = \sigma_{\text{annual}} \sqrt{T}$$

Short-Term (DTE in Hours, Commodities/Metals) $$T = \frac{h}{365 \times 23}, \quad \sigma_T = \sigma_{\text{annual}} \sqrt{T}$$

Two floors prevent degenerate $\sigma_T = 0$ at expiry: long-term mode floors DTE at 1 day ($\tau_{\min} = 1$); short-term mode floors $h$ at 0.25 hours (15 minutes). One exception: the curve endpoint at $\tau = 0$ is computed before any floor, using the deterministic boundary rule $p(0) = \mathbf{1}_{F > K}$, ensuring DTE=0 shows 100%/0% exactly.

§ 08Time-Value Decay Curve

The distribution fit gives $p_{\text{base}}$ — the probability evaluated at today's DTE. As expiry approaches, the contract's probability drifts toward its final 0 or 1 value. The time-value decay curve models this path.

For each DTE $\tau$ from today down to expiry, the engine applies the same log-normal formula with the remaining time:

Decay Curve Formula $$p(\tau) = N\!\left(d_2(\tau)\right), \quad d_2(\tau) = \frac{\ln(F/K) - \tfrac{1}{2}\sigma^2 (\tau/365)}{\sigma \sqrt{\tau/365}}$$

At $\tau = 0$, the outcome is deterministic: $p(0) = \mathbf{1}_{F > K}$. The full curve is a list of $(\tau,\ p(\tau))$ pairs.

Shape of the Curve

Deep ITM ($F \gg K$): $p$ starts high and increases monotonically toward 1.0 — less time for price to fall below $K$.
Deep OTM ($F \ll K$): $p$ starts low and decreases toward 0.0 — less time for a rally to reach $K$.
ATM ($F \approx K$): $p \approx 0.5$ throughout with a gentle final convergence.
Near-strike, high vol: the curve is flatter, with a sharp final convergence as $\sigma_T \to 0$.

Simplifying Assumption

The forward price is held constant along the curve — the strip is not re-interpolated at each DTE. In practice, as expiry approaches, the relevant forward price shifts toward the prompt-month contract.

§ 09Scenario Analysis (±1σ Shocks)

Three scenario probabilities are computed by shocking the forward price by ±1 standard deviation of the log-return over the remaining life:

Vol Shock on Forward Price $$F_{\text{bull}} = F \cdot e^{+\sigma_T}, \quad F_{\text{bear}} = F \cdot e^{-\sigma_T}, \quad F_{\text{base}} = F$$

Scenario Probabilities $$p_{\text{bull}} = N\!\left(\frac{\ln(F_{\text{bull}}/K) - \tfrac{1}{2}\sigma_T^2}{\sigma_T}\right)$$ $$p_{\text{base}} = N\!\left(\frac{\ln(F/K) - \tfrac{1}{2}\sigma_T^2}{\sigma_T}\right)$$ $$p_{\text{bear}} = N\!\left(\frac{\ln(F_{\text{bear}}/K) - \tfrac{1}{2}\sigma_T^2}{\sigma_T}\right)$$

Note that $\sigma_T$ in the denominator of $d_2$ is unchanged — we shock the expected price path while keeping residual uncertainty fixed. This gives a spread of outcomes bracketing the base case.

Directionality for NO Side

For a "below" contract priced on the NO side, bull/bear labels are inverted: a bullish price move hurts the NO side. The terminal report applies this inversion automatically when rendering.

§ 10Aggregation — Final Fair Value

Aggregation Pipeline $$p_{\text{base}} \xrightarrow{\text{time-value at today's DTE}} p_{\text{tv}} = \hat{p}$$

$p_{\text{base}}$: log-normal probability from the distribution fit. Anchors the estimate.
$p_{\text{tv}}$: the decay-curve value at today's DTE. Mathematically equivalent to $p_{\text{base}}$ under the same time scaling, but the time-value module also produces the full $\{\tau,\ p(\tau)\}$ curve for visualisation and forward simulation.
$\hat{p}$: the final engine fair value, equal to $p_{\text{tv}}$. If the time-value module fails, the aggregator falls back to $\hat{p} = p_{\text{base}}$.

Scenarios (bull/base/bear) are computed separately using $\pm 1\sigma_T$ forward shocks — see §9. They are reported alongside $\hat{p}$ but do not participate in its definition.

§ 11Short-Term Mode (DTE < 7 Days)

Contracts expiring within 7 days receive special treatment because:

Daily closing-price vol is too coarse — a lot can happen in hours
Integer DTE (0 or 1) provides insufficient precision for time scaling $T$
Expiry times matter: a contract at 4:00 PM vs 11:59 PM EST is very different at DTE=0

Short-Term Time Scaling $$T = \frac{h}{365 \times 23} \quad \text{where } h = \text{time-to-expiry in hours (fractional)}$$ $$\sigma_T = \sigma_{\text{annual,hourly}} \times \sqrt{T}$$

$h$ is computed from the Kalshi API's close_time field (ISO 8601 UTC) minus current UTC time, giving sub-hour precision. The decay curve in short-term mode shows 8 evenly-spaced hourly points from now to expiry.

§ 12Volatility Annualisation — Regime Reference

Asset Class	DTE Regime	Data Period	Interval	Annualisation
All (long-term)	≥ 7 days	252 days	1d	$\times\sqrt{252}$
Energy, Metals, Crypto, Ag	< 7 days	5 days	1h	$\times\sqrt{365 \times 23}$
Equity Indices	< 7 days	5 days	1h	$\times\sqrt{252 \times 17}$

The 23h/day assumption accounts for the maintenance window in electronic futures markets (typically 1 hour around 5–6 PM ET). Equity index futures use 17h because extended-hours sessions have much lower liquidity.

§ 13Edge Score

Edge Definition $$\text{edge} = \hat{p} - p_{\text{mkt}}$$

The market price $p_{\text{mkt}}$ is always sourced from the Kalshi API mid-price (never entered manually). Interpretation:

Edge	Signal	Implication
$> +5\%$	Strong YES edge	Market underpricing YES; consider buying YES
$+1\%$ to $+5\%$	Mild YES edge	Modest mispricing on YES side
$\pm 1\%$	Fairly priced	No strong signal
$-1\%$ to $-5\%$	Mild NO edge	Market underpricing NO; consider buying NO
$< -5\%$	Strong NO edge	Market overpricing YES; strong signal for NO

Edge Is a Model Output, Not a Guarantee

The log-normal model is a simplification. Real commodity prices exhibit jumps, fat tails, and mean-reversion the model does not capture. Positive edge is a signal to investigate, not a mechanical trading rule.

§ 14Backtesting and Model Evaluation

Once a contract settles, the engine evaluates the prediction using proper scoring rules — mathematical measures that reward honest probability estimates and cannot be gamed by reporting extreme probabilities.

Brier Score

Brier Score $$\text{BS} = \frac{1}{N} \sum_{i=1}^{N} \left(\hat{p}_i - o_i\right)^2$$

where $\hat{p}_i \in [0,1]$ is the model's fair value and $o_i \in \{0, 1\}$ is the actual outcome. Reference values:

BS = 0.00: perfect predictions
BS = 0.25: equivalent to guessing 50% on every contract
BS = 1.00: perfectly wrong on every prediction

Brier Skill Score (BSS)

Brier Skill Score vs. Market $$\text{BSS} = 1 - \frac{\text{BS}_{\text{model}}}{\text{BS}_{\text{market}}}$$

BSS > 0 means the model beats the market price as a probability estimator. BSS is the primary long-run measure of whether the engine adds value beyond using the market mid as your forecast.

Calibration

A model is well-calibrated if, across all contracts where it predicted probability $p$, the event actually occurred $p$ fraction of the time. The engine bins predictions into 10pp buckets:

\text{event rate in bucket } b = \frac{\sum_{i \in b} o_i}{\left|b\right|}

Calibration Pattern	Diagnosis
Event rate consistently above diagonal	Model underestimates probability (too bearish)
Event rate consistently below diagonal	Model overestimates probability (too bullish)
S-shaped: too low at extremes, too high in middle	Model is over-confident — probabilities too extreme
Inverse-S: too high at extremes, too low in middle	Model is under-confident — probabilities too conservative

Mean Absolute Error and Bias

MAE and Bias $$\text{MAE} = \frac{1}{N}\sum_{i=1}^{N} \left|\hat{p}_i - o_i\right|$$ $$\text{Bias} = \frac{1}{N}\sum_{i=1}^{N} \left(\hat{p}_i - o_i\right)$$

Positive bias means the model systematically overestimates the probability of the event occurring.

Edge Direction Accuracy

Directional Accuracy $$\text{edge\_correct}_i = \begin{cases} 1 & \text{if } \text{edge}_i > 0 \text{ and } o_i = 1 \\ 1 & \text{if } \text{edge}_i < 0 \text{ and } o_i = 0 \\ 0 & \text{otherwise} \end{cases}$$ $$\text{Edge Accuracy} = \frac{1}{N}\sum_{i=1}^N \text{edge\_correct}_i$$

PnL per Dollar Risked

Realised PnL per Dollar $$\text{PnL} = o - p_{\text{mkt}}$$

Averaging this over all runs in an edge bucket gives the realised expected value of acting on that edge signal. Positive average PnL in the high-edge bucket validates the model's alpha.

Why Proper Scoring Rules Matter

The Brier score is strictly proper: it is minimised in expectation only when the forecast equals the true probability. A model that reports extreme probabilities to artificially improve its score will actually increase its Brier score. BSS improvement is genuinely signal, not an artefact of reporting strategy.

§ 15Agent Pipeline Overview

The autonomous strategy agent runs a continuous scan-price-signal-size-record loop, evaluating all open Kalshi contracts and recording sizing decisions in shadow mode by default. Every decision is persisted to the strategy_decisions table with full provenance so the loop can be evaluated like a live trading system.

Scan

List all open Kalshi contracts across every configured series via the public API.

Filter

Apply six lightweight pre-pricing checks: priceable series, deduplication, volume, mid-price range, TTX/DTE window, and spread quality.

Price

Run the full pricing engine for each contract that passes filters → raw_edge = fair_value − market_price.

Confidence Regression

Look up OLS coefficients (α, β) from historical resolved runs for this segment → adjusted_edge = α + β × raw_edge.

Classify

Compare adjusted_edge to threshold θ: watch (below θ), auto-buy ([θ, 2θ)), or clear-buy (≥ 2θ). No per-cycle LLM call.

Size

Compute position size under two parallel strategies: fractional Kelly and Fixed dollar amount.

Record

Write the decision to strategy_decisions. Update the bankroll curve. After contract settlement, auto-resolve and compute PnL.

Shadow Mode

kelly_action and fixed_action are written to the database as if trades were placed, but no order is submitted to Kalshi. The agent simulates a live portfolio using mark-to-market PnL computed at settlement. Live mode is enabled by --live or AGENT_LIVE_MODE=true.

§ 16Pre-Signal Filters

Six filters are applied before any pricing or database lookup — ordered cheapest to most expensive. The first failing check short-circuits evaluation.

#	Filter	Default Threshold	Reason
1	Priceable series	`futures_root ≠ ""`	Series must map to a futures root; unknown series skipped entirely.
2	Deduplication	Not in `v_open_positions`	Avoid accumulating duplicate positions in the same contract.
3	Minimum volume	500 contracts	Illiquid contracts have wide spreads and unreliable mid-prices.
4	Mid-price range	5% – 95%	Deep OTM/ITM contracts are near-resolved; edge signal is noisy.
5	TTX / DTE window	2h – 26h (short) or 1d – 14d (medium)	Avoid contracts too close (no time to act) or too far (vol estimates unreliable).
6	Spread quality	yes_spread / yes_mid ≤ 8%	Wide bid/ask makes the mid-price an unreliable proxy for fair value.

TTX Regime Split

A contract is classified as short-term if dte_hours ≤ 26 (approximately one trading day). Short-term contracts use hourly volatility and TTX buckets. Medium-term contracts use daily DTE and daily vol. The agent applies separate TTX/DTE range filters for each regime.

§ 17Confidence Regression and Adjusted Edge

A raw-edge threshold alone ignores whether the pricing model's edge estimates are systematically over- or under-stated in specific market segments. The agent estimates a linear regression from historical raw_edge to realised PnL per dollar risked, fitting two coefficients that capture bias and calibration quality independently.

Training Data

YES Side $$x_i = \hat{p}_i - p_{\text{YES,mkt},i} \quad \text{(raw YES edge)}$$ $$y_i = o_i - p_{\text{YES,mkt},i} \quad \text{(realised PnL per dollar risked)}$$

NO Side $$x_i = (1 - \hat{p}_i) - p_{\text{NO,mkt},i} \quad \text{(raw NO edge)}$$ $$y_i = (1 - o_i) - p_{\text{NO,mkt},i} \quad \text{(realised PnL per dollar risked on NO)}$$

OLS Regression

Ordinary Least Squares $$y = \alpha + \beta \, x + \varepsilon$$ $$\hat{\alpha},\ \hat{\beta} = \underset{\alpha, \beta}{\arg\min} \sum_{i=1}^{N} \left(y_i - \alpha - \beta\, x_i\right)^2$$

Interpreting the coefficients:

$\alpha$ (intercept — bias): expected PnL when raw_edge = 0. If $\alpha < 0$, the model loses on zero-edge trades — systematic overconfidence or adverse selection.
$\beta$ (slope — calibration): additional realised PnL per unit of raw_edge. $\beta = 1$ means raw edge is perfectly predictive; $\beta < 1$ means edges are overstated; $\beta > 1$ means the model is conservative.

Adjusted Edge Formula

Adjusted Edge $$\text{adjusted\_edge} = \hat{\alpha} + \hat{\beta} \cdot \text{raw\_edge}$$

Fallback Prior

When fewer than min_samples (default 50) resolved runs exist for a segment, the agent falls back to a conservative neutral prior:

\hat{\alpha} = 0.0,\quad \hat{\beta} = 0.5 \quad \Rightarrow \quad \text{adjusted\_edge} = 0.5 \times \text{raw\_edge}

Confidence Hierarchy

Level	Segment	Example Label
1 — Most specific	Series + direction + TTX/DTE bucket	`KXGOLDD/above/4-8h`
2	Series + TTX/DTE bucket (any direction)	`KXGOLDD/4-8h`
3	Global + TTX/DTE bucket (all series)	`global/4-8h`
4 — Fallback	Global average (all segments)	`global_average`

TTX buckets for short-term: <2h, 2–4h, 4–8h, 8–16h, 16–26h. DTE buckets for medium-term: 1d, 2–3d, 4–7d, 8–14d.

§ 18Signal Classification

Two thresholds govern classification:

$\theta$ — the segment-overridden minimum adjusted edge. Starts at min_adjusted_edge (default 0.072) and may be raised/lowered by a min_edge_override. Used for the watch/buy boundary.
$\theta_0$ — the global, unoverridden config.min_adjusted_edge (default 0.072). Used exclusively for the auto-buy/clear-buy boundary at $2\theta_0$. Segment overrides never move this line.

Tier	Condition	Recorded Action
Watch	$\text{adjusted\_edge} < \theta$	`watch` — zero size, DB row written for analysis
Auto-Buy	$\theta \le \text{adjusted\_edge} < 2\theta_0$	`buy` — rule-based, proceeds immediately to sizing
Clear-Buy	$\text{adjusted\_edge} \ge 2\theta_0$	`buy` — same execution as auto-buy; label preserved in analytics for high-confidence tracking

Sanity Gate — Implausible Edge

Before any buy proceeds to sizing, raw_edge is checked against max_raw_edge_sanity (default 0.25). If raw_edge ≥ 0.25, the classification is forced to watch with skip reason raw_edge_implausible:<value>. Sane liquid markets do not leave 25 percentage points on the table; a reading this extreme almost certainly signals a stale forward price or input error.

Why Keep the Clear-Buy Label?

Separating clear-buy from auto-buy in the DB enables the periodic review to analyse whether the model's highest-confidence signals outperform relative to the auto-buy tier — without requiring a per-cycle LLM call to generate the distinction.

LLM Observability (Startup Preflight)

Status	Condition
`ok`	API key present — periodic review LLM will be available
`error`	Key missing — periodic review runs rule-based findings only, no Claude synthesis

§ 19Position Sizing

Kelly Criterion for Binary Contracts

A Kalshi YES contract bought at price $p_{\text{mkt}}$ has payoff structure: win (probability $\hat{p}$) profit = $1 - p_{\text{mkt}}$; lose (probability $1 - \hat{p}$) loss = $p_{\text{mkt}}$. The Kelly fraction maximises expected log-wealth growth. For net odds $b = (1 - p_{\text{mkt}}) / p_{\text{mkt}}$:

Kelly Fraction — Binary Contract $$f^* = \frac{\hat{p} \cdot b - (1 - \hat{p})}{b} = \frac{\hat{p} - p_{\text{mkt}}}{1 - p_{\text{mkt}}} = \frac{\text{edge}}{1 - p_{\text{mkt}}}$$

A half-Kelly multiplier $\lambda = 0.5$ and a hard cap $f_{\text{max}}$ (default 10% of bankroll) reduce variance and ruin risk:

Final Kelly Fraction and Size $$f_{\text{adj}} = \lambda \cdot f^* = \frac{\lambda \cdot \text{edge}}{1 - p_{\text{mkt}}}$$ $$f_{\text{final}} = \min\!\left(f_{\text{adj}},\ f_{\text{max}}\right)$$ $$\text{size}_{\text{USD}} = f_{\text{final}} \times W_{\text{avail}}$$

Cash-on-Hand Sizing and EV-Ranked Allocation

Available Kelly Bankroll $$W_{\text{avail}} = \max\!\left(0,\ W_{\text{snapshot}} - \sum_{i \in \text{open}} \text{size}_{\text{USD},i}\right)$$

Within a single cycle, candidates are funded in decreasing order of $f^*$. After each buy, $W_{\text{avail}}$ is decremented before the next candidate is sized:

Within-Cycle Priority $$\text{priority}(c) = \frac{\text{adjusted\_edge}_c}{1 - p_{\text{mkt},c}}, \quad p_{\text{mkt},c} < 1$$ $$W_{\text{avail}}^{(k+1)} = \max\!\left(0,\ W_{\text{avail}}^{(k)} - \text{size}_{\text{USD},c_k}\right)$$

Why Half-Kelly?

Full Kelly maximises long-run growth but produces extreme drawdowns and is highly sensitive to model error. Half-Kelly gives approximately 75% of the growth rate with much lower variance, and is standard practice for strategies with uncertain probability estimates.

Fixed Strategy

Fixed Position Size $$\text{size}_{\text{USD}} = \begin{cases} A & \text{if } \text{adjusted\_edge} \ge \theta \\ 0 & \text{otherwise} \end{cases}$$

where $A$ is a constant dollar amount (default $50) and $\theta$ is the same segment-overridden threshold used for Kelly classification. Fixed is a benchmark: lower variance than Kelly but does not scale with edge conviction.

Fixed Never Trades Live

Fixed is a permanent shadow benchmark — it records decisions exactly as Kelly does, but no real Kalshi order is ever placed on behalf of the Fixed strategy.

Bankroll Tracking

Bankroll Update $$W_{t+1} = W_t + \sum_{\text{resolved at }t} \text{PnL}_i$$ $$\text{PnL}_i = \text{size}_{\text{USD},i} \times \frac{o_i - p_{\text{mkt},i}}{p_{\text{mkt},i}}$$

This is the dollar profit from buying size_USD worth of contracts at $p_{\text{mkt}}$: each dollar spent buys $1/p_{\text{mkt}}$ contracts, each paying $1 or $0 at settlement. Snapshots are append-only and never amended.

§ 20Segment Overrides and Periodic Review

The agent runs a periodic review at most daily, or when review_sample_threshold (default 50) new resolutions have accumulated since the last review. The review identifies market segments where the confidence regression is systematically wrong and applies segment overrides that take effect immediately — no restart required.

Review Process

Load analytics views (v_by_ttx_bucket, v_by_dte_bucket, v_by_moneyness, v_strategy_comparison, etc.)
Flag segments with n ≥ review_min_segment_n (default 30) and edge accuracy below 0.45 or above 0.65.
Call claude-sonnet-4-6 with tool use — the LLM can query allowed analytics views, inspect bankroll curves, and retrieve prior overrides.
LLM returns override recommendations as structured JSON.
Overrides are written to segment_overrides and take effect on the next agent cycle.

Override Types

Type	Effect on Regression Coefficients
`confidence_multiplier`	$\hat{\beta}' = v \cdot \hat{\beta}$, $\hat{\alpha}$ unchanged. Use $v < 1$ to dampen over-trading; $v > 1$ to boost a reliably undervalued segment.
`exclude`	$\hat{\alpha}' = 0,\ \hat{\beta}' = 0\ \Rightarrow\ \text{adjusted\_edge} = 0$. Segment always produces watch/skip regardless of raw edge. Used for persistently negative realised PnL.
`min_edge_override`	Replaces `min_adjusted_edge` ($\theta$) for the specific segment. Raises the bar for low-quality segments or lowers it for highly reliable ones.

Override Preservation

The periodic review only deactivates prior permanent overrides when the LLM produces ≥1 valid replacement. A silent review never wipes the active override set — this fixed a live-trading incident where a review wiped a KXWTI exclude 47 seconds before 7 fresh KXWTI buys.

§ 21Live-Mode Safety Guards

The live execution path applies a layered set of safety checks introduced after Day-1 of live trading exposed several failure modes. All guards are evaluated in order; the first failing guard determines the skip reason.

Data-Quality Guards (Before Pricing)

Guard	Threshold	Failure Action
Stale daily data	Last bar older than 96h	Raise `FuturesDataError`
Stale hourly data	Last bar older than 6h	Raise `FuturesDataError`
Frozen futures price	Price identical (\|Δ\| < 1e-4) for ≥5 consecutive hourly readings over 6h	Raise `FuturesDataError`
Stale Pyth feed	`publish_time` more than 180s behind wall-clock	Raise `PythAPIError`

Per-Order Live Guards (Before Placing a Kalshi Order)

These guards evaluate against a per-cycle running state (_LiveCycleState) so two candidates within the same cycle cannot collectively breach a cap that each one would pass individually.

Guard	Condition Checked	Skip Reason
No quoted ask	`ask ≤ 0` for the evaluated side	`no_ask`
Insufficient live cash	Order cost (`count × ask`) exceeds running working cash	`insufficient_live_cash:<avail><<cost>`
Directional asymmetry	(series, direction) already has ≥ `max_same_direction_buys_per_cycle` (default 3) buys	`directional_asymmetry:…`
Per-series exposure cap	Post-trade series exposure / account value > 20%	`series_exposure_cap:…`
Portfolio exposure cap	Post-trade total open exposure / account value > 40%	`portfolio_exposure_cap:…`

Account Value and Exposure Definitions $$\text{total\_account\_value} = \text{Kalshi\_cash\_balance} + \text{open\_position\_cost}$$ $$\text{series\_frac} = \frac{\text{series\_exposure}[\text{series}] + \text{order\_cost}}{\text{total\_account\_value}}$$ $$\text{portfolio\_frac} = \frac{\text{running\_exposure} + \text{order\_cost}}{\text{total\_account\_value}}$$

Phantom-Exposure Guard

Open exposure is computed only for rows where order_status IN ('pending', 'filled') AND kalshi_order_id IS NOT NULL. Rows where placement crashed or the order was cancelled are excluded, preventing a failed placement from leaving phantom exposure that would artificially inflate position caps.

Order Type Restriction

All live orders are limit only, price clamped 1–99¢. Market orders are not supported by design: a stale or buggy signal produces an unfilled resting order rather than sweeping the book at an adverse price.

§ 22Model Limitations and Assumptions

Log-Normal Dynamics

The model assumes continuous log-normal price paths. Real commodity prices exhibit:

Jump risk: sudden gap moves from OPEC announcements, weather events, geopolitical shocks
Fat tails: extreme moves occur more frequently than the normal distribution predicts
Mean reversion: energy prices tend to revert to long-run equilibrium cost of production

Constant Forward Price

The forward price $F$ is interpolated from today's futures strip and held constant throughout the decay curve. In practice, $F$ shifts as new information arrives.

Constant Volatility

A single annualised $\sigma$ is used throughout the remaining contract life. Real markets exhibit a volatility term structure and a volatility smile. The engine uses a single number — adequate for ballpark fair-value estimation, not for precision near-the-money pricing.

Risk-Neutral vs. Real-World Measure

The engine operates under $\mathbb{Q}$, using the futures price as the expected terminal value. Risk premiums (e.g., the crude oil risk premium embedded in backwardation) are implicitly absorbed into the futures price but not separately modelled.

Short-Term Forward Price Accuracy

For very short-dated contracts, the front-month futures price is a reasonable proxy for the eventual settlement price, but intraday basis can be significant. A more precise implementation would use the specific expiry-month contract's last traded price.

§ 23Glossary

Term	Definition
$F$	Forward price — futures price interpolated to the settlement month
$K$	Strike price — target price level in the Kalshi contract (e.g. $80)
$\sigma$	Annualised volatility of log-returns
$\sigma_T$	Total volatility over remaining life: $\sigma\sqrt{T}$
$T$	Time to expiry in years: $\tau/365$ (long-term) or $h/(365 \times 23)$ (short-term)
$d_2$	Standardised distance from strike: $(\ln(F/K) - \tfrac{1}{2}\sigma_T^2)/\sigma_T$
$N(\cdot)$	Standard normal CDF
$p_{\text{base}}$	Log-normal probability from the distribution fit
$p_{\text{tv}}$	Time-value point estimate (same formula at today's DTE)
$\hat{p}$	Final YES fair value — the engine's $P(\text{target met})$, equal to $p_{\text{tv}}$
$p_{\text{YES,mkt}}$	YES mid-price from Kalshi: (yes_bid + yes_ask) / 2
$p_{\text{NO,mkt}}$	NO mid-price from Kalshi: (no_bid + no_ask) / 2 — independently quoted, not 1 − YES mid
edge	$\hat{p} - p_{\text{YES,mkt}}$ for YES; $(1-\hat{p}) - p_{\text{NO,mkt}}$ for NO
DTE	Days to expiry (whole calendar days)
TTX	Time to expiry in hours/minutes (short-term display)
ITM	In-the-money: $F > K$ for an "above" contract
OTM	Out-of-the-money: $F < K$ for an "above" contract
ATM	At-the-money: $F \approx K$
$\mathbb{Q}$	Risk-neutral probability measure; futures price = $\mathbb{E}^{\mathbb{Q}}[S_T]$
Futures strip	Set of futures prices across all liquid delivery months
Log-normal	Distribution where $\ln X \sim \mathcal{N}(\mu, \sigma^2)$; guarantees $X > 0$
Cash-or-nothing digital	Option paying a fixed cash amount if underlying exceeds strike; equivalent to a Kalshi YES payout
Brier Score (BS)	Mean squared error of probability forecasts vs. binary outcomes; range [0, 1], lower is better
Brier Skill Score (BSS)	$1 - \text{BS}_\text{model}/\text{BS}_\text{market}$; positive means model beats market price as a probability estimate
Calibration	When the model says $p\%$, the event occurs $p\%$ of the time across many predictions
MAE	Mean absolute error: average $\|\hat{p} - o\|$ across resolved contracts
Bias	Mean signed error $(\hat{p} - o)$; positive = model over-estimates probability
Edge accuracy	Fraction of runs where edge direction matched the actual outcome
raw_edge	Unadjusted edge signal directly from the pricing engine
adjusted_edge	$\alpha + \beta \cdot \text{raw\_edge}$ — scaled by OLS regression coefficients
$\alpha$ (intercept)	Expected realised PnL when raw_edge = 0; captures systematic model bias
$\beta$ (slope)	Additional realised PnL per unit of raw_edge; $\beta=1$ = perfectly calibrated
pyth_spot	Settlement kind for KXGOLDD/KXSILVERD: settles on Pyth Network XAU/USD or XAG/USD, not COMEX futures
spot_basis	Pyth spot − futures front-month price, stored in `distribution_params` for pyth_spot series
$\theta_0$ (global threshold)	Unoverridden `config.min_adjusted_edge` (default 0.072); the auto-buy/clear-buy split at $2\theta_0$
Shadow mode	Agent mode where decisions are logged as if trades were placed but no order is submitted to Kalshi
Kelly fraction	Optimal bet size as fraction of bankroll: $f^* = \text{edge}/(1-p_{\text{mkt}})$; half-Kelly ($\lambda=0.5$) used in practice
Segment override	Per-segment adjustment to α, β, or θ written by periodic review; takes effect next cycle
Clear-buy	Contract classified at adjusted_edge ≥ 2θ; same execution as auto-buy, label preserved in analytics
TTX bucket	Time-to-expiry range bucket for short-term contracts (<26h): e.g. "4-8h"