Kalshi Weather Bets

A plain-English look at every bet, what went wrong, and why

Data as of Feb 13, 2026 ~4:40 PM CT — DAY IS STILL IN PROGRESS. Temps still being recorded. Highs may change. Lows won't settle until overnight.

The Big Picture

Total Money Spent

$16.82

20 orders placed across 2 days

Orders That Filled

12 / 20

8 orders are still "resting" (never matched)

Money at Risk (filled)

$13.79

Only filled orders cost real money

Estimated P&L

-$7.89

1 confirmed win, 1 likely win, 10 dead or dying

How these bets work: You're betting on tomorrow's temperature in different cities. Each bet costs some cents (like 30 cents). If you're right, you get $1.00 back. If you're wrong, you lose what you paid. So a 30-cent bet can win you 70 cents profit, or lose you 30 cents. The trick is picking bets where you're right more often than the price suggests.

Scorecard

Confirmed Wins

NYC low landed 27F (needed 27-28F)

Likely Win

DEN high at 46F so far (needed ≤54F)

Dead (lows already below)

Low already recorded below target -- can't come back up

Very Unlikely (highs)

High bets still technically open but way off

Who Placed These Bets?

Important discovery: We built a math-based auto-trader (auto_trade.py) to replace Claude AI. But it was committed to git at 10:03 PM on Feb 12 and has never run live. Every single live trade was placed by Claude AI (agent.py) making its own decisions. The auto-trader only ran dry-run tests.

Claude Run 1 (1:34 - 4:08 PM)

7 orders

Trades #3-9. Claude looked at market data and picked bets using gut feeling. No probability model. No EV calculations. No forecast stored. Picked all "B" (between) bets -- narrow 2-degree windows.
3 filled, 4 resting.

Claude Run 2 (8:32 PM)

6 orders

Trades #17-22. Claude had the math model's probabilities as reference, but provided its own inflated estimates (P=0.60-0.70). The math model would have said P~0.20 for these "B" bets. Claude was overconfident.
4 filled, 2 resting.

Claude Run 3 (9:50 PM)

7 orders

Trades #30-36. Claude placed "T" (threshold) bets for the first time. Included a duplicate bet (#35 = same as #9) and two bets with buggy EV logging (showed negative EV but were actually positive -- the logging formula was wrong for NO-side bets).
5 filled, 2 resting.

The auto-trader (auto_trade.py) has never placed a real bet. It was built, tested in dry-run mode (which found correct probabilities like P=0.20 for "B" bets), committed to git, and configured for the nightly cron. Its first live run will be tonight at 8 PM CT. But the SDs (forecast error assumptions) still need to be fixed before it runs.

Why We're Losing: The Forecasts Were Wrong

Here's the #1 problem: We bet on temperatures based on what the weather forecast said. The current NWS forecast says Chicago's high will be 52°F today. So far it's only hit 45°F (and still climbing, but temps are declining in the forecast from here). We don't know exactly what the forecast said last night when the bets were placed -- the forecast wasn't stored in the database. Claude chose to bet on the 54-55°F window (B54.5), but that doesn't mean the forecast was 54°F. Either way, the forecast was significantly off from reality.

Forecast vs. Reality (High Temperatures)

Chicago

Forecast says 52°F

So far 45°F

Off by 7°F+ so far (day not over)

Austin

Forecast said 81°F

So far 72°F

Off by 9°F+ so far

Los Angeles

Forecast said 68°F

So far 61°F

Off by 7°F so far

Denver

Forecast said 55°F

So far 46°F

Off by 9°F so far

Miami

Forecast said 78°F

So far 75°F

Off by 3°F so far

New York

Forecast said ~28°F low

Low so far 22°F

Off by 6°F so far

Why this matters so much: Our model assumed weather forecasts are usually only off by about 3-4 degrees. But the current NWS forecast says 52°F for Chicago and the actual high so far is only 45°F (day still in progress). The model was too trusting of the forecast. When the forecast is wrong by more than 3-4 degrees, any "between" bet (narrow 2-degree window) becomes almost impossible to win.

How the Model Works (No BS Version)

You asked "what the hell does P=0.20 mean?" -- here's the answer, step by step, using a real bet.

Step 1: Get the weather forecast

Our bot asks the National Weather Service (NWS) "what will the temperature be in Chicago tomorrow?" NWS sends back an hour-by-hour forecast. We look at all 24 hours and find the highest predicted temperature.

Problem: we didn't save the forecast. The database has NULL for the forecast columns on CHI trades. The current NWS forecast says 52°F, but forecasts update every few hours. Last night when the bets were placed, it may have said something different. We just don't know.

Claude chose to bet on the 54-55°F window (B54.5). That's the bet window, not necessarily what the forecast said. For this example, let's use 52°F (the current forecast) as the starting point.

Step 2: The key question -- how much can the forecast be wrong?

Weather forecasts aren't perfect. Sometimes NWS says 52°F and the real temp ends up being 48°F or 56°F.

Our model says: "For Chicago in winter, the forecast is usually off by about 4 degrees." That 4-degree number is called "SD" (standard deviation). Think of it as the "average wrongness" of the forecast.

This turned out to be way too small. The forecast says 52°F but the actual high so far is only 45°F (day still in progress). That's already 7+ degrees off, and the bet needs 54-55°F which is even further away.

Step 3: Draw the "spread" of possible temperatures

Imagine the forecast says 52°F. The model draws a spread of possible actual temperatures centered on 52.

  Very                                                     Very
unlikely                 Most likely                   unlikely
   |                         |                             |
   |            . . . . . . . . . . . . . .                |
   |        .                                 .            |
   |     .                                       .         |
   |   .                                           .       |
   | .                                               .     |
   .                                                   .   |
  -+------+------+------+------+------+------+------+------+-
  36    40     44     48    [52]    56     60     64    68
                                ^
                           NWS forecast
                         (center of curve)

  |--4 deg--|         "SD" = how spread out it is

The taller the curve at a temperature, the more likely that temperature is.
Most of the weight is near 52°F. Temps like 36°F or 68°F are "almost impossible" according to this model.
Chicago's high so far is only 45°F (day still going) -- already well below the forecast center.

Step 4: What does "P=0.20" actually mean?

P = Probability = "out of 100 times, how many times would this happen?"

For the bet CHI B54.5 ("Chicago's high will be exactly 54° or 55°F"):
The model looks at its curve and asks: "what percentage of the curve falls between 54° and 55°?"

         . . . . . . . . . . . . . .
     .                |||              .
   .                  |||                .
  .                   |||                  .
 .                    |||                    .
                      |||
  ------+------+------+--+------+------
       44     48    [52]54  55    58     62

        This yellow slice = ~15-20% of the total area
        (slightly off-center from the forecast)

P=0.20 means: "If we ran this experiment 100 times, the temperature would land between 54-55 about 20 times."

That's it. It's just saying "there's about a 20% chance the temperature lands in that exact 2-degree window."

The problem: Claude looked at this 20% number and said "nah, I think it's more like 70%." Claude was wrong. The math was closer to right (although both were wrong because the forecast itself was garbage).

Step 5: Is the bet worth it? (EV = Expected Value)

EV tells you: "if I made this exact bet 100 times, would I make money or lose money on average?"

Here's the formula in plain English for the CHI B54.5 bet:

The bet costs 30 cents. If you win, you get $1.00 back (70 cents profit).

Using the model's P=0.20 (20% chance):
• 20 times out of 100: you WIN 70 cents = $14.00 total winnings
• 80 times out of 100: you LOSE 30 cents = $24.00 total losses
• After 100 bets: you're down $10.00, or -10 cents per bet

EV = -10 cents. This is a BAD bet. You'd lose money over time.

Using Claude's P=0.70 (70% chance):
• 70 times out of 100: you WIN 70 cents = $49.00 total winnings
• 30 times out of 100: you LOSE 30 cents = $9.00 total losses
• After 100 bets: you're up $40.00, or +40 cents per bet

Claude's EV = +40 cents. This LOOKS great -- but only if 70% was right. It wasn't.

The high so far is 45°F (day not over). 54-55°F looks very unlikely. If it doesn't hit, we lose 30 cents per contract.

Step 6: Why did Claude override the model?

This is the core problem. The system was set up to let Claude (an AI) use the model's probability as a suggestion, but Claude could replace it with its own number. Claude chose the 54-55°F window and said "70% chance it lands there." The math model would have said ~15-20%. Claude was being overconfident -- it didn't properly account for how wrong forecasts can be.

The auto-trader (auto_trade.py) doesn't have this problem. It only uses the math model's probability. If the model says P=0.20 and the price is 30 cents, the EV is negative, and the auto-trader skips the bet entirely. That's what should have happened here.

Bottom line: The model said "don't bet." Claude said "bet anyway." Claude was wrong.

Quick Reference

Term	What It Actually Means
P=0.20	20% chance of happening. Out of 100 tries, it would happen about 20 times.
EV = +5c	If you made this bet over and over, you'd average 5 cents profit each time. Worth it.
EV = -10c	You'd average 10 cents loss each time. Don't take this bet.
SD = 4.0	The forecast is usually off by about 4 degrees. Our "spread" is 4 degrees wide.
B54.5	"Between" bet. Temp must land in a 2-degree window (54-55°F). Hard to hit.
T55	"Threshold" bet. Temp must be above 55 or below 55 (depends on which side). Easier to hit.
YES @ 30c	You're betting "yes this will happen" and paying 30 cents. Win = $1.00 back (70c profit).
NO @ 40c	You're betting "no this won't happen" and paying 40 cents. Win = $1.00 back (60c profit).
NWS	National Weather Service. The government weather forecast. Free, but not always right.

Every Single Bet, Explained

Feb 12 — 1 bet placed

City	The Bet	In Plain English	Cost	Result	P&L
NYC	KXLOWTNYC-B27.5 YES @ 65cClaude Run 1	"NYC's coldest temp tonight will be between 27° and 28°F." Actual low: 27°F. It hit! Claude picked this with no math. Got lucky.	$0.65	WIN	+$0.35

This won, but barely worth it. You paid 65 cents to win 35 cents. That's like paying $6.50 for a chance to win $3.50. You need to be right almost 2 out of 3 times just to break even at that price. The bet was overpriced.

Feb 13 — 11 filled bets (real money), 8 resting (not matched)

City	The Bet	In Plain English	Cost	Status
DEN	KXHIGHDEN-T55 YES @ 18c × 5Claude Run 3	"Denver's high will be 54° or colder." High so far: 46°F (day not over). Way under 54°. This is almost certainly a winner. Claude's best bet. Ironically, the forecast being wrong helped here.	$0.90	Likely Win +$4.10 profit if yes
DEN	KXHIGHDEN-B59.5 YES @ 18c × 3Claude Run 1	"Denver's high will be exactly 59° or 60°F." High so far: 46°F (day not over). Off by 13+ degrees. Not even close. Claude picked this with zero math.	$0.54	Very Unlikely -$0.54
CHI	KXHIGHCHI-B54.5 YES @ 30c × 5Claude Run 2	"Chicago's high will be exactly 54° or 55°F." High so far: 45°F (day not over). Needs to climb 9+ more degrees. Claude said 70% chance. The math model said ~15-20%. Looking very unlikely. Claude overrode the model. 2nd most expensive bet.	$1.50	Very Unlikely -$1.50
CHI	KXLOWTCHI-B28.5 YES @ 27c × 5Claude Run 2	"Chicago's low will be exactly 28° or 29°F." Low already hit 27°F. Once a low is recorded, it can only stay or go lower -- never back up. Daily low will be 27°F or colder. Can't land in 28-29 range.	$1.35	Dead -$1.35
CHI	KXLOWTCHI-T29 YES @ 17c × 5Claude Run 3	"Chicago's low will be 30° or warmer." Low already hit 27°F. Lows don't go back up. Daily low will be ≤27°F. Contradicts the bet above! Run 2 bet "28-29°F" and Run 3 bet "30°F+." Both are dead. Both lose. $2.20 gone on contradictory bets.	$0.85	Dead -$0.85
LAX	KXHIGHLAX-B68.5 YES @ 51c × 5Claude Run 2	"LA's high will be exactly 68° or 69°F." High so far: 61°F (day not over). Off by 7+ degrees. Biggest single bet. $2.55 gone. Claude paid 51c/contract for a 2-degree window. Way too expensive.	$2.55	Very Unlikely -$2.55
LAX	KXLOWTLAX-T50 YES @ 29c × 5Claude Run 3	"LA's low will be 51° or warmer." Low already hit 48°F. Lows don't go back up. Daily low will be ≤48°F. Needed 51°F+ but already 3 degrees below that. Gone.	$1.45	Dead -$1.45
NYC	KXLOWTNYC-B23.5 YES @ 38c × 5Claude Run 2	"NYC's low will be exactly 23° or 24°F." Low already hit 22°F. Lows don't go back up. Daily low will be ≤22°F. Needed 23-24°F but already 1 degree below. Close miss but dead.	$1.90	Dead -$1.90
AUS	KXLOWTAUS-B59.5 YES @ 20c × 3Claude Run 1	"Austin's low will be exactly 59° or 60°F." Low already hit 54°F. Lows don't go back up. Daily low will be ≤54°F. Duplicate: Run 3 bet on the same contract again below.	$0.60	Dead -$0.60
AUS	KXLOWTAUS-B59.5 YES @ 15c × 5 DUPLICATEClaude Run 3	Same exact bet as above. Run 1 placed it at 4:08 PM, Run 3 placed it again at 9:50 PM. Claude didn't check its own previous bets. Both dead together.	$0.75	Dead -$0.75
AUS	KXLOWTAUS-T60 YES @ 17c × 5Claude Run 3	"Austin's low will be 61° or warmer." Low already hit 54°F. Lows don't go back up. Daily low will be ≤54°F. 3rd Austin low bet! Run 1 + Run 3 combined for 3 bets on Austin's low = $2.20 gone.	$0.85	Dead -$0.85

Resting Orders (Never Filled — No Money Lost)

What's a resting order? When you place a bet, sometimes nobody wants to take the other side at your price. Your order just sits there. You don't pay anything until someone matches you. These 8 orders never matched, so they didn't cost you anything. Some of them were bad bets though -- if they HAD filled, you'd be losing even more.

City	The Bet	Why It's Interesting	Would-be Cost	Status
MIA	KXHIGHMIA-T80 NO NO @ 57c × 5Claude Run 3	Logged as EV=-32c but that's a bug -- the EV formula is wrong for NO-side bets in agent.py. Actual EV was +46c. Still, Claude placed it without checking. Would have cost $2.15. Didn't fill.	$2.15	Not Filled
NYC	KXLOWTNYC-B23.5 NO NO @ 38c × 1Claude Run 3	Logged as EV=-40c (same logging bug for NO-side). Actual EV was +28c. But this bets AGAINST the NYC B23.5 YES bet from Run 2 -- contradictory.	$0.62	Not Filled
AUS	KXHIGHAUS-B80.5 YES YES @ 35c × 4Claude Run 2	"Austin high = exactly 80-81°F." High so far: 72°F. Very unlikely to hit.	$1.40	Not Filled
CHI	KXHIGHCHI-B54.5 YES YES @ 26c × 3Claude Run 1	Same contract as Run 2's filled CHI B54.5 bet. Would have added more losses.	$0.78	Not Filled
LAX	KXHIGHLAX-B68.5 YES YES @ 35c × 3Claude Run 1	Same contract as Run 2's filled LAX B68.5 bet. Would have added more losses.	$1.05	Not Filled
MIA	KXHIGHMIA-B79.5 YES YES @ 44c × 2Claude Run 1	"Miami high = 79-80°F." High so far: 75°F. Day not over but unlikely to hit.	$0.88	Not Filled
MIA	KXLOWTMIA-B61.5 YES YES @ 36c × 5Claude Run 2	"Miami low = 61-62°F." Low so far: 61°F. Could still win if low stays in range!	$1.80	Not Filled
NYC	KXLOWTNYC-B23.5 YES YES @ 34c × 3Claude Run 1	Earlier attempt at same NYC low bet (Run 2 got it filled at 38c). Would have lost.	$1.02	Not Filled

Estimated P&L Summary

Category	Amount
NYC Feb 12 win	+$0.35
DEN T55 (likely win, 5 contracts)	+$4.10
4 very unlikely losses (high bets)	-$5.04
6 dead losses (low bets already below target)	-$7.30
Estimated Total	-$7.89

What Went Wrong (The 6 Problems)

1 The weather forecast was way off. The NWS (National Weather Service) predicted temps that were 5 to 15 degrees too warm in every single city. Our bot trusted those forecasts completely. When the forecast is wrong, every bet based on it is wrong too. Think of it like this: if someone told you "it's going to be 52 degrees" and you bet money on a narrow window of 54-55, but the high so far is only 45 degrees -- you'd be in trouble.
2 Claude ignored the math and was overconfident. The math model calculated ~15-20% chance for the Chicago B54.5 bet. Claude overrode that and said 70%. Claude was shown the model's numbers but decided it knew better. It didn't. Every single live trade was placed by Claude's judgment, not the auto-trader's math.
3 "Between" bets are like hitting a bullseye. Bets like "the high will be EXACTLY 54 or 55 degrees" require the temperature to land in a tiny 2-degree window. That's really hard. Even if the forecast is right, the temp could easily be 52 or 56 instead. These are basically lottery tickets, but the model treated them like safe bets.
4 Too many bets on the same city. Austin had 3 bets all depending on the low temperature: two bets on "exactly 59-60°F" (duplicate!) and one on "61°F or warmer." When Austin's low came in at 54°F, all three lost. That's $2.20 wiped out because of one wrong number. Don't put all your eggs in one basket.
5 Some bets contradicted each other. In Chicago, we bet on "low is 28-29°F" AND "low is 30°F or warmer." These can't both be true! If the low is 28, the first wins but the second loses. If the low is 30, the second wins but the first loses. We're guaranteed to lose at least one.
6 EV logging was broken for NO-side bets. The MIA T80 NO and NYC B23.5 NO bets showed negative EV in the logs (-32c, -40c) but those numbers were wrong. There's a bug in agent.py's EV formula -- it doesn't flip the probability for NO-side bets. The actual EV was positive (+46c, +28c). The bets weren't as bad as they looked, but the bug means we can't trust the logged EV for any NO-side trade.

What Needs to Change

Before placing any more bets, these fixes are needed:

1 Increase the error margins (SDs). Change the assumed forecast error from 3-4°F to 6-10°F. This makes the model less confident and only bet when the price is really cheap relative to the probability. Fewer bets, but smarter ones.
2 Add deduplication. Check if we already have a bet on the same contract before placing another one. No more double-betting on AUS B59.5.
3 Limit bets per city. Max 1-2 bets per city per day. Don't stack 3 bets on Austin's low temperature.
4 Block contradictory bets. Don't bet YES on "low is 28-29" and also YES on "low is 30+." Pick one direction.
5 Block negative-EV bets. The code already calculates EV. Just don't submit orders where EV < 0. Simple.
6 Cross-check forecasts. Use multiple forecast sources (NWS + another API) and take the average. If two forecasts disagree by more than 5°F, skip that city entirely.