The house always wins. Unless you arbitrage the house.

python blockchain ethereum polygon web3 websockets arbitrage prediction-markets docker postgresql

The 2026 World Cup is about to start, and like every four years, my brain splits in two. One half is already arguing about Brazil’s starting eleven. The other half, the engineer half, noticed something else: prediction markets were filling up with World Cup contracts faster than I’d ever seen. Every match, every group, every “will Brazil win it all” future, listed across half a dozen platforms at once.

And when the same bet is listed in six different places, the prices don’t agree.

That’s the whole idea. Will Brazil beat Croatia is one real-world fact, but Polymarket might price it at 0.62 while another venue prices the other side cheap enough that, if you buy both, you’ve locked in a profit no matter who wins. The volume the World Cup was about to dump onto these markets meant a lot more of those gaps, and a lot more liquidity to actually trade them. So I built a bot to find them and trade both sides at once.

It’s live. It’s hedging real trades across five venues. And it’s netting small, consistent profits. Here’s how it works and, more honestly, what was hard about it.


The trade in one paragraph

A YES share and a NO share for the same outcome always pay out exactly $1 together at resolution. So if you can buy YES on one venue for a and NO on another for b, and a + b comes to less than $1 after fees, you keep the difference the moment both settle. No prediction required. You don’t care who wins, you care that the two prices add up to less than a dollar.

The catch, and it’s the entire job, is making sure both markets actually settle to the same outcome. If your two legs can disagree, you don’t have a hedge, you have two bets. That gap is the basis risk, and managing it is most of the engineering.


Five venues, five different chains

This is where it stopped being a clean math problem and became a blockchain problem.

The bot trades across Polymarket (Polygon), Limitless (Base), Myriad (BNB Chain), SX Bet (its own SX Network), and Opinion (BNB Chain). Five venues, four chains, and every single one does things its own way. Each venue is a plugin, one file declaring its chain, its fee model, its signing scheme, and its execution client, so the matching and edge math don’t have to care which chain a price came from.

That abstraction sounds clean. Getting there was not.

Wallets and signing. Most venues sign orders with EIP-712, which is supposed to be a shared standard. In practice each one shapes it a little differently. Polymarket trades through a proxy wallet instead of your own address. Limitless uses a newer wallet type that some RPC endpoints can’t even read the balance of. And one venue’s order format had a field whose meaning the code and the docs disagreed on, which I only figured out by comparing a request that worked against one that didn’t.

Gas, and the absence of it. One redeem path looked fine in testing and kept failing for real. The gas limit I’d set was too low for that transaction, and the simulation I was checking against doesn’t enforce the limit, so it passed while the real send reverted. The fix was to let the chain estimate the gas instead of hardcoding it. Small bug, but the kind that only shows up once real money is moving.

SX Network is the opposite kind of surprise: it’s gasless. A relayer pays the gas, and you only ever hold USDC. So an empty native balance there is normal, not a problem, and the funding checks had to learn to leave that venue alone.

None of this is hard on its own. It’s that “sign the order and send it” means something a little different on every chain.


The WebSocket problem: you can’t trade a price you fetched 7 minutes ago

Arbitrage is a latency game. An edge that exists when you spot it has to still exist when you fire, or you’ve bought one leg of a hedge that no longer hedges. So the speed at which the bot sees prices change is not a nice-to-have, it’s the whole product.

Two of the venues, Polymarket and Limitless, stream their order books over WebSocket. I keep those books in memory, updated in real time, so reading a price is a zero-round-trip lookup. The other two, Myriad and SX, have no realtime feed at all, I checked for one and there isn’t any, so their books have to be pulled over plain HTTP requests.

Here’s where it got bad. Naively scoring every venue pair re-fetched those REST books on every pass, and a single full pass took about seven minutes. Seven minutes is an eternity. For those two venues it was an effective blind window, and they carry most of the soccer moneyline coverage, which is exactly what the World Cup needed.

Three changes, none of which touch execution correctness, cut that pass from ~7 minutes to ~18 seconds:

  • Dedup per pass. Each distinct REST book gets fetched once, not once per venue-pairing it happens to appear in.
  • Concurrent prefetch. Myriad and SX are pulled in parallel, each on its own rate-limited session, so a pass costs the slowest venue, not the sum.
  • A cheap pre-check. Before fetching the full order book, I look at the rough listed prices I already have. If even those can’t add up to a profit, there’s no point fetching the rest, the real prices will only be worse. So I skip it. That throws away a lot of dead pairs without ever skipping a real opportunity.

The part I’m happiest with is that none of this can cause a bad trade, even if a cached price is stale. The executor always re-reads fresh books one last time before placing the first leg, and for the streamed venues that re-check reads straight from the live WebSocket feed, which is fresher than a REST snapshot anyway. Detection can be fast and approximate. The fire has to be exact. Keeping those two concerns separate is what makes the whole thing safe to run.


The cleverest rule in the codebase: same-oracle guard

This is the rule that separates a real hedge from a disguised bet, so it’s worth its own section.

Crypto up/down markets look like the easiest arbitrage in the world. Will BTC be up at 3pm is listed everywhere. Buy up on one venue, down on another, collect. The trap is in the settlement: each venue decides “up or not” by reading a price oracle, and they don’t read the same one. Polymarket settles off Binance, Limitless off Pyth, Opinion off Chainlink, Myriad off Binance.

On a flat, quiet window, two oracles can disagree about the direction by a hair. When they do, both of your legs can settle the same way, and you lose both. That’s not a hedge, that’s a coin flip you paid fees to enter.

So every crypto market in the bot is tagged with the exact oracle it settles against, and the matcher refuses any pair whose two legs would settle off provably different sources. In practice that means the only genuinely safe crypto pairs today are Binance-against-Binance. A sports result, by contrast, settles on one real-world fact, Brazil scored or it didn’t, so moneyline carries almost none of this risk. Which, conveniently, is exactly where the World Cup volume is.


Running real money with no dry-run mode

There is no paper-trading mode. Any service I bring up places real orders with real funds. That sounds reckless written down, so the safety lives in the design instead of in a flag:

  • Leg ordering for zero exposure. The venue least likely to fill goes first. If it fails, you abort with no position at all. Only once leg one is confirmed does leg two go in, and if leg two somehow fails after leg one filled, an auto-unwind kicks in immediately.
  • Size caps and a circuit breaker. Each trade is a fraction of free capital, with a hard ceiling on how much of the bankroll can be deployed at once.
  • Phantom-book filters. One-sided or dead order books can fake an edge that you can’t actually trade into. Those get dropped.
  • A kill switch and a daily loss limit, plus a reconcile guard so a crash mid-trade doesn’t leave the bot confused about what it owns.

It all runs as a set of Docker containers, with state in Postgres and a small read-only dashboard, standard library only, no wallet keys ever loaded, so I can watch live PnL and open positions instead of tailing logs at midnight.


A few things that stuck

The math was the easy part. a + b < 1 is one line. The chains, the signing, the oracles, the latency, the error handling, that’s where the real work was.

Every chain breaks a different assumption. The plugin setup that hides those differences only works because I checked each venue’s actual behaviour rather than trusting the docs to match.

Fast and correct are two separate jobs. The detection side is allowed to be rough and quick. The moment before a trade goes out, it re-checks everything for real. Keeping those two apart is what lets me leave it running on its own.

It’s open source, real-money disclaimers and all. Read the code, or try to break it:

github.com/vitor-chagas/prediction-mkt-arbitrage-bot


I’m a Data and DevOps engineer working at a bank in the Netherlands. Outside of work I build things that deal with real latency, and real failure modes. This one happened to be onchain trading. If you want to talk shop, find me on LinkedIn. And, obligatorily: RUMO AO HEXA Brazil!. 🇧🇷