Match Odds - An Evolving Strategy
Post date: 2016-09-26 | Updated: 2017-04-24
Online betting sites allow any cricket enthusiast to bet significant amounts of money on the result of live matches. If all available data was used to generate live odds, it might be possible to gain an edge over the run-of-the-mill gambler and even a somewhat sophisticated bettor.
Base Odds: Historical Match Scenarios
The obvious method to develop odds is to use similar historical match scenarios. Say a team is chasing 150/3 at 30 overs in a 50 over match chasing 275 (125 runs required). If you check similar match scenarios that occurred historically within a certain range (of the runs, wickets and overs) and then check what percentage of those unique scenarios ended up as wins, it is possible to come up with a base win percentage for the chasing team.
As the table above shows, for the example scenario this method finds 339 similar unique historical match situations (when multiple instances of the same match that match are identified, the mean values are used). Of those 339 match scenarios, 192 resulted in wins and 135 were losses. Using the ratio of wins, the win odd percentage is 56.64% (192/339).
to find base odds of different match scenarios.
Factor Model: Adjust with Additional Features
Even though using historical match scenarios results in decent odds values, it also ignores some significant factors. For example:
- Team Rating Difference
- Location (Home/Away for international matches)
- Momentum (Increasing odds in recent overs)
- Expected Score (using mean ground run totals)
Adjusting the base odds using additional factors this way leads to better odds when back-tested historically.
Machine Learning: Feature Engineering Focus
Even though the factor model results in better odds, a disadvantage of that method is that it is unclear how influential each factor should be. Does momentum really matter? Does how difficult or easy run scoring was on a particular ground 5 years ago affect the match outcome today? Even though these factors seem to add value, figuring out how much of an effect they make is cumbersome - involving painstaking back-testing work which does not always give a clear answer.
Using machine learning
techniques solves this issue by letting the algorithm decide how important a feature is. It allows one to focus on creative feature engineering - coming up with ideas for features and getting instant feedback on their impact.
For example, if India with their strong batting line-up was chasing in the above example scenario against a weak Zimbabwe bowling attack, the odds of them pulling of the win is probably higher. It is also possible to drill down to the player level by aggregating the current ratings of India's batsmen and Zimbabwe's bowlers. Since 60% of the 2nd innings has been completed (30/50 overs), Zimbabwe's bowling resources can be adjusted to be 40% of the starting value. Similarly, the current ratings of the 3 Indian batsmen that were dismissed would also need to be deducted from the aggregate batting rating.
, Random Forest
and Gradient Boosting
methods were among the methods used for this exercise. Through this, it was possible to identify specific factors that actually influenced the match outcome:
- Base Odds using similar historical match scenarios
- Run Rate (Required Run Rate for 2nd innings)
- Batting-Bowling Rating Difference
- Team Rating Difference
- Runs (Runs Required for 2nd innings)
For example, the Required Run Rate factor has a clear connection with the match odds as shown in the graph below:
Different machine learning algorithms with specific parameters were found to be effective in different match scenarios (early in an innings vs late and 1st innings vs 2nd). Overall, machine learning methods improved on match odds by 10-15% (using F1 scores
and ROC AUC scores
as measures) over the factor odds, which already performed better than the base odds.
Live Odds: Live Comparison with Betting Sites
By pulling in betting site prices and calculating the implied odds (using the inverse of the mean of the best back and lay prices) live, it is possible to display a live snapshot of an ongoing match. This allows one to compare the betting site odds with the calculate odds and potentially profit from significant odds mismatches. The chart below shows a snapshot of a match:
to track current live matches - updating every 2 minutes.