Wildcard spread picks; Season-long bankroll simulations
What a year. While I ended up winning my fantasy league, I’m always sad to have the regular season come to an end. I really enjoy calculating the cover index each week; it’s too bad that the regular season lasts only 18 short weeks. The machine only really “knows” regular season games so its projections in the post season are not necessarily to be trusted. But what the hell? Might as well make some predictions anyway. My Raider-Turned-Patriot fan father (see week 12 picks and week 12 analysis for full details) proposed a post-season game picking competition: his picks vs those of the machine. Against my better judgement – knowing that picking post-season was not baked into the machine’s DNA – I agreed. I can’t resist some friendly competition. The stakes? A dinner for 4 (Mom, Dad, my girlfriend and myself) at a restaurant of the winner’s choosing.
So – with bold caveats in place: I have no idea what to expect of the algorithm in the post-season – here are the picks:
In case you are interested in following along with my paternal pick-off, the only game that my father and my machine disagree on is the KC and IND game where he is taking KC plus the points. Good luck to everyone this week.
Season long performance
I want to take some time to reflect on the regular season, the machine’s performance, and what we might expect if we projected this year’s performance into next season. To keep the length/redundancy of this post to a minimum, we will only explore the three key thresholds we tracked since week 11: all picks, 3.0 or greater Cover.Index, and 5.0 or greater Cover.Index.
All Picks (56 of 102, or 55%)
This season long data includes weeks 11 and 12 – weeks in which I used a very different (read ‘worse’) formulation of the algorithm – and week 13, the week of Thanksgiving, when I was tweaking the code and ended up using a sub-optimal version of the algorithm with disastrous results. The performance in weeks 14-17 was much improved, but since the idea is to track real-world performance, including all the ugly realities that come up when undertaking a project like this, I feel it’s most proper to include all the picks that I’ve ever published in all of my calculations and simulations.
With these potential drags on performance in mind, 56 of 102 for all picks isn’t terrible. It isn’t great either – but fortunately we don’t have to bet every game, every week. We can be selective on the games we choose to bet on, and that’s the function of the Cover.Index (SCI). The absolute value of the SCI is a measure of the machine’s confidence in a particular pick and using it as a threshold allows us to target in on games we think we have a better chance of picking correclty.
Here is the performance for games with an SCI greater than or equal to 3.0
Ah, much better. 64% ATS for 44 games is a sizable hot streak and very unlikely to be the product of chance. At the 3.0 SCI threshold, the algorithm did appear to have some predictive power.
Here is the performance for games with an SCI greater than or equal to 5.0
At 5.0 or above, the cover index was 77% accurate against the spread in 26 games. For any given week this could be construed as pure luck – but with a sample size of 26 games I feel comfortable believing that there’s really something predictive happening here.
Simulating betting the machine’s picks for a full season
If the machine really is predictive, why don’t we bet on it? First off – of course I’ve bet on it. I think it was week 12 in which my parents were making their way from California back to Boise and during a stop in Reno we put a few bucks at risk for the top 5 most confident picks. We went 3 of 5 and walked away with a profit – can’t complain. I also participate in an awesome season-long pool where I have to pick every game against the spread. Before I started using the algorithm in week 11 I was middle of the pack, but I ended the season in a tie for 5th which was at the very edge of the money. So sure, the machine made me $200 bucks this year… Talk about making $200 bucks the hard way!
But my mind naturally desires a more, well… substantial test. What would happen if you bet these picks week in and week out, all season long, for real money?
First, let’s lay out some assumptions and describe the visualizations we are about to explore. Using the various threshold’s track records from this year and the power of the symbolic programming language Mathematica I created what’s known as an empirical distribution from the past performance of the algorithm. This allows me to use Mathematica‘s RandomVariate functionality to simulate a hypothetical week’s performance that matches the distribution of our past performance. In other, less nerdy words – I’m creating hypothetical performances week to week that match the statistical profile of the performance we observed last year.
As a basis for analysis I’m assuming that we start with $1000 and that all bets are of $100. If we ever hit $0 we stop betting (c’mon – we aren’t total degenerates … right?!?) , and we never increase our bet size even if our bankroll increases. This is different from what the exponential returns might look like. The exponential scenario is one in which you invest a percentage of your bankroll in each pick – so as your bankroll increases so too does your bet size. But due to the programming difficulties, the already too-long post, and the fact that I’ve only got a few hours to work on this I stuck with the simpler assumption of a flat $100 bet structure. I’ve also assumed a 10% “juice” in the returns – so if you place a $100 bet and win, you get $190 back (not $200). The extra $10 is your required donation to Guido and his band of thug buddies… such is life.
In the scenario in which we are picking every game things are pretty straightforward. But to be realistic in the 3.0 and 5.0 SCI cases we have to account for the fact that there are only so many games available above these thresholds for us to pick each week. To approximate this fact I used the numbers from last year, making sure to account for the weeks in which there are fewer total match-ups due to bye weeks. In week 1 there were 16 games but we might only have 10 games above 3.0 SCI and 7 games above 5.0. In week 8 there are only 13 games due to bye weeks, and so we might only have 8 games above 3.0 and 5 or 6 above 5.0.
I simulated 1000 seasons for each threshold, and the plots you see contain the 1st, 10th, 25th, 75h, 90th and 99th percentile season long outcomes (where a lower percentile implies better performance, which is admittedly bit confusing/backwards). It also contains the median performance – or the 50th percentile – which we might consider an “average” year (I know this isn’t a mathematically precise description of median or average… so shut it!).
Ok – enough of this background crap – how rich are we gonna get!?!
Here are the simulated seasons if, like the indiscriminate degenerates we are, we bet every game, every week, no matter the threshold, no matter if we can still afford our steady diet of top ramen and coors original… We can call this the hopeless degenerate scenario.
Uhhhh… what? Hmmm… well, the 1st percentile performance is solid – we go from $1000 starting bankroll to almost $5000 over the course of the season. But in our 1,000 simulations we only saw 10 cases that played out this well or better. Much more likely is the 75th percentile (and worse) scenarios in which sometime mid-season we run out of money all together! Yikes! Even the median performance has us losing a few hundred dollars for all our trouble.
The lesson? If you’re paying 10% juice, a 55% winning percentage is likely not enough to make a living. At 55% against the spread you still have to be lucky in the longer term to be profitable, where the word lucky here means you are performing above your median expected performance. Granted, you have to be less lucky than the guy who picks games using a coin – but only barely. Both of you are, in the longer term, expected to lose money. Only a degenerate keeps betting when he/she knows there’s no chance of making a profit. That was not the idea when I created the machine… Fortunately – and as we expect – the higher SCI picks fare much better.
Here is how the 3.0 level played out in the simulator:
Ok – this is much better. On the top end we are doing quite well. Even the 99th percentile is expected to make a very modest profit, with the median expectation of growing our bankroll to $6000 by season’s end. One other thing I calculated is the probability in this simulation that we would experience a losing week using our empirical probabilities from 2013. For the ‘all picks’ case we expect that 47% of weeks down weeks. For the 3.0 threshold the number is much better – only 16% of weeks should have us losing money. Most impressive of all, for the 5.0 threshold we expect to experience a losing week only 3% of the time! That’s pretty impressive – here’s what the 5.0 threshold simulation looks like.
Woohoo! We’re rich!!! Starting with $1000 the average outcome is a 10 fold return, ending the year with $10,000. I’d certainly take it. A 3% chance of a down week seems almost too good to be true, but remember that in 6 weeks this year the highest SCI threshold never had a losing week. And that was with me trying my damnedest to screw it up in the first 3 weeks. Encouraging? Yup. Time to bet the farm? We shall see.
I think with all the evidence collected over this season it’d be insane to not at least try to make these simulations a reality going into next season. Perhaps next year’s blog will be a real-time analysis of how that experiment goes.
Thanks for everyone for following along, this is sort of the final shebang as far as blog posts go. I will analyze the results of the playoff games and report on my family wager but we wont have any more pretty pictures to look at and the posts will likely be short and to the point. During the offseason I will be refining the NFL algorithm and potentially posting picks in other sports for which I’ve adopted the machine, most notably fantasy Golf.