A Ghost In The Machine and Week 9 NFL Picks

by thesanction1

People liken math to science. Statistics, as a branch of mathematics, gets lumped into this relationship as well – people think of the process of doing statistics as unambiguous, direct, causal and clear. You have a set of data, you want to know some quantifiable information about that data, you apply whatever functions you need to get that information and you’re done. No wiggle room, no potential errors of interpretation, no room for art or creativity. People, as it turns out, are wrong.

Doing real world statistics and predictive modeling is inherently messy. The statistician is required to make creative, educated guesses at every step in the process. How should missing data be treated? How should outliers be handled? Should any variables be transformed or combined? Which variables should be included / excluded and what tools should be used for feature selection? What method should be used to produce the eventual model? How should backtesting be designed and tested? The list goes on. There are guidelines for each of these questions but there aren’t definitive rules. Every step in the process holds the potential for error, and many of the errors end up being subtle and hard to identify until it’s too late. You often don’t discover your errors until a model is set free in the world and starts behaving in unexpected and potentially dangerous ways.

Such is the case with the Machine, and the backdrop to the absence of picks last week.  The Machine’s picks had been doing well so far, but not quite as well as I expected in backtesting. In the LVH Supercontest – the world’s premier NFL handicapper contest that pays out first place prize of over $750,000 – one of my entries was inside the top 50 (out of 1400) and my second was inside the top 200.  My best entry used only the machines picks and had been picking games at roughly 67% against the spread. So while I wasn’t quite living up to backtesting, I was doing well enough to win (last year’s winner was 67% ATS) and feeling otherwise fairly optimistic about things. Plus, I reasoned, many of the variables the machine uses are moving averages – so as the season moved along I expected the performance to improve compared to a ‘pick from your gut’ method of handicapping.

Week 6 came along and the Machine had its first losing week. My LVH Supercontest cards dropped out of the top 100 and 300 respectively. It could have just been a 1% chance variance event, but the evidence had mounted to the point that I would have been crazy to not look in the code for potential issues. Where there’s smoke there’s fire, goes the saying, and it applied here as well. I found several key issues that caused the machine’s performance to degrade over time, and another major issue that caused me to distrust the backtesting I had done on last years data. The Machine certainly was doing something right in the first 5 weeks – but what that something was was difficult to say. It was a very sophisticated hybrid of solid predictive analytics and modeling and a few nasty bugs and, from that discovery on, it’s picks could not be trusted. Add this discovery to the fact that Victiv – the site that’s taking over the Daily Fantasy Sports industry, where I work as CIO – was exploding onto the DFS scene… needless to say I had a full plate these last few weeks. I started tearing apart the nefarious parts of the machine to rebuild them from scratch without the bugs… I thought I could do all of this in a single week, but I was wrong. Come Friday of last week I had no picks and was forced to simply guess on both of my LVH Supercontest cards. The results were a disastrous 1-4 on both cards, knocking the once promising tickets all the way back to tied for 239th (23 of 40 – 57% ATS) and 532nd (52.5%). The current leaders sit at 31 of 9 (77.5% ATS).

That’s the bad news. The good news is that the Machine has been reborn. I’m not going to claim it’s better than ever – but it’s functional again and the bugs have been resolved. This week is sort of a ‘minimum viable’ version – where the bugs are gone but it has yet to be optimally tuned with new parameters. Based on backtesting, which I now trust, I expect these picks to be 58% overall ATS in the longer term, and higher confidence picks to fare even better (3.0 factor and above – 64% ATS, 7.0 factor and above – 69% ATS). It’ll be a long road to get back on top of the leaderboard and I may be too far behind now to catchup – but comebacks always make for a better story anyway … Let’s do this.

Here are the week 9 picks from the Machine:



Good luck in all your week 9 contests.

