The Rotoquant

Describing and predicting the world of sports in the language of mathematics

Month: February, 2014

Honda Classic Picks

I feel like it’s been ages since my last post. I took vacation from my real job for the first time in a long time, and it provided some much needed relaxation. By sheer coincidence the vacation coincided with a break in the DFS golf season due to the still very interesting Accenture Match Play event that took place last week. If you are a fan of match play, last weekends contest and exciting finish certainly didn’t disappoint. 

But, if instead your a fan of numbers, of prediction, of skill-based prediction games like DFS, last week’s contest was simply a chance to step away from DFS entirely and ponder. I wrote a one-off post about variance last week that was well received in the rotogrinders.com forums, talked a little bit of golf with friends and family, but I mostly just tried to relax. I didn’t even open either my personal or work computer for days at a time. Whole days. Like 24 hours in a row.

I must admit to getting anxious towards the end of my vacation and diving in on all sorts of interesting computational projects I had been too busy to tackle. One of them was to totally uproot the old method of prediction for golf, and double check the methodology from a data standpoint. I found a few bugaboos hiding in the dark corners of the code, but luckily it only took one fun-and-code-filled Friday night to chase them out. The biggest change is moving away from the old method of normalizing performance to %MFDSP’s (percentage of mean field draft street points) and instead using a rescaled metric for the basis of my prediction – where the outcomes are now spanning from 0 (no fantasy points) to 1 (highest scoring fantasy golfer). This rescaling is a bit more comprehensible, but more importantly it opens the door to the family of computational methods that require logistic regression – a mathematical technique for predicting numbers that fall between 0 and 1.

So this week you will not see any predictions that are greater than zero. I did not have time to compare the tournament two weekends ago to Notorious with all these dataside changes – but to the extent I can remember such distant events (I’ve traveled from west coast to east coast and back in that same time span) I had a decent (but not great) week.

Let’s see if we can change all that with the Honda…

Picks for the Honda Classic

Here they are (note Justin Rose and a few others have withdrawn from the tournament … I don’t think Rose is here, but just make sure that whoever you’re starting is actually on the course):

Picks_NUKEHonda Classic_2014

The red line here represents the cut, based off of these projections, back at the normal 70th spot.

I don’t have a lot of time for commentary other than… yup! These are the least surprising looking picks at first glance that I can imagine! Let’s hope that means they are also the most accurate.

Good luck everyone in your contests.  Talk to you next week.

Understanding Variance in the Context of Daily Fantasy Sports

I’m a member of Rotogrinders.com and I frequent the forums there.  Recently there was a thread posted on the topic of variance, and being statistically minded my interest was naturally piqued. I started writing a response on the thread, but it turned into such a rambling dialog that I decided I’d just make it a blog post, and provide a link to it on the forum for anyone who was interested (God help them).  So here it is…

“…

I’m glad I found this thread. My background is computer programming, physics and finance (and the intersection of those things) so when I see a forum post with ‘variance’ in the title I can’t help but get interested.

But now that I’ve read through most of the comments here, I’m starting to think that ‘variance’ is almost never used correctly in the context of DFS. What is being discussed here is, almost universally, ‘diversification’ – in finance diversification is any attempt to remove unsystematic risk from a portfolio. It’s buying hedges against heavily owned positions, or simply owning multiple stocks in different industries that typically zig when their counter parts zag.

This is very different from variance. The most diversified bag of equities is a bag that includes every stock in the world. To approximate this bag, let’s just use the S&P 500. So, owning the S&P 500 is, in theory, taking the most diversified equity position possible (notice I’m intentionally leaving out bonds, cash, and alt. investments for simplicity here). That ‘most diversified bag’ will still experience the *statistical phenomenon of variance* in it’s measured performance; that is systematic real-world effects which cannot be diversified away will impact it’s ROI.

Are there a corollaries between finance and DFS? Well, they may be strained, but I think they exist. In DFS the most diversified portfolio would be having every possible team playing against every possible opponent. The expected return of a fully diversified DFS portfolio is probably *very* negative – as you are always paying the rake, and most possible teams are degenerate and would easily lose to anyone who thought about their team-construction seriously.

So I think the best way to understand diversification in DFS is based on one’s projected player performance. Since, theoretically speaking and ignoring bankroll and game-selection issues for a moment, the ROI for someone in DFS is based on their skill level at projecting player performance – ‘investing’ in DFS is really an investment in one’s ability to predict the fantasy performances of athletes. Diversification then refers to one’s financial exposure to athlete risk. Even if, based on your projections, there were only 15 golfers (pardon me, PGA is my sport) projected to be great picks for the week, there would still be 5,005 unique teams of 6 to construct using only these top 15 projections.

Undoubtedly some combinations wouldn’t fit within the salary cap, but perhaps 2,500 of the 5,005 teams them are actually affordable Complete diversification would be owning all of those teams – and playing them in equal measure against every opponent. Diversification in DFS is much more expensive than it is in finance exactly because you can’t simply buy an ‘index of reasonable teams’ in the same way that the S&P 500 is an ‘index of reasonable stocks ‘- there is a combinatorial explosion of player combinations and there’s no chance to get equal exposure even to the subset of combinations you think are ‘good’ (i.e. projected to perform).

In as much as I’ve thought about diversification in DFS I think it comes down to one basic tenant: play your best projected teams, no more, no less (if you can afford it).

Variance is something wholly different from diversification. I’ve noticed that there’s a grinders norm that says something like: if you’re practicing proper bankroll management you should never invest much more than around 10% of your bankroll in a given night. This is a strategy to handle variance. Most nights in DFS, assuming you are playing a diversified bag of your top teams, are still either going to be great and you win everything (your player projections were more right than the next guy), or not so great and you lose most contests (your player projections were less right than the next guy). It’s OK to talk about having an ROI of such-and-such, but this ROI should not be viewed like the ROI of the S&P 500. The S&P 500 never goes to 0 (it’s possible, but hasn’t happened yet).

If I had to describe the behavior of DFS players in the terms of finance I’d say we are all a bunch of options traders – we have most of our money in cash, and only put a little bit of it at risk at a time because the odds of losing all of it are relatively high. If you’ve never read books like The Black Swan, Fooled By Randomness, or Anti-Fragile you probably should (they are all by an insane financier named Nassim Taleb, but the are fun reads and the ideas are directly applicable to both finance and DFS).  But variance describes the risk in your DFS matchups that is un-diversifiable  – the risk that your projections/thought process for the week are just wrong. Anyone who plays DFS knows this feeling.  This week, for instance, my top projected golfer is Hunter Mahan, who is entering into Sunday tied for 58th! It’s not that my projection methodology is wrong, most golf minded DFS players would have put Mahan at the top of the projected pack this week (though admittedly I was higher on him than most).  It’s that sports really are unpredictable, there’s nothing you can do to change that, so even using fully diversified rosters of your most sound and highest quality picks you will lose often.

That’s variance – not who do I put on my team, or who do I go H-2-H against (diversification) – but rather given that I’m using the soundest strategy possible to me, some weeks I will be highly profitable, others I’ll fall flat on my face.  Some years the S&P is up, other years it’s down.  So variance isn’t something you can actively work to get rid of – in DFS your variance is largely determined simply by how good of a player you are.  If you’re really, really good your variance is going to be lower (because, while you will still lose, you will lose less often than you win).  Similarly, if you’re really, really bad your variance will be lower (because, while you will still sometimes win, you will win less often than you lose). The person who has the hardest time with variance is the average player – week to week they are not sure if they are more likely to be up than down.  It’s very difficult to try and tease out whether you are slightly above average or slightly below average, variance is a sneaky bastard in that way.  But it’s not diversification.  Even a diversified set of teams will experience variance – IMO, in the context of DFS we should be talking about variance as a statistical phenomenon based in the undiversifiable component of risk, and diversification as a method for smoothing out non-systematic player-risk.

Anyway – I wanted to bring up this subtle distinction so that we can talk about diversification vs variance more clearly.  They are two important topics, and to have dialog on these subjects it would help to tease them apart and have clear definitions of what each one represents… Hopefully I’ve added a modicum of clarity here, rather than muddying the waters further.

Northern Trust Open Picks, plus a look back at the AT&T Pebble Beach Open

I’ve read the comment on several other blogs, and I must agree… last weeks golf action was somewhat excruciating. Weather delays were a welcome reprieve from endless montages of Peyton Manning signing hats, and Kid Rock doing whatever it is that Kid Rock does. I found the whole thing somewhat annoying to be honest… Compared to, for instance, the Waste Management tournament, the AT&T was outright boring

And the annoyance wasn’t just that it wasn’t as fun to watch. It was annoying because I knew everything about the tournament was gimmicky and incomparable to other tournaments. Even mother nature didn’t want golf to be played that way – check out the wind on Friday doing magic tricks with one of Greg Chalmers puts.

Alright, enough bitching. Last week is so 3 days ago.

This week we’ve got the Northern Trust Open, which I’m really excited about.  It’s the first time a lot of things get back to normal golf wise.  It’s the first time we get to see a few big names like Justin Rose and Luis Oosthueizen (holy crap I spelled that right first try… gotta slow down on the golf-media-intake) play on American soil. And it’s not last week. Hooray.

Before we can get to this weeks projections we have to do the deed of seeing how last week went. I can tell you ahead of time it was unspectacular – but that doesn’t mean it was uninformative.  We knew that the events and circumstances surrounding last week’s tournament would increase the variance of actual golfer performance – so if we did better overall than we normally do, something would smell afoul.  I hate to use the word fortunately, so instead I’ll go with ‘as expected’, the prediction performance reflected the actual madness in the real world.

Look back at The Waste Management Phoenix Open Fantasy Projection Performance

The same caveats as always apply to this analysis.  

As usual the comparison picks are from Notorious at Rotogrinders.com, a very well respected name in the daily fantasy sports industry. If you don’t know what these graphs are showing (and you care) then just go to first fantasy golf post that’s hyperlinked above – it contains all the explanations of the relevant concepts used in this analysis. 

Below is a grid showing Notorious’ draft street points projections (DSP), the converted mean field draft street points projections (%MFDSP), the machine’s %MFDSP and the actual %MFDSP’s:

Picks_AT&T Pebble Beach_2014_MFP_Grid

Looking at the RMSE analysis – and almost the exact opposite of what we saw the week prior – last week was the first week in which the Machine’s projections were equally bad or good as were those by Notorious.  Blending the two came out with the optimal RMSE, but the reduction isn’t great – a sign that plenty of externalities that were not accounted for in either model were jacking things up:

Picks_AT&T Pebble Beach_2014_RMSE_Grid

I failed to preserve the output from the blending procedure – which is a shame because it was actually cool looking this week. It’s a giant U shape; the optimal blend was .53 – that is 53% Notorious picks and 47% Machine.

I ended up (stupidly) using a 10% split based on historical data. I should have realized that variance in the conditions was less likely to effect a less-statistically sensitive prediction, and weighing more heavily things like the Vegas Odds – which is a large part of Notorious’ projections – probably would have been a wise move.

Here’s the cut percentage for both of us:

Picks_AT&T Pebble Beach_2014_Made_Cut_Grid

The gale force winds and the presence of my nemesi (dammit spellcheck! – I’ve had enough of your shit for one day – ‘nemesi’ is, without a doubt, the plural form of ‘nemesis’), Young Tom Brady and Darth Belecheck, didn’t only jack up the machine’s predictions…  The machine went from a stellar cut percentage of 70% for the WM Phoenix Open, to a paltry 54% at Pebble, getting beat by Notorious’ much simpler method which posted a respectable 59% compared to his prior 64%.

Lastly – below is the TopN grid I introduced last week. Keep in mind all the caveats around who gets included in this analysis…

Picks_AT&T Pebble Beach_2014_TopN_Grid

Once again Notorious socked it to the robot. Well done, good sir.  I also ought to mention that in Notorious’ handful of named picks last week he managed to pick the winner in Jimmy Walker. So again, props are certainly due even if (or maybe, especially because?) the outing was a bit more random than usual.

At any rate – I think everyone is happy to have moved onto a more sane week in the Northern Trust Open. Without further ado, let’s get to the picks!

Picks for the Northern Trust Open

Here they are (note Webb Simpson and Harris English are new additions here, they did not make the first edition of this post!):

Picks_Northern Trust Open_2014

The red line here represents the cut, based off of these projections, back at the normal 70th spot.

A few comments I have to make about these projections…

If I could pick two golfers to grab a beer with it might be Hunter Mahan and Graham Delaet.  These guys just seem like awesome dudes. So I’m absolutely pumped that, based on how far out Mahan is projected here I will probably have him on every roster. It’s always nice when your strategy and fan-interests intersect.  The downside? Well, the last discrepancy that was this large was when Tiger was playing at Torrey… and we all remember what happened there. And last week Furyk didn’t exactly blow the field away, though he was projected as a fairly clear number one.

Sidebar: Do I think there’s something wrong with the algorithm because of these two data points? Not at all! These are real-world sporting events, and golf no less. The results are highly variable. If anyone could predict winners week in and week out there’d no need to play DFS – they could just go to Vegas and start printing money. The purpose of the algorithm is to perform over the longer term… so two weeks of projected winners performing poorly does not a trend make.  Their underperformance was more than compensated for by the projections of several key fades (see Jason Day last week), and by a stellar performance on most of the top 25 week in and week out.

Other interesting things to note here: Fred Couples with a top 15 finish? Luis Oost (… screw it I’m not risking that spelling again), Keegan, Haas, Dufner, Walker, Jones, Na – all these big/hot names ending up outside the top 20, some even outside the top 30? Seems crazy – but the machine has been right about crazier things before….

Good luck everyone in your contests.  Talk to you next week.

AT&T Pebble Beach Pro-Am Picks, plus a look back at the WM Phoenix Open

I’m just going to come out and say it: I enjoyed watching the Waste Management Phoenix Open more than the Super Bowl. 

It’s not just that the Super Bowl was a sham of a game, that Seattle was so obviously better than Denver in almost every facet of the game, and that Denver couldn’t get out of it’s own way. But also the Phoenix Open was really exciting!  Bubba’s downfall, and the big boy Stads pulling through with the victory, along with a monster weekend comeback from Graham Deleat, and Phil deciding to play after all.  Golf is really fun to watch, particularly when you’re playing daily fantasy golf and every stroke has implications on whether or not you’ll profit for the week. 

I managed to pull down an 8th place finish out of 140 in the big DraftKings.com GPP, and made out handsomely in DraftStreet as well. So far I’ve been on a roll with golf, and so far all of my teams have been 100% determined by algorithms (although I have spiced them up with anywhere from 5-15% of the projections made by Notorious at rotogrinders.com).  

Let’s hope that the magic doesn’t stop.  Feeling really good about this week – despite the fact that the format of the tournament is similar to the whacky Humana Challenge with three courses in play, a pro-am/celebrity feature, and the cut coming at the end of 54 holes.  Unlike Humana, this tournament should not be as much of a shootout – which is always something to consider heavily when making a DFS roster.  And, just because, here’s Bill Murray wearing camouflage in the 2011 tournament (he’s a staple at the Pebble Beach Pro-Am, but wont be participating this year due to schedule conflicts with the filming of a new flick… Caddie Shack 3 anyone?!?).  

Look back at The Waste Management Phoenix Open Fantasy Projection Performance

The same caveats as always apply to this analysis.  

As usual the comparison picks are from Notorious at Rotogrinders.com, a very well respected name in the daily fantasy sports industry. If you don’t know what these graphs are showing (and you care) then just go to first fantasy golf post that’s hyperlinked above – it contains all the explanations of the relevant concepts used in this analysis. 

Below is a grid showing Notorious’ draft street points projections (DSP), the converted mean field draft street points projections (%MFDSP), the machine’s %MFDSP and the actual %MFDSP’s:

Picks_Phoenix Open_2014_MFP_Grid

Last week was one of the first weeks of what I might call a ‘traditional’ golf tournament (well, sort of). No pro-am, all rounds played on the same course, a traditional cut at the end of day 2, etc. This format obviously lent itself to algorithmic prediction.  Looking at the RMSE analysis, this was the first week in which the Machine didn’t really benefit much at all from being blended with the projections by Notorious, reducing the RMSE by half on their own:

Picks_Phoenix Open_2014_RMSE_Grid

Here is the output from the blending procedure to find the optimal blending alpha between the two:

Picks_Phoenix Open_2014_Blending_Plot

The farther left you are on this line, the more of the machine’s projections you are using, the farther right here represents Notorious. So the optimal blending parameter for the WM Phoenix Open turned out to be about 3% Notorious and 97% Machine.  I will probably reduce the split I use personally back down to 8% or so this week and see how things go. Although, I must say, even using a 15% split which was – post facto – suboptimal, I fared quite well anyway as a credit to the value of Notorious’s simple methodology. 

Here’s the cut percentage for both of us:

Picks_Phoenix Open_2014_Made_Cut_Grid

This week was closer to what I’d seen to this point in the season.  A 70% made-cut percentage is basically as good as we can hope for. If the machine could do that every week, I think it’d be tough not to make a profit.

Lastly – and a new feature for the blog – I want to give those less mathematically inclined an idea of how many of the ‘top-n’ players were predicted correctly in each method.  Keep in mind all the caveats around who gets included in this analysis… If I get the comment ‘But, so-and-so didn’t finish in the top 10!’ I’m going to be very disappointed, dear reader. 

Picks_Phoenix Open_2014_TopN_Grid

I’d say we both had a solid predictive outing.  If you can predict 1 of 5 top 5’s and 3 of 10 top 10’s, you’re firing on all cylinders.  As for Notorious, while the machine ousted him in the top 10 picks – look at how awesome his top 25 picks were!  That 52% mark – or 14 of the top 25 – is pretty incredible IMO.

Ok, so given my earlier comments about how relatively normal the format of the WM Phoenix Open tournament is, the same cannot be said about the atmosphere. There were over 500,000 visitors to the tournament over its four day span, and 190,000 on Saturday alone – making it the second largest sporting event next to the Indianapolis 500. Many golfers thrive on this environment – with loud chanting and roars that can be heard around the course emanating from the stadium surrounding the 16th par 3 hole (see last week’s description). Some of them despise it, and it may well have been the cause for Bubba’s break down coming down the stretch, that allowed Stadler to steal the victory. This crazy environment may well serve as an externality that we cannot account for mathematically – so perhaps there is still some predictive edge to be eked out as the tournaments progress, and we get into the more meat-and-potatoes, bog-standard tournament formats.

All is ultimately leading up the to Masters, where DraftStreet has already announced the first ever $100K guaranteed tournament for fantasy golf (they are running satellites into this tourney this week for the AT&T Pro-Am).  It’s crazy how fast DFS golf is growing – it seems like just last year when $100K tournament was a big deal for daily fantasy football, and we were lucky to get a $10K golf event.  The tournaments run weekly for golf on DraftKings are larger than the largest tournaments in the industry were just last year. I absolutely can’t wait to see how DraftKings counters to DraftStreet’s masters tourney… and we’re just getting started.   

Ok, enough anticipation of the great events to come – we’ve got ourselves a humdinger teeing off on Thursday! Let’s get into it…

Picks for the AT&T Pebble Beach National Pro-Am

Here they are:

Picks_AT&T Pebble Beach_2014

The red line here represents the cut, based off of these projections. Keep in mind the cut is on Saturday, and to add to the weirdness it’s the top 60 golfers and ties instead of the traditional top 70. 

A few comments I have to make about these projections. 

First – Jim Furyk being by far the top projected golfer looks, to my eyes, like a giant dollar sign. No one likes to pick Jim – he’s the least sexy name in golf. PGATOUR.com has totally ignored him in their top 15 power rankings for this tournament, even though he’s by far the most accomplished player in this field (besides Phil), he has a solid track record at this event, and his form at the end of last season was top-notch.  I expect him to be relatively under-owned, representing a chance for relative value.  

There are some other interesting projections in there, but I’ll leave them to the reader to discover. 

Good luck everyone in your contests.  Talk to you next week.