NBA 2018-2019 Season- Daily Fantasy
My results and expected long term success in playing daily fantasy during the 2018-2019 NBA Season
Overview
As context, I’ve been building models and algorithms to create daily NBA fantasy lineups for the past two seasons. In these daily fantasy competitions, you are required to select a lineup of players with constraints on the number of players at each position and the combined salary. According to a scoring rubric, each player in your lineup generates points based on their performance during that day. In the particular competition I’ve focused on called either 50/50s or double-ups, the goal is to pick a lineup that generates enough points to roughly place you in the top 50%.
For this season, my goal was to bulid upon my first and second attempts to determine if a model and data based strategy could be profitable. Building on the data, models, and optimization techniques of those previous attempts, I decided to focus on these items for this season:
-
Model Improvements: Test models developed using xgboost with new features
-
Better Evaluation: Assess my models on estimated win percentage instead of expected fantasy points
-
New Contests: Enter contests at Draftkings in addition to the contests on FanDuel
Below, I’ll walk through each of these items and then share my conclusions and outlook for future seasons.
Model improvements
This season, I dramatically improved my models predicting fantasy points for a given player, most likely due to the use of xgboost. Comparing the best model from this season to the best model from last season, there’s a step change improvement:
Model | MSE | |
Old Model | 102 | |
New Model | 67 |
These improvements most likely came from two sources: 1) using xgboost over my old sklearn models and 2) incorporating new features. Though it’s hard to separate the impact of each of these improvements, I did observe a large isolated improvement from xgboost. When using the same build sample and features as models built using sklearn’s ridge regression and GBM modules, xgboost reduced the MSE from 85.1 to 64.6. This improvement may even be understated: my old sklearn models were tuned pretty extensively, but these xgboost models were not. If the xgboost model were to be tuned further, there could be further MSE improvements.
New and improved features also likely improved the models this season. Based on several iterative model builds, the biggest value-added feature appeared to be changing how I included team-level data. Previously, I used dummy indicators for each team to capture team-specific behavior. That had a few downsides: team performance wasn’t consistent season over season (and probably not even within a season) and data was very sparse for each team in the league. To solve these problems, I generalized team-level data by removing these dummy indicators and adding the average defensive and offensive ratings and the points scored by each team. As a result, these features were more grounded and stable than the previous team variables and allowed me to use past season data to build models.
Beyond the above change to team-level data, none of my other new or improved features appeared to increase performance. Here’s a list of those other variables along with a summary of my initial hypothesis:
- per minute metrics: perhaps separating out the rate of points, which is controlled by the player, from minutes played, which is controlled by the coach, can improve overall accuracy
- building on a smaller subset based on minutes played per game: perhaps optimizing the models on data points with more minutes played will result in better accuracy for players who play more minutes and have larger impacts on a lineup
- incorporating trends: Perhaps including features that indicate performance trends can improve accuracy compared to the current features that just take a flat average of past performance
Better Evaluation
I also developed a better way to evaluate the business value of new models by tying their performance directly to the daily fantasy contests. Directly evaluating performance on contest outcomes is important because there’s a lot of non-linear behavior between the model’s prediction and the outcome of a contest. First, the model predictions are put into an algorithm to determine the optimal daily fantasy lineup. Then this lineup competes against the lineups of others. Hypothetically, there could exist a scenario where selecting a typically lower scoring and less popular player is more profitable due to the differentiation from the competition. Because of those non-linear events, it’s best to assess model performance on the last step since that’s ultimately what we care about.
Using actual daily fantasy contest data for 50/50 contests (where the top 50% of lineups win) enabled me to simulate the winning percentage of each model. Interestingly, I noticed an unintuitive pattern where the highest fantasy point model was not the best winning model:
Model | Mean FanDuel Fantasy Points | Winning Perc. | ||
Model A | 292 | 52% | ||
Model C | 290 | 48% | ||
Model D | 297 | 45% | ||
FanDuel Model | 253 | 14% |
As a disclaimer, this pattern could be due to noise to the relatively low sample size of 42 contests. Changing the outcome of a couple contests could substantially change the results. Though given the nature of these contests, where there’s only a limited amount of contests per season, I still think it’s worthwhile exploring the implications of this data.
Going back to the table, there’s an inverse relationship between mean fantasy points and winning percentages. Intuitively, you’d expect that higher fantasy points would translate into higher winning percentages. My hypothesis for this relationship is that the additional data present in Model D over Model A caused Model D to produce predictions more similar to the crowd, causing the models to lose some of their competitive edge. Compared to Model A, Model D has new features related to the variance of fantasy points over the last n games. I thought that it might make sense to penalize players that are not consistent and have high variance of their performance. When users of FanDuel set their lineup, the site provides a line graph of a player’s past n games, showing the trend and stability of their fantasy output. I hypothesize that since this variance data is so easily available to other users, many users somehow incorporate variance when making their choices, which causes Model D to have more similar lineups with the competition. When this variance data is not easily accessible, such as through DraftKing’s website, I found that including this variance feature did indeed increase the winning percentage of the model (as shown in the section below).
I also found it interesting how poorly the default FanDuel model performed, winning only 14% of contests even when choosing the optimal lineup given these predictions. Compared to Model A, a 13% drop in FanDuel points resulted in a relative 73% drop in wins. That steep drop off supports the claim that the average FanDuel contestant has access to more accurate information and strategies than purely depending on the FanDuel model.
New Contests
This season, I also competed on another daily fantasy website, DraftKings. Similar to my first season on FanDuel, on DraftKings, I competed in the beginner 50/50 contests, where you win if you’re in the top 50% of entries (which consist of inexperienced users at DraftKings). Assuming that these users are truly beginners, these contests should be less competitive than the regular ones that are open to all users.
Despite this short-term perk of beginner contests, I also believe DraftKings is strategically better suited for models and lineup optimization due to the more complex scoring and lineup logic. The scoring logic in DraftKings is much more complex than FanDuel, making it more difficult for a typical user to estimate a player’s fantasy point output. In DraftKings, bonus points are awarded for double-doubles and additional bonus points are awarded for triple doubles. In addition, there are bonus points for 3 pointers. Here’s the full scoring logic:
- DraftKing Fantasy Points = Points + 0.5*3 Point Makes + 1.25*Rebounds + 1.5*Assists + 2*Blocks + 2*Steals -0.5*Turnovers + 1.5*Double-Double + 3*Triple-Double
The lineups are also more complex than FanDuel. In DraftKings, there are spots for each 5-spot position, but also spots for more generalized positions (i.e. guards, forwards, and utility). Having these additional degrees of freedom increases the number of possible lineups, which increases the difficulty of finding the optimal lineup without an algorithm.
In my first season on DraftKings, I performed decent, but did not hit the 50% winning percentage:
Model | Mean DraftKings Fantasy Points | Winning Perc. | ||
Model DK A | 246 | 19% | ||
Model DK D | 268 | 48% |
For DraftKings, my dataset was limited to just 21 games, so the variance in these metrics is relatively high and it’s hard to get a read of the overall profitability of the best performing model, Model DK D. There does, however, appear to be a substantial improvement from Model DK A to Model DK D. This is interesting as these DraftKing models parallel the FanDuel models where Model D adds features related to the variance of past fantasy performance, while Model A does not. In this DraftKing case, adding those features substantially improved performance over Model A, while in the FanDuel case, there was either a degradation in performance or no difference at all.
Takeaways and Next Steps
Though the progress of this season didn’t result in a massively profitable model, it did set up the infrastructure well for next season. On the modeling side, I generalized the team feature, allowing me to train models on previous seasons, which should allow me to start entering contests sooner next season (previously I needed to collect enough data for each team at the start of each season). On the assessment side, I set up a process to more directly tie my model performance to value. This should enable me to make better decisions on new models and features in the future. Finally, I built the ability to enter contests on DraftKings, giving me another option for entering contests.
This work should allow me to be more competitive next season, but there are some risks with next season that may weaken the effectiveness of my models.
First, the general competition may embrace similar lineup optimization frameworks, chipping away at the effectiveness of my process. I’ve seen several sites that give away free optimal lineups, and I’ve see FanDuel’s website offer recommended lineups and picks that may use a similar framework. If the general competition does embrace these tools, then my lineups will lose some of their competitive edge, making it harder to win contests.
Second, the legalization of sports betting may push the industry to develop better models, perhaps even making my models obsolete. With more and more states legalizing sports betting, there are more incentives for individuals and businesses to develop their own models to predict sporting events. As these models improve, there may be some runover to the models that predict daily fantasy performance. If that does happen, my model’s effectiveness will decrease, hurting my chances at winning contests.