When Will Lebron Eclipse Kareem?
If you all are not interested in the preamble and want to jump straight to the results, please scroll down to the section titled Results.
To learn more about how you can utilize these techniques for your teams checkout Prokanban.Org’s Applying Metrics for Predictability course.
Daniel Vacanti and I recently received a question from an old colleague. Mike Suarez had been watching episodes of Drunk Agile. The episode on Wilt Chamberlain, which takes a closer look at some of Wilt’s stats sparked a question for him — “I was wondering if MonteCarlo can predict when LeBron would pass Kareem?”.
LeBron currently has 37,062 career points. Kareem Abdul-Jabar, who retired in 1989 finished his career with a total of 38,387 points. Before we get into the analysis, let us take a quick moment to appreciate the fact that this record has stood for over 30 years. It is a testament to both the offensive capabilities and the longevity of the Lakers(and Bucks) great.
As I write this towards the end of August, we are a month and a half away from the start of the NBA season. Most analysts expect LeBron to score the 1,325 needed to eclipse Kareem, in the 2022–23 season. ESPN in their article discussing this, projects that LeBron will pass Kareem during the Lakers’ 49th game of this season.
Their projection is based on LeBron’s career average of 27.1 points per game. The simple projection using this analysis is done by dividing the points to achieve (1,325 points) by the average (27.1 points/game) and rounding up. If you plug that into a calculator, you will get 49 games (rounded up from 48.89).
This, as regular listeners of Drunk Agile would recognize is strict ‘Flaw of Averages’ territory. As a prediction, it has a very basic problem. It is just a single-point forecast. It is rooted in deterministic instead of probabilistic thinking. All forecasts should have at least two parts — A range and A probability. Hence, Mike’s question about applying Monte Carlo to get a more informative forecast.
I ran LeBron James’ performances from the past three seasons through Monte Carlo and the results are shown below. Why choose the last three seasons? They are the most recent of LeBron’s performances and hence most likely to represent the upcoming season. Also, these are with the same team that he is currently suiting up for. We could use his entire career, but I made the judgment call that the last three seasons would be a better model than the entire career.
Results
Using the past three seasons as a baseline, running a thousand Monte Carlo simulations gives us the following results —
In the table above, the first column is the which game of the season we are referring, the second is how many times in the simulations LeBron passed Karrem in that game and the final column is the cummulative percentage of LeBron passing Kareem by this game.
According to these results, the earliest LeBron passes Kareem is game 42 of the 2022–23 season. the probabilities increase as we get further into the season. Passing Kareem on or before game 48 has about a 25% chance, while we are over 90% confident that LeBron gets there on or before game 52. These simulation results also indicate that we have full confidence (we probably never should) that LeBron gets to 38,387 points by game 55.
Espn’s prediction of game 49 shows up 194 times out of the 1000 simulations. Which indicates that there is about a 19% chance of this prediction coming true. LeBron is about 44% likely to break the record on or before game 49. Here is a graphical representation of these results.
There is one more assumption we have made in these simulations — LeBron will play every Lakers game this season. In the past 3 seasons (225 games) LeBron has not played in 57 games. This means there is a decent chance that he misses some games this season as well. What if we take these missed games into account as well. How does that change the results? Below are the results assuming that LeBron would miss games at the same rate as he has in the past three seasons.
The results get a bit more interesting. The earliest or simulations have LeBron hitting the 38,387 mark is game 52. We are about 50% confident around game 66 and 90% by game 74.
Two very interesting results here — ESPN’s prediction of gmae 49 does not even show up. If LeBron misses games at the same rate as he has in the past, our simulations suggest that he will not break the record by game 49. Second result of interest is that LeBron has a 0.5% chance of not breaking the record this season. In 5 out of the 1000 simulations that took his absences into account, LeBron did not get to 38,387 career points. A possibility that would not be revealed if we did not take a probabilistic approach.
Continuous Forecasting
The simulations and the forecasts here are all pre-season. As Dan and I discussed in Drunk Agile Episode 4 (MonteCarlo — What Not To Do), the best way to get good forecasts is contuniously reforecasting when we get new information.
We need to do the same here. After every game, remove the points remaining and add the new game to our model. Reforecast and see where we end up next. This will continuosly adjust our predictions as new information about missed games or great performances come in. As LeBron plays (or does not play) games in the season, we will be able to use this new information to update our predictions
If you are interested in learning more about how to use these techniques in predicting “When will it be done?” check out ProKanban.Org. In particular check out the Applying Metrics for Predictability course.
Refrences
Basketball Reference — https://www.basketball-reference.com/players/j/jamesle01.html