On March 3rd, Warren Buffett and Quicken Loans rolled out the Quicken Loans Billion Dollar Bracket Challenge that pays $1 billion, with a B, to anyone with a perfect March Madness bracket. The chances of filling out a perfect bracket, just through the first round, are 13 million to 1 and a perfectly filled out bracket through all 63 games is 9 quintillion to 1. This is exactly the scenario where Big Data can drastically improve the odds.
Tim Chartier is the Associate Professor of Mathematics at Davidson College in Charlotte, NC and for the last 5 years, he has taught his students how to build predictive models to try and predict the outcome of the NCAA college basketball tournament. In other words, they use Big Data to try and fill out the perfect March Madness Bracket. Have they been perfect? No, but they are better than most beating over 8 million bracket entries which equated to 96% of the brackets submitted to ESPN last year. The students learn how to build the linear model in Chartier's Finite Math class. They then feed in data on over 5,000 regular season games and let their model go to work. It’s up to each student on what variables they use and how they are weighted.
We spoke with Tim today about the Analytics of Bracketology and the evolution of the March Madness Analytics Model.
How many students do you have this year submitting brackets?
"I don't entirely know. I have a class of 60 students but not everyone can submit. If you are a Division I athlete, you can't as it is considered gambling. I have about 10 research students testing new, more advanced ideas. I'll probably have 20 students from the 60 but there IS a billion dollars. So, I could bat with a very high percentage!"
How often do students fill their brackets out exactly the same or has it ever happened?
"I've never had that. There are enough parameters that it is very hard to have that for every game. BUT, the first year I had less parameters so many students had the same team winning everything. That doesn't even happen as much now given the flexibility of the idea."
What are a few variables that are used that are out of the ordinary?
"In terms of past years, it helps if you look at scores in buckets. For instance, you decide close games are within 3 points and count those as ties. Medium wins are 4 to 10 points and could as 6 points and anything bigger is an 11 point win. That's worked really well in some cases and reduces some of the noise of scores."
"Here is another that comes out of our most current research. This year's tournament will enable us to test it in brackets. We tried it on conference tournaments and it had good success. We use statistics (specifically Dean Oliver's 4 Factors) and look at that as a point, in this case in 4D space. Then we find another team that has a point in the fourth dimension closest to that team's point. This means they play similarly. Suddenly, we can begin to look at who similar teams win and lose against."
How are injuries, or if a key player is recovering from an injury, considered?
"This can be done but hasn't been done a lot. If you want to put that in, you down weight the game. Rather than saying it is a full game, you say it is 1/2 or 1/4 of a game. In this way, it doesn't have the same effect on the record."
Is human decisioning ever implemented if the model has picked a game winner by a very narrow margin?
"Yes. We often think this is VERY important. When things are close, you also want to look at data beyond our model – experience of the coach and performance of that coach in the tournament."
Do your students create models for other sports?
"Yes. We have applied to this the NFL. That's where we got the idea to treat close games as ties. We were able to have a more predictive model. Colleagues of mine at Furman University, looked at MLB and were able to rank batters and hitters. We've applied it to NASCAR and looked at predicting who would make the Chase. A student with the idea of using points in higher dimensional space moved that idea over to the UFC and found it to be predictive of who would win fights."
The Billion Dollar Bracket Challenge even pays if brackets aren't perfect. Quicken will award 20 of the most accurate brackets with $100,000 to be used for buying, refinancing or remodeling a home. The deadline to submit your bracket is March 18th.
Good luck! We're betting on Davidson College.
If you want more Analytics, Bracketology and Big Data, check out Tim’s book Math Bytes: Google Bombs, Chocolate Cover Pi, and other Cool Bits in Computing.
Article written by Todd Nevins for icrunchdata news Austin, TX