Updated: Sep 20, 2018
March Madness is here and what better way to kick it off than with some data analytics! This year we’ve set out to help you win your pools with our NCAA March Madness Bracket Generator by using some of our favorite and most powerful analytics tools available; Alteryx, DataRobot, and Tableau.
Setting up the Infrastructure
First step in building this Bracket Generator was to gather some NCAA College Basketball data to be used for predicting each possible match-up in the tournament. Using Alteryx I scraped data from sports-reference.com to compile a data-set that had results and statistics for each game in the last 15 years for every college basketball team that was featured in the given year’s tournament. From there, I calculated statistics based on regular season performance to create a profile for each team. This profile included 50-60 features in total such as Tournament Rank, Average Point Differential, Wins, Losses, Field Goal %, etc. I then gathered the tournament results for the same 15 years and applied the team profiles to each game to give me one flat data set to feed into DataRobot.
Once the data source was ready to be analyzed, I ran it through DataRobot to train and determine the best fit model for predicting the winners for each tournament game. By using DataRobot it enabled me to build and test many predictive models in a matter of minutes. If you’re familiar with DataRobot, you’ve probably used their great user-friendly web app to load data to, but since I was already working in Alteryx, I used the DataRobot tool in Alteryx to upload the data and train the models. The integration with Alteryx allowed me to quickly and easily go from data preparation to robust predictive modeling in just a few minutes and clicks of the mouse. After it was completed, I then ran all the possible match-ups for this year through the model to determine the percent chance each team had to win each possible match-up in the tournament. Again, all this was able to be done without ever leaving Alteryx due to the DataRobot integration in the tool. Tableau
The Pick Methodology
The above image displays how the picks are made each time a new bracket is generated. As shown, the two inputs required to make a pick are the Prediction number determined by DataRobot along with a random number between 0 and 1. You may be wondering what the purpose of the random number is and the reason is that by comparing the random number to the prediction number we’re able to generate new, but reasonable brackets with each click of the button. Please see the Example provided at the bottom of the above image to help further clarify.
Populating the Dashboard
Generating a Bracket
To generate the bracket, the user simply has to click the “Generate” button and watch as the bracket fills in with selections. Don’t like the bracket that was generated? Simply click the button again and watch as a new bracket is generated. Due to the intense processing required the bracket will take about 30 seconds to generate, but it’s worth the wait! Enjoy! Generate your bracket