Project Background
Taught by Dr. Mathieu Dahan, this course built upon the foundational concepts learned in ISyE 3133 Engineering Optimization, introducing more advanced algorithms of linear and integer optimization, including Dantzig-Wolfe decomposition, the cutting-plane method, column generation algorithms, the Simplex method, and the branch-and-bound algorithm, as well as methods of sensitivity analysis. We also further refined our ability to implement integer and linear programs in Gurobi, an optimization solver integrated with Python.
A major component of this course was a semester project in which we worked in teams of three to apply class concepts to a real-world context.
My team chose to formulate and code an integer program to maximize the expected number of points earned by a March Madness bracket. I'll discuss our methodology and embed our final paper below, but first I'll provide some background on March Madness and its bracket scoring system. Those already familiar with March Madness are welcome to skip the below green section and go straight to the methodology.
March Madness Overview
NCAA March Madness is one of the most popular college sports tournaments in the United States.
Each year, 64 Division I college basketball teams compete for the title of national champion. There are 32 games played in the first round, 16 in the second, 8 in the third, 4 in the fourth, 2 in the fifth, and 1 championship game in the sixth round, for a total of 63 games.
While the teams compete for fame and glory, many basketball fans have their own concurrent competitions. Before the tournament begins, millions of Americans create March Madness “brackets”, trying to predict the outcomes of all 63 games and, ultimately, the tournament winner.
A "bracket" is a way a fan records their prediction for the outcome of each game. The standard scoring system for March Madness brackets is delineated below:
1 point is awarded for each correct prediction in the first round of competition.
2 points are awarded for each correct prediction in the second round.
4 points are awarded for each correct prediction in the third round.
8 points are awarded for each correct prediction in the fourth round
16 points are awarded for each correct prediction in the fifth round.
32 points are awarded for correctly predicting the overall winner of the tournament.
While bracket predictions are often made for fun among friends — based off nothing more than educated guesses and personal preferences — many people bet large amounts of money on their brackets. Indeed, a perfect March Madness bracket could potentially earn its creator a very significant amount of money; Warren Buffett, for example, offers $1 million to any Berkshire Hathaway employee who can produce a perfect bracket.
Due to its immense scope, mathematical complexity, and potential value, March Madness has attracted the attention of mathematicians, engineers, data scientists, sports analytics professionals, and other general academics seeking to solve this challenging problem.
Methodology
Dr. Dahan graciously provided us with a 64 x 64 matrix from the 2019 tournament, in which the entry in position i, j represented the probability that team i would beat team j in the event that they played against each other. Since the matrix was from 2019, we decided to generate a bracket for the 2019 tournament for our project, which would also allow us to compare our optimal bracket to the actual tournament results.
First, we determined the probability of each team winning round k. The formula we derived to do so and the code we wrote to implement it are in our final paper below, under section (a).
In part (b) of the paper is our written formulation of the integer program to maximize the expected points earned
for each prediction, as well as our implementation in Gurobi Python.
In part (c) we present the results of solving our program. The optimal bracket is below, whose expected point total was 97.3371.