Google Hashcode 2022
Description: Google Hashcode 2022 Practice Problem
Tags: amplpy, heuristics, engineering, scheduling, complexity
Notebook author: Marcos Dominguez Velad <marcos@ampl.com>
Model author: Marcos Dominguez Velad <marcos@ampl.com>
References: N/A
Google Hashcode is a team programming competition to solve a complex engineering problem.
In this notebook we are showing how Mathematical Optimization methods as Mixed Integer Programming (MIP) are useful to solve this kind of problems, as they are really easy to implement and give optimal solutions (not only trade-off ones), as opposed to greedy approaches or heuristics. We are solving the pizza warm-up exercise.
We are using AMPL as the modeling language to formulate the problem from two different approaches (not all the formulations are the same in terms of complexity), coming up with enhancements or alternative approaches is an important part of the solving process.
As an instructive example of how to face this kind of problems, we are using the AMPL API for Python (AMPLPY), so we can read the input of the problem, translate easily to data for AMPL, and retrieve the solution to get the score. Because of using MIP approach, the score will be the highest possible for the problem.
Problem statement
The statement of this year is related to a pizzeria, the goal is to maximize the number of customers coming, and we want to pick the ingredients for the only pizza that is going to be sold:
Each customer has a list of ingredients he loves, and a list of those he does not like.
A customer will come to the pizzeria if the pizza has all the ingredients he likes, and does not have any disgusting ingredient for him.
Task: choose the exact ingredients the pizza should have so it maximizes the number of customers given their lists of preferences. The score is the number of customers coming to eat the pizza.
(The statement can be found here)
Conclusion
First, let’s compare the size of the two models.
Also in the second approach, each constraint has only two non-zero coefficients along with variables, which is an advantage to have more sparse coefficient matrices.
The choice of one model or another will depend on the concrete instance of the problem, so the sparsity of the matrix and the real number of constraints can change (actually, the constraints of the two models are compatible). AMPL will take care of building the coefficient matrix efficiently, so there is no extra effort to compute the constraints or sums within them once the model is prepared and sent to the solver, and we can focus on thinking algorithmically. Also a lot of constraints and variables would be removed by presolve. To know more about the AMPL modeling language you can take a look to the manual.
Some of the advantages of this approach are:
It is really easy to implement solutions.
There is no need to debug algorithms, only the correctness of the model.
Models are very flexible, so new constraints could be added while the rest of the model remains the same.
Disadvantages:
It is hard to estimate how long it is going to take, even in simple models like the ones presented.
Sometimes it is hard to formulate the problem, as some of the constraints or the objective function could not adjust to the usual mathematical language. The problem could be non-linear so convergence would be more difficult and even optimal solutions would not be guaranteed.
For simple problems, more efficient algorithmic techniques could also give the best solution (Dynamic Programming, optimal greedy approaches…).
Enhancements:
Study the problem to come up with presolve heuristics in order to get smaller models.
Add termination criterias (solver options) so the solver can stop prematurely when finding a enough good solution (there is a little gap between the best found solution and the known bounds), or even a time limit. If you are lucky the solution could be the optimal one but the optimality was not proved yet.
If the solver could not find the optimal solution on time, but we used a termination criteria, we could retrieve a good solution and run some kind of algorithm over it so we can improve and get closer to the optimal (GRASP or Genetic Algorithms, for instance). Actually, when solving a real engineering problem is desirable to combine exact methods as MIP, heuristics (greedy approaches) or metaheuristics (GRASP, Simulated Annealing, …) among others, to reach better solutions.
–
Author: Marcos Dominguez Velad. Software engineer at AMPL.
marcos@ampl.com