AMPL - solve multiple models in parallel

multiproc.ipynb Open In Colab Kaggle Gradient Open In SageMaker Studio Lab Hits

Description: Solve multiple AMPL models in parallel in Python with amplpy and the multiprocessing modules.

Tags: AMPL, Python, amplpy, multiprocess, Parallel Computing, Stochastic Programming

Notebook author: Nicolau Santos <nfbvs@ampl.com>

# Install dependencies
%pip install -q amplpy
# Google Colab & Kaggle integration
from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["highs"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics

Motivation

A common task is to analyze the results of a model given different combinations of some input parameters. This can be done in parallel with amplpy and the multiprocessing module.

For our demonstration we will use a stochastic programming model available at https://colab.ampl.com/

%%writefile model.mod

param samples > 0;
param demand{1..samples} >= 0;
param cost > 0;
param recover < cost;
param retail >= 0;
param minrev;
param maxrev;
param alpha >= 0, <= 1;
param beta >= 0, <= 1;

var nu >= minrev, <= maxrev;
var excess{1..samples} >= 0, <= maxrev - minrev;
var order >= 0;
var sales{i in 1..samples} >= 0, <= demand[i];
var discount{1..samples} >= 0;
var profit{1..samples} >= minrev, <= maxrev;

var cvar = nu + 1 / ((1 - alpha) * samples) * sum{i in 1..samples} excess[i];
var average_profit = (1/samples) * sum{i in 1..samples} profit[i];

maximize prof:
    - beta * cvar + (1-beta) * average_profit;

s.t. c1 {i in 1..samples}: profit[i] == -cost * order + retail * sales[i] + recover * discount[i];
s.t. c2 {i in 1..samples}: sales[i] + discount[i] == order;
s.t. c3 {i in 1..samples}: -profit[i] - nu <= excess[i];

The model has three parameters of major interest for our example, namelly:

  • alpha - parameter to manage the conditional value at risk (CVaR)

  • beta - parameter that manages the contribution of the CVaR and average profit to the objective function

  • demand - unknown parameter for which we will generate multiple scenarios.

Our main objective is to study the impact of different alpha and beta combinations for different scenarios of the demand.

Implementation

First we create a worker function that:

  • gets as input a list with the values of alpha, beta, run and seed

  • generate data for the demand parameter using the given seed

  • instantiates an AMPL object and loads the data, including the provided alpha and beta values

  • solves the problem and returns a list with the initial alpha, beta and run parameters and also the obtained objective value and used wall time.

%%writefile worker.py

import random
import time


from amplpy import AMPL


# Define constants
# Economic parameters
cost = 2
retail = 15
recover = -3
# Parameters to generate data in each process
samples = 10000
sigma = 100
mu = 400

def worker(data):
    """
    Example worker
    Input: a list with parameters
        alpha - parameter to be passed to AMPL
        beta - parameter to be passed to AMPL
        run - number of the run for the alpha and beta combination
        seed - seed to generate data for the given process
    Output: a list with parameters
        alpha - parameter used in the run
        beta - parameter used in the run
        run - number of the run for the alpha and beta combination
        obj - objective value of the solved model
        worker_time - wall time used by the worker
    This function as the following steps: 
        generate data acording to the received parameters
        instantiate AMPL and load an existing model file
        solve the model with the defined solver
        output results
    """
    start_time = time.time()

    # Get information from the input list
    alpha = data[0]
    beta = data[1]
    run = data[2]
    seed = data[3]

    # Initialyze and seed random number generator
    rng = random.Random()
    rng.seed(seed)

    # Generate data for this execution
    demand = [max(rng.normalvariate(mu,sigma), 0) for i in range(samples)]
    maxrev = max(demand) * (retail - cost)
    minrev = max(demand) * (recover - cost) + min(demand) * retail

    # Create AMPL instance and load the model
    ampl = AMPL()
    ampl.read("model.mod")

    # Load the data
    ampl.param["samples"] = samples
    ampl.param["cost"] = cost
    ampl.param["recover"] = recover
    ampl.param["retail"] = retail
    ampl.param["minrev"] = minrev
    ampl.param["maxrev"] = maxrev
    ampl.param["demand"] = demand
    ampl.param["alpha"] = alpha
    ampl.param["beta"] = beta

    # Set solver and options
    ampl.option["solver"] = "highs"
    ampl.option["highs_options"] = "tech:threads=1"
    # Solve without generating any output
    # use this when your model is ready for deployment
    # otherwise use the regular solve (commented below) to track potential issues
    ampl.get_output("solve;")
    #ampl.solve()

    # Get results
    obj = ampl.obj["prof"].value()

    worker_time = time.time() - start_time
    return [alpha, beta, run, obj, worker_time]
In our case the model will be the same for every run. However, it's possible to pass the name of the model as a parameter to the worker function and solve different models in parallel.
Note that the chosen solver may use by default more than one process/thread. Unless you are configuring the number of AMPL and solver processes manually you should set the number of processes/threads of the solver to 1.

Afterwards we create a list with the different parameter combinations that we will send as input for each process.

Parallelization is obtained with the multiprocessing module: we create a multiprocessing pool with a given number of processors and we map the created pool to the worker function and the list with the different parameter combinations.

At the end we print the obtained results.

"""
Example of multiprocessing with Python and amplpy.
This module defines a worker function that solves an AMPL model.
Multiple combination of parameters are generated and each one is
passed to a process.
Individual results of all processes are printed at the end.
"""

import multiprocessing
import os
import time

if __name__ == "__main__":
    from worker import worker

    # Generate different combinations of parameters to send to each process
    alphas = [0.7, 0.8]
    betas = [0.6, 0.7]
    nruns = 3

    config = []
    seed = 0
    for a in alphas:
        for b in betas:
            for r in range(nruns):
                config.append((a, b, r, seed))
                seed += 1

    print("Configurations:")
    print("\t".join(["alpha", "beta", "run", "seed"]))
    for c in config:
        print("\t".join([str(i) for i in c]))
    print()
    print("Number of workers:", len(config))
    print()

    # Get the number of processors available and initialyze the pool
    # Could be better to use nproc = os.cpu_count()-1 and keep one processor for the OS
    nproc = os.cpu_count()
    print(nproc, "processors available")
    print()
    pool = multiprocessing.Pool(nproc)

    # Run the workers
    main_start_time = time.time()
    results = pool.map(worker, config)
    pool.close()
    main_end_time = time.time()

    # Print results
    print("Results:")
    print("\t".join(["alpha", "beta", "run", "objective", "workertime"]))
    for r in results:
        print("\t".join([str(i) for i in r]))
    print()
    print("Main time: %.3f" % (main_end_time - main_start_time))
    print("Average worker time: %.3f" % (sum(r[4] for r in results) / (nproc)))
    print("Total time: %.3f" % (sum(r[4] for r in results)))
Configurations:
alpha	beta	run	seed
0.7	0.6	0	0
0.7	0.6	1	1
0.7	0.6	2	2
0.7	0.7	0	3
0.7	0.7	1	4
0.7	0.7	2	5
0.8	0.6	0	6
0.8	0.6	1	7
0.8	0.6	2	8
0.8	0.7	0	9
0.8	0.7	1	10
0.8	0.7	2	11

Number of workers: 12

4 processors available

Results:
alpha	beta	run	objective	workertime
0.7	0.6	0	3525.832209279827	25.238504648208618
0.7	0.6	1	3524.356178682977	25.41237211227417
0.7	0.6	2	3541.325355197152	25.553394317626953
0.7	0.7	0	3565.9342984444384	27.238896131515503
0.7	0.7	1	3585.2380403500324	21.48252558708191
0.7	0.7	2	3402.66236960429	20.380152225494385
0.8	0.6	0	3265.1446466955886	20.773118257522583
0.8	0.6	1	3390.28915038965	20.527676343917847
0.8	0.6	2	3422.6281549721944	19.438149213790894
0.8	0.7	0	3290.5810949315205	20.264755964279175
0.8	0.7	1	3233.292842109562	17.839205741882324
0.8	0.7	2	3282.2282040925375	18.890254974365234

Main time: 66.715
Average worker time: 65.760
Total time: 263.039

In our example a nearly linear speedup was achieved. Depending on the ratio between the number of workers and number of processes this value may vary. Statistical analysis of the results is beyond the scope of this notebook.