Stochastic Capacitated Facility Location Problem

stochastic_facility_location.ipynb Open In Colab Kaggle Gradient Open In SageMaker Studio Lab Hits

Description: This notebook illustrates modeling a stochastic facility location problem using a mixed-integer programming approach. Facility location decisions are pivotal, typically requiring substantial investments that have far-reaching social, economic, and environmental consequences. Strategically positioning critical facilities—like warehouses, factories, and service centers—is essential for optimizing an organization’s operational efficiency, expanding its market presence, and achieving sustainable practices. Such modeling enables decision-makers to navigate uncertainties and make informed, data-driven choices that support long-term strategic goals.

Tags: amplpy, ampl, mip, stochastic, facility location

Notebook author: Gyorgy Matyasfalvi <>


  • AMPL a Modeling Language for Mathematical Programming – Robert Fourer et al.

  • SCIP Optimization Suite example projects – Stephen J. Maher (ZIB)

# Install dependencies
%pip install -q amplpy
# Google Colab & Kaggle integration
from amplpy import AMPL, ampl_notebook

ampl = ampl_notebook(
    modules=["open"],  # modules to install
    license_uuid="default",  # license to use
)  # instantiate AMPL object and register magics

Problem description

Facility location decisions are crucial and often involve significant investment for both public and private sector entities, bearing profound social, economic, and environmental implications. The strategic positioning of facilities, such as warehouses, factories, and service centers, can determine an organization’s operational efficiency, market reach, and overall sustainability.

Given the high stakes of these decisions, engineers and analysts have developed sophisticated models to aid organizations in identifying optimal locations. These models take into account a variety of factors, including but not limited to, transportation costs, proximity to customers and suppliers, labor availability, customer demand, and environmental regulations.

The challenge is compounded when considering the uncertainty inherent in future conditions. Factors such as fluctuating market demands, changes in infrastructure, and unpredictable socio-economic developments require a robust approach to facility location. Hence, engineers often employ stochastic models and robust optimization techniques that account for such uncertainties, ensuring that the chosen locations remain viable under a range of possible future scenarios.

Mixed integer program

Below you can find the extensive form of the stochastic facility location problem as an explicit mixed integer program.


  • A set of facilities: $I$.

  • A set of customers: $J$.

  • Set of scenarios: $S$ (representing different customer demands).


  • Find the minimum cost facilities to open such that the customer demand can be satisfied in all scenarios.


  • $x_i \in {0, 1} \quad \forall i \in I$

    • $x_i = 1$ if facility $i$ is opened.

  • $y_{ij}^s \geq 0 \quad \forall i \in I, \forall j \in J, \forall s \in S$

    • $y_{ij}^s$ is the level of demand for customer $j$ satisfied by facility $i$ in scenario $s$.


  • $f_i$: the fixed cost for opening facility $i$,

  • $q_{ij}$: the cost of servicing customer $j$ from facility $i$,

  • $\lambda_j^s$: the demand of customer $j$ in scenario $s$,

  • $k_i:$ the capacity of facility $i$.

The extensive form

The extensive form of our stochastic program can be formulated as follows:

$ \begin{equation} \begin{array}{rll} \min \quad & \sum_{i \in I} f_i x_i + \frac{1}{|S|} \sum_{s \in S} \sum_{i \in I} \sum_{j \in J} q_{ij} y_{ij}^s & \ & & \ \textrm{subject to} \quad & \sum_{i \in I} y_{ij}^s \geq \lambda_j^s & \forall j \in J, \forall s \in S \ & \sum_{j \in J} y_{ij}^s \leq k_i x_i & \forall i \in I, \forall s \in S \ & \sum_{i \in I} k_i x_i \geq \max_{s \in S} \sum_{j \in J} \lambda_j^s & \ & & \ & x_i \in {0, 1} & \forall i \in I \ & y_{ij}^s \geq 0 & \forall i \in I, \forall j \in J, \forall s \in S \end{array} \tag{1} \end{equation} $

AMPL Implementation

Translating the mathematical formulation of our optimization problem into an AMPL model is a direct process. The AMPL code closely mirrors each inequality in the system (1), preserving the structure of the mathematical model.

AMPL’s expressive syntax allows for meaningful names for entities such as variables, parameters, and constraints, enhancing the model’s clarity. For instance, we’ll represent the set $I$ as FACILITIES, $J$ as CUSTOMERS, and $S$ as SCENARIOS. Variables previously denoted as $x$ and $y$ will be named facility_open and production, respectively.

Similarly, we will rename our parameters for greater clarity: $f_i$ becomes fixed_cost, $q_ij$ is now variable_cost, $\lambda_j^s$ is referred to as customer_demand, and $k_i$ is labeled as facility_capacity.

Using descriptive names not only enhances the readability of the model but also its maintainability and the ease with which it can be shared and understood by others.

%%writefile floc_ef.mod
# Sets
set FACILITIES; # set of facilities
set CUSTOMERS;  # set of customers
set SCENARIOS;  # set of scenarios

# Variables
var facility_open{FACILITIES} binary;                   # 1 if facility i is open, 0 otherwise
var production{FACILITIES, CUSTOMERS, SCENARIOS} >= 0;  # production from facility i to satisfy customer demand j in scenario s

# Parameters
param fixed_cost{FACILITIES} >= 0;                  # fixed cost of opening facility_open i
param variable_cost{FACILITIES, CUSTOMERS} >= 0;    # variable cost of satisfying customer_demand of customer j from facility_open i
param customer_demand{CUSTOMERS, SCENARIOS} >= 0;   # customer_demand of customer j in scenario s
param facility_capacity{FACILITIES} >= 0;           # facility_capacity of facility_open i

# Objective
minimize TotalCost: 
    sum{i in FACILITIES} fixed_cost[i] * facility_open[i] +                                                             # Fixed cost of opening facility i 
    1/card(SCENARIOS) * (sum{s in SCENARIOS, i in FACILITIES, j in CUSTOMERS} variable_cost[i,j] * production[i,j,s]);  # Variable cost of satisfying customer demand j from facility i in scenario s

# Constraints
s.t. satisfying_customer_demand{s in SCENARIOS, j in CUSTOMERS}:
    sum{i in FACILITIES} production[i,j,s] >= customer_demand[j,s];

s.t. facility_capacity_limits{s in SCENARIOS, i in FACILITIES}:
    sum{j in CUSTOMERS} production[i,j,s] <= facility_capacity[i] * facility_open[i];

s.t. sufficient_production_capacity:
    sum{i in FACILITIES} facility_capacity[i]*facility_open[i] >= max{s in SCENARIOS} sum{j in CUSTOMERS} customer_demand[j,s];  
Overwriting floc_ef.mod
%%writefile floc_ef_toy.dat
set FACILITIES  := Watson Bayshore Orange;
set CUSTOMERS   := Station_LA Store_TX Center_TX Store_AZ;
set SCENARIOS   := Low Medium High;

param fixed_cost := Watson 400000 Bayshore 500000 Orange 600000;

param facility_capacity := Watson 1550 Bayshore 650 Orange 1750;

param variable_cost:        Station_LA  Store_TX    Center_TX   Store_AZ :=
             Watson         6739.725    3204.8625   4914        32372.1125
             Bayshore       10355.05    5457.075    26409.6     29982.225       
             Orange         7650.40     3845.4      19622.4     21024.325;

param customer_demand:        Low    Medium   High :=
               Station_LA     166    166      177 
               Store_TX       91     105      106       
               Center_TX      679    716      873
               Store_AZ       1441   1500     1528;
Overwriting floc_ef_toy.dat
ampl = AMPL()                       # Instantiate an AMPL object"floc_ef.mod")            # Read the model from file
ampl.readData("floc_ef_toy.dat")    # Read the data from file

ampl.option["solver"] = "highs"    # Select the solver
ampl.solve()                        # Attempt to solve

print(ampl.option["solve_result_table"])                    # Print the solve result table, this will inform us of the various solution codes.
result_df = ampl.get_data("solve_result_num").to_pandas()   # Retrieve solution status

# Check if the problem was solved if not print warning
srn = result_df["solve_result_num"].iloc[0]
if srn != 0:
    print(f"Warning:\tProblem not solved to optimality!\n\t\tsolve_result_num = {srn}")
    print("Success:\tProblem solved to optimality!")
HiGHS 1.6.0:HiGHS 1.6.0: optimal solution; objective 37500349.64
12 simplex iterations
1 branching nodes

0	solved
100	solved?
200	infeasible
300	unbounded
400	limit
500	failure

Success:	Problem solved to optimality!