Loren on the Art of MATLAB

Turn ideas into MATLAB

Generating an Optimal Employee Work Schedule Using Integer Linear Programming 7

Posted by Loren Shure,

Today's guest blogger is Teja Muppirala, who is a member of our Consulting Services group. Teja has been with MathWorks for 6 years and is based in our Tokyo office.

Contents

Mixed-Integer Linear Programming and The Nurse Scheduling Problem

Since it's introduction in release R2014a, we've had several blog posts now showing some applications of intlinprog , the mixed-integer linear programming (MILP) solver found in the Optimization Toolbox. This had been one of our most requested features, as MILP has trememdous application in a variety of areas such as scheduling and resource optimization.

There are a number of classic optimization problems that can be framed using MILP, such as the Travelling Salesman Problem and Knapsack Problem.

The work in this blog post is based loosely on a discussion I recently had with a customer, who wanted to make an optimal shift schedule for his employees while satisfying certain availability and staffing constraints. This is a variation of what is known as the Nurse Scheduling Problem.

Problem Statement

Obviously the "Nurse Scheduling Problem" is not limited to "nurses" as an occupation, so I will just use the generic term "employee" here. The customer's problem statement was as follows.

Given:

  • A list of employees with their available work hours, and hourly salaries
  • A prescibed minimum number of staff needed to be at work at a given hour (fewer staff are needed at night, more staff are needed during peak hours)

Find a schedule which MINIMIZES:

  • The total daily wages the employer must pay out.

While meeting the hard CONSTRAINTS:

  • At any given hour, you must meet the minimum staffing requirement
  • An employee can only work one shift a day
  • An employee must work within his available hours
  • If an employee is called for duty, they must work at least a specified minimum number of hours, and no more than a specified maximum number of hours

This is probably a bit abstract, but let's load in some concrete data so you can get a clearer idea of what exactly we are trying to do here.

1. Read in the requirements and employee data from the Excel sheet

The staff information and scheduling requirements are in an Excel file, so we will first need to import the data. This is a fictional dataset and the Excel file is available along with the MATLAB code for this blog post so you can feel free to try it out on your own.

The first sheet in the Excel file (available here) contains the staff information, and it is in a tabular format suitable to be imported directly as a MATLAB table using the readtable function. Let's take a look at what's inside.

filename = 'scheduling.xlsx';
staffTable = readtable(filename)
staffTable = 
    EmployeeName    MinHours    MaxHours    HourlyWage    Availability
    ____________    ________    ________    __________    ____________
    'SMITH'         6           8           30            '6-20'      
    'JOHNSON'       6           8           50            ''          
    'WILLIAMS'      6           8           30            ''          
    'JONES'         6           8           30            ''          
    'BROWN'         6           8           40            ''          
    'DAVIS'         6           8           50            ''          
    'MILLER'        6           8           45            '6-18'      
    'WILSON'        6           8           30            ''          
    'MOORE'         6           8           35            ''          
    'TAYLOR'        6           8           40            ''          
    'ANDERSON'      2           3           60            '0-6, 18-24'
    'THOMAS'        2           4           40            ''          
    'JACKSON'       2           4           60            '8-16'      
    'WHITE'         2           6           55            ''          
    'HARRIS'        2           6           45            ''          
    'MARTIN'        2           3           40            ''          
    'THOMPSON'      2           5           50            '12-24'     
    'GARCIA'        2           4           50            ''          
    'MARTINEZ'      2           4           40            ''          
    'ROBINSON'      2           5           50            ''          

The variable staffTable has a list of each employee, along with the minimum hours they must work (if they are called in), maximum hours they may work, their hourly wage, and any limits on availability if there are any.

For example, the first employee SMITH, if called in, must work at least 6 hours, no more than 8 hours, earns $30/hour, and is only available between 6am and 8pm.

We can also see the required staffing requirements for each hour of the 24-hour day. I don't need this to be table, I just want to import it as a numeric array, so xlsread will work just fine.

requirements = xlsread(filename,2); % Second sheet has minimum staff requirements
disp('Required staffing (hourly from 0:00 - 24:00): ');
disp(mat2str(requirements(2,:)));
Required staffing (hourly from 0:00 - 24:00): 
[1 1 2 3 6 6 7 8 9 8 8 8 7 6 6 5 5 4 4 3 2 2 2 2]

Late at night, only 1 or 2 employees are needed, while during peak hours in the morning to afternoon, we may need as many as 9 employees on duty.

2. Generate the f, A, and b matrices based on the the constraints and objectives

Generating a MILP formulation of a particular problem involves expressing the minimization objective and constraints using linear equations, and these are typically written using matrix notation. The specifics of this are covered thoroughly in the documentation.

There is a bit of machinery involved in generating the appropriate matrices in this particular case, and it would be a bit dense to go through all the details in this blog post. My aim is more to show an application of MILP, rather than discussing the intricate details of problem formulation. If you are interested in that though, I definitely encourage you to download the MATLAB files for this example and check it out for yourself.

However, I do want to discuss briefly the structure of the constraint matrices and decision variables for this particular problem. Let's take a look at the constraint matrix A. Here I used the spy function to view the sparsity pattern.

% f,A,b are inputs for the MILP Problem
% staffNumberVector is needed later when plotting the results
[f,A,b,staffNumberVector] = makeMILPMatrices(staffTable,requirements);

figure,
spy(A,1); axis fill;
title({'Sparsity pattern of the constraint matrix A:';
       'The top portion is from the minimum hour requirement,';
       'The bottom portion is from the one-shift-a-day requirement'});

There are two parts to the inequality constraint matrix A, which are clearly visible. The top portion has 24 rows (because there are 24 hours in a day) and each row represents the constraint that at a particular hour, you need a minimum number of staff.

The bottom section of A, in this case, has 20 rows (because there are 20 available employees). The k-th row of this section is associated with the constraint that the k-th employee may only work one shift per day.

A will in general be much wider than tall, and the columns of A represent every possible shift for every possible employee. Every column of A is also associated with a decision variable that we constrain to be either 0 or 1 (a binary variable).

$\textbf{x} = [x_1,x_2,...x_k]$

${x_n} \in \{0,1\}$ for all n

A value of 1 for $x_n$ means that that n-th shift possibility is implemented, and a value of 0 indicates it is not. For this particular problem, A has 1295 columns, and so x is a decision vector of length k = 1295. It is this x that intlinprog will solve for.

As the number of employees and possible shifts increases, A may consist of many thousands of columns. Although I didn't use sparse matrices in this demo, intlinprog is capable of working with sparse matrices to make more efficient use of memory and allow for larger problem sizes.

3. Call intlinprog with every variable as an integer 0 or 1

Once the relevant matrices are created, all that's left to do is actually call intlinprog. I specify that all the decision variables are integers, and have a value between 0 and 1.

nVars = numel(f);  % Number of variables in the problem
lb = zeros(nVars,1);    % The upper and lower bounds are 0 and 1
ub = ones(nVars,1);
[x, cost] = intlinprog(f,1:nVars,A,b,[],[],lb,ub);
LP:                Optimal objective value is 4670.000000.                                          

Cut Generation:    Applied 1 zero-half cut.                                                         
                   Lower bound is 4670.000000.                                                      

Branch and Bound:

   nodes     total   num int        integer       relative                                          
explored  time (s)  solution           fval        gap (%)                                         
      49      0.07         1   4.670000e+03   0.000000e+00                                          

Optimal solution found.

Intlinprog stopped because the objective value is within a gap tolerance of the
optimal value, options.TolGapAbs = 0 (the default value). The intcon variables
are integer within tolerance, options.TolInteger = 1e-05 (the default value).

I am returning two arguments, the first is the decision vector itself, x, and the second argument, cost, is the minimized daily cost associated with this solution. You could also find cost as

cost = f*x.

intlinprog by default outputs certain diagnostic information to the command window (though there are options to hide this if preferred). For example, the miminized cost is shown in the command window to be 4670.0 (in real-world terms, $4760 dollars in daily wages). Also, one can see the total time that it took to solve this problem.

intlinprog is very fast!

4. Gather and display the results

intlinprog will give me my optimal x vector, but this is simply a long vector of of ones and zeros saying which shifts are implemented. I will go back and find out which index of x corresponds with which employee and shift, and then display the data in a more human-friendly form.

numStaff = size(staffTable,1); % Total number of staff available

% Convert from indices in x to employee and shift information
selected = find(x);
hoursMatrix = zeros(numStaff,24);

for n = 1:numel(selected);
    thisEntry =selected(n);
    thisStaff = staffNumberVector(thisEntry);
    hoursOnDuty = -A(1:24,thisEntry);
    hoursMatrix(thisStaff,:) = hoursOnDuty;
end

% Make a figure in a suitable location
hf = figure('visible','off','units','pixel','position',[100 100 560 700]);
movegui(hf,'center');
set(hf,'visible','on');

% Draw the employee work schedule matrix
subplot(211);
imagesc(0.5:23.5,1:20,hoursMatrix)
set(gca,'xtick',0:24);
set(gca,'ytick',1:numStaff,'yticklabel',staffTable.EmployeeName);
axis tight;
set(gca,'xlimmode','manual','ylimmode','manual','layer','bottom');
hold all;
ylabel('Employee Name','FontSize',12);

% Let's make the grid lines manually...
for n = 0.5:numStaff+0.5
    plot(xlim,[n n],'k');
end
for n = 0:24
    plot([n n], ylim,'k');
end

title(['Employee shift schedule' 10 ...
       'Total wages over 24 hours: $' num2str(cost)],'FontSize',16);
colormap([1 1 1; 0 0.5 1]);

% Draw the required vs. actual staff
subplot(212);
plot(0.5:23.5,requirements(2,:),'b.-','linewidth',3,'markersize',30);
actualHours = -A(1:24,:)*x;
hold on;
plot(0.5:23.5,actualHours,'r:.','linewidth',2,'markersize',16);
xlim([0 24]);
set(gca,'xtick',0:24,'ytick',0:10);
ylim([0 10]);
grid on;
title(['Hourly Staff Count'],'FontSize',16);
legend({'Employees Required', 'Employees Scheduled'});
xlabel('Time of day','fontsize',12);
ylabel('Employee count','fontsize',12);

We have two plots here. The first shows the actual work schedule for each employee (blue indicates when on shift). Note that there are a couple of employees (ANDERSON and JACKSON) that do not get called in. Also, while it seems that JOHNSON has two shifts, this is because we are wrapping around and the shift actually goes from 10pm to 6am the next morning.

The second plot shows the number of staff on duty as compared to our minimum staffing requirements. We see that we meet the requirements exactly, without overstaffing or understaffing at any point. This makes sense, because if the goal is to minimize paid wages, then you'd try to stay as close to the minimum staff as possible.

5. Conclusion

The problem outlined in the example is not a trivial one to solve. To be sure, it takes a bit of study and experience to be able to know how to easily convert a real world problem into its equivalent MILP formulation.

But for problems like this, especially at larger problem sizes, attempting to solve it using other methods such trial-and-error or genetic algorithms will usually take longer, and generally not yield as good results.

Have you tried using MATLAB's mixed-integer linear programming solver for any of your own work? If you want to share your thoughts, or if you have any questions regarding this example, please feel free to let us know in here.


Get the MATLAB code

Published with MATLAB® R2015b

Note

Comments are closed.

7 CommentsOldest to Newest

Rob replied on : 1 of 7

If I’m reading this right – the optimized one does not employee Anderson nor Jackson at all. Likely because these are the two most expensive by hour people. But I was expecting the code to ensure that everyone did their minimum hours – which is 2 hours each for Anderson and Jackson. Did I miss something?

Teja Muppirala replied on : 2 of 7

Hello Rob, thank you for your feedback. Indeed, Anderson and Jackson are not called in, and as you pointed out it is not surprising that they are the most highly paid employees.

“The variable staffTable has a list of each employee, along with the minimum hours they must work (if they are called in), ….”

As stated here, the minimum hours requirement refers to the minimum hours they must work only if they are called in at all. I made no requirement that an employee must be called in. If you wanted to guarantee that every employee must come in, then only a slight modification would be needed. Take the lower part of the constraint matrix, and turn it into an equality constraint. Instead of having A*x <= b, you would make it so Aeq*x = beq for that part.
Then you would call intlinprog with Aeq and beq included as well.

In detail:
After line 126 [f,A,b,staffNumberVector] = makeMILPMatrices(staffTable,requirements);
Add the following 4 commands:

Aeq = A(25:end,:);
beq = b(25:end);
A = A(1:24,:);
b = b(1:24)

Then change line 171, the call to intlinprog as follows:
[x, cost] = intlinprog(f,1:nVars,A,b,Aeq,beq,lb,ub);

If you do that, you'll find that the new daily wage is $4700, again with no over or understaffing, and Anderson and Jackson are both called in for 2 hours.

Guilherme replied on : 3 of 7

@Rob,
the last hard constraint says that “if an employee is called for duty, they must work at least a specified minimum number of hours”, so they (Anderson and Jackson) can’t be scheduled to work just 1 hour.

Jacob replied on : 4 of 7

Rob, note that the problem states: “If an employee is called for duty”. Also the number of employees can be far more than can be included in the solution matrix (where the given time window is short) – of course you could/should have some more (in)equalities to guarantee some hours for your best (most expensive?) employees to have them in your x vector for your next iteration. :)

Sean de Wolski replied on : 5 of 7

@Jacob, or constraints could be replicated to look at a longer time period at once (week/month/etc.) with the additional constraint of you can’t work more than e.g. five days a week etc.

Michael replied on : 6 of 7

Hi Loren,
According to “staffTable”, JOHNSON is not available in ‘0-6’ and only ANDERSON and THOMAS are available during that hours, then why in final table JOHNSON assigned for that shift? it seems there are more contradiction with availability column…

Teja Muppirala replied on : 7 of 7

Hello Michael,
Thank you for your comment. The “availability” column in staffTable is empty for people that can work at any time. So for example, JOHNSON has nothing listed in availability, so that means he can work at any time of day, and he ended up getting scheduled from 0-6.