# Stable Matching Problem and the Algorithm that Won a Nobel Prize

Many here in Massachusetts started social distancing about a month ago and we have no end in sight yet. If you live alone, maybe you are ready to match up with someone after you get through this hardship. Today's guest blogger, Toshi Takeuchi, has a perfect algorithm for you. I love that this was inspired by a problem that, at first glance, doesn't seem very technical or relevant. But it is!

### Contents

#### It Began with a Facebook Post

One day I came across a Facebook post in one of those groups that had sprung up to help people cope with social distancing. It began:

**"A coding challenge and a puzzle to solve! The Nobel prize for Economics (my subject) was recently awarded for work on matching markets, where you care not just about demand and supply in general but about precisely who gets what. Sound familiar? Well that’s a milonga, right?"** (a milonga is an Argentine Tango dance event)

That's how I learned about the Stable Matching Problem (SMP). The original poster wanted people to apply the SMP to solve the problem of matching people for a partner dance. That sounded trivial, but the actual problem had many consequential applications.

- National Resident Matching Program (NRMP) matches tens of thousands medical students to residency programs
- New York City's High School Choice program matches tens of thousands of students to public high schools
- Doctors needs to match individual donors to recipients for kidney transplants
- Tens of thousands of edge servers must be matched to billions of web requests per hour

And indeed the algorithm that is 100% guaranteed to find solutions to the SMP won the 2012 Nobel Prize in Economics.

#### Problem Definition

Given two sets of participants of equal size to match, such as medical students and residency programs, and with their fixed preferences, such as:

S1 = {R1 R2 R3} R1 = {S1 S3 S2} S2 = {R3 R1 R2} R2 = {S1 S2 S3} S3 = {R1 R3 R2} R3 = {S3 S1 S2}

Can we find stable matches? Stability in this case means:

- no one is left unmatched, and
- no one can find a better match who is willing to switch

A solution may look like this:

(S1, R1), (S2, R2), (S3, R3)

This is stable because:

`S1`is matched to`R1`, and they are the first choices for each other. Yay!`S2`prefers`R3`to`R2`, but`R3`is matched to`S3`, its first choice, and therefore`R3`is not willing to switch`S3`prefers`R1`to`R3`, but`R1`is matched to`S1`, its first choice, and therefore`R1`is not willing to switch

Now what is this algorithm that is guaranteed to generate solutions to the SMP?

#### Gale Shapley Algorithm

This remarkable algorithm is called the Gale-Shapley algorithm and it is surprisingly straight-forward.

- One set of participants always proposes their matches to the other set of participants
- Those who propose (proposers) list their most to least favorite
- Those who get proposed to (acceptors) always accept the most favorable proposal if not currently matched
- Matched acceptors can switch to a new suitor if the new suitor is preferable
- The algorithm terminates when everyone is matched

The pseudocode from the Wikipedia page is as follows:

algorithm stable_matching is Initialize all m ∈ M and w ∈ W to free while ∃ free man m who still has a woman w to propose to do w := first woman on m's list to whom m has not yet proposed if w is free then (m, w) become engaged else some pair (m', w) already exists if w prefers m to m' then m' becomes free (m, w) become engaged else (m', w) remain engaged end if end if repeat

Here is the sample data from the Wikipedia page. There are 4 proposers and 4 acceptors. Rows represent either proposers or acceptors depending on the table, and columns their ranked preferences, where numbers are indices to their preferred matches.

prefProposers = [2,1,3,4;4,1,2,3;1,3,2,4;2,3,1,4] prefAcceptors = [1,3,2,4;3,4,1,2;4,2,3,1;3,2,1,4] wikiSolution = [1,1;2,4;3,3;4,2];

prefProposers = 2 1 3 4 4 1 2 3 1 3 2 4 2 3 1 4 prefAcceptors = 1 3 2 4 3 4 1 2 4 2 3 1 3 2 1 4

Unfortunately, it is not very intuitive to use numbers to walk through this example. To avoid a cliché, I tried to come up with a non-gender-based example, but I couldn't come up with a compelling one. For example, Loren asked me to apply the SMP to matching people to their favorite food, but I can't imagine pizza or sushi rejecting me (not sure if I can handle that).

This is not very original but at least I can reverse the traditional gender roles in our little Jane Austen drama, and let girls ask out boys.

proposers = ["Charlotte","Jane","Lizzy","Lydia"]; acceptors = ["Bingley","Collins","Darcy","Wickham"];

The first row of `prefProposers` represents Charlottes' preferences, and her first choice is 2, or Collins. Her second is choice 1, or Bingley, etc.

```
prefCharlotte = prefProposers(proposers == "Charlotte",:)
```

prefCharlotte = 2 1 3 4

The pseudocode is implemented literally as the function `galeShapley1` (see the Functions section at the bottom), which sequentially iterates over proposers to find their matches. The routine inside the while loop is in the subfunction `findMatch`.

It uses the vector `matched` to keep track of engagements, which is initially all zeros. Rows 1 to 4 map to respective proposers. If all elements are zeros, you know all proposers are free.

matched = [0;0;0;0]

matched = 0 0 0 0

When Charlotte proposes to Collins and he accepts, the first element is replaced with 2, which maps to Collins. Now they are engaged. We continue this way until Lydia proposes to Collins, who is already engaged to Charlotte.

matched = [2;4;1;0]

matched = 2 4 1 0

Because Collins prefers Lydia to Charlotte, he switches, and Charlotte becomes free. We know this because the first element is zero.

matched = [0;4;1;2]

matched = 0 4 1 2

Now we tried to find a match for Charlotte. Since she was already rejected by her first choice, we go with her second, Bingley, who is engaged to Lizzy. Bingley likes Charlotte better. Now Lizzy becomes free.

matched = [1;4;0;2]

matched = 1 4 0 2

Lizzy's second choice is Darcy, and Darcy is free. Now they are engaged.

matched = [1;4;3;2]

matched = 1 4 3 2

Now everyone is engaged, and the algorithm terminates.

Let's run the MATLAB code.

stableMatch1 = galeShapley1(prefProposers,prefAcceptors);

And here is the outcome.

namedMatch = strings(size(stableMatch1,1),1); namedMatch(:,1) = proposers'; for ii = 1:length(acceptors) namedMatch(stableMatch1(:,2) == ii,2) = acceptors(ii); end namedMatch

namedMatch = 4×2 string array "Charlotte" "Bingley" "Jane" "Wickham" "Lizzy" "Darcy" "Lydia" "Collins"

And here is the result in numbers, which matches to the solution in the GIF animation from the Wikipedia page.

doesMatch = isequal(wikiSolution,stableMatch1)

doesMatch = logical 1

#### Improved Implementation

We know that the sequential approach is not very efficient. In the first round, no one is rejected or matched yet. The acceptors are supposed to accept any first proposals. Therefore, we don't have to iterate over each proposer in this round. We could just copy their first choices.

matched = [2;4;1;2]

matched = 2 4 1 2

Then we need to check if any acceptors have received multiple proposals, such as 2 (Collins) in this example, and deal with them.

Function `galeShapley2` implements this idea. From the second round on, we use the original routine. We can do this because in the Gale-Shapley algorithm, the order of execution doesn't affect the outcome, as you can see from the output below.

[stableMatch2,scores] = galeShapley2(prefProposers,prefAcceptors);

The output matches the previous result.

doesMatch = isequal(stableMatch1,stableMatch2)

doesMatch = logical 1

#### How Happy Are They?

With `galeShapley2`, the second return variable `scores` shows how satisfied the participants are with the matches. It contains the rankings of their final matches. The first column is for proposers and the second column acceptors.

scores

scores = 2 1 1 2 2 3 1 2

- Charlotte is matched to her #2 choice, Jane her #1, Lizzy her #2, and Lydia her #1.
- Bingley his #1, Collins his #2, Darcy his #3, and Wickham his #2.

We can sum the rankings to see who did better. The smaller the score the better.

totalScores = sum(scores)

totalScores = 6 8

It appears that the proposers got better matches than acceptors.

#### Reversing the Roles

Let's see what happens if we switch the proposers and acceptors.

[stableMatch3,scores] = galeShapley2(prefAcceptors,prefProposers);

And here is the outcome. You see that the outcome is not the same as before.

stableMatch3

stableMatch3 = 1 1 2 3 3 4 4 2

And here are the scores.

scores

scores = 1 2 1 1 1 3 2 2

- Bingley is matched to his #1 choice, Collins his #1, Darcy his #1, and Wickham his #2.
- Charlotte her #2, Jane her #1, Lizzy her #3, and Lydia her #2.

totalScores = sum(scores)

totalScores = 5 8

Again, those who proposed got better matches than those who accepted.

#### Simulations Over 100 Trials

Under the Gale-Shapley algorithm, proposers generally do better than acceptors.

- Proposers start out with their first choices and they can only go down from there
- Acceptors can only improve the match through rejection, and they don't have control over how many proposals they get

Let's generate a dataset randomly using the function `generateSMPdataset` and plot the scores of proposers and acceptors. You can see that the scores of proposers are generally lower than those of acceptors and therefore they are getting the better matches.

iter = 100; totalScores = zeros(iter,2); for ii = 1:iter [prefProposers,prefAcceptors] = generateSMPdataset(4,4); [~, scores] = galeShapley2(prefProposers,prefAcceptors); totalScores(ii,:) = sum(scores); end figure plot(1:iter,totalScores(:,1)) hold on plot(1:iter,totalScores(:,2)) hold off xlabel("Trials") ylabel("Scores - the Lower the Better") title(["Gale-Shapley Algorithm - Scores by Role";"The Lower the Score, the Better"]) legend("Proposers","Acceptors","Location","best")

#### It's Better to Be Proposers Than Acceptors

Naively, it looks better to be acceptors because they can say no and don't have to face humiliating rejections. But, in exchange for the pain of rejection, proposers get better choices and that leads to the better outcomes. It is also known that if acceptors reject more by lying about their preferences, then they can game the system in their favor.

**Let this be your life lesson: initiate and risk rejections because that gives you better choices.**

It is also very important to consider who will be the proposers when we design a system that affects who the system benefits the most.

- In the case of NRMP, the medical students are the proposers. The system favors the medical students over the residency programs
- NYC High School Choice program, the students are the proposers. The system favors the students over the high schools they are applying for
- The edge servers are the proposers. The system favors the edge servers over the web requests

The system doesn't have to be set up this way, but they are what they are because they are designed to benefit one type of participants over the other.

#### Back to the Original Challenge

Now that we are familiar with the basic formulation of the SMP and the Gale-Shapley algorithm, let's go back to the challenge the original poster gave us.

In a milonga (dance event), we have 4 followers and 4 leaders.

namesF = ["Anna","Brigitte","Caroline","Dominique"]; namesL = ["Oliver","Thomas","Henri","Frank"];

Their rankings are as follows:

prefF = [1,2,3,4;1,3,2,4;2,4,1,3;3,1,4,2] prefL = [4,3,2,1;3,4,1,2;2,3,4,1;3,1,4,2]

prefF = 1 2 3 4 1 3 2 4 2 4 1 3 3 1 4 2 prefL = 4 3 2 1 3 4 1 2 2 3 4 1 3 1 4 2

- Only leaders can invite followers and followers cannot reject a leader if they are free
- There is only time for 3 tandas during the milonga
- A tanda is a set of 3 or 4 songs and you dance with the same partner during a tanda
- In a new tanda, however, no one can dance with a previous partner

The puzzle

- Who does leader 1 (Oliver) dance with on the third tanda?
- Which dancers don't get to dance with their first choices during the milonga?

The Coding Challenge

- Write a program in your favorite language to find the GS outcome for a single tanda (Done:
`galeShapley2`) - Extend the program to find the solution for a milonga of
`t`tandas. Use your program to obtain the solution to the puzzle above.

Here is the function `milongaMatch` to meet the coding challenge. The only new rule we need to deal with is that people are not allowed to dance with the same person again. Please note that this variant no longer 100% guarantees that everyone gets a partner (dancers, take notice). It stops when free proposers run out of acceptors to propose to.

tandas = milongaMatch(prefL,prefF,3);

Let's assign names to the output.

solution = strings(size(tandas)); for ii = 1:length(namesF) solution(tandas == ii) = namesF(ii); end solution = array2table(solution,"RowNames",namesL, ... "VariableNames","Tanda" + (1:size(tandas,2)))

solution = 4×3 table Tanda1 Tanda2 Tanda3 ___________ ___________ ___________ Oliver "Dominique" "Brigitte" "Caroline" Thomas "Caroline" "Anna" "Brigitte" Henri "Brigitte" "Dominique" "Anna" Frank "Anna" "Caroline" "Dominique"

The answer to the puzzle

- Oliver (row 1) dances with Caroline (3) in the third tanda (column 3)
- Anna (1) doesn't get to dance with her first choice Oliver (row 1), but everyone else dances with their respective first choice at some point

One complaint I have with Gale-Shapley is that it requires the two types of participants be in equal numbers. When it comes to social dancing, that never happens - usually we get more followers than leaders. I came up with a variant `flexibleGS` that can handle when proposers and acceptors are not equal in numbers. Give it a try!

prefProposers = [2,1,3,4,5;4,1,2,3,5;1,3,2,4,5;2,3,1,4,5]; prefAcceptors = [1,3,2,4;3,4,1,2;4,2,3,1;3,2,1,4;1,3,2,4]; stableMatch4 = flexibleGS(prefProposers,prefAcceptors);

#### Summary

Do you like puzzles and coding challenges? Did you find the SMP interesting? Can you think of fun examples to apply the SMP to? I am still thinking of a food example. There are restaurants in Japan that reject first time customers, unless accompanied by a regular who can vouch for you that your taste buds are worthy of their cuisine. Is that a SMP though? Share your fun SMP ideas here.

#### Functions

**galeShapley1**

function matched = galeShapley1(prefP,prefA) % Initialize all proposers and acceptors to free matched = zeros(size(prefP,1),1); proposed = false(size(prefP,1)); % while there exist a free proposer |p| ... while any(matched == 0) % find a match for |p| [matched,proposed] = findMatch(prefP,prefA,matched,proposed); end matched = [(1:size(prefP,1))',matched]; end

**findMatch**

function [matched,proposed] = findMatch(prefP,prefA,matched,proposed) p = find(matched == 0,1); % with |p| who still has acceptors to propose to acceptors = prefP(p,~proposed(p,:)); for ii = 1:length(acceptors) % |a| is |p|'s highest ranked such acceptor to whom |p| has not yet % proposed a = acceptors(ii); proposed(p,prefP(p,:) == a) = true; % if |a| is free if ~any(matched == a) % (p, a) become engaged matched(p) = a; break else % else some pair (p', a) already exists p_ = find(matched == a); [ranking,~] = find(prefA(a,:) == [p;p_]); % if |a| prefers |p| to |p'| if ranking(1) < ranking(2) % |p'| becomes free matched(p_) = 0; % (p, a) become engaged matched(p) = a; break else % else (p', a) remain engaged end end end end

**galeShapley2**

function [matched,varargout] = galeShapley2(prefP,prefA) matched = zeros(size(prefP,1),1); proposed = false(size(prefP,1)); while any(matched == 0) % This handles round 1 if all(matched == 0) matched = prefP(:,1); proposed(:,1) = true; [acceptors,~,idx] = unique(matched); a = acceptors(accumarray(idx,1) > 1); for ii = 1:length(a) p = find(ismember(matched,a(ii))); matched(p) = 0; [ranking,~] = find(prefA(a(ii),:) == p); [~,idx] = min(ranking); matched(p(idx)) = a(ii); end else % round 2 and onwards [matched,proposed] = findMatch(prefP,prefA,matched,proposed); end end matched = [(1:size(prefP,1))',matched]; % scoring the matching result scores = zeros(size(matched)); for ii = 1:size(matched,1) p = matched(ii,1); a = matched(ii,2); scores(p,1) = find(prefP(p,:) == a); scores(a,2) = find(prefA(a,:) == p); end varargout{1} = scores; end

**generateSMPdataset**

function [prefP,prefA] = generateSMPdataset(N,M) prefP = zeros(N,M); prefA = zeros(M,N); for ii = 1:N prefP(ii,:) = randperm(M); end for ii = 1:M prefA(ii,:) = randperm(N); end end

**milongaMatch**

function tandas = milongaMatch(prefL,prefF,t) tandas = zeros(size(prefL,1),t); for ii = 1:t tanda = zeros(size(prefL,1),1); proposed = false(size(prefL,1)); % make sure no one dances with a previous partner if ii > 1 prev = tandas(:,1:ii-1); for jj = 1:size(prefL,1) proposed(jj,:) = ismember(prefL(jj,:),prev(jj,:)); end end while any(tanda == 0) [tanda,proposed] = findMatch(prefL,prefF,tanda,proposed); % stop if free proposers run out of acceptors to propose to if all(all(proposed(tanda == 0,:),2)) break end end tandas(:,ii) = tanda; end end

**flexibleGS**

function matched = flexibleGS(prefP,prefA) N = size(prefP,1); M = size(prefA,1); if N < M matched = zeros(M,1); else matched = zeros(N,1); end proposed = false(N,M); % while we still have free proposers ... while any(matched(1:N) == 0) % find them matches [matched,proposed] = findMatch(prefP,prefA,matched,proposed); % but stop if they run out of acceptors to propose to if all(all(proposed(matched(1:N) == 0,:),2)) break end end if N < M matched(matched == 0) = setdiff(1:M,matched); matched = [[(1:N)';0],matched]; else matched = [(1:size(prefP,1))',matched]; end end