File Exchange Pick of the Week

Our best user submissions

Semaphore POSIX and​ Windows

Posted by Sean de Wolski,

Sean‘s pick this week is Semaphore POSIX and Windows by Andrew Smart.

Background

I’ve been working with someone to help them parallelize multiple simulations of an external program. The external program can be driven by its COM API on Windows using actxserver. This is an ideal candidate for parallel computing because the simulation takes minutes to hours to run and needs to be run for different configurations where there are no dependencies between runs.

My first attempt at this was to use a parfor loop, looping over configurations in parallel. It worked the first time(!), but broke the second time and continued to be unreliable. The problem was that I cannot open the actxserver at the same time from multiple parallel workers because a race condition is encountered.

So the challenge: How to enforce that each parallel worker opens the actxserver at a different time than the others, ideally immediately after the other has finished.

The way to do this is with a semaphore, which MATLAB does not have built in. A semaphore is used to control access to a shared resource. I vaguely remembered having seen this on the File Exchange before and checked my Pick of the Week candidates list and sure enough, there it was, waiting for a use-case!

Using The Semaphore

NOTE To use the semaphore, you need to compile using the following command:

mex -O -v semaphore.c

I will first create the semaphore from the client MATLAB. The parallel workers will grab it before opening the actxserver and release it after. The first worker to execute will grab the semaphore, making the other workers enter a First-In-First-Out queue to repeat the process. After a few seconds of initialization time, all of the workers will be running the lengthy simulation in parallel.

Open parallel computing pool. This pool has two workers because I have a dual core laptop.

gcp
ans = 
 Pool with properties: 

            Connected: true
           NumWorkers: 2
              Cluster: local
        AttachedFiles: {}
          IdleTimeout: 90 minute(s) (85 minutes remaining)
          SpmdEnabled: true

Create semaphore with key 1 and value 1. The value here matters, use 1.

semkey = 1;
semaphore('create',semkey,1)

Run the parfor-loop

parfor ii = 1:4
    try
        semaphore('wait',semkey)
        disp(string('Iteration ') + ii + ' Started: ' + string(datetime('now')))
        pause(2) % Proxy for actxserver()
        semaphore('post',semkey)
        disp(string('Iteration ') + ii + ' Finished: ' + string(datetime('now')))
        % Run the simulation here

    catch
        % In case something goes wrong, release the semaphore
        semaphore('post',semkey)

    end
end
Iteration 3 Started: 08-Feb-2017 11:09:34
Iteration 2 Started: 08-Feb-2017 11:09:32
Iteration 2 Finished: 08-Feb-2017 11:09:34
Iteration 1 Started: 08-Feb-2017 11:09:37
Iteration 3 Finished: 08-Feb-2017 11:09:37
Iteration 1 Finished: 08-Feb-2017 11:09:39
Iteration 4 Started: 08-Feb-2017 11:09:39
Iteration 4 Finished: 08-Feb-2017 11:09:41

If you look at the times, you will see that iteration 2 started at 32 seconds, and finished two seconds later (parfor iterations do not run in sorted order). Iteration 3 starts immediately after 2 and when it finishes iteration 1 begins. Iteration 4 starts after iteration 1 completes.

Here’s an annotated image:

Clean up after.

semaphore('destroy',semkey)

I was able to get this working in just a few minutes but did learn a couple lessons the hard way. A few things to note:

  • Use try/catch around the operations you’re doing. If the process fails, you want to release the semaphore to unblock the other workers. This can cause a hang if an error occurs because the loop will never complete as other workers are held. If you reach this state, force kill the parallel pool from the task manager.
  • Use Key=1 and Value=1 incrementing Key if you have multiple semaphores. My only advice to Andrew is to make this a little more clear in the help.

Comments

Give it a try and let us know what you think here or leave a comment for Andrew.

Get the MATLAB code

Published with MATLAB® R2016b

Note

Comments are closed.