Final Analysis: Gerrymander contest
By Lucio Cetto
I have analyzed the wining entry in the Gerrymander contest. Three important sections of the wining entry, ‘dm1’ submitted by Paulo Uribe, are considered in this review. Please forgive me if I do not include the names of the original authors of each algorithm. It is very hard to trace the contributions, so I have decided to omit the names. I would rather acknowledge the collective innovation of the dozens of contestants who all contributed to ‘dm1’.
There are many sections of code which I do not comment on, either because they are easier to understand or because they do not significantly affect the overall results. For example, “ozzy” (aka “prince of Darkness”) was left out of this analysis because, by the end of the contest, it only outperformed the other subfunctions in one of the small maps in the contest test suite.
I will be using two maps taken from the test suites, Florida and France. These population density maps are based on real data. We would like to see how different parts of the winning entry solved this NP-complete problem.
The main idea is to construct a snake, and then divide it into ‘n’ districts based on the population map. A ‘snake’ is a path through the elements of the matrix that travels the whole map without getting stuck. This approach is easy to implement because it is not necessary to test for contiguity of the elements assigned to each district (because the snake itself is contiguous) or to care about empty elements (because the snake covers the whole map).
Once the snakes are created, they’re stored into a library, or nest. Lots of new snakes were added during the contest, some of them involving complex patterns of up to 5×5 elements. All possible solutions in the nest are scored with all possible rotations of the density map, and the best one is picked.
A clever modification to the snake algorithm is implemented in the first loops of ‘solver1’. It searches for repeated patterns in the map by checking if the density map is divisible by a prime number in any direction, and testing if the sub-maps are equal. If they are, then the problem size can be significantly be reduced by calling ‘solver1’ only once, and then replicating the results. Although this modification can rarely be applied to realistic maps, it succeeded in breaking down some of the problems in the test suite.
Another slight modification to ‘solver1’ is implemented in ‘solver0’. ‘solver0’ trims empty columns and rows on the border of the map, then runs ‘solver1’ a second time. The reduction of the map allows ‘solver1’ to make a better approximation of the initial partition. The trimmed region is then added to either the upper left district or to the bottom right. As you can see, ‘solver0’ generates conspicuous L-shaped districts at the border of the map.
“crystallise6” grows one district at the time until it gets close to the targeted district population. The district starts with the most populated element in the map that has not been yet assigned to any other region. A halo, or buffer, around the current district reduces the search space for new candidate elements that need to be considered for inclusion in the current growing district. When all elements in the halo are exhausted, a new halo is constructed for the current growing district. Once the district is grown to the targeted population, a new district starts.
Intricate maps can stump this growing algorithm. Consider, for example, one of the last regions that has not yet satisfied the targeted population quota, but cannot grow any more because there are no more adjacent candidate elements to add, i.e. it is surrounded by other districts. It requires the inclusion of a safe ‘escape’ clause.
Paulo added a slight variation to the algorithm in his last entry. The halo can be updated only after all the elements in the current halo have been tested (mode 0), or every time a new element is added the halo (mode 1). Each approach leads to different shaped districts, the second growing more aggressively than the first. This modification might lead to less convex districts, making the algorithm search for more elaborate solutions. It increases, however, the chances of creating underpopulated regions that can’t continue to grow because they are already surrounded. Therefore both modes need to be run, increasing the execution time by almost a second. Fortunately the improvement in the results paid off and his entry took first place.
Once the district has been grown, there is still the problem of what to do with the unassigned space. Crystallise6 uses a second pass to grow districts and assign empty spaces using the same halo strategy. This time we start with the district with the smallest population and grow it as much as possible.
“trade” is a polishing algorithm, used after the map has already been divided. It improves the results by finding potential elements that can be moved from one district to another. The algorithm is implemented in a while-loop. For each iteration, we first find the district border elements (candidates for trading). Every possible trade is assessed and only the best one will be executed for each district in the current iteration. This condition allows the reuse the current index of border elements without having to recompute it for every trade. The loop is repeated until we reach a state where moving none of the available elements for trade (those in the borders) can improve the score. “trade” improves both the snaking and the growing algorithm.
Published with MATLAB® 7.0