# Sudoku Solver: Image Processing and Deep Learning

Posted by **Johanna Pingel**,

*image processing are both useful for object detection and image classification.*

**and****Image Processing**The two technologies I want to highlight today are:

__and__Deep LearningImage Processing |
Deep learning |

Transforming or modifying an image at the pixel level. For example, filtering, blurring, de-blurring, and edge detection (to name a few) | Automatically identifying features in an image through learning on sample images. Deep learning has has been revolutionizing the area of image processing in the past few years. |

“Deep learning has made ‘traditional’ image processing obsolete.”or

“Deep learning needs millions of examples and is only good for classifying pictures of cats anyway.”(I’m definitely guilty of classifying a few cats in my day). The reality is:

- Deep learning and image processing are effective tools to solve different problems.

**and**- These tasks are complex:
*use the right tool for the job*.

**For those of you not familiar, a Sudoku puzzle as below here is completed when each row, column and 3x3 square contain the numbers 1-9 exactly with no repeats. The puzzle begins with some of the numbers filled in. Which numbers and how many are filled in determines the complexity of the puzzle. Here, we want our algorithm to find the boxes, and fill in the missing numbers. But that’s almost too easy! We also want to solve the puzzle regardless of where it is in the image. Here’s an example image of where we expect the algorithm to be able to solve the puzzle. To solve this, we want to use the right tool for the job, and that means breaking this down into a few distinct parts:**

__Sudoku Puzzle__- Find the Puzzle - locate the box in the image
- Find the Boxes – identify each box of the 9x9 squares
- Read the numbers – these numbers can be digital or handwritten
- Solve the puzzle

**Step 1. Find the Puzzle**Overview: We have uncontrolled images, and background, and the object size and orientation could vary greatly between different images. Not to mention lighting, camera capturing conditions. There’s a lot of variability. Method? Deep Learning Let’s see if we can classify the pixels associated with the puzzle using semantic segmentation. To do this, we need to label training data. Here’s Justin's video of what that looks like in MATLAB using Image Labeler.

train = pixelLabelImageDatastore(imagesTrain, labelsTrain, ... 'OutputSize', inputSize(1:2)); test = pixelLabelImageDatastore(imagesTest, labelsTest, ... 'OutputSize', inputSize(1:2));We can then setup the network layers. It’s interesting to note that Justin made a function to balance the classes using class weighting. Justin pointed out to me that his function does exactly what this example does: https://www.mathworks.com/help/vision/examples/semantic-segmentation-using-deep-learning.html#d120e2892 Setup the network.

numClasses = 2; baseNetwork = 'vgg16'; layers = segnetLayers(inputSize, numClasses, baseNetwork); layers = sudoku.weightLossByFrequency(layers, train);Set up the training options.

opts = trainingOptions('sgdm', ... 'InitialLearnRate', 0.005, ... 'LearnRateDropFactor', 0.1, ... 'LearnRateDropPeriod', 20, ... 'LearnRateSchedule', 'piecewise', ... 'ValidationData', test, ... 'ValidationPatience', Inf, ... 'MaxEpochs', 40, ... 'MiniBatchSize', 2, ... 'Shuffle', 'every-epoch', ... 'Plots', 'training-progress', ... 'CheckpointPath', checkpointPath);Finally, train the network.

net = trainNetwork(train, layers, opts);This took roughly 20 minutes to run through 40 epochs, though differing hardware/GPUs will produce varying results. On a new test image, we get this result from the trained network: Not bad! It wasn’t even fooled too badly by the other box-shaped figure in the image. The smaller noise can be removed in the next section.

**Step 2. Find the Boxes**Here, we need to find the individual boxes in the grid. This is a well-defined problem: straight lines, always dark ink on light paper, and equally sized boxes. Also keep in mind, we already found the approximate location of the box in step 1. We can make everything besides that location black, making this a very cleanly defined problem. Method? Image Processing We talk a lot about image processing in our image processing blog. In fact, Steve's blog is the reason I became confident in my image processing skills. The key thing to remember here if you’re not an image processing expert is - you don’t have to be! MATLAB has apps to make this process easy. Check out the Image Segmenter app (here's a video that shows an overview) to explore detecting the boxes in the image. The code below was automatically generated by the app and will detect the individual squares in the image. First, we have to clean up the image, so any noise is gone.

BW_out = bwpropfilt(networkMask, 'Area', [100000 + eps(100000), Inf]);Then we dilate the mask to ensure it covers the entire box.

maskDilated = imdilate(BW_out, strel('disk', 120));We only care about the location where the box is, zero everything else out

grayIm = rgb2gray(im); grayIm(~maskDilated) = 0;And then find only the box in the image

%% Mask it BW = imbinarize(grayIm, 'adaptive', 'Sensitivity', 0.700000, 'ForegroundPolarity', 'bright'); % Invert mask BW = imcomplement(BW); % Clear borders BW = imclearborder(BW); % Fill holes BW = imfill(BW, 'holes'); imshowpair(im,BW)This video shows step 1 and 2 together. I find these results fascinating... and very robust!

**Step 3. Read the Numbers**There are many methods to reading handwritten and typed numbers. This is a challenging problem that must deal with different fonts and styles of numbers, but we have a variety of options:

- Optical character recognition (OCR) is a common method
- HOG with a machine learning classifier is another option. A MATLAB example is here

**synthetic data**. For the handwritten digits, this is easy – simply steal from the MNIST Dataset and add to our synthetic background shown in the figure below. For creating a variety of typed numbers, we want to vary the numbers to ensure they will be recognized regardless of the font used (Times New Roman, Verdana, etc)

resolution = size(im, 1); maxSize = resolution - 2*border; fontSize = round((maxSize - minSize)*rand(1) + minSize); textColour = maxColour*rand(1, 3); position = [resolution/2, resolution/2] + maxOffset.*(rand(1, 2) - 0.5);For both types of synthetically generated numbers, we want to vary the size and location of the numbers. This is what allows us to generate as much data as we want!

Synthetic Image - Handwritten | Synthetic Image - Typed |

*The idea of synthetic data warrants more time than I can give it in this post. Stay tuned for more posts on this in the future!*We now can train the network. Set the training options, create the layers, and train the network as we did before.

options = trainingOptions('sgdm', ... 'Plots', 'training-progress', ... 'L2Regularization', 1e-2, ... 'MaxEpochs', 8, ... 'Shuffle', 'every-epoch', ... 'InitialLearnRate', 0.01, ... 'LearnRateDropFactor', 0.1, ... 'LearnRateDropPeriod', 3, ... 'LearnRateSchedule', 'piecewise', ... 'ValidationData', test, ... 'ValidationPatience', Inf, ... 'MiniBatchSize', 64); % This is creating a network from scratch that closely resembles VGG16 layers = sudoku.training.vggLike(initialChannels, imageSize); % Train net = trainNetwork(train, layers, options);For this example, it’s looking like the network is getting roughly 97.8% accuracy. That accuracy is sufficient... for a Sudoku solution.

**Step 4. Solve the Puzzle**We have the boxes and the numbers. Now we need to fill in the other values. Method? Neither! This is an optimization problem We’ve written about Sudoku solvers previously:

- Cleve’s post on Sudoku is here: https://www.mathworks.com/company/newsletters/articles/solving-sudoku-with-matlab.html
- Or check out the optimization code in this documentation example: https://www.mathworks.com/help/optim/examples/solve-sudoku-puzzles-via-integer-programming.html

**key tips**when tackling a problem dealing with images and video:

- Deep learning is sometimes,
*but not always,*the right tool for the job - Both image processing and deep learning are great tools that can be combined to form the right solution You can download the complete solution here: https://www.mathworks.com/matlabcentral/fileexchange/68980-deep-sudoku-solver, and leave a comment for Justin below if you have any questions.

## Recent Comments