Convert sudoku image into 2D array grid using API.

Vitrag Shah
9 min readApr 29, 2021

Identifying Digits from Sudoku Grid ?

Human :As easy as pie.

Computer:Not For Us ,Want to know how easy it is? check this out.

This Article will explain the following things.

  • How the things are getting done?
  • How API is being called and used?
  • How the Sudoku Image is being passed ?
  • How ML model will understand the Sudoku grid and form an array?
  • How The Sudoku Grid is being returned?

API The Digital Bridge

  • It says “As the name so are the attributes” .I always consider API as Digital Bridge as Bridge connects Two physical location ,API does the same it connects two different application it act as a interface in a more technical term.
  • The code to detect Sudoku and making 2D-array is written in python,but as there are different platforms and different technologies are being used in day to day basis are not necessarily using python.
  • So think of API as an Interface, which is a software intermediary that allows two applications to talk to each other.Each time you use an app like Facebook, send an instant message, or check the weather on your phone, you’re using an API.Here the API which i have created will be returning the Extracted grid in 2D-array form.
  • For python luckily there are different Framework through which you can make API with ease.They are namely Flask , Django , Bottle , CherryPy , FastApi and many more you can check them out here.I have used flask as a framework for this project because of its ease of use and you can make your API using few lines of code.

PROCESSES :

  1. Encoding Image
  2. Sending Request
  3. Getting Response
  • These are the processes, Which are involved on user side where API request is being made.

Encoding Image

Encoding image to base64
  • with python You can encode image to base64 with this 3 lines of code,let me show you how an encoded image looks in base64 form.
Sudoku Image
image in base64 format

check out this link to convert image to base64.

  • Basically base64 is useful for tiny images and it would be useful and you can Send the data from a image input in the front-end to the back-end wrapped inside form data as base64 image.We do not have to store it somewhere in the cloud and we can pass the image as a String variable and get the things done.

SENDING REQUEST

Sending Request
  • with python requests module we can send the request to test_url(which is API link hosted on cloud)and will get response in JSON format which contains grid as 2d array.
Response in JSON format of 2D-array
  • That was few lines of code on user-side to encode image, making a request and, getting the response now let us see what is happening on the other side how the request is being handled how the response is being made and so on…

WHAT IS HAPPENING ON THE OTHER SIDE

  1. Getting Request
  2. Decoding Image
  3. Detecting Grid
  4. Cropping Cells
  5. Identifying The Digit
  6. Return The Array

GETTING REQUEST

getting request
  • So when user makes a request on the cloud particular function will get executed which was test() in this case ,and here r is request which will have image as r.data(you can refer it from previous section of sending request).
  • r.data is having image in base64 format but we can not use it directly so we need to decode it

DECODING IMAGE

  • when user makes a request to test_url which is hosted on cloud will first decode the passed data to 2D array of image for further processing
Decoding
  • So now we have image in array form which is needed to do image processing on it .

DETECTING GRID

  • I would say this is the most important part of all this process,You need to do well here otherwise no matter how good your ml model is recognizing digit it is of no use.so let’s see what are the things Which are involved in this part.
  1. Blurring
  2. Inverse-Thresholding
  3. Dilation
  4. Finding Contours
  5. Cropping The Grid

Blurring

  • When we blur an image, we make the color transition from one side of an edge in the image to another smooth rather than sudden. The effect is to average out rapid changes in pixel intensity. The blur, or smoothing, of an image removes “outlier” pixels that may be noise in the image.In nutshell it will remove noise which will be essential in this case.
  • Before removing this one has to make sure that the image should be in gray-scale mode.The reason for differentiating such images from any other sort of color image is that less information needs to be provided for each pixel. In addition, gray-scale images are entirely sufficient for many tasks and so there is no need to use more complicated and harder-to-process color images.
Blurred Image

Inverse-Thresholding

  • You can check this out and found your suitable thresholding filter and there is also available Sudoku example ,effect of different filter on Sudoku.
Different filters and thier effect
  • So cv2.ADAPTIVE_THRESH_GAUSSIAN_C is the filter which i have used which is more suitable in this case specially.And we are inverting colors to find grid edges.
INVERSE THRESHOLDED IMAGE

DILATION

  • Dilation adds pixels to the boundaries of objects in an image.The number of pixels added from the objects in an image depends on the size and shape of the structuring element used to process the image.The state of any given pixel in the output image is determined by applying a rule to the corresponding pixel and its neighbors in the input image.You can read more about it in the following link.
  • Rule of dilation:The value of the output pixel is the maximum value of all pixels in the neighborhood. In a binary image, a pixel is set to 1 if any of the neighboring pixels have the value 1.Morphological dilation makes objects more visible and fills in small holes in objects.
  • In nutshell we will thicken the edges further so that we can detect contours with ease.
The Edges are more thicken than it was earlier.

Finding Contours

  • Contours can be explained simply as a curve joining all the continuous points (along the boundary), having same color or intensity. The contours are a useful tool for shape analysis and object detection and recognition.
In 1st image a rectangle is drawn and circle in 2nd on contours
  • As Sudoku can have different contour points pair it is not straight forward to crop the grid out we have to mathematical computation to find out area covered by each contours and the contour pairs having higher area is our grid.
the contour which we were looking for.
  • But this is not the last thing still we have to crop the grid out of it.

Cropping the Grid

  • Now we have to transform the image so that we can do that thing with another great function of OpenCv .
  • You can refer the following to know the mathematics behinds it.
  • So now we have done the major part of this long procedure,so this cropped grid will be useful for the next operations.

Cropping Cells

  • from all the previous steps we have taken you will find these things most easier we just need to cut the image into 9x9=81 cells
[i][j].png is cell of ith row jth column

Identifying The Digit

  • Now we have total 81 cells we will now map this cell image to digit and will store this digit into 2d array which will be send back to the user in JSON format.
a cell after cropped from sudoku

This is being carried out in 3 steps

  1. Removing Grid-lines
  2. Cropping Digit
  3. Re-centering It

Removing Grid-lines

  • We are removing grid lines so that only digit will be there in the image and it will be useful to detect it using ml model.
after removing grids

Cropping Digit

  • we need to crop the digit because in the next procedure we will recenter it from mid point of the digit
cropped digit

Re-centering it

  • we will pad extra zero so the digit will be in the center.And the model can perform better while detecting which digit it is.
Re centered digit
  • You can check this digit_cropper code in my jupyter notebook link is above.

It is prediction time

  • So you are having MNIST handwritten digit dateset what else do you need ?this classic dateset of handwritten images has served as the basis for bench-marking classification algorithms. As new machine learning techniques emerge, MNIST remains a reliable resource for researchers and learners alike.
  • The following code snippets will explain,
  • How Machine Learning Model is made ?
  • Which libraries are being imported ?
  • What is the structure of the model ?
  • How the model is being trained ?
  • How the Prediction is being made?
Loading the data
Model Structure
  • once you make your model you can now pass the cell image which will eventually return the predicted digit and that will be stored to the 2d array.And for the cell which does not contain any digit will have value 0.
  • You can refer to the this link to make ML model for your project.

Returning Array

  • Once you have made prediction and place all the predicted value to the corresponding location you are ready with your 2D array which will be send back in JSON format .
Final Grid
  • This is how this grid is being made ,and passed back to the machine from where API request was made.

So This is what exactly happening inside.A Single click(upload image) is acting as Butterfly Effect (which one can never see). This much depth of operation is proving it.

--

--