Convert sudoku image into 2D array grid using API.

9 min readApr 29, 2021

Identifying Digits from Sudoku Grid ?
Human :As easy as pie.
Computer:Not For Us ,Want to know how easy it is? check this out.

This Article will explain the following things.

How the things are getting done?
How API is being called and used?
How the Sudoku Image is being passed ?
How ML model will understand the Sudoku grid and form an array?
How The Sudoku Grid is being returned?

API The Digital Bridge

It says “As the name so are the attributes” .I always consider API as Digital Bridge as Bridge connects Two physical location ,API does the same it connects two different application it act as a interface in a more technical term.
The code to detect Sudoku and making 2D-array is written in python,but as there are different platforms and different technologies are being used in day to day basis are not necessarily using python.

So think of API as an Interface, which is a software intermediary that allows two applications to talk to each other.Each time you use an app like Facebook, send an instant message, or check the weather on your phone, you’re using an API.Here the API which i have created will be returning the Extracted grid in 2D-array form.
For python luckily there are different Framework through which you can make API with ease.They are namely Flask , Django , Bottle , CherryPy , FastApi and many more you can check them out here.I have used flask as a framework for this project because of its ease of use and you can make your API using few lines of code.

PROCESSES :

Encoding Image
Sending Request
Getting Response

These are the processes, Which are involved on user side where API request is being made.

Encoding Image

with python You can encode image to base64 with this 3 lines of code,let me show you how an encoded image looks in base64 form.

check out this link to convert image to base64.

Basically base64 is useful for tiny images and it would be useful and you can Send the data from a image input in the front-end to the back-end wrapped inside form data as base64 image.We do not have to store it somewhere in the cloud and we can pass the image as a String variable and get the things done.

SENDING REQUEST

with python requests module we can send the request to test_url(which is API link hosted on cloud)and will get response in JSON format which contains grid as 2d array.

That was few lines of code on user-side to encode image, making a request and, getting the response now let us see what is happening on the other side how the request is being handled how the response is being made and so on…

WHAT IS HAPPENING ON THE OTHER SIDE

Getting Request
Decoding Image
Detecting Grid
Cropping Cells
Identifying The Digit
Return The Array

GETTING REQUEST

So when user makes a request on the cloud particular function will get executed which was test() in this case ,and here r is request which will have image as r.data(you can refer it from previous section of sending request).
r.data is having image in base64 format but we can not use it directly so we need to decode it

DECODING IMAGE

when user makes a request to test_url which is hosted on cloud will first decode the passed data to 2D array of image for further processing

So now we have image in array form which is needed to do image processing on it .

DETECTING GRID

I would say this is the most important part of all this process,You need to do well here otherwise no matter how good your ml model is recognizing digit it is of no use.so let’s see what are the things Which are involved in this part.

Blurring
Inverse-Thresholding
Dilation
Finding Contours
Cropping The Grid

Blurring

When we blur an image, we make the color transition from one side of an edge in the image to another smooth rather than sudden. The effect is to average out rapid changes in pixel intensity. The blur, or smoothing, of an image removes “outlier” pixels that may be noise in the image.In nutshell it will remove noise which will be essential in this case.
Before removing this one has to make sure that the image should be in gray-scale mode.The reason for differentiating such images from any other sort of color image is that less information needs to be provided for each pixel. In addition, gray-scale images are entirely sufficient for many tasks and so there is no need to use more complicated and harder-to-process color images.

Inverse-Thresholding

Image Thresholding - OpenCV-Python Tutorials 1 documentation

In this tutorial, you will learn Simple thresholding, Adaptive thresholding, Otsu's thresholding etc. You will learn…

opencv-python-tutroals.readthedocs.io

You can check this out and found your suitable thresholding filter and there is also available Sudoku example ,effect of different filter on Sudoku.

So cv2.ADAPTIVE_THRESH_GAUSSIAN_C is the filter which i have used which is more suitable in this case specially.And we are inverting colors to find grid edges.

DILATION

Dilation adds pixels to the boundaries of objects in an image.The number of pixels added from the objects in an image depends on the size and shape of the structuring element used to process the image.The state of any given pixel in the output image is determined by applying a rule to the corresponding pixel and its neighbors in the input image.You can read more about it in the following link.

Types of Morphological Operations - MATLAB & Simulink

Morphology is a broad set of image processing operations that process images based on shapes. Morphological operations…

www.mathworks.com

Rule of dilation:The value of the output pixel is the maximum value of all pixels in the neighborhood. In a binary image, a pixel is set to 1 if any of the neighboring pixels have the value 1.Morphological dilation makes objects more visible and fills in small holes in objects.

In nutshell we will thicken the edges further so that we can detect contours with ease.

The Edges are more thicken than it was earlier.

Finding Contours

Contours can be explained simply as a curve joining all the continuous points (along the boundary), having same color or intensity. The contours are a useful tool for shape analysis and object detection and recognition.

In 1st image a rectangle is drawn and circle in 2nd on contours

OpenCV: Contours : Getting Started

Contours can be explained simply as a curve joining all the continuous points (along the boundary), having same color…

docs.opencv.org

As Sudoku can have different contour points pair it is not straight forward to crop the grid out we have to mathematical computation to find out area covered by each contours and the contour pairs having higher area is our grid.

But this is not the last thing still we have to crop the grid out of it.

Cropping the Grid

Now we have to transform the image so that we can do that thing with another great function of OpenCv .

You can refer the following to know the mathematics behinds it.

Geometric Transformations of Images - OpenCV-Python Tutorials 1 documentation

Scaling is just resizing of the image. OpenCV comes with a function cv2.resize() for this purpose. The size of the…

opencv-python-tutroals.readthedocs.io

So now we have done the major part of this long procedure,so this cropped grid will be useful for the next operations.

Cropping Cells

from all the previous steps we have taken you will find these things most easier we just need to cut the image into 9x9=81 cells

[i][j].png is cell of ith row jth column

Identifying The Digit

Now we have total 81 cells we will now map this cell image to digit and will store this digit into 2d array which will be send back to the user in JSON format.

This is being carried out in 3 steps

Removing Grid-lines
Cropping Digit
Re-centering It

Removing Grid-lines

We are removing grid lines so that only digit will be there in the image and it will be useful to detect it using ml model.

Cropping Digit

we need to crop the digit because in the next procedure we will recenter it from mid point of the digit

Re-centering it

we will pad extra zero so the digit will be in the center.And the model can perform better while detecting which digit it is.

Digit cropper

Explore and run machine learning code with Kaggle Notebooks | Using data from [Private Datasource]

www.kaggle.com

You can check this digit_cropper code in my jupyter notebook link is above.

It is prediction time

So you are having MNIST handwritten digit dateset what else do you need ?this classic dateset of handwritten images has served as the basis for bench-marking classification algorithms. As new machine learning techniques emerge, MNIST remains a reliable resource for researchers and learners alike.
The following code snippets will explain,
How Machine Learning Model is made ?
Which libraries are being imported ?
What is the structure of the model ?
How the model is being trained ?
How the Prediction is being made?

once you make your model you can now pass the cell image which will eventually return the predicted digit and that will be stored to the 2d array.And for the cell which does not contain any digit will have value 0.

You can refer to the this link to make ML model for your project.

Returning Array

Once you have made prediction and place all the predicted value to the corresponding location you are ready with your 2D array which will be send back in JSON format .

This is how this grid is being made ,and passed back to the machine from where API request was made.

So This is what exactly happening inside.A Single click(upload image) is acting as Butterfly Effect (which one can never see). This much depth of operation is proving it.