Using ML and image manipulation to compare model performances

Things used in this project
Hardware components
Story
Introduction
Image classification using Machine Learning could have many useful applications for daily tasks and has the potential to make our lives easier.
Machine Learning programs aim to learn from images to find object patterns that make them recognizable. For example, we tend to recognize a fork because it has a handle, a neck and four points. Therefore, ML and Image classification should identify those patterns in a picture to classify it as a fork.
So, the question that might arise is: How can we teach the ML model the easiest way to recognize an object? Recent studies show that the ways that humans find patterns might not be the same way that a computer does (https://youtu.be/zVEoGHnYmMY).
Therefore, I'm going to explore the application of data manipulation to an image dataset to find the model with the best image classification performance. Finding the best image filter we will be optimizing dataset usage by highlighting image features without increasing project costs!

This project will:
-Develop Machine Learning programs to perform image classification to cutlery
-Compare different programs performance (same original dataset with different image filters applied)
-Run the best ML program in an Arduino Portenta H7 + Vision Shield
Project Structure
Machine Learning model will give as an output one of four possible categories:
-Background
-Fork
-Knife
-Spoon
As I wont be using a big dataset I'll take advantage of Transfer Learning feature from Edge Impulse.
What You'll Need
You'll need the following:
-A MycroPython powered device (in this case I'm using an Arduino Portenta)
-Camera (in this case I'm using the Portenta Vision Shield)
-An account on Edge Impulse (edgeimpulse.com)
-OpenMV IDE (https://openmv.io/pages/download)
Arduino Portenta allows you to run the code in different ways:
-With Arduino IDE, where once that you have uploaded the code it only needs a power supply to keep running
-With OpenMV IDE, where only runs if its plugged and compiled from there
-With MicroPython, where you need to upload a main.py file to the Arduino board. Once that the file has been uploaded and it has a power supply will start running your Python code (similar to Arduino IDE)
This tutorial shows how to run the code from the OpenMV IDE or with MicroPython by uploading the main.py file to the board.
Sampling
First step in every ML project is to get the data, which in this case will be images. As bigger and less biased the dataset is the better the model will perform.
For this project I picked my mobile phone and I took 75 pictures of each of the 4 categories (making up to 300 images in total). This is not a big dataset for Image classification, but with Transfer Learning should be enough to perform decently.
Once that you have finished taking the pictures download them to your computer.
Its always a good practice to tag the pictures and number them to make further steps easier.
Data manipulation
Once that you have tagged your images you can run the following code to apply different filters on the images (Contour, Edge or Emboss).
-Note 1: I'm assuming that the images are in JPG format
-Note 2: Save and run this JPG Image editor - Filters.py file in a folder that only has the images to be filtered
import os
import numpy as np
import PIL
from PIL import Image
from PIL import ImageFilter
from PIL import ImageEnhance
from PIL import ImageOps
files_in_path = os.listdir()
files_in_path_lst = []
for fip in files_in_path:
files_in_path_lst.append(fip)
for i in files_in_path_lst:
aux = i[-4:]
name = i[:-4]
ext = ".jpg"
if aux == ext:
orig = Image.open(i)
moreEdgeEnahnced = orig.filter(ImageFilter.EDGE_ENHANCE_MORE) # If image quality is high the enhance of the edges works better!
edges_strong = moreEdgeEnahnced.filter(ImageFilter.FIND_EDGES)
ty = "edges "
fn = ty + name + ext
edges_strong.save(fn)
emboss = orig.filter(ImageFilter.EMBOSS)
ty = "emboss "
fn = ty + name + ext
emboss.save(fn)
contour = orig.filter(ImageFilter.CONTOUR)
ty = "contour "
fn = ty + name + ext
contour.save(fn)v
Once that the code has run you should now have a three different filter versions of each image.
For tidying purposes I grouped each sort of image in its specific folder.

Sampling biases
Its important to be extremely careful while taking the samples because a biased dataset could seriously affect model performance in its real world application. In this case, there are a few biases that I considered as acceptable as:
-All the images have the same background
-I only used one sort of fork, knife and spoon
-Photographs were taken from the top
Model will perform poorly in a different background, or with other spoon design or with images taken from a side for example.
Develop Machine Learning model using Edge Impulse
At the last stage I'll be running the better model on an Arduino Portenta + Vision Shield so I'll need to design the Edge Impulse to perform in a grayscale and with a reduced size.
You can find the finished project following this link or you can follow this step by step guideline:
First, log into Edge Impulse and select Create new project. Name it and select the Developer option prior clicking on Create new project button.

Continue by selecting Images and then on Classify a single object (image classification) and then on Lets get started! button.

Now you should have been directed to your projects main page.
Once that you are there click into Data acquisition and then on Let's collect some data. Select Go to the uploader and start uploading your images. As we aim to compare different models performance we need to make models that could later be compared. This means that the different models should be trained using the same sub-dataset and tested using the same sub-dataset. If we do not do this, Edge Impulse will automatically split the dataset and could lead to unfair comparisons of model performances due that the models have used different images for its training!
So we will upload the whole batch of images into the train data and then manually move specific ones to the test data.


To upload the images, select the files from the class that you are going to upload, then select Training, write the class name in Enter label window and then press Begin upload. Note that Edge Impulse handles only specific formats (JPG or PNG) so the selected images should comply with this requirement.


If the uploading was successful you should see Files that failed to upload: 0 in the right side of the screen.

Repeat this step for each one of the classes.
It's recommended to split the dataset with 80% on your train data and 20% in test data. Considering that I have only 75 images of each class I set 61 images to the training and 14 to the testing.

I decided that the images that finished with 6 or 8 (i.e. 6,16,26, 66, 8, 18, 68, etc.) were the ones to be moved to the test set. To move the images to Test data click again over Data acquisition, select the filter and then the checkbox that says Select Multiple items. Now in by name box type "8".
Edge Impulse will filter all the images that have an 8. We will select all of them and then click on Move to test set. Repeat the same step for the images that are finished with "6".

Once that you've finished your Data acquisition should look like this (81% on Training data and 19% in Split data).

Now we are ready to create our Impulse, so select Impulse design and Create Impulse.
Transfer Learning only works in specific image sizes (96x96 or 160x160), so we have to take it into consideration while designing the model. In this case we'll be selecting the 96x96 size.
Complete the design so it looks like the image below and then click on Save Impulse.

Continue by clicking on Image under Impulse design. In color depth select Grayscale (remember that our final goal is to run the classifier in an Arduino Portenta H7 + Vision Shield and that only runs grayscale images) and press Save parameters. Now move to Generate features and click on Generate features button.

Edge Impulse will take a moment to generate the features of our model. Once that it has finished will return us an image as the one below.

As a reference, models that perform well are the ones were each of the classes is clearly segregated from the others. As we can see in this model, it seems that will be difficult to segregate between fork (orange) and spoon (red) classes because its dots tend to be close so the model sees those images as similar. In the other hand, model should perform well while identifying knife (green) and background (blue) images because its dots tend to be far from the rest.
Finally, we'll move to Transfer Learning step to build and train our model. I've selected the MobileNetV2 96x96 0.35 functional model to run in 50 training cycles, with an auto-balanced dataset and data augmentation. Click on Start training and Edge Impulse will start running the model training. After a few minutes should return models performance output. In this case I got the following:

Taking into consideration that I only have 75 images of each class a 93.9% of accuracy is quite impressive!
Now that we know how the model performs with known data we are going to test the model to classify unknown data. To do so we have to select Model testing and then Classify all.

In this case it has an 82.14% of accuracy which is still good.
If we analyse the confusion matrix that the model is excellent to identify the background (100%), very good to identify spoons (92.9%) and finds difficult to identify forks (71.4%).
Now that we've finished with the first model we should repeat the same steps for the other models so we can compare performances.
Compare different models performance
In order to make the models comparable I kept all the parameters and settings, being the only difference the filter applied for each data source.
Edge filter model outputs



Emboss filter model outputs



Contour filter model outputs



Selecting best model for Deployment
If we use as reference for our decision the accuracy of the model is quite clear (surprisingly) that the one that performs better is the one that has not been through any filter (Raw images).



If we also take into account RAM usage, Flash usage and latency we will still reach the same conclusion.




Now that we now which model we are choosing we are going to download the file that we will upload to the Arduino Portenta H7.
Select Dashboard and scroll down to Download block output to download the TensorFlowLite (int8 quantizied) file.
Open MV Coding
I took as reference the code from this amazing project (https://www.hackster.io/mjrobot/mug-or-not-mug-that-is-the-question-d4062a#toc-connecting-portenta-with-edge-impulse-studio-11) and I added a few changes so the Arduino Portenta H7 could turn on a different LED for each class (red for fork, green for knife, blue for spoon and white for the background).
You'll have to save this program as the main.py file for the Arduino Portenta H7.
import sensor, image, time, tf
import pyb
ledForkRed = pyb.LED(1) # Initiates the red led when it recognises a fork
ledKnifeGreen = pyb.LED(2) # Initiates the green led when it recognises a knife
ledSpoonBlue = pyb.LED(3) # Initiates the blue led when it recognises a spoon
model_file = "ei-image-classifier_-cutlery-(grayscale)-transfer-learning-tensorflow-lite-int8-quantized-model.lite"
labels = ["background", "fork", "knife", "spoon"] # Edge Impulse usually keeps labels in alphabetical order
sensor.reset()
sensor.set_pixformat(sensor.GRAYSCALE) # Set pixel format for Portenta H7 Vision Shield
sensor.set_framesize(sensor.QVGA) # Set frame size to QVGA
sensor.set_windowing((96, 96)) # Crop to Edge Impulse model resolution
sensor.skip_frames(time=2000) # Let the camera adjust
clock = time.clock() # Start clock (for measureing FPS)
while (True):
# Update timer
clock.tick()
# Get image from camera
img = sensor.snapshot()
img.set(h_mirror=True)
# Do inference and get predictions
objs = tf.classify(model_file, img)
predictions = objs[0].output()
# Find label with the highest probability
max_val = max(predictions)
max_idx = predictions.index(max_val)
if max_idx == 0: # turn on all the leds to show a white blink for background
ledForkRed.on()
ledKnifeGreen.on()
ledSpoonBlue.on()
elif max_idx == 1: # turn on red LED for forks
ledForkRed.on()
ledKnifeGreen.off()
ledSpoonBlue.off()
elif max_idx == 2: # turn on green LED for knifes
ledForkRed.off()
ledKnifeGreen.on()
ledSpoonBlue.off()
else: # turn on blue LED for spoons
ledForkRed.off()
ledKnifeGreen.off()
ledSpoonBlue.on()
# Draw label with highest probability to image viewer
img.draw_string(0, 0, labels[max_idx] + "\n{:.2f}".format(round(max_val, 2)), mono_space=False, scale=1)
# Print all the probabilities
print("-----")
for i, label in enumerate(labels):
print(str(label) + ": " + str(predictions[i]))
print("FPS:", clock.fps())
Running the program in the Arduino Portenta + Vision Shield
I tested the program on OpenMV IDE and got pretty impressive results!




Conclusion
We've been able to develop different Machine Learning models, compare its performance and select the best one. It's interesting to highlight that data manipulation by image filters do play the part and does have impact in the model output.
We uploaded the selected model to an Arduino Portenta H7 + Vision Shield. We could confirm that it performed very well taking into account the limited data used for the training steps.
It's worth to mention that this model has many biases, but as long as its running under specific conditions we can have a robust model with just a few images.
Schematics
TensorFlow Lite (int8 quantized)
Machine Learning code to support main.py file. Should also be uploaded to the Arduino Portenta H7

Code
JPG Image editor - Filters.py
Python
Code to apply filters to JPG images. Its assuming that the JPG images are the only files in the same folder than the code
#!/usr/bin/env python
# coding: utf-8
import os
import numpy as np
import PIL
from PIL import Image
from PIL import ImageFilter
from PIL import ImageEnhance
from PIL import ImageOps
files_in_path = os.listdir()
files_in_path_lst = []
for fip in files_in_path:
files_in_path_lst.append(fip)
for i in files_in_path_lst:
aux = i[-4:]
name = i[:-4]
ext = ".jpg"
if aux == ext:
orig = Image.open(i)
moreEdgeEnahnced = orig.filter(ImageFilter.EDGE_ENHANCE_MORE) # If image quality is high the enhance of the edges works better!
edges_strong = moreEdgeEnahnced.filter(ImageFilter.FIND_EDGES)
ty = "edges "
fn = ty + name + ext
edges_strong.save(fn)
emboss = orig.filter(ImageFilter.EMBOSS)
ty = "emboss "
fn = ty + name + ext
emboss.save(fn)
contour = orig.filter(ImageFilter.CONTOUR)
ty = "contour "
fn = ty + name + ext
contour.save(fn)
main.py
Python
Upload this file directly to the Arduino Portenta H7
# https://www.hackster.io/mjrobot/mug-or-not-mug-that-is-the-question-d4062a#toc-connecting-portenta-with-edge-impulse-studio-11
import sensor, image, time, tf
import pyb
ledForkRed = pyb.LED(1) # Initiates the red led when it recognises a fork
ledKnifeGreen = pyb.LED(2) # Initiates the green led when it recognises a knife
ledSpoonBlue = pyb.LED(3) # Initiates the blue led when it recognises a spoon
model_file = "ei-image-classifier_-cutlery-(grayscale)-transfer-learning-tensorflow-lite-int8-quantized-model.lite"
labels = ["background", "fork", "knife", "spoon"] # Edge Impulse usually keeps labels in alphabetical order
sensor.reset()
sensor.set_pixformat(sensor.GRAYSCALE) # Set pixel format for Portenta H7 Vision Shield
sensor.set_framesize(sensor.QVGA) # Set frame size to QVGA
sensor.set_windowing((96, 96)) # Crop to Edge Impulse model resolution
sensor.skip_frames(time=2000) # Let the camera adjust
clock = time.clock() # Start clock (for measureing FPS)
while (True):
# Update timer
clock.tick()
# Get image from camera
img = sensor.snapshot()
img.set(h_mirror=True)
# Do inference and get predictions
objs = tf.classify(model_file, img)
predictions = objs[0].output()
# Find label with the highest probability
max_val = max(predictions)
max_idx = predictions.index(max_val)
if max_idx == 0: # turn on all the leds to show a white blink for background
ledForkRed.on()
ledKnifeGreen.on()
ledSpoonBlue.on()
elif max_idx == 1: # turn on red LED for forks
ledForkRed.on()
ledKnifeGreen.off()
ledSpoonBlue.off()
elif max_idx == 2: # turn on green LED for knifes
ledForkRed.off()
ledKnifeGreen.on()
ledSpoonBlue.off()
else: # turn on blue LED for spoons
ledForkRed.off()
ledKnifeGreen.off()
ledSpoonBlue.on()
# Draw label with highest probability to image viewer
img.draw_string(0, 0, labels[max_idx] + "\n{:.2f}".format(round(max_val, 2)), mono_space=False, scale=1)
# Print all the probabilities
print("-----")
for i, label in enumerate(labels):
print(str(label) + ": " + str(predictions[i]))
print("FPS:", clock.fps())
The article was first published in hackster, April 16, 2022
cr: https://www.hackster.io/richmondkevin92/cutlery-classifier-using-machine-learning-8d84bd
author: Kevin Richmond
