Resize the image to give it as an input to neural network

puranjaymohan · September 5, 2020, 9:58am

I have OpenMV Cam H7
I have two questions regarding TFLite on openmv:-

I built a model using Keras and quantized it using tflite, it has an accuracy of 96% after quantization.
it takes grayscale images of size 28x28 pixels as an input, how to provide this input to the neural network in openMV. I want the image to scale properly to 28x28.
If I build a model which works on grayscale images with values between 0 and 1 rather than 0-255, how to give this as an input in OpenMV, as OpenMV takes grayscale images but I want them to have a value between 0 and 1. is there a way to multiply each pixel by 1/255?

Thanks

kwagyeman · September 5, 2020, 4:42pm

Our TF module takes cares of all these details for you. You just need to load the network and run it.

Out TF module does this, it will crop and scale the input image to whatever the network takes while maintaining aspect ratio.
Our TF module scales the inputs and outputs of the network to be what is needed. However, if a pixel is between 0-1 it should be a float.

We recommend using Edge Impulse to generate and train your network online.

darrask · March 31, 2022, 1:36am

Hello, digging up this topic to get an understanding of how these TF module functions work. It is awesome to read that the TF module handles a lot for us (but confusingly not documented in tf.classify help). Lately I have been using an EdgeImpulse-generated module to detect a black sponge and a green USB reader on a whiteboard. I classify images of variable resolutions and aspect ratios.

While under-scaled (relative to the model’s 160 px resolution) input images significantly reduced model performance (as expected) and over-sampled images had no detectable effect (as expected from the above), surprisingly, non-rescaled images (543 * 502 - close to square proportions) have terrible (null) performance (opposed to the assumption that the TF module handles everything), even though they are close to the resolution of the images that I re-scaled to 480*480, for instance.

Results above are based on samples of ~20 live image classifications of the target. The model usually has 100% accuracy for classifying the green USB reader. I used nearest neighbor interpolation. This is a sample image of what I was trying to classify (a green USB flash reader on a whiteboard):

I attach my script here, based on the one exported by Edgeimpulse:

# Edge Impulse - OpenMV Image Classification Example

import sensor, image, time, os, tf, uos, gc

sensor.reset()                         # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565)    # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.WQXGA2)      # Set frame size to QVGA (320x240)
#sensor.set_windowing((240, 240))       # Set 240x240 window.
sensor.skip_frames(time=2000)          # Let the camera adjust.

net = None
labels = None
side_res = "original"

try:
    # load the model, alloc the model file on the heap if we have at least 64K free after loading
    net = tf.load("trained.tflite", load_to_fb=uos.stat('trained.tflite')[6] > (gc.mem_free() - (64*1024)))
except Exception as e:
    print(e)
    raise Exception('Failed to load "trained.tflite", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')')

try:
    labels = [line.rstrip('\n') for line in open("labels.txt")]
except Exception as e:
    raise Exception('Failed to load "labels.txt", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')')

clock = time.clock()
while(True):
    clock.tick()

    img = sensor.snapshot()

    # default settings just do one detection... change them to search the image...
    if (str(side_res) != "original"):
        img.scale(x_size=side_res,y_size=side_res,roi=(1340,1215,543,502))

    # MANUALLY comment/uncomment the following line for using original vs. re-scaled images
    #for obj in net.classify(img,roi=(1340,1215,543,502), min_scale=1.0, scale_mul=0.8, x_overlap=0.5, y_overlap=0.5):
    for obj in net.classify(img, min_scale=1.0, scale_mul=0.8, x_overlap=0.5, y_overlap=0.5):
        print("**********\nPredictions at [x=%d,y=%d,w=%d,h=%d]" % obj.rect())
        img.draw_rectangle(obj.rect())
        # This combines the labels and confidence values into a list of tuples
        predictions_list = list(zip(labels, obj.output()))
        with open('confidences.csv', 'a') as confidencelog:
                                confidencelog.write(str(predictions_list[1][1]) + ',' + str(predictions_list[1][0]) + ',' + str(side_res) + '\n')
        for i in range(len(predictions_list)):
            print("%s = %f" % (predictions_list[i][0], predictions_list[i][1]))

    print(clock.fps(), "fps")

darrask · April 20, 2022, 6:50am

After discussing with EdgeImpulse it seems the tf.classify function just feeds the image to the model and thus, we need to rescale ourselves. Could it be worth to clarify this regarding what the TF module does?

kwagyeman · April 21, 2022, 5:31pm

The tf function automatically bilinearly scales the input ROI (which is the whole image if not specified) into whatever size of the CNN accepts.

darrask · May 9, 2022, 5:10am

I see. How does the discrepancy in the above result appear, then?

kwagyeman · May 10, 2022, 6:13pm

Can you make a github issue tracker?

darrask · July 16, 2022, 4:08am

github.com/openmv/openmv

Have tf.classify rescale input to match model resolution?

opened 04:07AM - 16 Jul 22 UTC

kdarras

enhancement investigate

This issue is related to the forum thread here: https://forums.openmv.io/t/resi…ze-the-image-to-give-it-as-an-input-to-neural-network/1918 The TF module is supposed to handle, among others, resizing of the input to the model's resolution (but this is not documented well tf.classify help). Lately I have been using an EdgeImpulse-generated module to detect a black sponge and a green USB reader on a whiteboard. I classify images of variable resolutions and aspect ratios. While under-scaled (relative to the model’s 160 px resolution) input images significantly reduced model performance (as expected) and over-sampled images had no detectable effect (as expected from the above), surprisingly, non-rescaled images (543 * 502 - close to square proportions) have terrible (null) performance (opposed to the assumption that the TF module handles everything), even though they are close to the resolution of the images that I re-scaled to 480*480, for instance. ![image](https://user-images.githubusercontent.com/7696198/179338576-83512bcb-94f7-446d-be9d-d9bc3ba14f5b.png) Results above are based on samples of ~20 live image classifications of the target. The model usually has 100% accuracy for classifying the green USB reader. I used nearest neighbor interpolation. This is a sample image of what I was trying to classify (a green USB flash reader on a whiteboard): ![image](https://user-images.githubusercontent.com/7696198/179338640-637b2fa8-548b-4ca8-8384-1e2deee9a65e.png) I attach my script here, based on the one exported by Edgeimpulse: # Edge Impulse - OpenMV Image Classification Example ``` import sensor, image, time, os, tf, uos, gc sensor.reset() # Reset and initialize the sensor. sensor.set_pixformat(sensor.RGB565) # Set pixel format to RGB565 (or GRAYSCALE) sensor.set_framesize(sensor.WQXGA2) # Set frame size to QVGA (320x240) #sensor.set_windowing((240, 240)) # Set 240x240 window. sensor.skip_frames(time=2000) # Let the camera adjust. net = None labels = None side_res = "original" try: # load the model, alloc the model file on the heap if we have at least 64K free after loading net = tf.load("trained.tflite", load_to_fb=uos.stat('trained.tflite')[6] > (gc.mem_free() - (64*1024))) except Exception as e: print(e) raise Exception('Failed to load "trained.tflite", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')') try: labels = [line.rstrip('\n') for line in open("labels.txt")] except Exception as e: raise Exception('Failed to load "labels.txt", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')') clock = time.clock() while(True): clock.tick() img = sensor.snapshot() # default settings just do one detection... change them to search the image... if (str(side_res) != "original"): img.scale(x_size=side_res,y_size=side_res,roi=(1340,1215,543,502)) # MANUALLY comment/uncomment the following line for using original vs. re-scaled images #for obj in net.classify(img,roi=(1340,1215,543,502), min_scale=1.0, scale_mul=0.8, x_overlap=0.5, y_overlap=0.5): for obj in net.classify(img, min_scale=1.0, scale_mul=0.8, x_overlap=0.5, y_overlap=0.5): print("**********\nPredictions at [x=%d,y=%d,w=%d,h=%d]" % obj.rect()) img.draw_rectangle(obj.rect()) # This combines the labels and confidence values into a list of tuples predictions_list = list(zip(labels, obj.output())) with open('confidences.csv', 'a') as confidencelog: confidencelog.write(str(predictions_list[1][1]) + ',' + str(predictions_list[1][0]) + ',' + str(side_res) + '\n') for i in range(len(predictions_list)): print("%s = %f" % (predictions_list[i][0], predictions_list[i][1])) print(clock.fps(), "fps") ``` After discussing with EdgeImpulse, it seems the tf.classify function just feeds the image to the model and thus, we need to rescale ourselves. Clarification or automating rescaling implementation seems to be needed.

Topic		Replies	Views
TFLite model and Quantization OpenMV Boards tf	4	154	September 17, 2024
Question about image scaling for tf.classify() OpenMV Boards	2	585	May 2, 2021
Size of .tflite models in OpenMV H7 Plus OpenMV Boards	4	114	May 31, 2024
H7 Plus for Quantization TFLite Model ？ OpenMV Boards	1	857	October 18, 2020
input value doesn't match my neural network model OpenMV Boards	5	7069	May 14, 2019

Resize the image to give it as an input to neural network

Related topics