Quantized model on OpenMV Cam H7 Plus

Hi,

I am hoping to run an object detection model on the OpenMV Cam H7 Plus. I am trying to use SSD mobilenet v2 fpnlite 320. After training this model, it is 11.5 MB as .tflite but by quantizing it I can reduce the size to 3.7 MB. This is smaller than the 4.2 MB available on the heap for the H7 Plus. However, when I try to load the model onto the device I get the error Failed to load model: Failed to allocate tensors. By rebuilding the firmware and using -release_with_logs.a I was able to get the error tflm_backend: Failed to get registration from op code CUSTOM. After looking through some forums, I made sure to quantize using this code:

converter = tf.lite.TFLiteConverter.from_saved_model('path_to_model/saved_model/')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = [tf.int8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()

The problem persisted when quantizing this way. I am able to successfully detect objects in test images with the quantized model on my machine. The code I am using to load the model onto the camera is very simple:

import sensor, image, time, os, ml, math, uos, gc
from ulab import numpy as np

sensor.reset()                         # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565)    # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.HVGA) # Modify as you like.
sensor.set_windowing((320, 320))
sensor.skip_frames(time=2000)          # Let the camera adjust.


print(f"Model size = {uos.stat('detect_quant.tflite')[6]}")
print(f"Available space = {gc.mem_free() - (64*1024)}")
try:
    net = ml.Model("detect_quant.tflite", load_to_fb=uos.stat('detect_quant.tflite')[6] > (gc.mem_free() - (64*1024)))
    print('Successfully loaded model')
except Exception as e:
    print('Failed to load model: ' + str(e))

Do you have any idea what the unsupported custom OP might be? Is there anything I can change in the way that I train or quantize the model to make it able to be loaded onto OpenMV Cam H7 Plus?

Load your model here: Netron

And then cross check the ops used and make sure they are in: openmv/src/lib/tflm/tflm_backend.cc at master · openmv/openmv

All of the ops are supported except for the output layer which uses TFLite_Detection_PostProcess. It looks like this function is commented out in tlfm_backend.cc. Do you know of any workarounds for this?

Hi, since you are compiling the firmware just enable it: openmv/src/lib/tflm/tflm_backend.cc at master · openmv/openmv

Otherwise you to want to shave that operator off the network and do it in Python code like this: openmv/scripts/libraries/ml/ml/postprocessing.py at master · openmv/openmv

Thanks. By recompiling the firmware, and using load_to_fb=True I was able to successfully load the model. However, I have one more question for you. When I try to feed an image into the network I get ValueEror: unexpected tensor shape. I am unsure what the issue is, because when I print the input tensor shape and the image dimensions they seem to line up. Is there a way to get more explicit feedback on which dimensions did not align? One possibility is that it might expect images in RGB888 format rather than RGB565. I will attach the python code I am running on the device, as well as jupyter notebook I used for training if you want more information.
detect_mobilenet.py (1.6 KB)
train_tflite.zip (6.4 KB)

Can you just print() the model itself and post that info here. This will list the tensors. Note that we support up to 4D tensors.

Preprocessing for the image input is here: openmv/scripts/libraries/ml/ml/preprocessing.py at master · openmv/openmv

We handle the RGB565 to RGB888 conversion.

That error is from here: openmv/src/omv/modules/py_ml.c at master · openmv/openmv

Something is going to wrong in either the input or output tensor shape. The error implies either the input or output tensor has no dims.

Model dims are captured here: openmv/src/lib/tflm/tflm_backend.cc at master · openmv/openmv

We pretty much grab all the tensor info from the model directly from tensorflow. Anyway, seeing the model tensor shapes will help debug this.

Here is the model printed from the H7 Plus:

{ model_size: 3741072, model_addr: 0xc1c6ea6c, ram_size: 3566032, ram_addr: 0xc0017780, input_shape: ((1, 320, 320, 3),), input_scale: (0.00392157,), input_zero_point: (0,), input_dtype: ('B',), output_shape: ((), (), (), ()), output_scale: (1.0, 1.0, 1.0, 1.0), output_zero_point: (0, 0, 0, 0), output_dtype: ('f', 'f', 'f', 'f') }

It looks like the output shape has no dimensions. If I print the input and output details of the model when it is loaded in TensorFlow Interpreter on my machine, it looks like this:

input details = [
	{'name': 'serving_default_input:0', 
	'index': 0, 
	'shape': array([  1, 320, 320,   3], dtype=int32), 
	'shape_signature': array([  1, 320, 320,   3], dtype=int32), 
	'dtype': <class 'numpy.uint8'>, 
	'quantization': (0.003921568859368563, 0), 
	'quantization_parameters': {
	'scales': array([0.00392157], dtype=float32), 
	'zero_points': array([0], dtype=int32), 
	'quantized_dimension': 0}, 
	'sparsity_parameters': {}}]


output details = [

	{'name': 'StatefulPartitionedCall:1', 
	'index': 383, 
	'shape': array([ 1, 10], dtype=int32), 
	'shape_signature': array([ 1, 10], dtype=int32), 
	'dtype': <class 'numpy.float32'>, 
	'quantization': (0.0, 0), 
	'quantization_parameters': {
	'scales': array([], dtype=float32), 
	'zero_points': array([], dtype=int32), 
	'quantized_dimension': 0}, 
	'sparsity_parameters': {}}, 

	{'name': 'StatefulPartitionedCall:3', 
	'index': 381, 
	'shape': array([ 1, 10,  4], dtype=int32), 
	'shape_signature': array([ 1, 10,  4], dtype=int32), 
	'dtype': <class 'numpy.float32'>, 
	'quantization': (0.0, 0), 
	'quantization_parameters': {
	'scales': array([], dtype=float32), 
	'zero_points': array([], dtype=int32), 
	'quantized_dimension': 0}, 
	'sparsity_parameters': {}}, 

	{'name': 'StatefulPartitionedCall:0', 
	'index': 384, 
	'shape': array([1], dtype=int32), 
	'shape_signature': array([1], dtype=int32), 
	'dtype': <class 'numpy.float32'>, 
	'quantization': (0.0, 0), 
	'quantization_parameters': {
	'scales': array([], dtype=float32), 
	'zero_points': array([], dtype=int32), 
	'quantized_dimension': 0}, 
	'sparsity_parameters': {}}, 

	{'name': 'StatefulPartitionedCall:2', 
	'index': 382, 
	'shape': array([ 1, 10], dtype=int32), 
	'shape_signature': array([ 1, 10], dtype=int32), 
	'dtype': <class 'numpy.float32'>, 
	'quantization': (0.0, 0), 
	'quantization_parameters': {
	'scales': array([], dtype=float32), 
	'zero_points': array([], dtype=int32), 
	'quantized_dimension': 0}, 
	'sparsity_parameters': {}}]

Hmmm, our library isn’t able to handle that output at all. However, it may be something I can easily parse. It appears to be a 4 tensor output. The library should have handled that though, so, there’s something we can improve in our code.

Okay, I need this from you:

  1. An input image to test with. Best if it’s the exact size of the input to the model.
  2. The exact expected values of the output tensors.

I can then update our backend to handle this model and run correctly.

Did you check this? Heads up custom Tflite models with Latest versions of Tensorflow

Here are the output tensors:

output_tensor_1 = [[0.99609375 0.0078125  0.00390625 0.00390625 0.00390625 0.00390625
  0.00390625 0.00390625 0.00390625 0.00390625]]

output_tensor_2 = [[[ 0.23265359  0.41437462  0.83452284  0.58187926]
  [ 0.3753295   0.26537663  0.5947138   0.64145315]
  [-0.01361288  0.3794297   0.04515417  0.5357584 ]
  [ 0.07706149  0.55290985  0.14303254  0.71663904]
  [ 0.09968229  0.5476206   0.16877642  0.7164772 ]
  [ 0.12220028  0.54266745  0.19625843  0.71815974]
  [ 0.14538985  0.5432126   0.22415906  0.7187049 ]
  [ 0.17793639  0.5033666   0.2459734   0.66836286]
  [ 0.1711407   0.5456453   0.25113377  0.7184525 ]
  [ 0.03213066  0.60696185  0.38053498  0.67497635]]]

output_tensor_3 = [10.]

output_tensor_4 = [[0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]

image_and_model.zip (2.5 MB)

Hi, I’ve made an issue tracker to work on this: Update tensorflow to handle post-processed model · Issue #2543 · openmv/openmv

However, I will not be able to get to it for a while. If you want to try to make the changes to the firmware yourself to make it work this would be appreciated and then I can review the PR.

The issue is the parsing in this function.

All headers for the library are here: libtflm/include at 73eb1aa07416cad0fd64a8405ca9eb5c4e878e87 · openmv/libtflm

Did you mean to provide a link to the function that does the parsing? You only included a link to the headers.

That’s buried in the TensorFlow lite code. tensorflow/tflite-micro: Infrastructure to enable deployment of ML models to low-power resource-constrained embedded targets (including microcontrollers and digital signal processors).

Given the model loads the issue is just parsing of the headers.

Any idea when you can get around to fixing this issue?

I’ll see what I can do this week.