TensorFlow Lite model inference result is wrong

Hi, I’m developing Facial Expression Recognition deep learning model for OpenMV.

I trained TensorFlow model for this task.
I converted Keras model to TensorFlow Lite quantized model.
The model is int8 quantized(both input and output).

My quantized model work well on PC,
But model dose not work on OpenMV H7.
It’s can loaded but inference result is always wrongz(guess wrong expression).

TFLite model is here.

Conversion script is here.

OpenMV script is here.

On OpenMV, using the firmware on this thread.

TFLite model input is int8 but Pixel Value is 255.
It’s no problem?

Do you have idea to fix inference result?

Thank you.

Hi, if you trained for int and you use grayscale we’d just give you 0 to 255. This probably breaks your model. Our code just converts the source image into the target format for the CNN. It does not convert unsigned data into signed data.

What was the range of input you fed your model? 0 to 127? -128 to 127? What’s the mapping from 0 to 255 to what your model takes?

I can probably add something to make this automatic. But, information on the mapping would help.

Also, our output code expect uint8 too. Please look at the code here and let me know what needs more flexibility. I only just removed the check in our lib to allow int8 images in. I did not do anything to fix the signedness.

https://github.com/openmv/openmv/blob/master/src/omv/py/py_tf.c#L213 - Input

https://github.com/openmv/openmv/blob/master/src/omv/py/py_tf.c#L322 - Output

Thanks kwagyeman,

I tried int8(0-127) input and uint8 output.
Then I got this error message when running.

OSError: tensorflow/lite/micro/kernels/quantize.cc:47 input->type == kTfLiteFloat32 || input->type == kTfLiteInt16 was not true.

I have just one quantize node, after softmax.
This model is fully 8bit quantized and error says input should be float or int16…?
I don’t know what this mean crealy.

Anyway, I will try some experiment for some conditions.

Thank you.

That’s an error in the TensorFlow library. I cannot solve that.

Anyway, for your previous network. What was the range of the input data and expected output data?

I can just subtract 128 from all pixels going in and subtract 128 from the output and this might mix everything.

Please tell me the range of input data and output data on the desktop.