Regarding Model Classification Failed Error

Project_Sat · November 24, 2019, 1:22pm

Hi guys again,

First of all, thanks for releasing the Tensorflow support with the latest firmware. I am now stuck with the model classification failed issue. I will summarize the process I have gone through so far.

CNN Type
Binary classification. Ouput is simply good or bad. Input size is 100x100x3 (100pixels by 100 pixels, 3 channels).

0) Module using
OpenMV H7 R1

1) Train the model
Used tensorflow with Keras. 2D convolution layer, flattening layer, global average pooling and output layer

2) Converted to tensorflow lite
Followed the guide on the website. Post training quantization: 8 bit flatbuffer quantization through representative dataset. Final size 107KB, small enough to not use a SD card and run on H7 ram using tf.classify. Has no problem with width, height and channel.

My model shows the following output when I use the tf.lite.Interperter

== Input details ==
name: conv2d_input
shape: [  1 100 100   3]
type: <class 'numpy.uint8'>
index: 23
quantization: (1.0, 0)

== Output details ==
name: Identity
shape: [1 2]
type: <class 'numpy.uint8'>
index: 23
quantization: (1.0, 0)

I also checked the pretrained mobilenet that was provided with the new release. I see that the shape is ok.

== Input details ==
name: input
shape: [  1 128 128   3]
type: <class 'numpy.uint8'>
index: 88
quantization: (0.0078125, 128)

== Output details ==
name: MobilenetV1/Predictions/Reshape_1
shape: [   1 1001]
type: <class 'numpy.uint8'>
index: 88
quantization: (0.0078125, 128)

I do notice a difference in quantization where my model is (1.0,0) and the model you provided is (0.0078125, 128). Not sure if this is the problem.

I have attached my tensorflow lite model below:
https://drive.google.com/file/d/1nM-ZAJD6tbOqVWq3NAXJtUy1jP3zm7Nx/view?usp=sharing

The way I approached is simple: I used the example code for person detection and replaced that with my own model. Here’s an example:

import sensor, image, time, os, tf

sensor.reset()                         # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565)    # Set pixel format to RGB565 because it is 3 channel
sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
sensor.set_windowing((100, 100))       # Set 100x100 window, the input is 100x100
sensor.skip_frames(time=2000)          # Let the camera adjust.

net = "model_quantized_io2.tflite"
labels = ['bad','good']

clock = time.clock()
while(True):
    clock.tick()

    img = sensor.snapshot()
    
    for obj in tf.classify(net, img, min_scale=0.5, scale_mul=0.5, x_overlap=-1, y_overlap=-1):
        print("**********\nDetections at [x=%d,y=%d,w=%d,h=%d]" % obj.rect())
        for i in range(len(obj.output())):
            print("%s = %f" % (labels[i], obj.output()[i]))
        img.draw_rectangle(obj.rect())
        img.draw_string(obj.x()+3, obj.y()-1, labels[obj.output().index(max(obj.output()))], mono_space = False)
    print(clock.fps(), "fps")

Is there anything I am missing? I will go through the entire training process again to check if my tflite model is ok. Do let me know your thoughts.

Thanks again.

kwagyeman · November 24, 2019, 3:47pm

Hi, I will be able to answer this later in the day. However, can you copy and paste the output of your serial terminal in the IDE here? The actual error will be printed out in the grayed out section of the serial terminal as the error messages from TensorFlow look like standard debug text right now.

kwagyeman · November 24, 2019, 11:32pm

Hi, the error was in the serial terminal:

>>> Didn't find op for builtin opcode '
CONV_2D
' version '
3
'



Failed to get registration from op code  d
 


AllocateTensors() failed!

You have an op that isn’t supported. https://github.com/openmv/tensorflow/blob/master/tensorflow/lite/experimental/micro/kernels/all_ops_resolver.cc

…

It looks like unless specified the min/max version of the ops is 1: https://github.com/openmv/tensorflow/blob/master/tensorflow/lite/experimental/micro/micro_mutable_op_resolver.h#L32

…

So, you need to use version 1 of CONV_2D. Unless, you see in the all ops resolver that a range of versions is allowed.

…

I will post a firmware update today that makes the error messages easier to read. Also, people seem to be running into the supported ops issue a lot so I will build a list of those into the firmware with an error case so it’s obvious.

kwagyeman · November 25, 2019, 12:21am

This has some info about this:

It looks like if you use anything more than basic features you get this issue.

kwagyeman · November 25, 2019, 12:39am

Looks like CONV_2D version 1 to 3 was supported and then removed or something: As a first step to synchronizing our op versions with TFLite, update … · openmv/tensorflow@1ee0f17 · GitHub

Project_Sat · November 25, 2019, 2:03am

Hey Kwageman,

Thanks for the reply.
Just got to see this, it’s morning around this part.

Will check and get back to you.

kwagyeman · November 25, 2019, 3:07am

Here’s the binary for the H7 which prints the error in the console:

Project_Sat · November 26, 2019, 2:37am

Hey again,

Thanks, updated the firmware.

I am retracing back to building a simple MNIST model first and then will be working on my custom model based on the lessons.

Will get back, time for that 1000th coffee

Project_Sat · November 26, 2019, 8:40am

Hey again,

I realized I have been running in circles again. Posting up the updates. My question has been on how to change the version of lets say CONV2D or SOFTMAX.

Approach
Proceeded to build a simple NN for MNIST classification. This would then help to work on my custom model.

Problem
SOFTMAX version 2 not found. Similar to CONV2D issue you mentioned with my model before

Traceback (most recent call last):
  File "<stdin>", line 26, in <module>
OSError: Didn't find op for builtin opcode 'SOFTMAX' version '2'

Documentation
I have documented the entire process in google colab. Shift+F9 runs the whole thing. Generates mnist_model_quant.tlite

Methods Tried

Changing different TF
Changing tf.compat.v1.math.softmax [My logic was that Softmax would change to version 1 but no effect]
Changing different architectures. Other architectures also have the same “Version not found” issue
I am used to using the high level API so I could be missing out something.

I realized that other people also have had the same issue. They have also tried different iterations/codes/techniques to generate. What I read from it is that there’s no support in the library as of yet.

github.com/tensorflow/tensorflow

TF Micro requires CONV_2D version '2' when applying quantization

opened 02:53PM - 12 Sep 19 UTC

closed 02:32PM - 07 Nov 21 UTC

dustedduke

stat:awaiting response type:support stalled comp:lite TF 2.0

- Have I written custom code: Yes - OS Platform and Distribution: Ubuntu 18.04 … - TensorFlow version: 1.14, 2.0-rc0, nightly-preview 2.0.0.dev20190911 - Python version: 3.7.4 I am trying to run a simple Keras model with TensorFlow version for microcontrollers (modified micro_vision example with several convolutions, dense layers and batch normalization). While the model works correctly when using floats, the quantized version generated with `tf.lite.Optimize.DEFAULT` outputs wrong values and `'Didn't find op for builtin opcode 'CONV_2D' version '2' Invoke failed.'` error. After changing `CONV_2D` version with `AddBuiltin(BuiltinOperator_CONV_2D, Register_CONV_2D(), 1, 2)` in `all_ops_resolver.cc` the error disappeared, but now the model outputs NaN instead of normal output and I'm not sure if it actually uses a different micro kernel. After trying multiple conversion techniques (from .h5, model itself, SavedModel folder etc.) with multiple TF versions and BatchNormalization removal, the situation remains the same. As far as I understand, each way to quantize the model (post-training quantization, quantization aware training) requires this version of `CONV2D`, but it seems to work incorrectly or there is no such version in TF micro. Is there any way to deploy such a model with quantization?

Is there any way, looking at my code, that you think I could change to the version of CONV2D or SOFTMAX. The quantization model generated through the code/guide provided by google doesn’t seem to be compatible with the current library.

Let me know your thoughts. I will go get some rest for now. Barely keeping awake.

kwagyeman · November 26, 2019, 6:56pm

Hi, to be honest I’m not more familiar with this than you. We are kinda depending on Google here now.

I would do this, only use ops used by mobilenet-v1 (I’ve run Mobilenet v1 on the camera and it works). This way you are using the supported ops they have rolled out so far. I can easily update our code as they roll out more ops but it’s going to take a while. So, I’d take a look at the ops used by mobilenet and restrict yourself to those for right now.

Project_Sat · November 27, 2019, 12:00pm

Hey again,

I see, in that case we will have to wait for google then.

In the meantime though, as you said, looking at mobilenet could give some ideas. I did do some searching:

The original paper was published in 2017 and used Tensorflow (although does not mention which). I am assuming they had the basic one which is Tensorflow1.1
They use SOFTMAX, AVG POOLING and DENSE along with CONV2D layers. Since it worked on OpenMV, they should have used the version 1

My question would then be this:

How did you create the mobilenet that you provided with OpenMV examples? How did you train the network and change it to qunatized tflite models? If not, how did you transfer learn and then change them to tflite?

Would be very interesting if you could have a basic colab(similar to one i posted before) or guide on how OpenMV trained models that you run on your modules and let the users then train their own based on the steps followed. I am sure this is additional work but I think would be a win-win for both the company and users as would mean less confusion and questions.

Do let me know, I will try to see if I can get tensorflow 1.1 going without Keras. I haven’t done that so might be a while.

Thanks again.

kwagyeman · November 27, 2019, 5:57pm

For Mobilenet, the script that makes it is here: models/mobilenet_v1.py at master · tensorflow/models · GitHub

For running mobilenet I just used the quanitized version here: models/mobilenet_v1.md at master · tensorflow/models · GitHub

Mobilenet only fits on our new version with SDRAM, but, I can run it on that without issue.

…

Regarding a guide, I haven’t actually trained any models. We are just using the ones Google supplied right now. There’s a lot of other stuff on my work queue right now before I can write a guide on this.

Since we re using TensorFlow OpenMV isn’t really the issue here. It’s that Google’s library doesn’t yet have the layers you want.

I would start here: models/research/slim at master · tensorflow/models · GitHub. I guess that you can’t use Keras right now.

Project_Sat · November 28, 2019, 3:38am

Hey again,

Thanks for the information. It’s clear now how the models were provided. Will look into them.

Also, sent you an email through my official mail. If you need a gmail account to for the mailing list, you can use the one you contacted me with. Otherwise my institution prefers the official.

Thanks again

Topic		Replies	Views
TensorFlow Lite model inference result is wrong OpenMV Boards	3	2534	April 19, 2020
Tensorflow Lite model output type error OpenMV Boards ml	14	74	June 6, 2025
H7 Plus for Quantization TFLite Model ？ OpenMV Boards	1	857	October 18, 2020
tflite image classification out of memory OpenMV Boards	1	923	November 20, 2019
Tensorflow lite OpenMV Boards	5	1040	July 5, 2020

Regarding Model Classification Failed Error

Related topics