Regarding Model Classification Failed Error

Hi guys again,

First of all, thanks for releasing the Tensorflow support with the latest firmware. I am now stuck with the model classification failed issue. I will summarize the process I have gone through so far.

CNN Type
Binary classification. Ouput is simply good or bad. Input size is 100x100x3 (100pixels by 100 pixels, 3 channels).

0) Module using
OpenMV H7 R1

1) Train the model
Used tensorflow with Keras. 2D convolution layer, flattening layer, global average pooling and output layer

2) Converted to tensorflow lite
Followed the guide on the website. Post training quantization: 8 bit flatbuffer quantization through representative dataset. Final size 107KB, small enough to not use a SD card and run on H7 ram using tf.classify. Has no problem with width, height and channel.

My model shows the following output when I use the tf.lite.Interperter

== Input details ==
name: conv2d_input
shape: [  1 100 100   3]
type: <class 'numpy.uint8'>
index: 23
quantization: (1.0, 0)

== Output details ==
name: Identity
shape: [1 2]
type: <class 'numpy.uint8'>
index: 23
quantization: (1.0, 0)

I also checked the pretrained mobilenet that was provided with the new release. I see that the shape is ok.

== Input details ==
name: input
shape: [  1 128 128   3]
type: <class 'numpy.uint8'>
index: 88
quantization: (0.0078125, 128)

== Output details ==
name: MobilenetV1/Predictions/Reshape_1
shape: [   1 1001]
type: <class 'numpy.uint8'>
index: 88
quantization: (0.0078125, 128)

I do notice a difference in quantization where my model is (1.0,0) and the model you provided is (0.0078125, 128). Not sure if this is the problem.

I have attached my tensorflow lite model below:
https://drive.google.com/file/d/1nM-ZAJD6tbOqVWq3NAXJtUy1jP3zm7Nx/view?usp=sharing

The way I approached is simple: I used the example code for person detection and replaced that with my own model. Here’s an example:

import sensor, image, time, os, tf

sensor.reset()                         # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565)    # Set pixel format to RGB565 because it is 3 channel
sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
sensor.set_windowing((100, 100))       # Set 100x100 window, the input is 100x100
sensor.skip_frames(time=2000)          # Let the camera adjust.

net = "model_quantized_io2.tflite"
labels = ['bad','good']

clock = time.clock()
while(True):
    clock.tick()

    img = sensor.snapshot()
    
    for obj in tf.classify(net, img, min_scale=0.5, scale_mul=0.5, x_overlap=-1, y_overlap=-1):
        print("**********\nDetections at [x=%d,y=%d,w=%d,h=%d]" % obj.rect())
        for i in range(len(obj.output())):
            print("%s = %f" % (labels[i], obj.output()[i]))
        img.draw_rectangle(obj.rect())
        img.draw_string(obj.x()+3, obj.y()-1, labels[obj.output().index(max(obj.output()))], mono_space = False)
    print(clock.fps(), "fps")

Is there anything I am missing? I will go through the entire training process again to check if my tflite model is ok. Do let me know your thoughts.

Thanks again.

Hi, I will be able to answer this later in the day. However, can you copy and paste the output of your serial terminal in the IDE here? The actual error will be printed out in the grayed out section of the serial terminal as the error messages from TensorFlow look like standard debug text right now.

Hi, the error was in the serial terminal:

>>> Didn't find op for builtin opcode '
CONV_2D
' version '
3
'



Failed to get registration from op code  d
 


AllocateTensors() failed!

You have an op that isn’t supported. https://github.com/openmv/tensorflow/blob/master/tensorflow/lite/experimental/micro/kernels/all_ops_resolver.cc

It looks like unless specified the min/max version of the ops is 1: https://github.com/openmv/tensorflow/blob/master/tensorflow/lite/experimental/micro/micro_mutable_op_resolver.h#L32

So, you need to use version 1 of CONV_2D. Unless, you see in the all ops resolver that a range of versions is allowed.

I will post a firmware update today that makes the error messages easier to read. Also, people seem to be running into the supported ops issue a lot so I will build a list of those into the firmware with an error case so it’s obvious.

This has some info about this:

It looks like if you use anything more than basic features you get this issue.

Looks like CONV_2D version 1 to 3 was supported and then removed or something: As a first step to synchronizing our op versions with TFLite, update … · openmv/tensorflow@1ee0f17 · GitHub

Hey Kwageman,

Thanks for the reply.
Just got to see this, it’s morning around this part.

Will check and get back to you.

Here’s the binary for the H7 which prints the error in the console:

Hey again,

Thanks, updated the firmware.

I am retracing back to building a simple MNIST model first and then will be working on my custom model based on the lessons.

Will get back, time for that 1000th coffee

Hey again,

I realized I have been running in circles again. Posting up the updates. My question has been on how to change the version of lets say CONV2D or SOFTMAX.

Approach
Proceeded to build a simple NN for MNIST classification. This would then help to work on my custom model.

Problem
SOFTMAX version 2 not found. Similar to CONV2D issue you mentioned with my model before

Traceback (most recent call last):
  File "<stdin>", line 26, in <module>
OSError: Didn't find op for builtin opcode 'SOFTMAX' version '2'

Documentation
I have documented the entire process in google colab. Shift+F9 runs the whole thing. Generates mnist_model_quant.tlite

Methods Tried

  1. Changing different TF
  2. Changing tf.compat.v1.math.softmax [My logic was that Softmax would change to version 1 but no effect]
  3. Changing different architectures. Other architectures also have the same “Version not found” issue
    I am used to using the high level API so I could be missing out something.

I realized that other people also have had the same issue. They have also tried different iterations/codes/techniques to generate. What I read from it is that there’s no support in the library as of yet.

Is there any way, looking at my code, that you think I could change to the version of CONV2D or SOFTMAX. The quantization model generated through the code/guide provided by google doesn’t seem to be compatible with the current library.

Let me know your thoughts. I will go get some rest for now. Barely keeping awake.

Hi, to be honest I’m not more familiar with this than you. We are kinda depending on Google here now.

I would do this, only use ops used by mobilenet-v1 (I’ve run Mobilenet v1 on the camera and it works). This way you are using the supported ops they have rolled out so far. I can easily update our code as they roll out more ops but it’s going to take a while. So, I’d take a look at the ops used by mobilenet and restrict yourself to those for right now.

Hey again,

I see, in that case we will have to wait for google then.

In the meantime though, as you said, looking at mobilenet could give some ideas. I did do some searching:

  1. The original paper was published in 2017 and used Tensorflow (although does not mention which). I am assuming they had the basic one which is Tensorflow1.1
  2. They use SOFTMAX, AVG POOLING and DENSE along with CONV2D layers. Since it worked on OpenMV, they should have used the version 1

My question would then be this:

How did you create the mobilenet that you provided with OpenMV examples? How did you train the network and change it to qunatized tflite models? If not, how did you transfer learn and then change them to tflite?

Would be very interesting if you could have a basic colab(similar to one i posted before) or guide on how OpenMV trained models that you run on your modules and let the users then train their own based on the steps followed. I am sure this is additional work but I think would be a win-win for both the company and users as would mean less confusion and questions.

Do let me know, I will try to see if I can get tensorflow 1.1 going without Keras. I haven’t done that so might be a while.

Thanks again.

For Mobilenet, the script that makes it is here: models/mobilenet_v1.py at master · tensorflow/models · GitHub

For running mobilenet I just used the quanitized version here: models/mobilenet_v1.md at master · tensorflow/models · GitHub

Mobilenet only fits on our new version with SDRAM, but, I can run it on that without issue.

Regarding a guide, I haven’t actually trained any models. We are just using the ones Google supplied right now. There’s a lot of other stuff on my work queue right now before I can write a guide on this.

Since we re using TensorFlow OpenMV isn’t really the issue here. It’s that Google’s library doesn’t yet have the layers you want.

I would start here: models/research/slim at master · tensorflow/models · GitHub. I guess that you can’t use Keras right now.

Hey again,

Thanks for the information. It’s clear now how the models were provided. Will look into them.

Also, sent you an email through my official mail. If you need a gmail account to for the mailing list, you can use the one you contacted me with. Otherwise my institution prefers the official.

Thanks again