Loading a tflite model into SDRAM instead of SRAM - H7 Plus

sencery · April 28, 2023, 6:15pm

Hey,

I have a tflite object detection model which I have trained with Edge Impulse with a very nice accuracy. However, even though I have tried so hard, I could not been able to load it into SDRAM instead of SRAM. I have no worries about the time the process will take, I just want to make it running on H7 Plus. Is there way to use SDRAM instead of SRAM for loading tflite models?

If the answer is no, then is there any possibility for H7 Plus to load a tflite model which has a larger size than SRAM can handle?

Thank you so much in advance.

kwagyeman · April 28, 2023, 6:51pm

Hi, the model automatically loads on SDRAM using the H7 Plus. This is by default. You have no option to load it into SRAM. The firmware will speed things up by moving the model into an SRAM cache if it can, but, only if it’s small enough.

Please share your code.

sencery · April 28, 2023, 7:06pm

Actually, I am using edge impulse’s image classification inference code. I only change the model path and use my own model. After sensor.skip_frames(time=2000) line, the display on frame buffer freezes without any LED light flashing on OpenMV.

For another tflite model which is even larger that I used the model here, the board crashes with green LED light flashes and the connection wents off.

# Edge Impulse - OpenMV Image Classification Example

import sensor, image, time, os, tf, uos, gc

sensor.reset()                         # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565)    # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
sensor.set_windowing((240, 240))       # Set 240x240 window.
sensor.skip_frames(time=2000)          # Let the camera adjust.

net = None
labels = None

try:
    # load the model, alloc the model file on the heap if we have at least 64K free after loading
    net = tf.load("my_model.tflite", load_to_fb=True)
except Exception as e:
    print(e)
    raise Exception('Failed to load "trained.tflite", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')')

try:
    labels = [line.rstrip('\n') for line in open("labels.txt")]
except Exception as e:
    raise Exception('Failed to load "labels.txt", did you copy the .tflite and labels.txt file onto the mass-storage device? (' + str(e) + ')')

clock = time.clock()
while(True):
    clock.tick()

    img = sensor.snapshot()

    # default settings just do one detection... change them to search the image...
    for obj in net.classify(img, min_scale=1.0, scale_mul=0.8, x_overlap=0.5, y_overlap=0.5):
        print("**********\nPredictions at [x=%d,y=%d,w=%d,h=%d]" % obj.rect())
        img.draw_rectangle(obj.rect())
        # This combines the labels and confidence values into a list of tuples
        predictions_list = list(zip(labels, obj.output()))

        for i in range(len(predictions_list)):
            print("%s = %f" % (predictions_list[i][0], predictions_list[i][1]))

    print(clock.fps(), "fps")

kwagyeman · April 28, 2023, 8:00pm

Oh, that’s unrelated to the model size. Unfortunately, TensorFlow lite for microcontrollers just crashes versus printing an error when you use an unsupported op:

github.com

openmv/tensorflow-lib/blob/master/libtf.cc#L71


      
          {
              if (type == kTfLiteUInt8) {
                  return LIBTF_DATATYPE_UINT8;
              } else if (type == kTfLiteInt8) {
                  return LIBTF_DATATYPE_INT8;
              } else {
                  return LIBTF_DATATYPE_FLOAT;
              }
          }
          
          
static void libtf_init_op_resolver(tflite::MicroMutableOpResolver<LIBTF_MAX_OPS> &resolver)
          {
              // resolver.AddAbs();
              resolver.AddAdd();
              resolver.AddAddN();
              // resolver.AddArgMax();
              // resolver.AddArgMin();
              // resolver.AddAssignVariable();
              resolver.AddAveragePool2D();
              // resolver.AddBatchToSpaceNd();
              // resolver.AddCallOnce();

So, if you have an op in your model not in that list you get a crash.

sencery · April 28, 2023, 8:35pm

Oh, I haven’t even thought something else than the size of the model. Surprised a lot. Thank you.

Is there any way to solve this issue? I mean, is there a way for me to use only supported ops? Besides, I don’t know what op really means and how to config it. I have found this, but I did not understand what is going on in detail.

Thanks in advance, again.

kwagyeman · April 28, 2023, 9:12pm

Yeah, we just support ops that Edge Impulse spits out. Enabling all of them on the H7 takes all of the SRAM.

On the upcoming OpenMV RT we will enable them all.

As for you current issue… two options:

Retrain your model with only those ops. It should be clear from your tensor flow graph.
If you give me the ops you are trying to use I could enable the missing one.

sencery · April 30, 2023, 12:45pm

Alright, thanks for your help in advance. I am sharing the ops that the model has:

Sub
Mul
Conv2D
DepthwiseConv2D
Add
Relu6
ResizeNearestNeighbor
Relu
StridedSlice
Logistic
MaxPool2D
Abs
Cast
Less
Transpose
Reshape
TopkV2
FloorDiv
Pack
Gathernd
Unpack
Tile
GatherEqual
Minimum
NotEqual
Sqrt
ArgMin
Maximum

If you have the opportunity and time to give me a real quick tutorial about how can I enable these missing ops by myself, it would be much better. Because I may be using some other models for different purposes that H7 Plus does not support in the future. If it is too complicated that you can not even explain, then it would be great if you can enable the missing ops.

Thank you so much.

kwagyeman · April 30, 2023, 7:16pm

Hi, TensorFlow Lite for Microcontrollers supports everything but:

Cast
TopkV2
Gathernd
Tile
GatherEqual

So, you need to remove these ops from the network. Also, your network is extremely complex. The output of nets from EdgeImpulse only use 6 ops… you are using a plethora of them. So, I guess this net is not designed to run on embedded hardware. As such, the performance is likely to be very low.

I could update the library to upstream… but, the supported ops would be this: tflite-micro/all_ops_resolver.cc at main · edgeimpulse/tflite-micro · GitHub

Cast and Gathernd have been added.

So, you’d need to eliminate TopkV2, Tile, and GatherEqual.

sencery · April 30, 2023, 7:36pm

Hello, Kwabena

It is really sad that I won’t be able to use this model with a microcontroller. However, I’d be really happy if you can update the library to upstream for Cast and Gathernd. Maybe I will make my way to eliminate the others and see if the accuracy goes down a lot or not. By the way, if you have the chance to explain how to update the board that it can run models with other ops, it would be amazing.

And, thank you so much. I really appreciate.

kwagyeman · April 30, 2023, 7:57pm

Hi, getting the latest library is quite a lot of work… It’s easy to enable the ops on our current version. If I pull the latest upstream I’ll have to fight a lot of issues unrelated to this. Please keep in mind we advise users to use the Edge Impulse flow as this works and produces usable models… and which does not require these ops.

Anyway, if you can remove these ops:

Cast
TopkV2
Gathernd
Tile
GatherEqual

Then I’ll give you a firmware you can run which has the rest enabled. Also, if the added code isn’t too much I can send a PR to our repo to permanently enable the other ops.

Whatever the case, THANK YOU for actually helping me identify which ops to enable. This will help future folks. It’s not really clear which to add and they do require a lot of code space.

sencery · April 30, 2023, 8:15pm

Alright, I will be waiting for you to publish the firmware which can handle these ops after I remove that 5 ops. I will be sharing the new accuracy of the model after the removal to future folks for them to have further ideas as well as you have said.

And do not thank me because it was you guys’ helps who made me continue to work on these stuff.

Maybe you may want to consider changing the header of this question before you share the new firmware since it is a little off topic

I’ll be waiting for you and posting the difference between the accuracies caused by differences of ops meanwhile.

Thank you!

kwagyeman · April 30, 2023, 8:24pm

Hi, it appears that:

FloorDiv and Transpose are not available.

I have though:

Floor and TransposeConv

sencery · April 30, 2023, 8:33pm

Alright, I am going to take a look at them and check if they are equivalent or not.

sencery · May 1, 2023, 6:10am

Hello, @kwagyeman

I found out that I won’t be able to run this model even though I remove those layers. That is why I came up with another model which can be useful for my project with a way less complex network. Here are the layers:

Quantize
Conv2D
Relu6
DepthwiseConv2D
Add
Pad
Mean
FullyConnected
Logistic
Dequantize

Can you make this network runnable on OpenMV H7 Plus? It will be enough for me AFAIU.

Thanks a lot.

kwagyeman · May 1, 2023, 6:07pm

firmware.zip (1.0 MB)

Hi, here’s the attached firmware with all those enabled along with others you asked for that I could enable.

kwagyeman · May 1, 2023, 6:12pm

PR done: imlib/libtf: Enable more ops in tensorflow library. by kwagyeman · Pull Request #1831 · openmv/openmv · GitHub

BehicMV · May 2, 2023, 11:12am

Hi @kwagyeman,
Thanks for this PR and the new firmware. I just tried your new firmware request with
quantized_model.zip (434.9 KB) model.

This model has “Dequantize” layer which is incompatible with the board. Unfortunately the new firmware couldn’t resolve this problem.

Should I do anything else before the inference except running the bootloader?

Thanks in advance

kwagyeman · May 2, 2023, 2:42pm

Not sure what you mean in the last statement. I guess that means we need to update to the latest TensorFlow version. I will ask Edge Impulse where they are at and get that started.

sencery · May 3, 2023, 1:46pm

Hey, @kwagyeman
I have made some further tests with different neural networks. The latest firmware can not run Quantize and Dequantize. Waiting for you to update.

Thanks

kwagyeman · May 3, 2023, 7:13pm

Hi, this requires an update to the library. The ETA for fixes for this will be significantly longer. If you can drop those layers it would be helpful.

Also, why are those in your network? The OpenMV Cam is optimized for 8-bit signed multiplies. So… you should have already quantized the model offline.

Topic		Replies	Views
OpenMV4 Cam H7 Plus, accessing 32Mb SDRAM OpenMV Boards	9	49	April 9, 2025
Maximum size OpenMV Cam H7 Plus OpenMV Boards	9	857	January 16, 2022
H7 Camera Board - tf.load OpenMV Boards	5	753	January 19, 2022
OpenMV H7 Plus Usage of 32 MB SDRAM OpenMV Boards	5	514	August 5, 2022
The bug of OpenMV cam H7 when using Edge impulse OpenMV Boards	4	575	July 12, 2023

Loading a tflite model into SDRAM instead of SRAM - H7 Plus

Related topics