Success stories batch ml predict?

Joe100A · August 15, 2025, 8:30pm

Did anyone manage to train a tflite model that can make predictions on batches of images successfully on Openmv H7plus 4.7 release?Wanting to see if I can get a performance increase on a small model of mine by predicting in batches rather than in a serial manner.I am trying and failing miserably when loading the model created with tensorflow using the same parameters for quantization that I use for the normal serial model with a different representative_data set construct.

def representative_data_gen():
samples_per_class = rep_data_samples_per_class

class_counts = {j: 0 for j in range(NUM_CLASSES)}

batch_imgs = \[\]

for img, label in rep_data:

    true_class = np.argmax(label)

    if class_counts\[true_class\] >= samples_per_class:

        continue

    batch_imgs.append(img)

    class_counts\[true_class\] += 1

    if len(batch_imgs) == BATCH_SIZE:

        yield \[np.stack(batch_imgs, axis=0).astype(np.float32)\]

        batch_imgs = \[\]

\# Pad last incomplete batch with duplicates

if batch_imgs:

    while len(batch_imgs) < BATCH_SIZE:

        batch_imgs.append(batch_imgs\[-1\])  

    yield \[np.stack(batch_imgs, axis=0).astype(np.float32)\]

…    # Convert to quantized TFLite

converter = tf.lite.TFLiteConverter.from_keras_model(fixed_model)

converter.optimizations = \[tf.lite.Optimize.DEFAULT\]

converter.target_spec.supported_ops = \[tf.lite.OpsSet.TFLITE_BUILTINS_INT8\]

converter.inference_input_type = tf.int8

converter.inference_output_type = tf.int8

converter.representative_dataset = representative_data_gen

tflite_model = converter.convert()

Can it be that ml doesn’t support batch inference?
Tried both loading from ROM FS and Flash with load_to_fb True and False

tf_mod = ml.Model(F)

Model size: 39096 bytesAllocated RAM: 6016 Free RAM: 4331264

Load failed: Failed to allocate tensors

Allocated RAM: 4104384 Free RAM: 232896

note batch tflites generated from h5 files from which I generate also serial tflite models that run successfully so there are no operations in the model which the openmv cannot handle in the model.

Joe100A · August 15, 2025, 8:44pm

Manage to load now after I was reminded on the bottom of the thread of an unsupported operation with the old tf library (old post of mine):
This was not required after tf to ml migration so I had removed it from my training procedure.
converter._experimental_disable_per_channel_quantization_for_dense_layers = True

I have added it to the tensorflow tflite conversion procedure and now the model loads.

Joe100A · August 15, 2025, 9:27pm

After testing timing serial vs batch it is evident that on this platform there is no gain on batch vs serial performance wise.

Unless there are some batch specific optimization steps to be done in tensorflow.

If anyone has an application such as mine would be glad to exchange thoughts.

kwagyeman · August 15, 2025, 9:38pm

Yeah, there would be no performance gain as the batch operations are for when you have really wide SIMD paths, which the H7/RT1062 don’t. The N6 and AE3 have the NPU onboard which can be helpful for this.

Joe100A · August 15, 2025, 10:53pm

Thanks will consider in a future upgrade.

Topic		Replies	Views
Process for building custom a model, training it, and deploying it to the OpenMV Cam H7 OpenMV Boards	22	12940	November 23, 2019
[ASK for help] issues with tf moudle on OpenMV H7 plus OpenMV Boards	14	1785	July 24, 2021
Quantized model on OpenMV Cam H7 Plus OpenMV Boards	17	223	January 23, 2025
TensorFlow Lite model inference result is wrong OpenMV Boards	3	2575	April 19, 2020
H7 Plus for Quantization TFLite Model ？ OpenMV Boards	1	879	October 18, 2020

Success stories batch ml predict?

Related topics