Heads up custom Tflite models with Latest versions of Tensorflow

Doing an update of all my tools for ML and upgraded to Openmv 4.5.6.
In doing so I observed that Tflite models that were created and quantized with Tensorflow 2.8.2 were working but when using Latest tensorflow 2.17 I kept getting failures while loading to Openmv. With this flag in my script before quantization process I was able to successfully load the model though

    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter._experimental_disable_per_channel_quantization_for_dense_layers = True

See tensorflow release notes.

1 Like

Thanks for posting about this.

I’m guessing the new tflite feature could be useful and Openmv firmware could be made compatible but I have no idea where to start to look. I immediately got a message unable to allocate tensor and that’s it upon load.

FWIW this is how I quantize models and this works with the old and new tflite-micro we’re using:

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = [tf.int8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()
    
with open('model.tflite', 'wb') as f:
  f.write(tflite_model)

I’m using Tensorflow 2.8.4 though, I didn’t try this with a newer one.

1 Like

For me identical till newest TF version I just added
converter._experimental_disable_per_channel_quantization_for_dense_layers = True

for i in range(NUMBER_OF_MODELS_SAVE):
    model_path = os.path.join(save_dir, f"model_{i}.h5")
    model = tf.keras.models.load_model(model_path)
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    converter._experimental_disable_per_channel_quantization_for_dense_layers = True
    converter.optimizations = [tf.lite.Optimize.DEFAULT]
    converter.representative_dataset = representative_dataset
    converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
    converter.inference_input_type = tf.int8
    converter.inference_output_type = tf.int8
    tflite_quant_model = converter.convert()
    quant_model_path = os.path.join(PROCESSED_DIR, f"model_{i}_quantized.tflite")
    with open(quant_model_path, 'wb') as f:
        f.write(tflite_quant_model)
    print(f"TFLite model {i} saved at {quant_model_path}")

On a side note whilst trying to find out what was going on came across this : Quantization aware training  |  TensorFlow Model Optimization

I think this only works if you train the model from scratch. I couldn’t figure out how to use that for transfer learning, if it’s possible.

I only train custom models from scratch and will soon figure out if I can get it to work for me and evaluate the benefits if any.