Doing an update of all my tools for ML and upgraded to Openmv 4.5.6.
In doing so I observed that Tflite models that were created and quantized with Tensorflow 2.8.2 were working but when using Latest tensorflow 2.17 I kept getting failures while loading to Openmv. With this flag in my script before quantization process I was able to successfully load the model though
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter._experimental_disable_per_channel_quantization_for_dense_layers = True
See tensorflow release notes.
1 Like
Thanks for posting about this.
I’m guessing the new tflite feature could be useful and Openmv firmware could be made compatible but I have no idea where to start to look. I immediately got a message unable to allocate tensor and that’s it upon load.
FWIW this is how I quantize models and this works with the old and new tflite-micro we’re using:
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_data_gen
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.target_spec.supported_types = [tf.int8]
converter.inference_input_type = tf.uint8
converter.inference_output_type = tf.uint8
tflite_model = converter.convert()
with open('model.tflite', 'wb') as f:
f.write(tflite_model)
I’m using Tensorflow 2.8.4 though, I didn’t try this with a newer one.
1 Like
For me identical till newest TF version I just added
converter._experimental_disable_per_channel_quantization_for_dense_layers = True
for i in range(NUMBER_OF_MODELS_SAVE):
model_path = os.path.join(save_dir, f"model_{i}.h5")
model = tf.keras.models.load_model(model_path)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter._experimental_disable_per_channel_quantization_for_dense_layers = True
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = representative_dataset
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.inference_input_type = tf.int8
converter.inference_output_type = tf.int8
tflite_quant_model = converter.convert()
quant_model_path = os.path.join(PROCESSED_DIR, f"model_{i}_quantized.tflite")
with open(quant_model_path, 'wb') as f:
f.write(tflite_quant_model)
print(f"TFLite model {i} saved at {quant_model_path}")
On a side note whilst trying to find out what was going on came across this : Quantization aware training | TensorFlow Model Optimization
I think this only works if you train the model from scratch. I couldn’t figure out how to use that for transfer learning, if it’s possible.
I only train custom models from scratch and will soon figure out if I can get it to work for me and evaluate the benefits if any.