Load .bmp file from flash memory

Hello!
I’ve saved a numpy array (400,400,3) to .bmp file and the format is RGB565. But when I tried to load the .bmp file from flash memory and created an Image object, I found the byte length of the Image object is quite wierd (32000 instead of 48000 for RGB565 image).

img = image.Image(‘sample_4.bmp’, buffer=bytes, copy_to_fb=True)

    print('img size: ', img.size())
    # Preprocess
    h = img.height()
    w = img.width()
    print('H x W: ', h, w)
    x_scale = self.crop_size / h
    y_scale = self.crop_size / w
    img.scale( x_scale= x_scale,
               y_scale = y_scale,
               roi = None,
               rgb_channel = -1,
               alpha= 256,
               color_palette=None,
               alpha_palette=None,
               hint = 0,# nearest interpolation
               copy = False,
               copy_to_fb = False)

    # Image2array
    img_array = np.frombuffer(img, dtype=np.int8)
    print(img_array.shape)

And the output of print shows:

img size: 320000
H x W: 400 400
(20000,)

Why img.size() can be 320000 instead of 4004003=480000? And why after scaling, the byte length becomes 20000 instead of 30000 (which looks like one of the RGB channel is missing)?
PS: The following python scripts are how I generate my .bmp files:

def save_bmp(filename, img):

width, height, channels = img.shape  
row_size = (width * channels + 3) // 4 * 4  # 
pixel_data = img.tobytes()  
bmp_header = struct.pack('<2sIHHIIIIHHIIIIII',  
    b'BM',                            # BMP
    54 + row_size * height,          
    0, 0,                          
    54,                             
    40,                           
    width, height,                
    1,                             
    channels * 8,                 
    0,                               
    row_size * height,              
    0,                              
    0,                               
    0,                               
    0)                            
with open(filename, 'wb') as f:  
    f.write(bmp_header)  
    f.write(pixel_data)  

for i in range(num_samples):

img = torch.tensor(extracted_images[i], dtype=torch.float32)  
img = fivecrop_scale(img, crop_size=450)  
img = img.permute(1, 2, 0)  
img_array = img.numpy()
print('Permuted Image Shape:', img.shape)
img_bmp = cv2.flip(img_array, 0)  
img_bmp = (img_bmp * 255).astype(np.uint8)   
print("Image shape:", img_bmp.shape)  # (height, width, 3)      
save_bmp(f'/data/sample_{i+1}.bmp', img_bmp)

Hi, the way you are trying to use the Image object isn’t supported.

img = image.Image(‘sample_4.bmp’, buffer=bytes, copy_to_fb=True)

Buffer is not applicable here. So, it will just be loaded from disk into the frame buffer because you passed copy_to_fb.

(400,400,3) is an RGB888 image. It’s 480,000 bytes… with 3 bytes per pixel. RGB565 is 2 bytes per pixel. (400,400,2) == 320,000

Thank you for your feedback and suggestions! I appreciate your insights regarding the following points:

  1. Image Format:
print("Image shape:", img_bmp.shape)  # (height, width, 3)
save_bmp(f'/data/sample_{i+1}.bmp', img_bmp)

Before saving my data into .bmp file, I’ve checked the dimension of the img to be (H,W,3), and I wonder why the following loading method would also lead to 2 channel Image object (RGB565)

img = image.Image('sample_4.bmp', copy_to_fb=True)
print('img size: ', img.size()) # This output will be 20000 for my saved (100,100,3) .bmp file image
  1. Load a 3-channel RGB image with specific height and width:

Is there a better way to load a 3-channel RGB image with specified height and width, aside from using .bmp format, for input into a neural network, since loading .png/.jpg results in reshape error for compressed bytes length (usually for (100,100,3), the byte length would be smaller than 30000)

# Load image from .png file 
# and scale the image to specific height and width
img = image.Image('sample_3.png', copy_to_fb=True)
print('img size: ', img.size())
# Preprocess
h = img.height()
w = img.width()
print('H x W: ', h, w)
x_scale = self.crop_size / h
y_scale = self.crop_size / w
img.scale( x_scale= x_scale,
           y_scale = y_scale,
           roi = None,
           rgb_channel = -1,
           alpha= 256,
           color_palette=None,
           alpha_palette=None,
           hint = 0,# nearest interpolation
           copy = False,
           copy_to_fb = False)
# Image2array
  img_array = np.frombuffer(img, dtype=np.int8)
  print(img_array.shape)
  del img
  img_array = img_array.reshape((1, 3, 100, 100))

Error occurs since the size of img_array is smaller than 30000.

Hi, RGB565 is not 2 channels. It’s 2 bytes per pixel. RGB888 is 3 bytes per pixel.

We do not support RGB888 in our firmware except to load and convert it to RGB565, when you cast the RGB565 image to a ulab array it will become an 3 channel ndarray. However, our code otherwise does everything using RGB565 as it’s more efficient data processing-wise as each pixel is 16-bits so we can work on multiple pixels at a time in a 32-bit reg or 128-bit vector reg.

Regarding neural network conversion. We have a preprocessing module that does this for you: ml.preprocessing — ML Preprocessing — MicroPython 1.24 documentation

What are you trying to do exactly? If it’s to pass the image object to the ml module you just have to do:

img = image.Image(‘sample_4.bmp’, copy_to_fb=True)
list_of_ndarray_tensors = ml.predict([img])

The normalization object is automatically created for you inside the predict call.

1 Like

Thank you so much! I‘ve tried this before but the input shape is not supported for my model, since the model requires an input of shape(1,3,H,W).
And I tried another method for loading image from .bmp files to my model, and it works, though the prediction accuracy seems bad.

def __getitem__(self):
    if self.index >= len(self.image_files):
        self.index = 0  #
    img_file = self.image_files[self.index]
    print('loading')
    img = image.Image(img_file, buffer=bytes, copy_to_fb=False)
    # Preprocess
    h = img.height()
    w = img.width()
    x_scale = self.crop_size / h
    y_scale = self.crop_size / w
    img.scale( x_scale= x_scale,
               y_scale = y_scale,
               roi = None,
               rgb_channel = -1,
               alpha= 256,
               color_palette=None,
               alpha_palette=None,
               hint = image.BILINEAR,
               copy = False,
               copy_to_fb = False)
    # Image2array
    img_nd = img.to_ndarray(dtype="b", buffer=None).transpose()
    # Convert RGB2numpy
    img_array = np.zeros((1, img_nd.shape[0], img_nd.shape[1], img_nd.shape[2]), dtype=np.int8)
    img_array[0, :, :, :] = img_nd.copy()
    del img, img_nd
    self.index += 1

    valid_samples = []
    valid_samples.append(img_array)
    del img_array
    return valid_samples

Here’s how the normalization code works: openmv/scripts/libraries/ml/ml/preprocessing.py at master · openmv/openmv · GitHub

You can make your own wrapper class that’s customized and pass it to predict.

Please see the documentation on the predict for more information about what’s going on.

1 Like