erode/dilate time consuming

ronOM · February 5, 2021, 3:21pm

Hello,

I was amazed by how quickly and easily a little OpenMV cam find blobs, so that I decided to see if I can put those cams into my project.

I try to find dimensions on that kind of object:

After finding blobs, I want to erode(4) and dilate(4) to remove the white wire. But that quickly slow down the process. On the following code that is the computing time I measured:

Convert to binary: 1.73ms
Find blob (in grayscale): 1.04ms
Crop (from 34808px to 18696px): 0.82ms
img.erode(4): 9.59ms
img.dilate(4): 9.53ms
Find blob (in binary): 0.94ms
Mesure: 0.76ms

A snapshot takes the camera 13.8ms. Then the total cycle is about 38.2ms and half is dedicated to the erode/dilate process.
In other terms:

without erode/dilate=52 FPS
with erode/dilate=26 FPS

Is there something I can do to filter my image at higher speed? Or erode/dilate is slow by its nature.

import sensor, image, time, math

streamname = "stream-c1.bin"
stream = image.ImageIO(streamname, "r")
px_threshold = 1641
px_Stride = 23
thresholdGS = [(40, 255)]
thresholdBIN = [(1,1)]
opening_px = 4
total_compute_time = 0
n = 1

while(True):
    img = stream.read(copy_to_fb=True, loop=True, pause=True)
    aftersnapshottime = time.ticks_us()
    # Binary image, find the blob, crop and image opening
    img.binary(thresholdGS,to_bitmap=True)
    for blob in img.find_blobs(thresholdBIN, pixels_threshold=px_threshold, x_stride=px_Stride, y_stride=px_Stride, merge=True):
        blobx, bloby, blobw, blobh = blob.rect()
        blobroi = (blobx,bloby,blobw,blobh)
        break
    img = img.crop(roi=blobroi)
    img.erode(opening_px)
    img.dilate(opening_px)
    # Measure blob axis
    for blob in img.find_blobs(thresholdBIN, pixels_threshold=px_threshold, x_stride=px_Stride, y_stride=px_Stride, merge=True):
        maja = blob.major_axis_line()
        lmaja = round(math.sqrt((maja[2]-maja[0])**2 + (maja[3]-maja[1])**2), 1)
        mina = blob.minor_axis_line()
        lmina = round(math.sqrt((mina[2]-mina[0])**2 + (mina[3]-mina[1])**2), 1)
        print("BlobGS |" + " P:" + str(blob.pixels()) + " L:" + str(lmaja) + " l:" + str(lmina))
        break
    total_compute_time = total_compute_time + time.ticks_diff(time.ticks_us(), aftersnapshottime)
    average_ellapsed_compute_time = total_compute_time / n
    print("CoMS: " + str(round(average_ellapsed_compute_time/1000,2)))
    n = n + 1

stream-c1.zip (746 KB)

kwagyeman · February 5, 2021, 6:19pm

Erode and dilate of 4 is a 9x9 convolution on the image. It’s really expensive.

With the triple buffering fix coming later this month (or the next) this should double your FPS. If you can wait on me.

Please reduce the erode and dilate size and try out the threshold argument. This is a feature OpenCV doesn’t have but it’s quite nice for getting the result you want with a smaller kernel size.

ronOM · February 5, 2021, 8:14pm

OK.
I will wait then for the triple buffering to see if that fit my needs.
Nevertheless erode/dilate(4) seems not the efficient way to do the job. But for the moment it is the only way I found to remove the wire from the image.

kwagyeman · February 5, 2021, 8:39pm

Use the threshold argument for these methods. It will do what you want.

Also, don’t convert he image to a bitmap. This is harder for the processor to work on. Even though the image is smaller it has to pack and unpack bits. Leave the image as a grayscale image. This is the fastest.

E.g. remove “to_bitmap=True”

ronOM · February 5, 2021, 11:46pm

Indeed. It was counter-intuitive to me as I don’t know what is behind the hood. Here the differences:

Convert to binary: -0.18ms
Find blob (in grayscale): +0.08ms
Crop (from 34808px to 18696px): -0.49ms
img.erode(4): -1.83ms
img.dilate(4): -1.95ms
Find blob (in binary): +0.05ms
Mesure: +0.01ms

I tried the threshold argument for erode, nothing happened until I set it to 50 minimum. And as I understand the threshold argument is more to only target small isolated blobs, not to remove hairy spikes of a blob.

import sensor, image, time, math

streamname = "stream-c1.bin"
stream = image.ImageIO(streamname, "r")
px_threshold = 1641
px_Stride = 23
thresholdGS = [(40, 255)]
thresholdBIN = [(128,255)]
opening_px = 4
total_compute_time = 0
n = 1

while(True):
    img = stream.read(copy_to_fb=True, loop=True, pause=True)
    aftersnapshottime = time.ticks_us()
    # Binary image, find the blob, crop and image opening
    img.binary(thresholdGS)
    for blob in img.find_blobs(thresholdBIN, pixels_threshold=px_threshold, x_stride=px_Stride, y_stride=px_Stride, merge=True):
        blobx, bloby, blobw, blobh = blob.rect()
        blobroi = (blobx,bloby,blobw,blobh)
        break
    img = img.crop(roi=blobroi)
    img.erode(opening_px)
    img.dilate(opening_px)
    # Measure blob axis
    for blob in img.find_blobs(thresholdBIN, pixels_threshold=px_threshold, x_stride=px_Stride, y_stride=px_Stride, merge=True):
        maja = blob.major_axis_line()
        lmaja = round(math.sqrt((maja[2]-maja[0])**2 + (maja[3]-maja[1])**2), 1)
        mina = blob.minor_axis_line()
        lmina = round(math.sqrt((mina[2]-mina[0])**2 + (mina[3]-mina[1])**2), 1)
        print("BlobGS |" + " P:" + str(blob.pixels()) + " L:" + str(lmaja) + " l:" + str(lmina))
        break
    total_compute_time = total_compute_time + time.ticks_diff(time.ticks_us(), aftersnapshottime)
    average_ellapsed_compute_time = total_compute_time / n
    print("CoMS: " + str(round(average_ellapsed_compute_time/1000,2)))
    n = n + 1

kwagyeman · February 7, 2021, 3:58am

It will remove hairy spikes too.

ronOM · February 9, 2021, 12:38am

I’m confused about what threshold really meant to do. And there is no improvement in the computing time.

Here my test code:

import image, time
thresholdBIN = [(128,255)]
img = image.Image("/Disk-Spikes-r00a-320x240-dots.pgm",copy_to_fb=True)
img.binary(thresholdBIN)
aftersnapshottime = time.ticks_us()
img.erode(1,threshold=7)
print("Time: " + str(round(time.ticks_diff(time.ticks_us(), aftersnapshottime)/1000,2)))
img.flush()
time.sleep_ms(500)
img.save("/Disk-Spikes-r00a-320x240-dots-erode_tests.pgm")

Original file:

img.erode(1) - OK understand:

img.erode(1,threshold=0) - the smaller dot is not removed but the second smallest is smaller?:

img.erode(1,threshold=1) - OK understand:

img.erode(1,threshold=2):

img.erode(1,threshold=3):

img.erode(1,threshold=4):

img.erode(1,threshold=5):

img.erode(1,threshold=6):

img.erode(1,threshold=7):

img.erode(1,threshold=8):

img.erode(1,threshold=9) - All set to black?:

img.erode(2) - OK understand:

img.erode(2,threshold=0) like img.erode(1,threshold=0) ?:

img.erode(2,threshold=1) and further threshold ?:

kwagyeman · February 9, 2021, 12:55am

So, erode and dilate both do this…

They look at a region of NxN pixels where N is ((arg*2)-1).

They sum the number of pixels that are not black in that region.

Then erode sets the pixel to black unless the sum is above threshold.

And dilate sets the pixel to white if the sum is below threshold.

The threshold argument allows you to tune how many neighbors per window need to be set/clear.

This is powerful because this allows you to make erode only function if any neighbor is not set.

ronOM · February 9, 2021, 6:20pm

Thanx for that clarification.

But I think threshold is not useful for my case.

Best result to remove spikes is to set the threshold to the default value (2*arg+1)^2-1 .
And the threshold doesn’t seem to have any impact on the computation time. During the convolution imlib_erode_dilate() method go anyway through all the image to be able to compare to the threshold and IMAGE_PUT_GRAYSCALE_PIXEL_FAST(buf_row_ptr, x, COLOR_GRAYSCALE_BINARY_MIN) costs nothing compare to that.

What I find strange indeed is that erode(4) is not so expensive compared to erode(1). On a 320x240 image erode(4) costs 22.8ms and erode(1) is 9.7ms but the kernel is 81pixels in the first case and 9 in the second one.

kwagyeman · February 9, 2021, 6:34pm

Yeah, we only load/unload the columns that change. This saves a lot of work. E.g. since only the first/last column change we only are updating the sum by 18 pixels versus 6 as the kernel slides across the image.

ronOM · February 18, 2021, 4:55pm

As I’m only working in triggered mode, I won’t see any difference with the triple buffering. Is that right?

kwagyeman · February 18, 2021, 6:36pm

Yeah, it won’t because you’d want to control the frame capture.

Topic		Replies	Views
Decrease time for sensor.snapshot and find_blobs? OpenMV Boards	1	2110	November 22, 2017
OpenMV M7 Cam Latency OpenMV Boards	3	2633	September 3, 2018
Timing and 4 sets of data OpenMV Boards	3	2410	November 30, 2019
Method speed scaling to 5MP images OpenMV Boards	3	1539	January 24, 2020
Python performance OpenMV Boards	4	379	August 29, 2022

erode/dilate time consuming

Related topics