Image processing performance improvements

meshein · January 21, 2024, 6:31pm

I am performing a handful of operations to simplify the frame and then get blobs that appear to be a ball. I iterate through blobs later in the loop, but it appears the majority of the performance is impacted by the image processing portion. Here are the operations I am performing, currently at QVGA resolution. Ideally, I’d like to increase to VGA, but it drops my frame rate significantly. I believe the close operation is most taxing - are there any thoughts on how I can increase performance?

QVGA gets me about 25 FPS on the RT1062.

        img2.replace(img) # Save copy of frame for processing, the original is displayed on lcd
        img2.difference(extra_fb) # Get frame difference (from a copy made earlier)
        img2.binary([(binaryThresholdLow,binaryThresholdHigh)]) # Convert to binary, (25,255)
        img2.close(closeImageValue,threshold=closeImageThreshold) # Value is 2, threshold is 6

kwagyeman · January 21, 2024, 6:47pm

Yeah, I have some PRs open on the main repo right now that will 4x the performance of difference. Close/open though have not been improved in performance. Same for binary. There’s not much that can speed those up.

The biggest culprit is the lack of DMA offload by the camera sensor driver. I will be able to start doing work on that later once another PR is merged for sensor driver features. This will massive improve the VGA resolution performance.

At the moment though it won’t be any faster for a while. The H7 Plus has an optimized camera driver though. So, you should see a great deal more speed on that platform in the mean time.

meshein · January 21, 2024, 11:14pm

I’m able to get upwards of 35 fps on HVGA with everything except for the close function. Do you have any suggestions for a more performant alternative to close an image? Maybe taking a step back to what I am trying to do - I have a greyscale image of ball in flight, I see that it is in flight by using frame differencing and converting to binary for easy blob detection. The ball has some texture so sometimes it is broken into multiple blobs. I’ve opted not to use merging of blobs because it can result in the ball being merged with a moving person. The close feature has been reliable but now that I have stepped up from QVGA to HVGA (and ideally eventually to VGA), the frame rate is limiting.

kwagyeman · January 22, 2024, 3:12am

The proper way to do this would be through something called the cam shift operation. However, we don’t support this OpenCV method just yet.

So, what are you tracking exactly? Blobs using find_blobs(). What I’m getting at is that it’s no cost to merge detections in software given their blob attributes. E.g. can you do something with a more noisy list of detections than what it would be after using close().

We will eventually get all of this code more optimized over the year. But, right now it’s not as fast as it can be. That said, let me give you a binary with the new line ops enabled. Difference() as I mentioned is 4x faster. This might help. Close() uses difference internally so it’s being called twice.

meshein · January 22, 2024, 3:31am

Here’s an example where merged blobs are an issue. Is there a good way to get selective about which blobs to merge? I know there is a minimum size but I’d want to selectively filter out large blobs, like a person.

kwagyeman · January 22, 2024, 9:03pm

firmware.zip (1.3 MB)

Hi, here’s a firmare with the new lineops PR here: modules/py_image: Optimize and cleanup all math and binary line ops. by kwagyeman · Pull Request #2061 · openmv/openmv (github.com) compiled for the RT1062.

You should see a 4x speedup with difference(). The PR switches support to use cortex-simd.

As for dealing with the blobs in that image.

This is simple. Ignore blobs that are too big. You can still use the merge=True argument by the way with find_blobs(). Just add the callback to a python method to look at the blob and then filter it if it’s too large. See the callback arg that find_blobs() supports.

meshein · January 24, 2024, 6:31am

Is there a way to compile a firmware that includes the changes in Image processing performance improvements - #6 by kwagyeman

meshein · January 24, 2024, 10:05am

I added a callback that appears to be working, but perhaps too well for my use case (see below). Without any close operation, my head (and even beard) ends up being separated into it’s own smaller blob, and since it is now under the threshold it never is joined with the rest of me to be filtered out. I suspect I can dial in my binary thresholds but I worry that this might be prone to errors as different people/clothing are the subject. Any thoughts on this?

def area_filter(blob):
    return minBallArea <= blob.area() <= maxBallArea
...
for blob in img2.find_blobs(
    [thresholds],pixels_threshold=minBallPixels, area_threshold=minBallArea, merge=True, margin=2, x_stride=stride, y_stride=stride, threshold_cb=area_filter
):

kwagyeman · January 24, 2024, 4:47pm

Is this is valid statement?

return minBallArea <= blob.area() <= maxBallArea

I think you need to do:

return (minBallArea <= blob.area()) and (blob.area() <= maxBallArea)

kwagyeman · January 24, 2024, 4:48pm

Yeah, you just follow the guide on the github for how to build the firmware and then checkout the branch and compile.

Topic		Replies	Views
Method speed scaling to 5MP images OpenMV Boards	3	1539	January 24, 2020
H7 Resolutions? OpenMV Boards	3	3598	April 24, 2019
Optimising OpenMV H7 Camera for 3 feature (blob) tracking OpenMV Boards	6	257	June 6, 2023
Decrease time for sensor.snapshot and find_blobs? OpenMV Boards	1	2105	November 22, 2017
Finding Circle and Color Detection OpenMV Boards	3	4135	January 10, 2019

Image processing performance improvements

Related topics