Hello. I’m trying to apply OpenMV cam (board H7, sensor OV7725, firmware 4.1.1) for some motion recognition project. I wrote the code and got a 5 seconds for processing a single frame, which is totally not acceptable. I start researching and found that a single multiplication in nested loop (4800 iterations) causes a 1.5 ms delay. For core clock 480 MHz it means about 150 cycles per multiplication. Here is the testing code:
# Python speed test
import pyb, sensor, image, time, cpufreq
sensor.reset() # Reset and initialize the sensor.
sensor.set_pixformat(sensor.RGB565) # Set pixel format to RGB565 (or GRAYSCALE)
sensor.set_framesize(sensor.QQQVGA) # 80x60
sensor.skip_frames(time = 2000) # Wait for settings take effect.
clock = time.clock() # Create a clock object to track the FPS.
bzrp = pyb.Pin("P7",pyb.Pin.OUT_PP)
bzrn = pyb.Pin("P8",pyb.Pin.OUT_PP)
image_width = sensor.width()
image_height = sensor.height()
working_size = int(image_width*image_height)
r_buf = list((range(working_size)))
g_buf = list((range(working_size)))
b_buf = list((range(working_size)))
for i in range(working_size):
r_buf[i] = 0
g_buf[i] = 0
b_buf[i] = 0
def bzr_toggle(): #piezo buzzer between P7 and P8
if bzrp.value() == 0:
bzrp.high()
bzrn.low()
else:
bzrp.low()
bzrn.high()
freqs = cpufreq.get_current_frequencies()
print("CPU",freqs[0],"MHz",", working size",working_size,"/ allocated",sensor.get_framebuffers(),"buffers")
cnt = 0
fps_rst = 10
# ------------------------------------------------------------------------
while(True):
bzr_toggle() #to track FPS without IDE
clock.tick() # Update the FPS clock.
#img = sensor.snapshot() # Take a picture and return the image.
imgarr = sensor.snapshot().bytearray() #this works faster than unpacking RGB tuple
test = 0
#test image processing, takes 91 ms (11 fps) in good lightning conditions
for j in range(image_height): #60
for i in range(image_width): #80
test += i #adds 6ms
test += j*3 #adds 8ms
pixel_index = j*image_width+i
byte_index = pixel_index<<1
bt0 = imgarr[byte_index]
bt1 = imgarr[byte_index+1]
bt2 = imgarr[byte_index+1] #adds 7 ms, for example
r_buf[pixel_index] += bt1&0xF8
g_buf[pixel_index] += ((bt1&7)<<5)|((bt0&0xE0)>>3)
b_buf[pixel_index] += (bt0&0x1F)<<3
#track FPS and frame processing time
cnt += 1
fps = clock.fps()
tms = 1000/fps
time_msg = "{:.2f} fps, {:.2f} ms"
print(cnt,">",time_msg.format(fps,tms))
fps_rst -= 1
if fps_rst == 0:
clock.reset()
fps_rst = int(fps)
if fps_rst < 5: fps_rst = 5
Processing a single 80x60 frame takes 91 ms. Maybe I doing some obvious mistake, because I’m new in python, but still, is this ok? Is there a means for improving performance for 10 times at least?