GenX320: High-Rate Event logging on H7 Plus/RT1062 (SD blackouts and throughput limits)

Hi everyone,

The past weeks I worked with the H7 Plus and the RT1062 together with the GenX320 sensor. I can record data in both histogram mode and event mode. For my research I need the raw events with timestamps (x, y, t, polarity). Histogram mode is not enough. I run into a problem: the amount of events per second is so high that my current SD logging setup gives blackouts.

Some context

The use case is a flapping-wing drone. The flapping motion, plus the forward motion, creates a lot of events whenever the wings move. To avoid losing data, I now need to tune the sensor biases in a conservative way, which reduces how much real detail I can record.

With these settings:

DIFF_OFF: 44 DIFF_ON: 60 HPF: 55 FO: 19 REFR: 0

…I already get around 500,000 events/s.

For research I don’t want to filter much more, because I need to stay within Prophesee’s recommended ranges:
https://docs.prophesee.ai/stable/hw/manuals/biases.html#bias-ranges

Boards I tested:

  • OpenMV RT1062

  • OpenMV H7 Plus

Sensor: GenX320
Mode: Event mode

What I need

I want to log as many events as possible, for as long as possible, on these boards. I want to understand the real bottleneck and what the realistic upper limit is for logging raw events to an SD card.

Below is what I have tried so far.

What I tried

  • (A) increasing the deep event buffer

  • (B) writing to the SD card as fast as possible

Current script for reference

import os, time
import csi
from ulab import numpy as np

# run configuration
RUNID = 'A'
REC_DURATION_S = 10          # total record time (s)
SENSOR_BUFFER = 65536*1        # sensor buffer (events) #with modded firmware buffer x 4 possible
START_TIME = 1               # delay before recording (s)

# GENX320 biases
BIASES = {"DIFF_OFF": 44, "DIFF_ON": 60, "HPF": 55, "FO": 19, "REFR": 0} #conservative settings for flapper

# # default values
# BIASES = {"DIFF_OFF": 28, "DIFF_ON": 25, "HPF": 40, "FO": 34, "REFR": 10}


# sd card check
if 'sdcard' not in os.listdir('/'):
    raise Exception("Error: No SD card found")

# create file
count = sum(1 for f in os.listdir('/sdcard/') if RUNID in f) + 1
filename = f"/sdcard/{SENSOR_BUFFER}_{RUNID}_{count}_{REC_DURATION_S}s.bin"
file = open(filename, "wb")

# init sensor
csi0 = csi.CSI(cid=csi.GENX320)
csi0.reset()

# allocating the biggest sensor buffer possible
def alloc_events(n):
    while True:
        try:
            return np.zeros((n, 6), dtype=np.uint16)
        except MemoryError:
            if n <= 1024:
                raise
            n //= 2
            print("MemoryError: reducing SENSOR_BUFFER to", n)

events = alloc_events(int(SENSOR_BUFFER))

# config event mode and biases
csi0.ioctl(csi.IOCTL_GENX320_SET_MODE, csi.GENX320_MODE_EVENT, events.shape[0])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_DIFF_OFF, BIASES["DIFF_OFF"])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_DIFF_ON,  BIASES["DIFF_ON"])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_HPF,      BIASES["HPF"])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_FO,       BIASES["FO"])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_REFR,     BIASES["REFR"])

time.sleep(START_TIME)

print("Starting event recording...")
print("Sensor buffer (events):", events.shape[0])

event_count = 0
start_ms = time.ticks_ms()
deadline = start_ms + REC_DURATION_S * 1000

try:
    while time.ticks_diff(time.ticks_ms(), deadline) < 0:
        n = csi0.ioctl(csi.IOCTL_GENX320_READ_EVENTS, events)
        if n > 0:
            file.write(memoryview(events)[:n * 6])  # 6 uint16 per event
            event_count += n
finally:
    file.flush()
    os.sync()
    file.close()

elapsed_ms = time.ticks_diff(time.ticks_ms(), start_ms)
eps = (event_count * 1000) // elapsed_ms if elapsed_ms > 0 else 0
print("Done.")
print(f"Events saved to {filename}")
print(f"Duration: {elapsed_ms/1000:.3f}s | Total events: {event_count} | ~{eps} ev/s")

(A) Increasing the deep event buffer

The first thing I tried was using the maximum supported event buffer of 65536. I don’t care about delay between sensor readout and file write. I only care about capturing all events with correct timestamps.

Result:
Still blackouts.
A bigger buffer reduces how often they happen, but the blackouts become longer.

My base logging loop looked like this:

try:
    while time.ticks_diff(time.ticks_ms(), deadline) < 0:
        n = csi0.ioctl(csi.IOCTL_GENX320_READ_EVENTS, events)
        if n > 0:
            file.write(memoryview(events)[:n * 6])  # 6 uint16 per event
            event_count += n
finally:
    file.flush()
    os.sync()
    file.close()

Then I modified the firmware (genx320.c) to allow larger buffers:

if (ndarray_size < 1024 || ndarray_size > 65536)

I changed this limit to allow 65536×2 and 65536×4.
This only worked on the H7 Plus, because the H7+ has much more usable RAM for Micropython and for CSI buffers. The RT1062 does not have enough RAM for these larger buffers.

Result:
Fewer blackouts overall, but the stalls become longer.
At 65536×4 yhe SD card becomes the bottleneck.
The sensor fills the buffer faster than I can flush it to SD.
Every 1–2 seconds there is a blackout.

Question A: Is there anything I am misunderstanding about this logic, or anything else I could try to change here that would reduce blackouts?


(B) Writing to the SD card as fast as possible

Because the input event rate is high and constant (~500k events/s), the write speed and especially the stall time of the SD card becomes the real bottleneck.

I tested A1 and A2 SD cards.
A2 cards are clearly faster on the H7+.

SD write speed tests

Chunk Size A2 Speed (MB/s) A2 Max Stall (µs) A1 Speed (MB/s) A1 Max Stall (µs)
4 KB 3.51 5251 1.28 95541
16 KB 8.47 4625 4.32 55863
64 KB 8.46 11403 5.14 65719
256 KB 8.23 40211 5.69 94675

The A2 card is better in two ways:

  • Higher sustained throughput

  • Much shorter worst-case stalls

The stall time matters a lot.
A long stall means the sensor FIFO overflows, resulting in a blackout.

Question B: Is there anything I can do to write faster to SD cards on either board? Or other card types worth trying?

Trying an extra buffer

I also tried adding an intermediate write buffer between the event buffer and the SD card. However, this reduces the amount of memory Micropython has left for the event buffer. It also reduces frame buffer space. It didn’t solve the stalls.


Where I am now

  • Increasing the sensor buffer helps, but only shifts the problem

  • A2 cards help, but stall times still cause blackouts

  • Firmware changes help a bit

  • Extra buffering in Micropython doesn’t help because RAM gets tight


What I am considering

Please let me know if you have ideas on how to tackle this problem.
Below is what I am thinking of trying next.

1) Logging events over another medium

  • Over USB.
    According to this discussion:
    https://forums.openmv.io/t/genx320-streaming-raw-event-data-over-usb/11414
    USB HS is fast enough, but Micropython cannot stream GenX320 events over USB at the full rate.
    To make over USB work, the event stream must be handled in C inside the firmware.
    A custom USB endpoint is needed to push the events to the PC.
    So USB is possible, but only with firmware changes.

  • Over Ethernet.
    The RT1062 MCU has an Ethernet MAC, but the OpenMV RT1062 board does not expose the necessary pins and has no PHY.
    Ethernet would require custom hardware and a custom driver.
    The H7 Plus cannot do Ethernet at all.

Question C: What do you think about these options?


2) Changing the firmware to do less processing

I tried capturing the raw EVT20 words from the GenX320 using:

int rc = omv_csi_snapshot(csi, &image, OMV_CSI_FLAG_NO_POST);

This returns the raw 32-bit EVT20 words without any post-processing.
This is faster and more compact than decoding to (x, y, t, polarity).
But when I decoded the EVT20 words offline, the data contained a lot of noise.
This is expected, because OMV_CSI_FLAG_NO_POST disables AFK, STC, ERC, and other Prophesee filters.
So the output includes flicker and background contrast noise.

I’m not sure whether writing EVT20 directly to SD gives more useful events, even though it gives more total events.

Question D:
Do you think modding the firmware in this direction could give more real events per second?
If so, how would you approach it?
I can share the EVT20 code I used, but it includes a lot of noise because all post-processing is disabled.

Hi, thank you for the very long forum post on this with all the detail!

We actually have a feature for you for this that should solve the issues. What is happening is that SD cards will block writing to them while they erase. When this happens, it doesn’t really matter how efficient you make things as you just have to buffer data until they finish the erase.

Please see the FIFO mode of the frame buffer. E.g. csi0.framebuffers(N) where N is greater than 3. This changes the default algorithm of the frame buffer from triple buffering, which is effectively a FIFO depth of 2, to a true FIFO with depth up to N. Note that the defualt logic will flush the whole buffer if it overflows to avoid giving you old data. Try a depth of 10 or so.

Just making the event buffer really huge doesn’t fix the issue as there’s still only 2 buffers. As such, when the CPU gets blocked during a write operation while the SD card is erasing then events are dropped.

When you enable FIFO mode, the interrupt process which receives event buffers will continue to receive data in the background while the main thread writes data to the SD card and blocks during sd card writes/erase.

Note that you still want the event buffer size to be as large as possible in order to increase the maximum bandwidth as maximum SD card performance is achieved when writing huge blocks of data. You should be able to get to about 10 MB/s.

Regarding USB:

We are upgrading the USB protocol in the future to support custom transports. Once this is done, you’ll be able to much more flexibly move data from the camera to the PC with custom scripts. This will let you use the USB bandwidth to get data to the PC at nearly 30 MB/s. Our current USB protocol doesn’t make this easy as the only high bandwidth way to move data is treating everything as an image.

Regarding Ethernet:

There’s an Ethernet PHY on the board, support for it already works and the driver logic is already complete. You just have to buy the PoE Ethernet Shield and Ethernet will work. Mentioning this to correct your post.

As for changing the firmware to do less processing, it doesn’t matter as the bottleneck is the SD card blocking the main thread during writing while it erases things in the background.

I will try the FIFO mode as soon as possible and I will let you know how performance increases(or blackouts hopefully disappear).
Additionally I am looking forward to the updated USB protocol, and cool to hear this is something you guys are working on.

Thank you for your fast response, really amazing thanks!

@kwagyeman, I wanted to let you know that the FIFO does improve the situation(and my custom firmware tweaks where indeed redundant), however the SD card with ~8 MB/s max seems to be the main bottleneck. Nevertheless, thank you for your help!