Hi everyone,
The past weeks I worked with the H7 Plus and the RT1062 together with the GenX320 sensor. I can record data in both histogram mode and event mode. For my research I need the raw events with timestamps (x, y, t, polarity). Histogram mode is not enough. I run into a problem: the amount of events per second is so high that my current SD logging setup gives blackouts.
Some context
The use case is a flapping-wing drone. The flapping motion, plus the forward motion, creates a lot of events whenever the wings move. To avoid losing data, I now need to tune the sensor biases in a conservative way, which reduces how much real detail I can record.
With these settings:
DIFF_OFF: 44 DIFF_ON: 60 HPF: 55 FO: 19 REFR: 0
…I already get around 500,000 events/s.
For research I don’t want to filter much more, because I need to stay within Prophesee’s recommended ranges:
https://docs.prophesee.ai/stable/hw/manuals/biases.html#bias-ranges
Boards I tested:
-
OpenMV RT1062
-
OpenMV H7 Plus
Sensor: GenX320
Mode: Event mode
What I need
I want to log as many events as possible, for as long as possible, on these boards. I want to understand the real bottleneck and what the realistic upper limit is for logging raw events to an SD card.
Below is what I have tried so far.
What I tried
-
(A) increasing the deep event buffer
-
(B) writing to the SD card as fast as possible
Current script for reference
import os, time
import csi
from ulab import numpy as np
# run configuration
RUNID = 'A'
REC_DURATION_S = 10 # total record time (s)
SENSOR_BUFFER = 65536*1 # sensor buffer (events) #with modded firmware buffer x 4 possible
START_TIME = 1 # delay before recording (s)
# GENX320 biases
BIASES = {"DIFF_OFF": 44, "DIFF_ON": 60, "HPF": 55, "FO": 19, "REFR": 0} #conservative settings for flapper
# # default values
# BIASES = {"DIFF_OFF": 28, "DIFF_ON": 25, "HPF": 40, "FO": 34, "REFR": 10}
# sd card check
if 'sdcard' not in os.listdir('/'):
raise Exception("Error: No SD card found")
# create file
count = sum(1 for f in os.listdir('/sdcard/') if RUNID in f) + 1
filename = f"/sdcard/{SENSOR_BUFFER}_{RUNID}_{count}_{REC_DURATION_S}s.bin"
file = open(filename, "wb")
# init sensor
csi0 = csi.CSI(cid=csi.GENX320)
csi0.reset()
# allocating the biggest sensor buffer possible
def alloc_events(n):
while True:
try:
return np.zeros((n, 6), dtype=np.uint16)
except MemoryError:
if n <= 1024:
raise
n //= 2
print("MemoryError: reducing SENSOR_BUFFER to", n)
events = alloc_events(int(SENSOR_BUFFER))
# config event mode and biases
csi0.ioctl(csi.IOCTL_GENX320_SET_MODE, csi.GENX320_MODE_EVENT, events.shape[0])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_DIFF_OFF, BIASES["DIFF_OFF"])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_DIFF_ON, BIASES["DIFF_ON"])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_HPF, BIASES["HPF"])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_FO, BIASES["FO"])
csi0.ioctl(csi.IOCTL_GENX320_SET_BIAS, csi.GENX320_BIAS_REFR, BIASES["REFR"])
time.sleep(START_TIME)
print("Starting event recording...")
print("Sensor buffer (events):", events.shape[0])
event_count = 0
start_ms = time.ticks_ms()
deadline = start_ms + REC_DURATION_S * 1000
try:
while time.ticks_diff(time.ticks_ms(), deadline) < 0:
n = csi0.ioctl(csi.IOCTL_GENX320_READ_EVENTS, events)
if n > 0:
file.write(memoryview(events)[:n * 6]) # 6 uint16 per event
event_count += n
finally:
file.flush()
os.sync()
file.close()
elapsed_ms = time.ticks_diff(time.ticks_ms(), start_ms)
eps = (event_count * 1000) // elapsed_ms if elapsed_ms > 0 else 0
print("Done.")
print(f"Events saved to {filename}")
print(f"Duration: {elapsed_ms/1000:.3f}s | Total events: {event_count} | ~{eps} ev/s")
(A) Increasing the deep event buffer
The first thing I tried was using the maximum supported event buffer of 65536. I don’t care about delay between sensor readout and file write. I only care about capturing all events with correct timestamps.
Result:
Still blackouts.
A bigger buffer reduces how often they happen, but the blackouts become longer.
My base logging loop looked like this:
try:
while time.ticks_diff(time.ticks_ms(), deadline) < 0:
n = csi0.ioctl(csi.IOCTL_GENX320_READ_EVENTS, events)
if n > 0:
file.write(memoryview(events)[:n * 6]) # 6 uint16 per event
event_count += n
finally:
file.flush()
os.sync()
file.close()
Then I modified the firmware (genx320.c) to allow larger buffers:
if (ndarray_size < 1024 || ndarray_size > 65536)
I changed this limit to allow 65536×2 and 65536×4.
This only worked on the H7 Plus, because the H7+ has much more usable RAM for Micropython and for CSI buffers. The RT1062 does not have enough RAM for these larger buffers.
Result:
Fewer blackouts overall, but the stalls become longer.
At 65536×4 yhe SD card becomes the bottleneck.
The sensor fills the buffer faster than I can flush it to SD.
Every 1–2 seconds there is a blackout.
Question A: Is there anything I am misunderstanding about this logic, or anything else I could try to change here that would reduce blackouts?
(B) Writing to the SD card as fast as possible
Because the input event rate is high and constant (~500k events/s), the write speed and especially the stall time of the SD card becomes the real bottleneck.
I tested A1 and A2 SD cards.
A2 cards are clearly faster on the H7+.
SD write speed tests
| Chunk Size | A2 Speed (MB/s) | A2 Max Stall (µs) | A1 Speed (MB/s) | A1 Max Stall (µs) |
|---|---|---|---|---|
| 4 KB | 3.51 | 5251 | 1.28 | 95541 |
| 16 KB | 8.47 | 4625 | 4.32 | 55863 |
| 64 KB | 8.46 | 11403 | 5.14 | 65719 |
| 256 KB | 8.23 | 40211 | 5.69 | 94675 |
The A2 card is better in two ways:
-
Higher sustained throughput
-
Much shorter worst-case stalls
The stall time matters a lot.
A long stall means the sensor FIFO overflows, resulting in a blackout.
Question B: Is there anything I can do to write faster to SD cards on either board? Or other card types worth trying?
Trying an extra buffer
I also tried adding an intermediate write buffer between the event buffer and the SD card. However, this reduces the amount of memory Micropython has left for the event buffer. It also reduces frame buffer space. It didn’t solve the stalls.
Where I am now
-
Increasing the sensor buffer helps, but only shifts the problem
-
A2 cards help, but stall times still cause blackouts
-
Firmware changes help a bit
-
Extra buffering in Micropython doesn’t help because RAM gets tight
What I am considering
Please let me know if you have ideas on how to tackle this problem.
Below is what I am thinking of trying next.
1) Logging events over another medium
-
Over USB.
According to this discussion:
https://forums.openmv.io/t/genx320-streaming-raw-event-data-over-usb/11414
USB HS is fast enough, but Micropython cannot stream GenX320 events over USB at the full rate.
To make over USB work, the event stream must be handled in C inside the firmware.
A custom USB endpoint is needed to push the events to the PC.
So USB is possible, but only with firmware changes. -
Over Ethernet.
The RT1062 MCU has an Ethernet MAC, but the OpenMV RT1062 board does not expose the necessary pins and has no PHY.
Ethernet would require custom hardware and a custom driver.
The H7 Plus cannot do Ethernet at all.
Question C: What do you think about these options?
2) Changing the firmware to do less processing
I tried capturing the raw EVT20 words from the GenX320 using:
int rc = omv_csi_snapshot(csi, &image, OMV_CSI_FLAG_NO_POST);
This returns the raw 32-bit EVT20 words without any post-processing.
This is faster and more compact than decoding to (x, y, t, polarity).
But when I decoded the EVT20 words offline, the data contained a lot of noise.
This is expected, because OMV_CSI_FLAG_NO_POST disables AFK, STC, ERC, and other Prophesee filters.
So the output includes flicker and background contrast noise.
I’m not sure whether writing EVT20 directly to SD gives more useful events, even though it gives more total events.
Question D:
Do you think modding the firmware in this direction could give more real events per second?
If so, how would you approach it?
I can share the EVT20 code I used, but it includes a lot of noise because all post-processing is disabled.