M7 crashes under Vin but works fine with USB

Hi,

I’m using a M7 embedded on an electrical vehicle. It is powered through an embedded 5,3V (3A) from the VIN input.

While debugging/developing (M7 connected via USB) everything goes fine. However the program seems to crash when it is powered by the VIN input. It happens while the program is running the blob detection. And when it does, the RED led blinks 3 times (it does not come from my program) and then nothing more. I have to remove power and set it again to reboot.

What does this red led blinking mean ?

First, I supposed it was a wire length problem (the CAM is powered with a 1m harness (5V + CAN)) but I scoped the VIN voltage and I saw nothing. Moreover the USB cable length I’m using is about 10m and USB works fine.

Here is what I’ve done:

  • I scoped the VIN and the 3v3 while a crash happened and I see nothing at all. Communication signals (CAN) are fine as well.


  • I tried with another cam, same problem.


  • I still believe it has something to do with the VIN but I can’t figure it out so far.

Any suggestion would be highly appreciated.

Thank you.

Red blinking means there’s an error in the script (something throws an exception). Try adding try/exception blocks and write the exception to txt file.

I scoped the VIN and the 3v3 while a crash happened and I see nothing at all. Communication signals (CAN) are fine as well.
I tried with another cam, same problem.
I still believe it has something to do with the VIN but I can’t figure it out so far.

  • Note 3v3 is an output, you only need to supply 5v VIN.

I succeeded in finding that a “Maximum Recursion Depth Exceeded” exception was raised when calling the find_blob function :

def merge_callback(blob1, blob2):
    if (blob1.cx() <= blob2.cx()) : blobs = [blob1, blob2]
    else : blobs = [blob2, blob1]

    if((blobs[0].code() == 1) and (blobs[1].code() == 2)):
        return True
    return False
    
try:
    blobs = img.find_blobs(color_thresholds, roi=roi_blob, pixels_threshold=100, area_threshold=100, merge=True, margin=5, merge_cb=merge_callback)
except Exception as e :
    f = open("log.txt", 'a+')
    f.write("find_blob:"+ str(e) + "\r\n")
    f.close()

I’m using a custom merge_callback to find bicolor blobs that only are [color1 (left) - color2 (right)].

Could you please explain to me why this function raises this exception and why it does not when powering the CAM with USB ?
Is there an easy way to make my code properly work ?

Thank you in advance.

Try changing the threshold ?

Changing the thresholds could not be the solution : the threshold values depend on my application.

I found 2 workarounds :

  • I removed the merge_callback and no more exception is raised. That means the merge_callback is working in a recursive way that could lead to a “Maximum Recursion Depth Exceeded” exception.
    Can you explain why ? This solution is not acceptable to me because I have to find bicolor blobs


  • Another patch surprisingly worked. I put the call of a function that calls find_blobs in try/except bloc and no more exception are raised :
try:
    check_line_alignment()
except Exception as e :
    pass

def check_line_alignment():
    """some stuffs"""
    blobs = blobs_analysing()
    """some stuffs"""

def blobs_analysing():
    try:
        blobs = img.find_blobs(color_thresholds, roi=roi_blob, pixels_threshold=100, area_threshold=100, merge=True, margin=5, merge_cb=merge_callback)
    except Exception as e :
        f = open("log.txt", 'a+')
        f.write("find_blob:"+ str(e) + "\r\n")
        f.close()
        blobs = []
    blobs_bicolor = [x for x in blobs if x.code() == 3]

    return blobs_bicolor

It seems really strange to me that adding a try/except bloc makes the problem disappear without catching any exception any more…

Moreover any of these tests could explain why the problem does not happen in USB mode.

Any other idea ?

This seems like a legit issue with callbacks.

In our code we let MicroPython do all the work for running the callback. So, I don’t push/pop stack pointers. So, I’m wondering if this is a bug in MicroPython.

Okay, can you generate example code for the failure and success cases that are simple enough for me to run at home? Then I can debug.

You will find attached a file that reproduces the problem.

I coded the same code structure as the one of my project. That’s mean the following function call sequence :
run() in the main while(1) loop calls
=> blobs_analysing() that calls
=> find_bicolor_blobs() that calls
=> find_blobs().
In this case, adding try/except bloc around find_bicolor_blobs() make the find_blobs() function stop raising “Maximal recursive Depth Exceeded” exception.

But I also discovered that bypassing one of the function in the call sequence makes the find_blobs() function stop raising “Maximal recursive Depth Exceeded” exception no matter what. So it seems the problem could come from the depth of the function call sequence ?

Last but not least, the issue caused by this simplified source code happens also in USB powered mode…

Tell me if you find anything !
Thanks.
main.py (3.02 KB)

Great, I will debug tonight.

Any update ? =)

Sorry, I was focused on doing other work for the business. OpenMV is at a point where it’s beyond my ability to do everything as quickly as I used to.

Okay, debugged this. So, the issue is literally that you are running out of stack on the M7. We only had the stack about 3KB on the M7 and because we added a bunch of new features to find_blobs() to make it more or less at parity with OpenCV it uses way more stack now.

When I run the code on the H7 I see that it uses:

stack: 3532 out of 6216
GC: total: 240064, used: 112896, free: 127168
No. of 1-blocks: 4464, 2-blocks: 604, max blk sz: 182, max free sz: 7948

And on the M7 before I call find_blobs I see that the info is:

stack: 1772 out of 3292
GC: total: 54016, used: 22672, free: 31344
No. of 1-blocks: 779, 2-blocks: 118, max blk sz: 184, max free sz: 1955

Since they are running the same code you can see the H7 when it finishes with everything has a higher number than the M7.

A quick fix for this is for Ibrahim to bump the stack up at the cost of removing some ram from something else. We can then just cut a firmware for you. Alternatively, get an H7.

Adding support for finding the min area rectangle of each blob is the likely offender here.

Use the:

print(micropython.mem_info())

To see the stack usage.

Ibrahim can you increase the stack on the M4 and M7?

I’d have to lower the heap, isn’t there any other way around this ?

I can move the move structs to the heap space. It’s more work however than just making the stack larger.

Hi !

No problem, you’re still doing a great job ! :wink:

Okay, thank you. I have found a workaround using the try/catch block so it will be enough for now.

Do not bother doing that at the risk that some other problems could reveal. Now that the problem is clearly identified and comes from my application (it could also have been a bug or something on your side that it would have been great to help finding it !). I will definitively get an H7 as soon as the same CAN features available on M7 have been developed on it. (Is that already the case ?). We are developing a solution for a client that will be tested in September but deployed only at the beginning of 2020. So I hope H7 will be mature enough to switch until this time.

Thank you for your time.

CAN still doesn’t work on the H7. I guess probably have to get it working…

It should be working in a couple of weeks before the next release.

Hi, CAN is now supported on the H7 and the next release will be out in a day or two.