I have an RT1062 I’m using for some experiments. The experiment involves taking about 200 pictures (stored to an SD card) and running some analysis on them. I can capture the photos just and and then run face detection on them, saving information about bounding boxes and keypoints in JSON.
The analysis portion runs for a while and then bricks the board, at the same point every time. The point at which it bricks seems arbitrary, in that it’s in some middle loop iteration, not the start of a new phase or anything. I don’t believe I’m running out of memory, because the data it’s computing is just not that large (a small histogram). Even if it were a memory hog, isn’t the space for Python heap on the order of 8MB? There’s no way I could be close to that.
There is one way to get the program to run longer: reduce the number of print statements in it. I found that as I commented out print statements, the program would run longer. The more print statements I eliminated, the further it would get. This is very mysterious to me. Does anyone have ideas about what might be going on or how to debug it? Ironically, debugging via print statements is going to make the problem worse.
This seems to correspond to another issue I’ve seen. Sometimes if there is a Python error the program will abort but the IDE won’t report an error. The terminal console does not show the exception backtrace. However, if I add enough print statements, suddenly the exception backtrace appears!
It seems like there is something wrong with the (tty?) I/O system. Does that provide any clues?
I am using version 4.7.0 of the IDE. I tried upgrading to a newer version (the one released several days ago) but it immediately bricks the board when trying to connect, every time. Perhaps that is related as well.
Please install the latest IDE, v4.8.1. We shipped a new protocol for v5.0.0 which is the top of dev. Once installed you can use the latest debug protocol which improves the system stability.
Just install the latest IDE and then you can use the latest dev firmware.
Also, regarding print statements… it’s odd to see any issues with this. However, there have been a lot of bug fixes since v4.7.0.
4.8.1 didn’t brick the board immediately like it did a few days ago. Not sure why that changed.
In any case, maybe some progress, but regression in other areas.
One new bug is that when I load an image (saved as JPEG) and then try to use it in an Image routine, I get a “JPEG decode error.” Calling img.to_rgb565() before other Image methods fixes that issue. I did not need to do that before, or at least I didn’t see that error before. I wonder if that is what was corrupting something.
In any case, the processing seems to get further but now I see this error:
KeyError: Z{uU*ӻ
_;y
OpenMV v4.8.1; MicroPython v1.26.0-77; OpenMV IMXRT1060 with MIMXRT1062DVJ6A
Type "help()" for more information.
That looks like memory corruption. It’s indexing a dictionary using a key read from a metadata file on the SD card. I checked and the file itself has the correct key. Moreover, the program used this key hundreds of times before it suddenly failed.
Do you have any recommendations to help debug this?
More testing revealed that adding print statements again bricks the board predictably, without the KeyError. This definitely smells like a memory corruption issue.
Another idea I have to make progress is to run this post-processing step on the host computer. The only reason it needs to run on the board and load images is to call find_lbp on them. Is there some way to serialize the LBP result to disk? If I could do that immediately after taking the picture, it would save a lot of time and I could hopefully work around this problem. When I try to print the find_lbp result I get {}, which is expected because that’s what the print function does in the library source.
Is there a description of this returned structure somewhere and how to pick it apart? Even if I have to manually iterate over a complex data structure, it would still be faster doing that and then offloading this final processing step to a beefier machine.
As for the issue, if you can make a minimal program that causes the error, I can debug it. Also, you may wish to update to v5.0.0 after you have the new IDE installed via Tools->Install Latest dev release.
We’ve been talking in the thread about to_grayscale as well. It seems these problems are related. As I mentioned there, the board is now working as expected. Making a small reproducer is probably going to be tricky, but if I see it again within short periods, I’ll see what I can do.