Detecting Letters

Where should the Model Explorer be? I don’t find it in Tools → Machine Vision.

The CNN library, copy the model to your SD card for the camera.

Okay, releasing the IDE now. The raspberry pi build works.

This is the official OpenMV Cam H7 release!

Hello! My camera finally arrived, so I got to test a few things. I came to the conclusion, that using the templates might be an easy way to detect letters. Do you think, using templates is a reliable way to detect letters? Or how would you do it?

When I tested it with the printed letters, it worked. But only at a certain range…so if I was too far away or too close, it wouldn’t work…


This is the code. It is basically one of the examples, just a bit modified.

import time, sensor, image
from pyb import UART
from image import SEARCH_EX, SEARCH_DS
ser = UART(3,115200,timeout_char=1000)
# Reset sensor
sensor.reset()

# Set sensor settings
sensor.set_contrast(1)
sensor.set_gainceiling(16)
# Max resolution for template matching with SEARCH_EX is QQVGA
sensor.set_framesize(sensor.QQCIF)
# You can set windowing to reduce the search image.
#sensor.set_windowing(((640-80)//2, (480-60)//2, 80, 60))
sensor.set_pixformat(sensor.GRAYSCALE)

# Load template.
# Template should be a small (eg. 32x32 pixels) grayscale image.
template1 = image.Image("/H_Letter.pgm")
template2 = image.Image("/U_Letter.pgm")
template3 = image.Image("/S_Letter.pgm")
clock = time.clock()

# Run template matching
while (True):
    clock.tick()
    img = sensor.snapshot()

    # find_template(template, threshold, [roi, step, search])
    # ROI: The region of interest tuple (x, y, w, h).
    # Step: The loop step used (y+=step, x+=step) use a bigger step to make it faster.
    # Search is either image.SEARCH_EX for exhaustive search or image.SEARCH_DS for diamond search
    #
    # Note1: ROI has to be smaller than the image and bigger than the template.
    # Note2: In diamond search, step and ROI are both ignored.
    harmed = img.find_template(template1, 0.70, step=4, search=SEARCH_EX) #, roi=(10, 0, 60, 60))
    unharmed = img.find_template(template2, 0.70, step=4, search=SEARCH_EX)
    stable = img.find_template(template3, 0.70, step=4, search=SEARCH_EX)
    if harmed:
        img.draw_rectangle(harmed,5)
        ser.write(0x01)            #Error: TypeError: Object with buffer protocol required.
        print("Detected H")
    if unharmed:
        img.draw_rectangle(unharmed,5)
        ser.write(0x02)            # same Error
        print("Detected U")
    if stable:
        img.draw_rectangle(stable,5)
        ser.write(0x03)            # same Error
        print("Detected S")
    #if l:
    #  img.draw_rectangle(l,5)
    #  ser.write(0x02)
    print(clock.fps())

As you see, I get an error at ser.write(…). Is that error, because I have no Arduino connected? Or what does this error mean? If I comment those lines, everything works fine.


Thank you for your help,
Finn :slight_smile:

Hi, write() doesn’t take a number, it takes a string. If you want to send the byte value 0x03 then do write(str(0x3)) or like write(“\x03”), etc.

Template matching is not scale or rotation invariant. So, by definition it doesn’t work when scale (zoom) or the image is rotated. It is translation invariant however.

Template matching is fine as long as you just need to find the x/y translation. It’s no good for scale/zoom issues. That said, the CNN will struggle with the same issues as it’s not trained on rotated and scaled data.

Is it possible, to use the CNN Chars74K, only when there is actually a letter? For example, if there is only a white wall, it just gives nothing. Oh, and I tried the Chars74K network library on printed letters, but it gave me nothing. Also, the camera was just black and white. It gave me B or P, even though the letters were completely different, S or U for example. They don’t look like P or B, not even close to P or B. If I change the contrast, so that it is not just completely black or white, would that change the detection of letters? Or does it always detect wrong letters, no matter what way I do it?

Thank you, Finn :slight_smile:

I think, I solved the problem, that it always detects the wrong letter. But I still don’t know, how I can make it detect letters, only if there is actually something. What, if I tell him, to only start the detection procedure, when there is something black?

And just one more question: Where do I find the ‘/img-chars74k.network’? Or does the ‘/fnt-chars74k.network’ work better for printed letters?

Um, so, there are 3 different networks for the character 74k. Only the fnt one works. The one trained on image letters just overfits and doesn’t work in real life.

To make it work better I’m pretty sure it would require edge detection first (using canny) versus the raw image data. This is because the general image one is full of characters of different resolutions, sizes, rotations, etc. So the net learns how to cheat the dataset instead of what is in the image because there are too many artifacts.

Ok, and how would that work? How would the code look like?

Um, try out the Color Tracking → Single Color Greyscale tracking script. Adjust the thresholds to find the white character.

Once you do that, you can print out the blob.rect() to see where the blob is. Then you pass the blob.rect() (if found) to the nn.forward(roi=blob.rect()).

That said the rect may be too small so you might want to temporarily create a new rect object that’s slightly bigger than the blob rect. For an example of doing this see the AprilTag Examples → higher resolution AprilTag example.

Alright…

This is the code, I have so far:

# Color Tracking Thresholds (Grayscale Min, Grayscale Max)
# The below grayscale threshold is set to only find extremely bright white areas.
thresholds = (0, 40)
blobby = False

sensor.reset()                         # Reset and initialize the sensor.

# Set sensor settings
sensor.set_contrast(1)
sensor.set_gainceiling(16)

sensor.set_pixformat(sensor.GRAYSCALE) # Set pixel format to GRAYSCALE
sensor.set_framesize(sensor.QVGA)      # Set frame size to QVGA (320x240)
sensor.set_windowing((96, 96))       # Set 128x128 window.
sensor.skip_frames(time=500)
sensor.set_auto_gain(False)
sensor.set_auto_exposure(False)

# Load chars74 network
net = nn.load('/fnt-chars74k.network') # works on printed font
# net = nn.load('/fnt-chars74k.network') # works on handwritten chars
# net = nn.load('/img-chars74k.network') # works on images of chars
labels = ['n/a', '0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
for i in range(ord('A'), ord('Z') + 1): labels.append(chr(i))
for i in range(ord('a'), ord('z') + 1): labels.append(chr(i))

clock = time.clock()                # Create a clock object to track the FPS.
while(True):
    clock.tick()                    # Update the FPS clock.
    img = sensor.snapshot()         # Take a picture and return the image.
    #imgBlob = sensor.snapshot()     # Picture of blob, to see the letter
    # Adjust the binary thresholds below if things aren't working - make sure characters are good.
    #img.find_edges(image.EDGE_CANNY, threshold=(100, 100))

    for blob in img.find_blobs([thresholds], pixels_threshold=100, area_threshold=100, merge=True):
        # These values depend on the blob not being circular - otherwise they will be shaky.
        if blob.elongation() > 0.5:
            img.draw_edges(blob.min_corners(), color=0)
            img.draw_line(blob.major_axis_line(), color=0)
            img.draw_line(blob.minor_axis_line(), color=0)
        # These values are stable all the time.
        img.draw_rectangle(blob.rect(), color=127)
        img.draw_cross(blob.cx(), blob.cy(), color=127)
        blobby = True
        # Note - the blob rotation is unique to 0-180 only.
        img.draw_keypoints([(blob.cx(), blob.cy(), int(math.degrees(blob.rotation())))], size=40, color=127)

    if (blobby == True):
        out = net.forward(img.binary([(100, 255)]), softmax=True)
        max_idx = out.index(max(out))
        score = int(out[max_idx]*100)
        if (score < 50):
            score_str = "??:??%"
        else:
            score_str = "%s:%d%% "%(labels[max_idx], score)
        img.draw_string(0, 0, score_str, color=(0, 255, 0))
        print(score_str)
        blobbi = False

    print(clock.fps())             # Note: OpenMV Cam runs about half as fast when connected
                                   # to the IDE. The FPS should increase once disconnected.

If I execute this code, it will search for blobs. As soon as it finds one, the image will change to binary (black&white) and look for the letter. But it looks for the letter in the whole image, not just where the blob is. Also, it will stay in this binary mode (how can I exit this binary mode, and go back to normal grayscale?). Another problem is, that it will detect blobs anywhere, but I think that is solvable with the threshold.

So, my questions are: How do I exit the binary image mode? How do I change the image size to the size of the blob, or let the camera only search for letters within the blob rectangle?

Thank you for your help ^^

Blob detection works in binary mode too. So, I’d just do everything in that mode. I.e. binary the image first using color thresholds. Then find_blobs() just looks for “white” in the binary image.

Do:

net.forward(img.binary([(100, 255)]), softmax=True, roi=blob.rect())

Or:

# Next we look for a tag in an ROI that's bigger than the blob.
w = min(max(int(blob.w() * 1.2), 10), 160) # Not too small, not too big.
h = min(max(int(blob.h() * 1.2), 10), 160) # Not too small, not too big.
x = min(max(int(blob.x() + (blob.w()/4) - (w * 0.1)), 0), img.width()-1)
y = min(max(int(blob.y() + (blob.h()/4) - (h * 0.1)), 0), img.height()-1)
net.forward(img.binary([(100, 255)]), softmax=True, roi=(x, y, w, h))

Can the board work without the laptop, too? If I connect it to my arduino board, but disconnect it from my laptop, how would I ensure, that the code is on my camera and that the camera is turned on and does whatever the code wants it to do?

Tools->Save Script to OpenMV Cam
Tools->Reset OpenMV Cam

Ok. So now the code is saved on my camera. But how do I start the code from outside the IDE? Is there a way through arduino coding? Or some kind o button or connection,that has to be turned on?

Or does it always automatically execute the main.py file, when connected correctly?

When the system is powered on it starts running main.py.

I hope this message finds you well. I would like to politely and respectfully request your assistance with a code for letter detection. I am aware of your skills and expertise in this area, and I believe your contribution would be invaluable to my current project.

Would it be possible for you to share with me a code that can help me detect letters in a specific context? I need to perform this particular task to fulfill the requirements of my ongoing project.

I understand that your time is valuable, and I completely understand if you are unable to assist. However, if you are available and can provide any guidance, advice, or a code example, I would be extremely grateful.

Thank you in advance for your time and consideration. If there is anything I can do to assist you in the future, please do not hesitate to contact me.

Best regards

Hi, you need to train a CNN using Edge Impulse. These things are model based now. There’s not really much to it based on code. You just have to collect the dataset. Upload to edge impulse, get a CNN, and run it.