Presence detection

Hi,
I am interested to know if there is some kind of algorithm to detect human presence, to be able to differentiate between kids, adults and elderly people as well as a head count within a certain area…
Any input input would be great. Than you.

Hi, you can do motion detection with the camera and the thermal shield can be used for presence detection, however, it can Not classify people :slight_smile:

OK great. Thank you.

By the way, how mature is the module along with the software currently? Are there any restrictions in using it with a commercial application?

It’s a relatively new camera, it will need more time to prove itself, users will be the judge of that, I can say that the HW is good and the software could use some more work, that will be my next task.
For commercial apps you need to make sure the licenses allow it, (HW is CC, SW is MIT) a lawyer will probably be more useful to you.

How about just detecting on/off presence of object then? Ie. detect when it is not being present? No need to classify it. Let’s say I have a table with multiple different objects on it and I want to find out if a particular one is present or not? This is based on location of object, ie. we can assume the locations and orientations of the objects are always pretty much the same (except that one is sometimes missing - ie. the one whose presence I want to detect).

It would be even nicer if the system could detect any missing object in such a setup, or handle any orientations of objects, but is that too much to expect from the system?

This was an old post. :slight_smile:

Yes. But I thought it’d be better to continue the thread than start a new one with near-identical topic. Should I open a new thread instead then?

No that’s fine. I’m just saying the thread owner will not respond. Anyway, please expand on your last paragraph. What do you mean?

Ok thanks for the reminder. I’ll expand on what I am looking for:

It would be even nicer if the system could detect any missing object in such a setup, or handle any orientations of objects, but is that too much to expect from the system?

The idea is that the positions of the items are always more or less fixed. Sorry, I think I forgot to mention that earlier.

So, continuing with the ‘objects/items on a table’ example, I was wondering there whether it would be possible to distinguish what item (or subset of items out of a predefined, known total) is not present on the table. Of course in the real world there are always also slight inaccuracies in positions and orientations of the items that we try to detect.

Another example would be finding out how many cars are present on a small, say 5-car parking lot, and which of those parking slots are currently empty.

Should one treat the problem as a set of comparisons (one per missing item / set of missing items) of the full image against the same base image (all items present), or a set of present/not present comparisons of a partial image (one part per item to watch for)? Or something else entirely… ? A Haar cascade trained to recognize the item type(s) that can be present and then getting their positions?

I took a look at the docs and it seems like one could use the image.find_template(template, threshold) function. So assuming we have an image, a template, and suitably tuned threshold value parameter, maybe the following code could work for detecting whether the “testitem” is present:

checked_item_locations = {"testitem": (100, 100, 50, 50)} # 50x50 box whose top left is at (100, 100)

def point_inside(point, container):
   "return True if a (x,y) point is inside (topleftx, toplefty, width, height) container"
   x, y = point
   cx, cy, w, h = container
   return True if (cx <= x <= cx+w) and (cy-height <= y <= cy) else False
 
def box_inside(box, container):
   "return True if a (topleftx, toplefty, width, height) container is inside another"
   topleft = (box[0], box[1])
   topright = (box[0] + box[2])
   bottomleft = (box[0] - box[3])
   bottomright = (box[0] + box[2], box[1]  - box[3])
   for corner in (topleft, topright, bottomleft, bottomright):
      if not point_inside(corner, container):
         return False

def find_present(items=checked_item_locations):
   "return labels of all items matching template and present inside bounding boxes of checked_item_locations"
   matches = []
   no_more_matches = False
   while not no_more_matches:
      bbox = image.find_template(template, threshold)
      if bbox:
         for label, known_item_bbox in items.items():
            if box_inside(bbox, known_item_bbox):
               matches.append(label)
      else:
         no_more_matches = True
   return matches

Thoughts? After writing the above, I realized we could also work on partial images, given that we know the expected positions of the items. On the other hand, assuming all the bounding box contents look pretty much the same when empty (table surface or parking lot asphalt), perhaps a more generic solution could be found? Something that merely checks the dominant average color of each bbox. So if the color is mostly NOT the color of table surface, or asphalt in second example, the item is present / slot is occupied.

Here’s an example algorithm for finding the most prominent color of the image.
(source: http://pieroxy.net/blog/pages/color-finder/index.html):

I have not yet checked what kind of support the OpenMV has for this kind of algorithm. But here it is.

Get all the pixels in the image and store their color value as a key in a hash map. The value is then the number of pixels of the same color encountered. However, this gives poor results so here are some improvements:

  • Treat at most a certain number of pixels so if the image is bigger, start undersampling (not needed in case of OpenMV since the image is small?)
  • To avoid noise due to flat areas (text, borders, etc) start by right shifting all the RGB values by 6. So all the values are now between 0 and 3, allowing to encompass large areas of almost-the-same color.
  • Then perform another pass of the same algorithm (with a shift of 4) on all the pixels falling in the previous winning color group. And so on until zero is reached.

When trying to find the group that has the most pixels, use a callback to weight the score. This allows customization and telling the algorithm you want for example a rather dark color, or you wan to exclude black, or you want only highly saturated color excluding the greys, or whatever.

If the objects are in fixed positions you can then just use frame differencing with image masking. So, basically, you’d just difference the current image against a background image and then you’d apply a mask to zero out all the areas of the image you’re not interested in. You can have a background/mask for each object location on the SD card. You’d then just iterate through the different backgrounds/masks for each image to capture to check if one thing in particular changed.

If the objects aren’t in fixed locations then you could do frame differencing still. However, you wouldn’t know what object was which after you got the difference. But, you could then run either blob detection or template matching to try to figure out what each object was. I know frame differencing and color tracking works well so I’d start with that. See all the and/xor/or/nand stuff for masking operations. You can build image masks as bmp files. For example, use the AND operation with a mask that has white and black pixel areas to zero all pixels that aren’t in any white areas. However, note that the mask needs to be in the same format as the image you are applying it to. Use GIMP to make images because it allows you to control the image format.

how we can count heads?

Hi Marcus, please start a new thread with your particular question with a lot of details about what you want to do.