Hi, if he gets it done over the weekend (unlikely but maybe) then you can try to use the code I originally posted from chatpgt.
If homography support is ported too then it becomes super easy.
Hi, if he gets it done over the weekend (unlikely but maybe) then you can try to use the code I originally posted from chatpgt.
If homography support is ported too then it becomes super easy.
That sounds great. However the chatGPT code you provided still references openCV for homography so I canāt use it, and I donāt really understand the math behind the homography code from the source code so Iām not sure how to apply it here. In fact I donāt even know how the aprilTag pose estimation determines the rotation and translation (or perhaps only translation as rotation is the same regardless of distance) if I never gave it the size of my april tags in real life - what units would the translation vector be in then? cm/mm/meters?
Another issue (or slight problem) I came accross is that the store page mentions the horizontal and vertical field of view as " HFOV = 70.8°, VFOV = 55.6°", however when I measured it myself on QVGA by placing the camera over a protractor and placing something at both edges of the field of view until its just barely out of view and I got a reading of 53 degrees horizontal and 45 degrees vertical. I assume this means setting the resolution actually does like a digital crop and only uses a smaller subset of pixels on the sensor near the center, hence the smaller FOV? How would I go about getting the size of a pixel in real life anyways?
The sensor datasheet list the size of the pixels in micrometers.
" HFOV = 70.8°, VFOV = 55.6°" the spec given the lens focal length. Itās not measured, though.
ā¦
You need to give the AprilTag code the size in real life for the units to mean anything. It can return scaleless units without the tag size.
ā¦
The first piece of code I posted from chatgpt doesnāt use any opencv code. However, itās just an idea. Iād refine that by talking with the LLM itself to refine what I posted.
Ahh alright, thanks! Iāll have a look at the datasheet.
Hi, I just wanted to confirm something about the camera which affects what pixel size and focal length I obtain, as Iām a bit confused about the naming convention for the openMV camera boards. In my openMV IDE, it mentions the board is openMV Cam H7 (STM32H743) and sensor is MT9M114, however on the box it came in it states at the bottom that its an OpenMV Cam H7 R2, so I pulled up the store page for the H7 R2: OpenMV Cam H7 R2
However on the store page, in the image the pins, chip and board layout seem the same, however the lens seems to use a screw to hold the lens in place whereas mine has a M12 screw mount, but the sensor seems to match with what was stated in the IDE.
The H7 plus on the other hand (OpenMV Cam H7 Plus) looks like it has the correct lens mount with the M12 screw, but the chip/processor looks different and has a different sensor and focal length as well.
Sorry for my confusion, but can I just confirm which board am I actually using, and which sensor and focal length is the correct one for the board Iām using?
h7 plus is 5mp board.
dont mind about the lens holder. its just a plastic piece on top of the sensor.
M12 nut its way better that the screw on sideā¦
The H7 Plus has has the OV5640 sensor.
Yes, we switched to the M12 nut as the lens screw is terrible.
So does that mean the camera Iām using is a H7 plus or a H7 R2?
You can clearly tell this from the board. The H7 R2 says itās an H7 unit. It just has a different camera sensor board. From what you said you are using the R2.
Thanks for clarifying, had a feeling I was using the R2 but I wanted an extra layer of confirmation.
Hi, can I check on whether SVD has been implemented? I believe Iāve made progress calculating the homography matrix using numpy on a hosted computer using np.linalg.svd(), now I just need it to work on the openMV board.
Hi, yes, they finished porting it and they are testing it now. Also, we found how the homography code works and you only need the SVD function before I can supply you with the exact numpy code to compute it.
That sounds great! Roughly when roughly I be able to get my hands on that code?
From my research, I may possibly need QR decomposition as well, but Iām still experimenting around. Right now I have successfully obtained the homography matrix, I just need to figure out how to obtain the camera pose from it and it seems to involve QR decomposition. Will definitely need to do more research though
Hi, hereās the discussion:
Hereās how one computes the homography using SVD: apriltag/common/homography.c at master Ā· AprilRobotics/apriltag (github.com)
(Note, the compute inverse method doesnāt really work. Use the SVD one).
You can probably ask ChatGPT to turn that code into numpy code in python.
Then use this to turn the homography into your pose: apriltag/common/homography.c at master Ā· AprilRobotics/apriltag (github.com)
Finally then use the code above that I posted:
matd_t *pose = homography_to_pose(det->H, -fx, fy, cx, cy);
lnk_data.x_translation = MATD_EL(pose, 0, 3);
lnk_data.y_translation = MATD_EL(pose, 1, 3);
lnk_data.z_translation = MATD_EL(pose, 2, 3);
lnk_data.x_rotation = fast_atan2f(MATD_EL(pose, 2, 1), MATD_EL(pose, 2, 2));
lnk_data.y_rotation = fast_atan2f(-MATD_EL(pose, 2, 0), fast_sqrtf(sq(MATD_EL(pose, 2, 1)) + sq(MATD_EL(pose, 2, 2))));
lnk_data.z_rotation = fast_atan2f(MATD_EL(pose, 1, 0), MATD_EL(pose, 0, 0));
To get the x/y/z translation and rotation from the post matrix.
Note that the final numbers are in āwhatever unitsā, so, you need to use a scale factor and offset to put them into the correct units. Luckily, the math is linear so one scale factor and offset is all that is needed.
Thatās actually a problem Iāve been stuck on for awhile, how do I know what scale factor sets my translation vector accurately to real world units? Thatās actually quite critical for my application.
You measure and derive the scale factor. Itās related to the constants in the imaging system though. So, maybe you can work out what each value is and decompose a measured value into itās real world components.
You mean I take a sampled picture at a known length and compare it against the translation vector provided and see how much scaling up it requires?
Could you elaborate abit on what you mean by constants in the imaging system? In my head, I feel like if I know the size in real life, and the cameraās properties, geometrically shouldnāt it make sense that I can obtain the accurate pose in meters?
Hi, yes, those are the constants. You have to plug them into the camera matrix equation.
However, itās unlikely once you plug everything in the output will be correct still. There will be some small scale factor. That factor could be decomposed into constants that should be known. However, when Iāve done this before the result doesnāt just come out perfect. Typically, the final scale factor has to do with the size of the target.
So would I even be able to accurately determine the position (including depth) of the camera relative to the screen using just 1 camera?
Yes, it should work pretty well.
Remember, though, 1/(w*h) falls off in a non-linear way. So, expect good performance while close and bad while far away.