AprilTag 3D pose estimation scaling

I’m having trouble understanding the scaling aspect to get the world co-ordinates from the pose estimate. In the z_to_mm function of the example “MAVLink AprilTags Landing Target Script.” , how do you get the number “165”,used to divide the actual Tagsize in mm ?

def z_to_mm(z_translation, tag_size): # z_translation is in decimeters...
    return (((z_translation * 100) * tag_size) / 165) - lens_to_camera_mm

In My case, using know distance to target, I’m getting that should be 215 instead of 165.



It was figured out by measuring the target distance and comparing to what the function was returning… then figuring out the value needed to get the right result.

Okay. Would you know of the theoretical approach to arrive at that factor to get accurate scaling ? or point me towards something in that direction?

I wasn’t able to figure out on how to get that factor by looking through the source code apriltag.c or other AprilTag repos online using similar Homography to pose approach.

As a note, I’m using a vari-focal lens and my target distance and intrinsic matrix changes for different cases.

Thanks for the help,


Hi, it’s based on the tag homography and the camera matrix. Our code just applies the camera maxtrix to the tag homogrpahy and that spits out the translate and pose (3d rotation) of the tag away from the camera. You should look at the apriltag 3d pose example. This explains what’s going on more.