@Fressbrett
Disclamer … this is just as I understand it and is meant to add to the conversation nothing more.
Some additional info based on assumptions of what issue your reporting
1st the question as to why we only care about height
FOV is an angle representing the verticle field of view so we calcualte the verticle
The scale is the same on all axis so
var scale = Vector3.one * (camHeight / Screen.width) * _scaleFactor;
Would give you a Vector3 scale
Thus if you wanted a world canvas to fit a 16:9 screen perfectly then you would set its pixle resolution to be some multiple of 16:9 i.e. 1920 by 1080 for example and then when you set its scale not its reoslution it will scale to keep that fit.
So if your needing to fit the screen then you need to change scale by distance from camera and resolution to match screen resolution.
2nd you said it gets a little bigger as it gets further away … this is either due to projection or rounding.
What I think you might be seeing in terms of width is due to projection
Having a wider screen like my 32:9 vs a 21:9 vs a 16:9 etc. will cause distortion as you approch the left and right extent of the screen this is due to the projection.
In short a perspective camera is applying a projection matrix that causes a distortion toward the left and right edges of the screen. This effect gives you a since of depth and angle.
If you want to see this in its extreem just turn the FOV up really hgih … put a cube in the middle of the screen and now rotate the camera on its Y axis so your looking left and right … you will notice that as the box moves toward the edge of the screne even though its distance from the camera remains a constant that it appears to stretch.
As to the rounding point … your starting at 0.1 float gets sloppy with very small and very large numbers so if your scale starts to get into 0.000001 youll notice it switches to notation in display and starts to lose some precision as you keep goign … same issue that makes cameras shake further from origin (0,0,0) and that make shaddows flicker fruther from origin (0,0,0)
If I assumed what you where saying wrong sorry.
The math provided is correct in that it find the relative scaler that is the apparent difference in size of an object based on its distance from the screen and the FOV of the camera. Things that will make it less than perfect are as I noted being off center of the screen and having a higher FOV, and I suppose at extream values (very small scale or very large scale) floating point could be an issue