Good day! For those who don’t know me, my name is Alexey, I am a physicist by education and have done my part in both experiment and theory. Some time ago I have “pivoted” into applied sciences and industrial R&D. In particular, I did some research in computer vision, focusing mainly on algorithms based on physical principles (as opposed to heuristics that rely on trained models, such as neural networks).
During my transition from fundamental sciences, I have learned many exciting new concepts and ideas, but also have encountered some surprises “of a wrong kind”. These are mostly entrenched conventions and decisions that at some point in the past may have made sense, but presently are just a form of technical debt.
One of such archaisms is the typical way we (people in computer vision) treat digital cameras. I am not saying we don’t respect cameras enough or handle them without sufficient care. No, my criticism rather concerns our wide-spread failure to recognize cameras as bona fide measurement devices. See, once you calibrate an “undisputed” sensor, such as a thermometer, you get a model that helps you convert the readings (e.g., in form of positions of the indicator arrow) to the measured value, such as temperature. In addition, though, you also necessarily must establish the measurement uncertainty – the range of possible deviation of the true values from the measurement outcomes. This uncertainty, originating from the underlying physics, the design of the sensor, and/or the calibration procedure itself, becomes an important characteristic of the device. Any metrologist can tell you that precision converts to money according to a very non-linear scale, so an accurate calibration may turn out to be more expensive than the sensor itself.
Compare that picture to the typical scenario in a computer vision lab, which by a very large stretch is called “geometric camera calibration”. Some poor students perform the so-called “checkerboard dance” until they are bored or are called to lunch, then execute some code, copied from a random website, and derive a few numbers they do not fully understand. The senior colleagues do not pay much attention: they work on the exciting main application that uses the camera – for instance, an algorithm for the autonomous navigation of a flying robot. However, when the robot crashes into a wall in the first test, it is really difficult to relate that unfortunate outcome to an insufficient-quality calibration, since the latter has generated no interpretable metrics.
In this series of posts I will try to convince you that it is in fact possible to re-define the concept of camera calibration in computer vision with more rigour and finally turn cameras into “first-class” measurement devices. Moreover, this purely philosophical mindset change may bring quite tangible benefits: for example, it may become possible to accomplish some challenging tasks using less expensive hardware.
Before we start, I need to make an important disclosure: my colleagues and I have implemented some of the discussed methods as a web service, running at https://radiant-metrics.com. Here you can try to calibrate your camera in the “new” way without having to worry about the mathematical details, and decide whether you agree with my sentiments and conclusions.
As another necessary side note, most of the ideas and concepts that I am going to discuss have been published in the recent years in scientific journals and conferences: [1], [2], [3]. Feel free to look there if you need more details or if you prefer a more condensed writing style.
References:
[1] A. Pak, S. Reichel, J. Burke. Machine-learning-inspired workflow for camera calibration. Sensors 22 (18), 6804 (2022).
[2] A. Pak, A. Mir. Cloud-based metrological-quality camera calibration. Proc. SPIE 12893, Photonic Instrumentation Engineering XI, 128930Z (2024).
[3] A. Pak. The concept of smooth generic camera calibration for optical metrology. TM-Technisches Messen 83 (1), 25–35 (2016).