Specialised global methods for binocular and trinocular stereo matching
Horna Carranza, Luis Alberto
The problem of estimating depth from two or more images is a fundamental problem in computer vision, which is commonly referred as to stereo matching. The applications of stereo matching range from 3D reconstruction to autonomous robot navigation. Stereo matching is particularly attractive for applications in real life because of its simplicity and low cost, especially compared to costly laser range finders/scanners, such as for the case of 3D reconstruction. However, stereo matching has its very unique problems like convergence issues in the optimisation methods, and challenges to find matches accurately due to changes in lighting conditions, occluded areas, noisy images, etc. It is precisely because of these challenges that stereo matching continues to be a very active field of research. In this thesis we develop a binocular stereo matching algorithm that works with rectified images (i.e. scan lines in two images are aligned) to find a real valued displacement (i.e. disparity) that best matches two pixels. To accomplish this our research has developed techniques to efficiently explore a 3D space, compare potential matches, and an inference algorithm to assign the optimal disparity to each pixel in the image. The proposed approach is also extended to the trinocular case. In particular, the trinocular extension deals with a binocular set of images captured at the same time and a third image displaced in time. This approach is referred as to t +1 trinocular stereo matching, and poses the challenge of recovering camera motion, which is addressed by a novel technique we call baseline recovery. We have extensively validated our binocular and trinocular algorithms using the well known KITTI and Middlebury data sets. The performance of our algorithms is consistent across different data sets, and its performance is among the top performers in the KITTI and Middlebury datasets. The time-stamped results of our algorithms as reported in this thesis can be found at: • LCU on Middlebury V2 (https://web.archive.org/web/20150106200339/http://vision.middlebury. edu/stereo/eval/). • LCU on Middlebury V3 (https://web.archive.org/web/20150510133811/http://vision.middlebury. edu/stereo/eval3/). • LPU on Middlebury V3 (https://web.archive.org/web/20161210064827/http://vision.middlebury. edu/stereo/eval3/). • LPU on KITTI 2012 (https://web.archive.org/web/20161106202908/http://cvlibs.net/datasets/ kitti/eval_stereo_flow.php?benchmark=stereo). • LPU on KITTI 2015 (https://web.archive.org/web/20161010184245/http://cvlibs.net/datasets/ kitti/eval_scene_flow.php?benchmark=stereo). • TBR on KITTI 2012 (https://web.archive.org/web/20161230052942/http://cvlibs.net/datasets/ kitti/eval_stereo_flow.php?benchmark=stereo).