Back to the research area

Landmark Localization in Depth Images


A correct and reliable localization of facial landmarks enables several applications in many fields, ranging from Human Computer Interaction to video surveillance.
For instance, it can provide a valuable input to monitor the driver physical state and attention level in automotive context. In this paper, we tackle the problem of facial landmark localization through a deep approach. The developed system is fast and, in particular, is more reliable than state of the art competitors specially in presence of light changes and poor illumination, thanks to the use of depth input images. We also collected and shared a new realistic dataset inside a car, called MotorMark, to train and test the system. In addition, we exploited the public Eurecom Kinect Face Dataset for the evaluation phase, achieving promising results both in terms of accuracy and computational speed.


The autonomous driving of on-road vehicles is one of the most challenging and actual problems for both the research and industrial communities. In recent years, it is gathering the attention of numerous researchers from different disciplines, with a strong involvement of the ICT community. Among the others, Computer Vision is playing a leading role in two main aspects. 
First, Computer Vision and Pattern Recognition disciplines are applied to assist or even replace traditional sensors in the perception of the surround context, i.e., the outside world. 
Second, the ability to monitor the behavior of passengers and drivers is fundamental, for example as a safety aid to enable full or semi-autonomous driving: the intervention of the driver or, at least, his/her attention can be requested by the automatic system in exceptional cases of need.
In this case, vision systems must operate on images provided by internal cameras, installed and configured to monitor the passengers and the driver.

A reliable localization of facial landmarks -- i.e., the ability to infer the position of prominent face elements relative to the view of the acquisition device -- is one of the basic component to conduct driver physical state investigation, through eyes or mouth direct monitoring, facial expressions recognition, head pose estimation, all fundamental elements also for driver attention analysis, as reported in literature.
Facial landmark localization is also an important task in Computer Vision, and a key element for many other fields, such as age estimation, sign language recognition \cite{ari2008} and various applications in biometrics.
Many solutions of facial landmark localization have been proposed in the last decades. However, the automotive context is characterized by additional issues such as strong occlusions, dramatic light changes, high head pose variability. Moreover, additional requirements like non-intrusivity of the acquisition device (no physiological signals, like EEG, ECG, EMG) and the avoidance of initialization or on-user training are preferable.

Download MotorMark Dataset



1 Frigieri, Elia; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita "Fast and Accurate Facial Landmark Localization in Depth Images for In-car Applications" Proceedings of the 19th International Conference on Image Analysis and Processing, Catania, 11-15 september 2017, 2017 Conference

Video Demo

Research Activity Info