Abstract
Accurate blink detection is crucial for various applications such as eye tracking, driver drowsiness detection, brain-computer interfaces, diagnosing neurological disorders and understanding visual behavior. Despite the research in the area, some of the current mobile eye tracking devices still struggle to provide an accurate detection of blinks, which can impact their accuracy in eye tracking. This research aims to develop a deep learning method to reliably detect and classify blinks for use in mobile eye tracking devices. The approach comprises a variational autoencoder (VAE) stage followed by a feature classifier. The VAE architecture is responsible for learning a compressed representation of the input image, consisting of an encoder network that transforms the input images into a low-dimensional latent space, and a decoder network. Subsequently, the encoder part of the VAE is utilized to map images into the latent space, and these representations (i.e. features) are input into a classifier for blink detection. To better understand the proposed method, four VAE models (A, B, C, and D) with different levels of complexity, three latent parameter dimensions (2, 4, and 6) and four classifiers were trained. The findings suggest that higher-dimensional latent representations enable the VAE models to capture more essential features for blink detection, leading to improved discriminative power in the classifier using the latent parameter representation. In terms of classifier performance, the K-Nearest Neighbours (KNN) algorithm demonstrated the highest accuracy. The VAE model with five convolutional layers in the encoder (model B) and a latent parameter dimension of 6 achieved the best performance, with an accuracy of 99.40%. The proposed method outperformed other methods tested with the same dataset, indicating the effectiveness of the VAE-based approach for blink detection.