Abstract
Optical Coherence Tomography (OCT) is a crucial ophthalmic imaging technique that is subject to speckle noise which can obscure pathological features and hinder accurate segmentation. A standard approach to OCT image noise is to capture and average multiple (typically 30) B-scans at the same tissue location. However this is not always possible, particularly for children and patients with pathology who have more difficulty fixating during the scanning process. In the worst scenario, the OCT scanner will stop before achieving the target scans as the exposure time is too long. It is important therefore to seek ways of synthesising better quality images that rely on fewer scans. This paper frames the OCT image reconstruction problem as a multiple-input extension of the image-to-image translation problem. The goal is to match the perceptual quality of the "gold standard" averaged image by synthesising a new image from between 1 and 8 B-scan slides. Building on previous work, we propose a multi-input conditional GAN-based architecture that incorporates a lightweight U-Net generator with a PatchGAN discriminator. Additionally, we examine the impact of a gated attention (GA) mechanism within the U-Net architecture, and the impact of increasing depth of filters. Experiments were conducted with varying numbers of input slices, comparing the results to averaged images derived from the same number of slices as well as the gold standard averaged over 30 B-scans. There is no single reliable metric for image quality. Therefore this paper compares images over a range of indicative metrics including peak-signal-to-noise-ratio (PSNR), structural similarity (SSIM), texture preservation (TP) and edge preservation (EP), as well as three no-reference image sharpness metrics: just noticeable blur (JNB), perceptual similarity index(PSI) and spectral and spatial sharpness (S3). We found that GAN networks with GA generally outperformed those without GA and traditional averaging on PSNR, TP and EPI. On PSI and S3 the performance of the GAN networks approached that of the gold standard, while on JNB results were less definitive. Overall the multi-image-to-image translation GAN shows superior performance in OCT image reconstruction compared to traditional averaging with limited number of input slices, offering enhanced contrast and sharpness, making it a promising alternative for clinical applications.