Abstract
White matter hyperintensities (WMH) of presumed vascular origin are frequently found in MRIs of healthy older adults. WMH are also associated with aging and cognitive decline. Here, we compared and validated three algorithms for WMH extraction: FreeSurfer (T1w), UBO Detector (T1w + FLAIR), and FSL's Brain Intensity AbNormality Classification Algorithm (BIANCA; T1w + FLAIR) using a longitudinal dataset comprising MRI data of cognitively healthy older adults (baseline N = 231, age range 64-87 years). As reference we manually segmented WMH in T1w, three-dimensional (3D) FLAIR, and two-dimensional (2D) FLAIR images which were used to assess the segmentation accuracy of the different automated algorithms. Further, we assessed the relationships of WMH volumes provided by the algorithms with Fazekas scores and age. FreeSurfer underestimated the WMH volumes and scored worst in Dice Similarity Coefficient (DSC = 0.434) but its WMH volumes strongly correlated with the Fazekas scores (r$_{s}$ = 0.73). BIANCA accomplished the highest DSC (0.602) in 3D FLAIR images. However, the relations with the Fazekas scores were only moderate, especially in the 2D FLAIR images (r$_{s}$ = 0.41), and many outlier WMH volumes were detected when exploring within-person trajectories (2D FLAIR: ~30%). UBO Detector performed similarly to BIANCA in DSC with both modalities and reached the best DSC in 2D FLAIR (0.531) without requiring a tailored training dataset. In addition, it achieved very high associations with the Fazekas scores (2D FLAIR: r$_{s}$ = 0.80). In summary, our results emphasize the importance of carefully contemplating the choice of the WMH segmentation algorithm and MR-modality.