degeneration (AMD). It is characterized by progressive atrophy of the retinal pigment epithelium, overlying photoreceptors, and underlying choriocapillaris.1 Areas of GA often initially appear extrafoveal, where they may cause difficulties in reading or dim-light vision.2 Over time the atrophic area may grow, and when it reaches the fovea, visual acuity is severely diminished. Prevalence of GA increases exponentially with age,3 and is highest in people of European ancestry.4 The number of people affected by GA is expected to increase further in the near future because of the ageing population.5
GA.6,7 However, several potential therapies are in clinical trial.8 For evaluation of these trials, reliable anatomic endpoints are required, as visual acuity alone provides insufficient insight in the severity of the disease.9 Growth rate of the atrophic area has been suggested as an important indicator of disease progression.9-11 However, the speed at which GA progresses varies greatly between subjects.12-14 Therefore, understanding the patterns associated with progression and the variability between subjects is important for the design and interpretation of clinical trials.
as manual delineation can be challenging and time-consuming,15,16 automatic segmentation could provide a scalable and reproducible alternative. Deep learning has emerged as a powerful technique for the automatic analysis of medical images.17 Deep learning models require labeled examples (training data) to tune their internal parameters. The model then learns to extract features that are important for the segmentation task without further need of explicit domain knowledge from experts. It has been applied successfully to color fundus images (CFIs) for classification of severity stages in AMD18,19 or diabetic retinopathy,20 and recently also for the detection of GA.21 Although manually labeled examples are still required for training and validation, the model can thereafter be applied to large data sets without further intervention of expert ophthalmologists.
extract structural characteristics of GA as seen in imaging that have been demonstrated to correlate with growth rate. For example, multifocal lesions grow faster than unifocal lesions22 and extrafoveal lesions grow faster than foveal lesions.13 Circular lesions have been demonstrated to grow at a slower rate than more irregularly shaped lesions.23. Baseline lesion area has been consistently associated with future growth, with larger lesions growing faster than smaller lesions.11,13,24,25 However, applying a square root transformation to the lesion size may remove this dependency.16,26 It is therefore hypothesized that lesions with approximate circular shape grow at a constant radial speed, thus leading to a quadratic growth of the area.16,27
most widely used, particularly in large epidemiologic studies.12 More recently fundus autofluorescence (FAF) and optical coherence tomography (OCT) have also become popular for the study of GA and GA progression.13,16,25 Several lesion characteristics visible on those modalities can be linked to progression of GA. For example, banded or diffuse perilesional patters on FAF and structural abnormalities at the junctional zone on OCT have been associated with faster GA progression.13,25,28 Although GA may be detected earlier on FAF than CFI,29 good agreement on quantification of GA area in CFI between two independent reading centers has been demonstrated,11 and progression rates assessed from both FAF and CFI are highly correlated.13,29 CFI has the advantage that it is widely available, often over longer time periods, making it suitable for the study of long term progression of GA. Previous work on automatic methods for segmentation of GA focuses mainly on OCT30-32 or FAF.33 Feeny et al.34 proposed a method based on a random forest classifier in CFI. In contrast, in this study we present a model that is based on deep learning. To our best knowledge, this is the first deep learning model for segmentation of GA in CFI.
model for segmentation of GA in CFIs. and 2) to demonstrate its utility in a longitudinal setting for the study of GA progression. The performance of the developed model is compared against four graders on a challenging dataset to evaluate its robustness. Next, the automatically segmented GA areas provide measures of structural characteristics related to lesion size, location and morphology. We investigate the associations between those structural characteristics at baseline and subsequent growth rate of GA. Finally, we combine GA growth rates across patients to obtain an estimate of average progression of GA area over time.
Data
segmentation were collected from the Blue Mountains Eye Study (BMES)35 and the Rotterdam Study (RS) cohorts I, II and III.36 The developed model was applied to CFIs from the Age-Related Eye Disease Study (AREDS)11 for the assessment of GA growth rate.
started between 1992 and 1994, and included 3,654 participants aged 49 or older. CFIs were obtained with a Zeiss fundus camera (Carl Zeiss, Oberkochen, Germany) for the first 3 visits and a CanonCF-60 DSi with DS Mark II body (Canon, Tokyo, Japan) for the 4th visit. The BMES was approved by the University of Sydney and the Sydney West Area Health Service Human Research Ethics Committees.
cohort I started in 1990 and included 7,983 participants aged 55 years and older. Cohort II started in 2000 and included 3,011 participants aged 55 years and older. Cohort III started in 2006 and included 3,932 participants aged 45 years and older. CFIs for the first examinations were obtained with a Topcon TRV-50VT (Topcon Optical Company, Tokyo, Japan), the last two examinations with a Topcon TRC 50EX and a Sony DXC-950P digital camera. The RS was approved by the Medical Ethics Committee of the Erasmus MC and by the Netherlands Ministry of Health, Welfare and Sport.
of AMD and cataract. Starting between 1992 and 1998, 11 clinics in the United States enrolled 4,757 participants aged between 55 and 80 years. Stereoscopic CFIs were acquired with a Zeiss FF-series camera (Carl Zeiss AG, Oberkochen, Germany). The AREDS was approved by an independent institutional review board at each clinical center.
up at six months intervals, although the typical interval between available CFIs was one year. The BMES, RS and AREDS all adhere to the tenets of the Declaration of Helsinki.
from the BMES and RS sets. 26 images with mixed signs of AMD (neovascularization, bleedings, scars) were excluded in order to disambiguate overlapping areas. Furthermore, no GA was delineated in 43 images because it was either not present or ungradable, and 26 images were excluded due to poor image quality. The remaining 409 images were included for development of the model and evaluation of its performance. This set contains 87 images from BMES (26 participants, 43 eyes) and 322 images from RS (149 participants, 195 eyes). The 409 images represent 315 unique visits (some visits had two CFIs available).
GA and at least two years of follow up, following the grading available from the database of genotype and phenotype (dbGaP) 2014 table. Most of these images were stereoscopic, so this accounted for 2,750 unique acquisitions (eye-visit). Pixel to millimeter conversion was fixed for all images, based on the average distance between fovea and center of the optic disc measured in a subset of the images. This distance was assumed to be 4.5mm.37
of experience), using an in-house created software platform for manual annotations (https://www.a-eyeresearch.nl/software/ophthalmology_workstation/).38 For RS, additional multimodal imaging (infrared, FAF and/or OCT) was available for some of the visits, and the platform allowed images of the same eye (both multimodal and longitudinal) to be aligned manually by identifying corresponding landmarks. The graders could simultaneously view images of the same eye using a synchronized cursor on multiple screens. GA was identified as absence of the retinal pigment epithelium and increased visibility of the choriocapillaris on CFI. Additional evidence from other modalities was used whenever available. Areas of macular and peripapillary atrophy were delineated as separate classes, but for this study only macular GA was used.
a way that each grader annotated approximately half of the entire set and every image was graded by at least two graders. Finally, a consensus grading was made for all images in both sets. During the consensus grading all graders decided together which of the individual gradings was most accurate, and updated this grading if necessary, until consensus was reached. If two CFIs of the same visit were present, both were included for model development and the delineated GA area was propagated from one image to the other by using the affine transformation calculated from the manual landmarks. For evaluation, only the CFI that was used to make the consensus grading was used.
Model
ensemble of several models, each trained with partly overlapping training sets. The network architecture (the topology of connections between internal parameters of the deep-learning model) for each model consisted of a deep encoder-decoder structure with residual blocks and shortcut connections, similar to De Fauw et al,39 but adapted to work with CFIs. This architecture, and its variations, can be characterized by a contracting path, in which the high-resolution input image is converted to a low-resolution abstract representation, followed by an expanding path in which the original resolution is reconstructed. The contracting and expanding path are connected by shortcut connections. This approach has been shown to be very effective for semantic segmentation in medical imaging for which large contextual information is required.
version of the same image, both resampled to 512x512 pixels. The contrast-enhanced image was obtained by subtracting a blurred image from the original image.40 The input was transformed through the many layers of artificial neurons in the contracting and expanding path, and ultimately yielded a new image in which the value of every pixel represented a likelihood of being part of an area of GA. A threshold was applied to this likelihood image to obtain the final GA area. More details about the model and the training procedure used for this study can be found in the supplementary material.
GA segmentation
validation scheme. Data from BMES and RS were merged into one dataset and randomly split at patient level into five approximately equal folds. In a rotating scheme, four folds were used for model training and validation (development set), while the remaining fold was used for performance evaluation (test set). Furthermore, four separate models were created within each development set. Each model used three folds for tuning of the internal parameters (training) and one for validation. An ensemble of these four models was then evaluated on the respective test set. Ultimately, an ensemble of the 20 obtained models (four models developed for each of the five rounds) constituted the final model.
assessed using the Dice coefficient, which is defined as two times the intersection of two areas divided by the sum of the individual areas. Hence, a value of zero represents disjoint areas (no overlap), while a value of one represents perfect agreement. Dice coefficients were calculated between graders to assess the inter-observer agreement, whereas the areas delineated in the consensus grading were used as reference for the model. Note that the consensus grading was not independent of the individual gradings, and therefore could not be used as a reference to estimate graders' performance. Furthermore, intra-class correlation coefficient (ICC) of the GA area and of the square root of the GA area was used to measure agreement between graders and the model.
GA growth rate
from AREDS for the analysis of GA progression. It is well-documented that GA area increases faster for larger lesions. To remove the dependency of baseline lesion size on growth rate, many researchers apply a square root transformation to the GA area.26 Similarly, we calculated the square root annual growth in millimeter per year for each eye to assess progression in the AREDS set.37 This value was obtained from the slope of a linear regression through the square root of the GA area for a selected set of timepoints. The selected set consisted of all available CFIs within a window of 2 years, for which the number of available CFIs was highest for the respective eye. The window was limited at 2 years because growth rate and lesion characteristics may change over time.23 We calculated the correlation of square root annual growth rate between fellow eyes, and compared growth rate between groups using an unpaired t-test for unilateral versus bilateral, unifocal versus multifocal and foveal versus extrafoveal cases.
growth rate, we built a linear model based on features that were extracted from the segmented GA area at baseline (the first image within the selected window). Candidate features were area, perimeter, convex area, filled area, solidity (area / convex area ratio and area / filled area ratio), number of lesions, eccentricity, circularity, roundness and foveal involvement. Details on how these features were calculated can be found in the supplementary material. Associations between individual features and square root annual growth rate were calculated using univariate linear regression. Because the features were not independent, a multivariate linear model was created to further investigate which features best explain variation in square root annual growth rate. The multivariate model was built using forward selection, by iteratively adding the feature that yielded the highest increase in adjusted R² value, until it no further increased. When stereoscopic images were available, lesion characteristics were represented by the mean of the two calculated values. In order to obtain a more homogeneous set for the prediction model, we discarded images where the relative difference in GA area between the left and right stereoscopic image was more than 50%, and only included eyes with at least 2 years of follow up images.
mm² per year (not square root transformed) was estimated as a function of GA area, again using a linear regression for each eye through the GA area in a window of 2 years. This resulted in an estimate of GA growth (the slope of the regression), bounded by a minimum and maximum GA area. The estimated general GA growth for a given GA area was then represented by the mean of all growth estimates for which this GA area fell within the respective area bounds. Confidence intervals were estimated using bootstrapping.
GA segmentation
measured in cross-validation in the BMES and RS data sets. Dice coefficients between two independent graders ranged from 0.72 ± 0.26 to 0.82 ± 0.21 (0.78 ± 0.24 on average). See Table 1 for more details. The intraclass correlation coefficient between the model and the consensus was 0.83 for GA area, and 0.84 for the square root of the GA area. Consistency in those values is further visualized in Figure 1 using Bland-Altman plots. The mean value of the differences between consensus and model did not differ significantly from 0 on the basis of a 1-sample t-test for neither GA area (p=0.82) nor square root GA area (p=0.22). Examples of manually and automatically segmented GA areas can be found in Figure 2. More examples of automatic segmentation results on the AREDS set can be found in Supplementary Figure 1.
GA growth rate
images in automatically segmented area was more than 50%, 584 of the 625 eyes in AREDS with at least 2 years of follow up remained. Square root annual growth of GA for those eyes was 0.21 ± 0.46 mm/year. This value was significantly higher for eyes with small (<5 mm²) baseline GA area (0.31 ± 0.39, N=308), compared to eyes with large (≥ 5mm²) baseline GA area (0.10 ± 0.50, N=276), p<0.001. Table 2 shows differences in growth rate between groups. We observed that multifocal and extrafoveal lesions grow faster than unifocal or foveal lesions. Subjects with bilateral GA showed faster progression than unilateral cases, although not significant in our analysis (p=0.12). Growth rates between fellow eyes were correlated (r=0.45, p<0.001). Figure 3 highlights progression of GA for selected individual eyes.
growth are summarized in Table 3. Eight out of eleven features were significantly correlated with GA growth rate (after Bonferroni correction). Features included in the multivariate model were area, circularity, filled solidity, convex area, number of lesions, eccentricity and roundness. The coefficient of determination of this model was 0.18.
AREDS set can be found in Figure 4. The red dashed line in these graphs represent a quadratic model that best fitted the data for GA area < 12 mm².
evaluated. We demonstrated how the automatically obtained segmentations of the model can be used to study growth rate of GA on an independent set. The performance of the deep learning model in terms of Dice coefficient on the BMES and RS set approached that of human experts. The model was able to identify GA even when image quality or contrast were relatively poor, as demonstrated in Figure 2. Nevertheless, some failure cases were still present, which was the main reason for the lower average Dice coefficient. We suspect that more training data may solve this issue, since each of the models only used 60% of the data (~245 images) for training, which may not be enough given the inherent difficulty of the problem and the variability in the data. For application to the AREDS set this problem was partly circumvented by using an ensemble model, which indirectly made use of all training data.
0.46 mm/year) was slightly lower than previously reported values. For example, Domalpally et al. observed 0.30 mm/year,29 and Keenan et al. observed 0.28 mm/year.41 A reason for this may be the dependence of growth rate on baseline area. When we split the dataset on baseline lesion size, we observed that small lesions have larger square root growth rates (see Table 2). This phenomenon was analyzed in more detail in Figure 4. A quadratic curve seemed to fit the observed GA progression very well up to an area of around 12 mm². For larger areas, the growth rate seemed to stabilize or even decrease. Similar observations were made by Keenan et al.41, whose reported values are included in Figure 4 for comparison.
in the regression analysis, where area, filled area and convex area were most strongly correlated with square root annual growth rate. However, when we included only lesions with baseline area < 12mm² in the regression analysis, no features related to lesion size were significantly associated with square root annual growth rate. On an individual level, we also observed a quadratic growth of the area of GA in many cases in the AREDS set, some of them highlighted in Figure 3, where we fitted a quadratic curve through the GA area over time. Again, the decrease in growth rate for larger lesions was visible (bottom two cases in Figure 3).
significantly associated with square root annual growth rate. Convex solidity is low for irregular shaped lesions, but also for multifocal lesions. This feature hence captures multiple previously reported associations. Circularity was previously associated with GA growth rate,23 but compared to other features, the association was not very strong in our analysis. An explanation is that the model may have produced a segmentation with a very jagged border for some lesions with indistinct borders of the atrophic area. This could have led to a relatively large perimeter, and hence a lower value for circularity. Roundness will be a better representation of how well the lesion approaches a circular shape in those cases, as it represents the ratio of the area of an enclosing circle and the area of the lesion, and is hence less sensitive to irregular borders.42
have been inaccurate. This conversion was based on the average distance between fovea and optic disk in a subset of images. Although it is unlikely that this inaccuracy was a source for bias in reported associations with growth rate, reported values for area and growth rate may be slightly larger or smaller in reality.
OCT. This may give more accurate measurements of the atrophic area, and hence more reliable assessment of growth rate. In this study, only morphological features of the atrophic area were considered. A next step would be to include associations between growth rate and other lesions patterns, especially those visible on FAF or OCT. Finally, we are investigating the capabilities of deep learning models to directly predict areas where GA may develop. This will provide predictions of both the extent and the location of future GA area.
based on deep learning for GA in CFIs. The model was capable of reproducing known associations between current GA status and future growth. Moreover, we indicated novel structural biomarkers that are predictive for future growth rate, such as solidity, eccentricity or roundness of the lesion. We demonstrated how deep learning can help in the automation of grading, allowing for analysis of larger datasets and helping to understand progression of GA.
EyeNED Reading Center, specifically by Johanna Colijn, Caroline Klaver, Corina Brussee and Ada Hooghart.
1. Lim LS, Mitchell P, Seddon JM, et al. Age-related macular degeneration. Lancet. 2012;379:1728-1738.
2. Sunness JS, Rubin GS, Applegate CA, et al. Visual function abnormalities and prognosis in eyes with age-related geographic atrophy of the macula and good visual acuity. Ophthalmology. 1997;104:1677-1691.
3. Owen CG, Jarrar Z, Wormald R, et al. The estimated prevalence and incidence of late stage age related macular degeneration in the UK. Br J Ophthalmol. 2012;96:752-756.
4. Wong WL, Su X, Li X, et al. Global prevalence of age-related macular degeneration and disease burden projection for 2020 and 2040: a systematic review and meta-analysis. Lancet Glob Health. 2014;2:e106-116.
5. Colijn JM, Buitendijk GH, Prokofyeva E, et al. Prevalence of age-related macular degeneration in Europe: the past and the future. Ophthalmology. 2017;124:1753-1763.
6. Gehrs KM, Anderson DH, Johnson LV, et al. Age-related macular degeneration— emerging pathogenetic and therapeutic concepts. Ann Med. 2006;38:450-471.
7. Boyer DS, Schmidt-Erfurth U, van Lookeren Campagne M, et al. The pathophysiology of geographic atrophy secondary to age-related macular degeneration and the complement pathway as a therapeutic target. Retina. 2017;37:819.
8. Hanus J, Zhao F, Wang S. Current therapeutic developments in atrophic age-related macular degeneration. Br J Ophthalmol. 2016;100:122-127.
9. Holz FG, Strauss EC, Schmitz-Valckenberg S, et al. Geographic atrophy: clinical features and potential therapeutic approaches. Ophthalmology. 2014;121:1079-1091.
10. Sunness JS, Applegate CA, Bressler NM, et al. Designing clinical trials for age-related geographic atrophy of the macula: enrollment data from the geographic atrophy natural history study. Retina. 2007;27:204-210.
11. Lindblad AS, Lloyd PC, Clemons TE, et al. Change in area of geographic atrophy in the Age-Related Eye Disease Study: AREDS report number 26. Arch Ophthalmol. 2009;127:1168-1174.
12. Fleckenstein M, Mitchell P, Freund KB, et al. The progression of geographic atrophy secondary to age-related macular degeneration. Ophthalmology. 2018;125:369-390.
13. Schmitz-Valckenberg S, Sahel J, Danis R, et al. Natural history of geographic atrophy progression secondary to age-related macular degeneration (Geographic Atrophy Progression Study). Ophthalmology. 2016;123:361-368.
14. Danis RP, Lavine JA, Domalpally A. Geographic atrophy in patients with advanced dry age-related macular degeneration: current challenges and future prospects. Clin Ophthalmol. 2015;9:2159.
15. Sunness JS, Bressler NM, Tian Y, et al. Measuring geographic atrophy in advanced age-related macular degeneration. Invest Ophthalmol Vis Sci. 1999;40:1761-1769.
16. Yehoshua Z, Rosenfeld PJ, Gregori G, et al. Progression of geographic atrophy in age-related macular degeneration imaged with spectral domain optical coherence tomography. Ophthalmology. 2011;118:679-686.
17. Litjens G, Kooi T, Bejnordi BE, et al. A survey on deep learning in medical image analysis. Med Image Anal. 2017;42:60-88.
18. Burlina PM, Joshi N, Pekala M, et al. Automated grading of age-related macular degeneration from color fundus images using deep convolutional neural networks. JAMA Ophthalmol. 2017;135:1170-1176.
19. Peng Y, Dharssi S, Chen Q, et al. DeepSeeNet: a deep learning model for automated classification of patient-based age-related macular degeneration severity from color fundus photographs. Ophthalmology. 2019;126:565-575.
20. Gulshan V, Peng L, Coram M, et al. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA. 2016;316:2402-2410.
21. Keenan T, Dharssi S, Peng Y, et al. A deep learning approach for automated detection of geographic atrophy from color fundus photographs. Ophthalmology. 2019;In press.
22. Klein R, Meuer SM, Knudtson MD, et al. The epidemiology of progression of pure geographic atrophy: the Beaver Dam Eye Study. Am J Ophthalmol. 2008;146:692-699.
23. Domalpally A, Danis RP, White J, et al. Circularity index as a risk factor for progression of geographic atrophy. Ophthalmology. 2013;120:2666-2671.
24. Sunness JS, Margalit E, Srikumaran D, et al. The long-term natural history of geographic atrophy from age-related macular degeneration: enlargement of atrophy and implications for interventional clinical trials. Ophthalmology. 2007;114:271-277.
25. Holz FG, Bindewald-Wittich A, Fleckenstein M, et al. Progression of geographic atrophy and impact of fundus autofluorescence patterns in age-related macular degeneration. Am J Ophthalmol. 2007;143:463-472.
26. Feuer WJ, Yehoshua Z, Gregori G, et al. Square root transformation of geographic atrophy area measurements to eliminate dependence of growth rates on baseline lesion measurements: a reanalysis of age-related eye disease study report no.
26. JAMA Ophthalmol. 2013;131:110-111.
27. Shen L, Liu F, Nardini HG, et al. Natural history of geographic atrophy in untreated eyes with nonexudative age-related macular degeneration: a systematic review and meta-analysis. Ophthalmol Retina. 2018;2:914-921.
28. Fleckenstein M, Schmitz-Valckenberg S, Martens C, et al. Fundus autofluorescence and spectral-domain optical coherence tomography characteristics in a rapidly progressing form of geographic atrophy. Invest Ophthalmol Vis Sci. 2011;52:3761-3766.
29. Domalpally A, Danis R, Agrón E, et al. Evaluation of geographic atrophy from color photographs and fundus autofluorescence images: Age-Related Eye Disease Study 2 Report Number 11. Ophthalmology. 2016;123:2401-2407.
30. Chiu SJ, Izatt JA, O'Connell RV, et al. Validated automatic segmentation of AMD pathology including drusen and geographic atrophy in SD-OCT images. Invest Ophthalmol Vis Sci. 2012;53:53-61.
31. Hu Z, Medioni GG, Hernandez M, et al. Segmentation of the geographic atrophy in spectral-domain optical coherence tomography and fundus autofluorescence images. Invest Ophthalmol Vis Sci. 2013;54:8375-8383.
32. Niu S, de Sisternes L, Chen Q, et al. Automated geographic atrophy segmentation for SD-OCT images using region-based CV model via local similarity factor. Biomed Opt Express. 2016;7:581-600.
33. Hu Z, Medioni GG, Hernandez M, et al. Automated segmentation of geographic atrophy in fundus autofluorescence images using supervised pixel classification. J Med Imaging. 2015;2:014501.
34. Feeny AK, Tadarati M, Freund DE, et al. Automated segmentation of geographic atrophy of the retinal epithelium via random forests in AREDS color fundus images. Comput Biol Med. 2015;65:124-136.
35. Mitchell P, Smith W, Attebo K, et al. Prevalence of age-related maculopathy in Australia. Ophthalmology. 1995;102:1450-1460.
36. Ikram MA, Brusselle GG, Murad SD, et al. The Rotterdam Study: 2018 update on objectives, design and main results. Eur J Epidemiol. 2017;32:807-850.
37. Grunwald JE, Pistilli M, Ying G, et al. Growth of geographic atrophy in the comparison of age-related macular degeneration treatments trials. Ophthalmology. 2015;122:809-816.
38. van Zeeland H, Meakin J, Liefers B, et al. "EyeNED workstation: development of a multi-modal vendor-independent application for annotation, spatial alignment and analysis of retinal images", in: Association for Research in Vision and Ophthalmology, 2019
39. De Fauw J, Ledsam JR, Romera-Paredes B, et al. Clinically applicable deep learning for diagnosis and referral in retinal disease. Nat Med. 2018;24:1342.
40. Graham B. Kaggle diabetic retinopathy detection competition report. University of Warwick. 2015
41. Keenan TD, Agron E, Domalpally A, et al. Progression of geographic atrophy in age-related macular degeneration: AREDS2 report number 16. Ophthalmology. 2018;125:1913-1928.
42. Zdilla MJ, Hatfield SA, McLean KA, et al. Circularity, solidity, axes of a best fit ellipse, aspect ratio, and roundness of the foramen ovale: a morphometric analysis with neurosurgical considerations. J Craniofac Surg. 2016;27:222.
Table 1: Dice coefficients between model and consensus grading, and between
individual graders.
Table 2: Square root annual growth of the GA area. Values represent mean ±
standard deviation. P-values are calculated using an unpaired t-test.
Table 3: Correlations between baseline lesion characteristics (features) and
square root annual growth rate (in mm/year). Features are sorted in decreasing order of strength of association. A P-value smaller than 0.0045 (0.05, Bonferroni corrected) is considered significant
Figure 1: Bland-Altman plot of GA area (left) and square root GA area (right).
Differences are calculated as the area/square root area of the consensus grading minus the automatic segmentation.
Figure 2: Examples of automatic GA segmentation. The green area corresponds
to either the consensus (left) or the model output (right). The top three rows show accurate segmentation results, for various configurations of GA differing in area, shape and number of lesions, and variable image quality and contrast. The bottom row shows examples of inaccurate model output.
Figure 3: Progression of GA over time for 4 selected eyes. The graphs represent
area measurements over time (two points per timepoint for the LS and RS stereoscopic images). The blue line is a quadratic fit through the points. For the top 2 cases, an increment in growth rate can be observed. 53834 LE has a more irregular shape than 51551 RE and progresses faster. In the bottom two cases we observe that the growth decreases as the GA area gets larger.
Figure 4: GA growth over time. Left: GA growth rate (in mm2/year) as a function
of GA area. The blue line represents growth rates estimated from the segmentations of the deep learning model. The shaded area represents the 95% confidence interval (estimated using bootstrapping). The dashed red line represents the growth rate of a quadratic model, as visualized in the right graph. Right: the blue line represents the evolution of GA area over time, obtained by numerically integrating the estimated growth rates from the left graph using a GA area of 0.5 mm2 at t=0. The red dashed line represents the best quadratic fit to the plot for GA area < 12 mm2. Above this area the observed GA area diverges from the quadratic fit.
Deep learning model details
eight levels of resolution, connected by shortcut connections at every level. At every level of resolution, a residual block with two 3x3 convolutions was used. The number of filters per convolution was, for each of the respective levels 32, 32, 64, 64, 128, 128, 256, 256. The eight levels of down-sampling reduce the input from 512x512 to a feature map of 2x2 pixels. At the lowest level, two residual blocks with 1x1 convolutions and 2048 filters each were applied. The down-sampling operations were performed by strided convolutions.
loss was weighted to balance the classes. The model was trained on batches of 2 images, using the adam-optimizer with a learning rate of 32 * 10-5. The learning rate was divided by two every 50 epochs. Input images were augmented by horizontal and vertical flipping, scaling of up to 1.3, rotations of maximum 40 degrees, and translations of up to 150 pixels.
selected based on best performance on the respective validation set. Performance was assessed as best average Dice-coefficient after selecting the optimal threshold. The ensemble model was constructed by combining the output of the models after correcting for differences in optimal threshold between models:
Where represents the final prediction for a specific pixel,
represents the prediction for that pixel for model
and
represents the optimal threshold for model
.
tensorflow backend.2
1. Chollet, F. et al., Keras, https://keras.io, 2015; 2. Abadi, M. et al. TensorFlow: Large-scale machine learning on heterogeneous
Description of lesion characteristics
have holes, the perimeter along inner borders are also included. Four-connectivity for border pixel determination is used.
joined rather than calculating the convex hull for each focus separately.
0.175 mm. Lesions are separated if pixels do not touch neither horizontally, vertically or diagonally.
the GA region. The eccentricity is the ratio of the focal distance (distance between focal points) over the major axis length.
calculated as 4 area / π d², where d is the length of the major axis of the ellipse that has the same second central moments as the region.
circular area with diameter of 0.3 mm in the center of the image.
All lesion characteristics were calculated using Python 3.7, with the numpy library, version 1.15.3 (https://numpy.org/) and the scikit-image library, version 0.14.1 (https://scikit-image.org/)
Figure S1: Sample showing the output of the model for the first 20 patients in the
AREDS set (showing the first image with label GA for every patient).