Statistical Exploration of Relationships Between Routine and Agnostic Features Towards Interpretable Risk Characterization

2020·Arxiv

Abstract

Abstract

As is typical in other fields of application of high throughput systems, radiology is faced with the challenge of interpreting increasingly sophisticated predictive models such as those derived from radiomics analyses. Interpretation may be guided by the learning output from machine learning models, which may however vary greatly with each technique. Whatever this output model, it will raise some essential questions. How do we interpret the prognostic model for clinical implementation? How can we identify potential information structures within sets of radiomic features, in order to create clinically interpretable models? And how can we recombine or exploit potential relationships between features towards improved interpretability? A number of statistical techniques are explored to assess (possibly nonlinear) relationships between radiological features from different angles.

1 Introduction

Building and interpretation of radiomics-based predictive models is discussed in many reports [1, 2, 3, 4, 5, 6, 7], which all highlight the difficulty of converting the modelbased risk assessment into practical decision-making pathways for routine implementation–a necessary condition to the clinical implementation of machine learning and arti-ficial intelligence solutions in radiology. Several types of methodologies are considered in the literature to better understand potential for feature recombination towards this goal.

Conventional models such as the lasso, random forests or neural networks [8, 9, 10, 11] are usually used to build predictive models. Combined with preliminary feature elimination, these techniques provide feature set reduction methodologies geared towards a particular endpoint of interest, whether for prognosis (e.g. overall or two-year patient survival) or tumor characterization (e.g. tumor grading, subtyping, etc.) [5, 12, 13, 14, 15, 16, 17, 18]. Interpretation of the model is directly derived from its structure (which describes the interaction between its covariates), and is determined in the context of the endpoint of interest. For example, a linear model for 2-year survival may indicate that an increase of 1 standard unit in both SUVmax and GLCM entropy may contribute to a 3-fold increase in the risk of death within the next two years. In this illustration, the linear interaction between SUVmax and GLCM entropy is directly associated with worse prognosis at the 2-year horizon, but this finding provides only limited insight in terms of the role of GLCM entropy in a prognostic context or in terms of tumor characteristics.

Microarray data analysis encompasses another family of techniques for the discovery of features with high predictive potential. In this framework, multiple statistical testing of association with endpoint is performed to select features of interest [1, 5, 19]. Interpretation of the selected group of predictors is thus also directly linked to the endpoint of interest and may include assessment of joint association between a number of features in this context.

Nonparametric (i.e. model-free) multidimensional methods such as Principal Components Analysis (PCA) and clustering are also employed to identify relevant sets of prognostic features. These techniques are common to other high-throughput fields including genomics and proteomics [1, 19, 20], and consist in detecting relationships between potential predictors before their grouped association with a particular endpoint is established (this is done as a second step). They therefore provide ways to identify associations between features that are not endpoint-dependent. Their scope is however limited by their construction; for example PCA may be used to identify linear associations but not nonlinear ones [21].

Ultimately, an output set of predictive features is considered for use in (future) clinical settings [22]. For further interpretation, association of a small number of radiological features with phenotype or other clinical assessment, through e.g. logistic regression [2], can be performed. Texture and other radiomic features are sometimes clustered on the basis of correlation heat maps [5]. The clinical relevance of cluster consensus maps can be assessed [19], and used to measure predictive ability of radiomic features for specific clinical, biological and functional pathways [23]. Patients may also be clustered on the basis of texture features heat maps, and availability of gene-analysis data allows for association of radiomic signature features and gene expression using gene-set enrichment analysis, by scoring radiomic signatures [1]. In many studies, composite radiomic variables are defined for each patient via a linear combination of selected features and used as additional variables alongside routine clinical or other variables [22]; these can also be interpreted based e.g. on their mathematical construction.

A large number of statistical techniques are thus at hand to build predictive models and find relevant associations within feature sets. Biological interpretation of the output multivariate associations of radiologic features however remains challenging, due to the complex and diverse nature of most of these variables, and of cancer itself. Opacity of machine learning frameworks, often used as black boxes, also adds to this difficulty. They can however be used to gather insight and simplify radiomic summaries in view to facilitate further interpretation. Finding direct, statistically strong associations among features, as illustrated hereafter, can provide a mechanism to simplify such models and facilitate explainability.

2 Methods

2.1 The dataset

We consider radiological summaries derived for a set of FDG-PET sarcoma studies in a previous analysis, as reported in [17]. This dataset of primary sarcoma tumors was acquired at the University of Washington in Seattle, United States, between August 1993 and January 2003, after patients were diagnosed by biopsy. The final cohort comprised of 197 studies, including 88 deaths before loss to follow up. The tumors consisted of 130 soft tissue, 51 bone, and 16 cartilage sarcomas, in patients aged between 17 and 86 years of age (median 45), of which 86 females and 111 males, with 99 high-grade, 66 intermediate, and 32 low-grade tumors. In this report we present the results of analyses carried out on the cohort of 130 soft tissue sarcomas (STS) from this dataset; all other subtypes have been excluded from analysis.

Quantitations were obtained for a fixed-threshold segmentation, with a threshold value set for each study based on the subsample of the lower 15% of uptake values (so as to include background and healthy tissue activity only). For a given study, the segmentation threshold was thus defined as the mean subsample value plus three standard deviations of this subsample. Given the near-homogeneous voxel dimensions of the output images (voxel size of 4.30 mm 4.30 mm in the transverse plane and slice thicknesses of 4.25 mm), no interpolation was performed prior to VoI resegmentation for texture analysis. Uptake values were requantized into 32 grey levels by fixed bin number transformation.

A total of 43 variables were considered and may be iden-tified in three frames: (i) routine clinical variables (tumor grade, clinical volume, patient age, patient sex, maximum standardized uptake value (SUVmax), mean uptake value (SUVmean) and total lesion glycolysis (TLG) were collected for this cohort); (ii) structural features including heterogeneity and as defined in [17], and associated spatial uptake gradients; and (iii) a set of image summaries including morphologic and texture features, all computed as per definitions provided by the Image Biomarker Standardization Initiative (Version 1.5) [24, 25]. More specifically, this third set of features included volume asphericity, morphological descriptors for ellipsoidal characteristics, intensity- and histogram-based first-order statistics, GLCM features, as well as two GLSZM features evaluating the numbers and sizes of contiguous homogeneous regions of equal (discretized) grey level.

2.2 Overall scope

The objective here is not to propose predictive models of patient risk or tumour characterization, but rather to look for and identify patterns among features typically used in radiomic analyses. To this end a number of multivariate data exploration and modelling techniques are used as follows:

1. Correlation and partial correlation analysis, to inspect correlation structures present in the image analysis data;

2. Multivariate decomposition and clustering, to identify natural groupings of features;

3. Regularized multilinear modeling of features, to identify small (2 or 3) subsets of features that can “explain” (i.e. predict) a given feature of interest.

We can use any of the above exploratory analyses to recombine features into composite predictive variables (based e.g. on partitional clustering techniques as in other works cited earlier), which allows for increased statistical power and data-based evaluation of model interpretability. An illustration of this step is also provided in the next section, demonstrating statistical prognostic potential of composite variables derived from these analyses on the sarcoma cohort and discussing their interpretation.

2.3 Correlation and partial correlation analyses

Correlation may be induced by one of several causes. It may result from the mathematical construction of the features; for example it would be reasonable to think that

and

are closely related since uptake histograms tend to be rightskewed, with decreasing as i increases. The same principle applies to second-order quantitations; for example the GLCM matrix yielding joint probabilities for voxel grey levels i, j = 1, . . . , Q typically exhibits a bell-shaped structure with monotonic variations in across the probability surface. Figure 1 illustrates such structural variations in the first- and second-order distributions. Another example, considering the rough approximation 1 for small values of , exposes the numerical proximity between entropyand energyas follows:

entropy

energy

Figure 2 illustrates the relationships observed in the sarcoma dataset for these two examples.

In other instances, correlation may be caused e.g. by tomography-related aspects (for example relating to noise or dose levels, or the reconstruction filtering process), in which case correlation characteristics would change with scanner, or by underlying biological characteristics. Correlation analysis would not provide information on cause, but it constitutes a valuable tool in identifying associations and, therefore, potential pathways for feature set simplifi-cation.

The correlation matrix (e.g. using Pearson correlation, as done here) is often considered for exploratory purposes, but also for preliminary feature elimination. In the latter case, groups of highly correlated features are identified and a single feature is kept as unique representative for each group. How the other variables are eliminated may be guided by clinical or practical considerations but may also be, and often is, arbitrary.

The overlap in information with other variables within the dataset, due e.g. to confounding or mathematical construction, contributes to the value of the correlation coeffi-cient calculated between two variables. Gaussian graphical models (GGMs) [21, 26, 27] provide a way of analysing partial correlation between variables instead. GGMs can be used to highlight direct relationships (the edges in the graph, as illustrated in Figure 3) between features that are conditionally dependent given all other variables. Dependence is evaluated here as a non-negligible partial correlation between the two features.

2.4 Multivariate exploration

Here we considered Principal Component Analysis (PCA) for multidimensional exploration of the feature set. It also consists in analyzing the correlation structure of the feature set, but provides a more elaborate tool for assessment of multivariate associations. Note that PCA applies to quantitative features (we therefore excluded categorical features from the analysis), but variations on this approach may be used for mixed quantitative and descriptive feature sets, allowing for inclusion of categorical agnostic variables.

Partition-based clustering was applied as in [17] to the PCA projection matrix (i.e. the matrix of eigenvectors obtained from spectral decomposition of the feature set correlation matrix) in order to identify groupings of variables

Figure 1: Illustration of grey level distributions from a requantized FDG-PET sarcoma image. Inset, top: midvolume transverse slice of the requantized FDG-PET data. Top: the histogram exhibits a clear relationship between voxel grey level i and likelihood. Bottom: level curve representation of the GLCM matrix depicts the ellipsoidal footprint that is typical of joint uptake distribution structures.

in the feature set. The features projected via PCA can be used directly as composite predictor variables, as detailed for example in [17]. These linear recombinations can however be difficult to interpret. Subsequent clustering analysis of feature groupings in the information space can be exploited to recreate alternative composite variables with meaningful clinical interpretation.

2.5 Multilinear analysis of features

The above methodologies are mechanisms used to isolate groupings of variables based on the correlation (or partial correlation) structure of the feature set. Direct association of features can also be identified and exploited via multivariate modelling, using one feature as the dependent variable explained by subset of other features. Here we

Figure 2: Examples of (nonlinear) associations between texture features in the sarcoma feature set, which result from their mathematical construction.

considered multilinear modelling and used lasso models to identify a group of 2 or 3 features in order to describe (i.e. predict) each feature in the dataset.

original STS dataset for use as an independent test set. Repeated 5-fold cross-validation (CV), using two repetitions, was applied to the remaining which was performed using each one of the continuous variables in the STS feature successively as the dependent variable, and all other variables as predictors in a lasso model. (No preliminary feature elimination was performed.) In total P=41 lasso models of the form (for some i.i.d. zeromean Gaussian noise )

were therefore fitted to each feature of interest via CV using the remaining P-1 features as predictors, from the CV sample of observations, i.e. . In other words each of the 41 continuous features available (that is, all features except for tumor grade and patient sex) was thus modelled by fitting a lasso model to the remainder of the feature set.

Feature selections and model fits from the CV training sets were then analysed. The feature selection scheme provided by lasso was used to eliminate weaker contributions. The remaining predictors were inspected and those with an estimated effect ˆof a magnitude of at least 20% of the overall sum of estimated effects ˆfrom (1) were re- tained as final predictors of that dependent variable. The choice of a 20% cutoff was arbitrary but aimed to reduce the model to a few (possibly strong) predictors. In any case (i.e. even when all estimated coefficients were under this cutoff point of 20% of the cumulative effect in magnitude) the two covariates with highest estimated effect ˆwere retained as a final model.

A final lasso model was fitted to the whole CV sample (using all ically two or three) most popular features as determined by CV. This final prediction model was then applied independently to the test set (using the observations) for final prediction performance assessment for each of the features available.

3 Results

3.1 Correlation and partial correlation analyses

Figure 3 illustrates the output of a GGM, which provides guidance for the understanding of direct relationships between some of the features based on their partial correlations. It highlights several characteristics of this feature space. For instance, it indicates no direct correlation between age and any of the features and exhibits the expected relationships between SUVmax and SUVmean, and between SUVmean, TLG and volume. It also exhibits a cluster of uptake gradients (lower left) which was also expected due to their construction–these quantities are successive quantiles of the sample of normalized uptake gradients [17]. Moreover this graph provides information on direct associations between some of the conventional radiomic features. Skewness, kurtosis and other summaries of the histogram of requantized intensities, unsurprisingly, tend to cluster together (top right), with significant partial correlation found among this group of features. The graph also highlights direct correlation among a group of GLCM features comprising of entropy, dissimilarity, contrast, homogeneity and a few other metrics.

These direct associations may be at least partially explained by their mathematical construction; however they may also be partially driven by other factors that may be pertaining to biological or physiological aspects of the disease. Many of these relationships can be directly exploited in further multivariate modelling of the features to assess prediction potential for this set of features, as considered

further below.

Figure 3: Gaussian graphical model obtained for the sar- coma feature set, showing direct associations between features as defined by partial correlation.

3.2 Principle Component Analysis

The principle components (PC’s) derived from PCA of the STS feature set consist of linear recombinations of the input features. The first 12 PC’s (on the arbitrary basis that the first 12 PCs captured over 95% of the variance in the feature set) were used as composite risk predictors in a multivariate Cox proportional hazard model. This analysis demonstrated the prognostic potential of this multilinear recombination of the original STS feature set, with 3 of the 12 PC’s found statistically significant prognostic variables at the 5% significance level (Figure 4). Figure 5 further illustrates this potential in terms of Kaplan-Meier risk stratification for overall survival, comparing this model to a baseline clinical model comprising of tumor grade, patient age and SUVmax. Note that here the low- and high-risk survival curves are separated so as to optimize the log-rank test statistic.

This PCA output was subsequently analysed by k-means clustering using 12 clusters (thus arbitrarily matching our use of the first 12 PCs for risk prediction). This determined the clusters of features in the PCA projection space illustrated by Figure 6. Makeup of the clusters is detailed in Figure 7. The right-most column in this table provides tentative fields of interpretability for each cluster. This is included here solely as an illustration of the potential for explainable PCA-derived models that is facilitated by clustering analysis of the PCA output. Thorough, rigorous investigations would be required in order to establish suitable clinical interpretation guidelines from such feature clusters.

Figure 4: P-values for the multivariate Cox proportional hazard model for overall survival copmrising of the frst 12 principal components, with associated concordance index C=0.75.

Figure 5: Kaplan-Meier analysis showing risk stratifica- tions obtained from baseline clinical assessment based on grade, age and SUVmax (red, dashed lines) and by using the first 12 principal components together for multivarate risk prediction (blue, solid lines).

Figure 6: Biplot of the features in the PCA projection space, along the first two eigenvectors (and ). Colourcode and grey dots respectively indicate clusters and cluster centroids obtained from k-means clustering of the projected features (i.e. analysing the eigenvector coordinates).

Figure 7: Clusters of features obtained from combined PCA and k-means analysis of the STS feature set. A tentative indication for interpretability is provided here only for the purpose of illustrating the potential for explainable PCAderived models facilitated by the clustering analysis.

3.3 Multilinear modelling

Lasso modelling of the features determined that in many cases, a radiomic feature could be predicted with high accuracy. Figures 8 and 9 show eight examples of such predictions on the 30 test datapoints. GLCM autocorrelation, for example, was predicted extremely well using only meanand variance, which suggests this second-order feature may be replaced with, or interpreted by more

easily explainable first-order features.

In some cases clinical variables were found useful in predicting agnostic features; for example Figure 8 depicts reasonable prediction of GLCM max.probability using GLCM uniformity and SUVmax.

We also note that age and were selected to predict GLCM correlation however with poor performance (Figure 9). This aligns with previous findings (from the above correlation analysis and also from [17]) of reasonable separation between model-derived features (and related uptake gradients) and conventional radiomic features. Overall, however, many of the 41 features considered were predicted with high accuracy from a small number of other features.

For many of these features, it was observed that the predictors selected via cross-validated lasso modelling were connected to the dependent feature in the GGM representation; in other words lasso often selected predictors with strong partial correlation with the dependent variable (which could be expected from this multilinear modelling technique).

Figure 8: Prediction performance based on lasso mod- elling for homogeneity, contrast, autocorrelation and maximum probability (all GLCM features) successively (clockwise from top-left), with associated MSE between predicted and observed values, for the STS test set (two or three features used as predictors in each case are indicated in inset.

4 Conclusion

Identifying and understanding associations between radiomic features and routine variables, such as grade or

Figure 9: Prediction performance as in Figure 8, but here for entropy, uniformity correlation and dissimilarity GLCM features respectively.

other clinically interpretable forms of assessment, is a key step towards integration of models obtained from machine learning analyses. Some works do establish links between specific texture features and tumor biologic heterogeneity for example; but the generalization of this understanding to all such agnostic features will enable integration of full AI systems within the field of radiology. Here we promote discussion on some possible statistical analysis pathways towards this goal.

Most reports of radiomics studies focus on the clustering of agnostic features and assess their potential alignment with routine variables a posteriori, to analyze the clustering output. Potential interactions between agnostic and routine variable can also be explored more directly by allowing any such feature to describe any other, as is done in this work via lasso modelling.

This pilot analysis highlighted numerous substantial feature associations. Many of 41 clinical and radiomic features considered here were predicted with high accuracy using a small number of other features, via lasso modelling, focusing on linear interactions. The majority of these associations were noticed to coincide with groupings of features obtained on the basis of a high partial correlation, as exposed in a Gaussian graphical model representation.

This didactic presentation aimed at demonstrating methodologies of interest for exploration of strong direct associations within a radiological feature set. More extensive analyses of such associations are currently underway and will be presented in follow-on reports for sarcoma, NSCLC and other cancer types. This work will explore opportunities for simplification of prognostic models on the basis of relationships found within the combined clinical and agnostic feature set.

References

[1] H. Aerts, E. Velazquez, R. Leijenaar, C. Parmar, P. Grossmann, S. Carvalho, J. Bussink, R. Monshouwer, B. Haibe-Kains, D. Rietveld, F. Hoebers, M. Rietbergen, C. Leemans, A. Dekker, J. Quackenbush, R. Gillies, and O. Lambin, “Decoding tumour phenotype by noninvasive imaging using a quantitative radiomics approach,” Nat Commun, p. 4006, 2014.

[2] M. Soussan, F. Orlhac, M. Boubaya, L. Zelek, M. Ziol, V. Eder, and I. Buvat, “Relationship between tumor heterogeneity measured on FDG-PET/CT and pathological prognostic factors in invasive breast cancer,” PLoS ONE, vol. 9, no. 4, pp. 881–901, 2014.

[3] I. Buvat, F. Orlhac, and M. Soussan, “Tumor texture analysis in PET: Where do we stand?” J Nucl Med, vol. 56, no. 11, pp. 1642–1644, 2015.

[4] M.-C. Desseroit, D. Visvikis, F. Tixier, M. Majdoub, R. Perdrisot, R. Guillevin, C. C. L. Rest, and M. Hatt, “Development of a nomogram combining clinical staging with (18)f-fdg pet/ct image features in non-small-cell lung cancer stage i-iii,” Eur J Nucl Med Mol Imaging, vol. 43, no. 8, pp. 1477–1485, 2016.

[5] R. J. Gillies, P. E. Kinahan, and H. H. Hricak, “Ra- diomics: Images are more than pictures, they are data,” Radiology, vol. 278, no. 2, pp. 563–577, 2016.

[6] M. Hatt, F. Tixier, L. Pierce, P. E. Kinahan, C. C. L. Rest, and D. Visvikis, “Characterization of PET/CT images using texture analysis: the past, the present... any future?” Eur J Nucl Med Mol Imaging, vol. 44, pp. 151–165, 2017.

[7] M. Hatt, F. Tixier, D. Visvikis, and C. C. L. Rest, “Radiomics in PET/CT: More than meets the eye?” J Nucl Med, vol. 58, no. 3, pp. 365–366, 2017.

[8] R. Tibshirani, “The LASSO method for variable selec- tion in the Cox model,” Statistics in Medicine, vol. 16, pp. 385–395, 1997.

[9] M. G. I. Dimopoulos and S. Lek, “Review and compar- ison of methods to study the contribution of variables in artificial neural network models,” Ecological Modelling, vol. 160, pp. 249–264, 2003.

[10] T. Hastie, R. Tibshirani, and J. Friedman, The Elements of Statistical Learning. Springer, 2009.

[11] M. Hatt, C. Parmar, J. Qi, and I. E. Naqa, “Ma- chine (deep) learning methods for image processing and radiomics,” IEEE Trans Rad Plasma Med Sciences, vol. 3, no. 3, pp. 104–108, 2019.

[12] M. Asselin, J. O’Connor, R. Boellaard, N. Thacker, and A. Jackson, “Quantifying heterogeneity in human tumours using MRI and PET,” Eur J Cancer, vol. 48, pp. 447–455, 2012.

[13] M. Rahim, S. E. Kim, H. So, H. Kim, G. Cheon, E. Lee, K. Kang, and D. Lee, “Recent trends in PET image interpretations using volumetric and texturebased quantification methods in nuclear oncology,” Nucl Med Mol Imaging, vol. 48, pp. 1–15, 2014.

[14] L. Alic, W. J. Niessen, and J. F. Veeland, “Quantifica- tion of heterogeneity as a biomarker in tumor imaging: A systematic review,” PLoS ONE, vol. 9, no. 10, p. 15 pp., 2014.

[15] P.-P. Ypsilantis, M. Siddique, H.-M. Sohn, A. Davies, G. Cook, V. Goh, and G. Montana, “Predicting response to neoadjuvant chemotherapy with PET imaging using convolutional neural networks,” PLoS ONE, vol. 10, no. 9, p. 18pp, 2015. [Online]. Available: e0137036.https://doi.org/10.1371/journal. pone.0137036

[16] C. Lian, S. Ruan, T. Denoeux, F. Jardin, and P. Vera, “Selecting radiomic features from FDG-PET images for cancer treatment outcome prediction,” Medical Image Analysis, vol. 32, pp. 257–268, 2016.

[17] E. Wolsztynski, F. O’Sullivan, E. Keyes, J. O’Sullivan, and J. Eary, “Positron emission tomography-based assessment of metabolic gradient and other prognostic features in sarcoma,” J. Med. Imag., vol. 5, no. 2, p. 024502 (16 pages), 2018.

[18] E. Wolsztynski, J. OSullivan, N. M. Hughes, T. Mou, P. Murphy, F. OSullivan, and K. ORegan, “Combining structural and textural assessments of volumetric fdg-pet uptake in nsclc,” IEEE Transactions on Radiation and Plasma Medical Sciences, vol. 3, no. 4, pp. 421– 433, 2019.

[19] C. Parmar, R. Leijenaar, P. Grossmann, E. R. Ve- lazquez, J. Bussink, D. Rietveld, M. Rietbergen, B. Haibe-Kains, P. Lambin, and H. J. Aerts, “Radiomic feature clusters and prognostic signatures spe-cific for lung and head & neck cancer,” Sci. Rep., vol. 5, p. 11044, 2015.

[20] E. Math and S. Davies, Statistical Genomics, Methods and Protocols. Springer New-York, 2016.

[21] B. Falissard, “Focused principal component analysis: Looking at a correlation matrix with a particular interest in a given variable,” Journal of Computational and Graphical Statistics, vol. 8, no. 4, pp. 906–912, 1999.

[22] W. Zhao, Y. Xu, Z. Yang, Y. Sun, C. Li, L. Jin, P. Gao, W. He, P. Wang, H. Shi, Y. Hua, and M. Li, “Development and validation of a radiomics nomogram for

identifying invasiveness of pulmonary adenocarcinomas appearing as subcentimeter ground-glass opacity nodules,” Eur J Radiol, vol. 112, pp. 161–8, 2019.

[23] P. Grossmann, O. Stringfield, N. El-Hachem, M. M. Bui, E. R. Velazquez, C. Parmar, R. T. Leijenaar, B. Haibe-Kains, P. Lambin, R. J. Gillies et al., “Defin-ing the biological basis of radiomic phenotypes in lung cancer,” Elife, vol. 6, p. e23421, 2017.

[24] A. Zwanenburg, S. Leger, M. Vallires, and S. Lck, “Im- age biomarker standardisation initiative,” 2016.

[25] A. Zwanenburg, S. Leger, M. Valli`eres, and S. L¨ock, “Image biomarker standardisation initiative - feature definitions,” CoRR, vol. abs/1612.07003, 2016. [Online]. Available: http://arxiv.org/abs/1612.07003

[26] M. Yuan and Y. Lin, “Model selection and estimation in the Gaussian graphical model,” Biometrika, vol. 94, no. 1, pp. 19–35, 03 2007. [Online]. Available: https://doi.org/10.1093/biomet/asm018

[27] Giraud, Christophe, Huet, Sylvie, Verzelem, and Nicolas, “Graph selection with ggmselect,” arXiv:0907.0619, 2009. [Online]. Available: http://fr.arxiv.org/abs/0907.0619