DFDL: Discriminative Feature-oriented Dictionary Learning for Histopathological Image Classification

2015·Arxiv

ABSTRACT

ABSTRACT

In histopathological image analysis, feature extraction for classification is a challenging task due to the diversity of histology features suitable for each problem as well as presence of rich geometrical structure. In this paper, we propose an automatic feature discovery framework for extracting discriminative class-specific features and present a low-complexity method for classification and disease grading in histopathology. Essentially, our Discriminative Feature-oriented Dictionary Learning (DFDL) method learns class-specific features which are suitable for representing samples from the same class while are poorly capable of representing samples from other classes. Experiments on three challenging real-world image databases: 1) histopathological images of intraductal breast lesions, 2) mammalian lung images provided by the Animal Diagnostics Lab (ADL) at Pennsylvania State University, and 3) brain tumor images from The Cancer Genome Atlas (TCGA) database, show the significance of DFDL model in a variety problems over state-of-the-art methods.

Index Terms— Histopathological image classification, Sparse coding, Dictionary learning, Feature extraction

1. INTRODUCTION

Automated histopathological image analysis has recently become a significant research problem in medical imaging and there is an increasing need for developing quantitative image analysis methods as a complement to the effort of pathologists in diagnosis process. Consequently, an emerging class of problems in medical imaging focuses on the the development of computerized frameworks to classify histopathological images [1–5]. These advanced image analysis methods have been developed with purpose of relieving the workload on pathologists by sieving out obviously diseased and also healthy cases, which allows specialists to spend more time on more sophisticated cases.

In the diagnosis process, pathologists often look for problem-specific visual cues in histopathological images in order to categorize a tissue image as one of the possible categories. Consequently, different customized feature

extraction techniques for a variety of problems have been developed based on these visual cues [6–10]. However, a challenging question in medical image analysis is how to extract these features. The challenge inherits from the richness of geometric structures in tissue imagery and the meaningful pathological information at diverse scales. Although several methods have been proposed for this crucial task, they are mostly exclusively designed for particular data sets and are highly dependent on preprocessing steps (e.g., color normalization and nuclear segmentation), limiting their performance on general histopathology problems. In order to mitigate the workload in preprocessing step and to develop a more general solution, we propose a dictionary learning method relying on a sparse representation-based framework that can automatically discover relevant features from raw medical images and can be applied to several histopathological data sets.

Sparse representation-based methods are powerful tools for image classification [11–13]. The underlying idea is that given a class of images and sufficient collection of bases, a test image can be expressed approximately as a sparse linear combination of bases. Representing signals using a set of learned bases instead of predefined bases, e.g. DCT and wavelet bases, has led to state-of-the-art results in various applications such as denoising, inpainting and classification [14–16]. To achieve a comprehensive set of bases, sparsity and task-driven constraints are combined together in several ways into optimization problems, which are called Dictionary Learning methods. For classification problems, the class-specific design of such dictionaries enables class assignment via a simple reconstruction error-based metric [17, 18]. In particular, GDDL [19] and LC-KSVD [20] enforced the label consistency needed between dictionary bases and training data for classification. Meanwhile, FDDL [21] encouraged coding coefficients to have small intra-class scatter but big inter-class scatter.

Sparsity-based classification schemes have also been proposed for medical applications, recently [22,23]. Specifically, Srinivas et al. [2,3] presented a multi-channel histopathological image as a sparse linear combination of training examples under channel-wise constraints and proposed a residualbased classification technique. In addition, Parvin et al. [4] combined a dictionary learning framework with a Restricted Boltzmann Machine to learn sparse features for classification.

Being mindful of the challenges of feature extraction of histopathological images, we aim to build discriminative bases for each class by imposing sparsity constraints on minimizing intra-class differences, while simultaneously emphasizing inter-class differences. Small intra-class differences encourage the comprehensibility of the set of learned bases, which has the ability of representing in-class samples with only few bases (intra class sparsity). Simultaneously, large inter-class differences prevent bases of a class from sparsely representing samples from other classes (complementary samples). This crucial property of learned bases would promote the discrimination ability of the sparse code (coefficient vector) for classification. Concretely, given a dictionary from a particular class D containing k bases and a certain number , we define an L-subspace of D as a span of a subset of L bases from D. Our proposed Discriminative Feature-oriented Dictionary Learning (DFDL) aims to build dictionaries with this key property: any sample from a class can be reasonably close to an L-subspace of the associated dictionary while a complementary sample is far from any L-subspace of that dictionary. Illustration of the proposed idea is shown in Fig. 1.

Contributions: The main contributions of this paper are as follows: (1) A dictionary learning method for automatic feature discovery in histopathological images to mitigate the generally difficult problem of feature extraction in medical images. (2) Our framework is a discriminative dictionary learning method that emphasizes inter-class differences while keeping intra-class differences small, resulting in enhanced classification performance. (3) The proposed method is applied on three diverse histopathological data sets to show the capability of our method in handling a variety of diagnosis and grading problems. Extensive experimental results show that our method provides outstanding performances even with a small number of training images.

2. DISCRIMINATIVE DICTIONARY LEARNING

2.1. Notation

Suppose that we have c classes. The vectorization of a small block (or patch) of an image1, which will be referred as a sample, is denoted as a column vector . For i = 1, 2, . . . , c, let and be matrices containing all data samples from class i and its complementary samples, respectively. We denote by the dictionary of class i.

For a code , we denote by the number of its non-zeros. The sparsity constraint of s can be formulated as with . For a matrix means that each column of S has no more than L non-zeros.

Fig. 1: Sparse representation space of a) predefined dictionary, e.g. DCT or Wavelet (may cover all data), b) learned dictionary using in-class samples only, e.g. KSVD [15] or ODL [16] (may also cover some complementary samples) and c) desired DFDL (cover in-class samples only).

2.2. Problem Formulation

We aim to build class-specific dictionaries such that each reasonably represents samples from class i but is poorly capable of representing its complementary samples. Concretely, for the learned dictionaries we need:

where controls the sparsity level and denotes the Frobenius norm. For simplicity, from now on, we consider only one class and drop the class index in each notion, i.e., using instead of and . Based on the argument above, we formulate the optimization problem for each dictionary:

where is a positive regularization parameter. The first term in the above optimization problem minimizes intra-class differences and the second term emphasizes inter-class differences. By solving the above problem, we can find the appropriate dictionaries as we desire.

In the same manner with SRC [11], a new patch y is classified as follows. Firstly, the sparse codes are calculated via -norm minimization:

where is the collection of all dictionaries and is a scalar constant. Secondly, the identity of y is determined as: where

and is part of associated with class i.

Data: : collection of all in-class samples and complementary samples. k: number of learned bases. : regularization parameter. L: sparsity level Result: D: dictionary 1. Initializing D by randomly picking k columns of Y while not converged do 2. Fix D and update by solving Problem (2); 3. Fix , calculate:

Algorithm 1: DFDL for sparse representation-based classification

2.3. Proposed solution

We use an iterative method to find the optimal solution for problem (1). Specifically, the process is iterative by fixing D while optimizing and vice versa.

At sparse coding step, can be found by solving:

With the same dictionary, these two sparse coding problems can be combined into the following one:

with being the matrix of all training samples and . This sparse coding problem can be solved by OMP method [24] using SPAMS toolbox [25].

For the bases update stage, is found by solving:

using the method of block coordinate descent with a warm start to update bases one by one [16]. We have used the equation traceto derive (4) from (3) and denoted

In order to make the optimization problem tractable and solve it efficiently, we need one more requirement for to ensure that tracetraceis a convex function with respect to D, or in other words, the symmetric matrix F need to be positive semidefinite (PSD). If we let be (real) eigenvalues of a symmetric matrix M, the PSD constraint of F is

Fig. 2: Samples from three data sets. Column 1: IBL data set, column 2: ADL data set, column 3: TCGA data set

equivalent to the non-negativity constraint of . Using Weyl’s inequalities, we can get lower bound for

. As a result, if is guaranteed to be small enough such that , then F is PSD. In fact, this problem is unstable and difficult to track since depend on D. We propose a practical solution for this difficulty as follows. First, we start with a small value of and check if F is PSD at each iteration. If so, remains unchanged; otherwise, would be assigned to a smaller value, say, . In our experiments, gives good results. Our DFDL method is summarized in Algorithm 1.

3. EXPERIMENTAL RESULTS

In this section, we present the experimental results of applying DFDL to three diverse histopathological image databases (sample images are shown in Fig. 2.) and compare our results with those using SVM in conjunction with a collection of state-of-the-art histopathology features from WND-CHARM [7] (will be referred as WND-CHARM method), SRC [11], LA-SHIRC [3] and other DL methods (LC-KSVD [20] and FDDL [21]). In each experiment, 10000 20-by-20 patches are randomly extracted from training images for each class. Each Dictionary Learning method learns the same number of bases, say 500, per class.

IBL data set contains images which belong to either of two well-defined classes: usual ductal hyperplasia (UDH) and ductal carcinoma in situ (DCIS). Ground truth class labels for the images are assigned manually by the pathologists. A total of 40 patient cases – 20 well-defined DCIS and 20 UDH – are identified for experiments in the manner described in [26]. Each case contains a number of regions of interest (RoIs), and we have chosen a total of 120 images (RoIs), consisting of a randomly selected set of 20 images for training and the remaining 100 RoIs for test. Images are downsampled for computational reduction purpose such that size of a cell is

Fig. 3: Examples of learned bases from (a) UDH and (b) DCIS dictionaries.

around 20-by-20 (pixel). In classification step, an image is decomposed into non-overlapping patches and it is classified as Healthy if proportion of classified-as-healthy patches is higher than a threshold. In order to mitigate the issue of well-chosen training sets, we perform 10 different trials of each experiment with an arbitrary choice of training images. All results reported next are the average of the classification accuracies over 10 trials.

Learned bases corresponding to the two classes are visualized in Fig. 3. The average classification accuracy for each method is shown in Table 1. It is evident from the table that DFDL outperforms others, offering a classification accuracy of nearly 100 percent in recognizing DCIS and just under 97 percent in UDH. It means that by using DFDL method, the probability of miss is extremely low while that of false alarm is kept at a low level. In order to illustrate the efficiency of DFDL, we keep number of training patches and learned bases but reduce the number of training images by half. Noticeably, DFDL still shows outstanding results compared to the other methods.

ADL-Lung data set: This database contains bovine histopathology images of lung acquired by pathologists at the Animal Diagnostics Lab, Pennsylvania State University. These images are scanned using a whole slide digital scanner at 40x optical magnification and are of size pixels. For the purpose of computational speed-up, all images are downsampled to pixels in an aliasing-free manner. This database consists of images from two classes: healthy and inflammatory. Each class has 150 images from which 40 images are chosen for training, the remaining ones are used for testing. The averaging experiment results over 10 trials for different methods are presented in Table 1. Apparently, SHIRC and LC-KSVD are moderately suitable to detect inflammatory while WND-CHARM only provides a fairly high-performance in recognizing healthy organs. In contrast, DFDL offers the best accuracy in both detecting Healthy and Inflamed organs with more than 92 percent in the former and over 94 percent in the latter.

Table 1: Classification accuracies: IBL and ADL Lung

Table 2: Confusion matrix: MVP and Not MVP

TCGA data set: In this section, we present experimental results on the brain cancer histopathological images obtained from TCGA database [27] provided by the National Institute of Health. One important indicator of a high grade glioma is presence of MicroVascular Proliferation (MVP). Essentially MVP is presence of proliferation of hypertrophic endothelial cells in the tissue. An example of a tissue containing MVP regions is illustrated in Fig. 2f. In this paper, we applied our method to find MVP regions with a slight modification to the decision procedure. This modification is crucial to obtain desirable performance using our algorithm since MVP detection is an inherently more difficult problem because of the complexity of cell structure and morphological features of the cells in MVP regions.

Unlike classifying images in IBL and ADL Lung data set which are distinguishable by researching small regions, MVP detection requires more effort because an MVP region might be surrounded by tumor cells which are actually low grade. We define a patch as MVP if it lies entirely within an MVP region and as Not MVP otherwise. A new image is divided into non-overlapping patches and it is classified as MVP only if it contains a sufficiently large number of neighboring classified-as-MVP patches.

We use a total of 178 images (resolution ) from the TCGA, 52 for MVP and 126 for not MVP and 20 images are randomly selected from each class for training. We manually extracted MVP and Not MVP regions from training images and randomly extract training patches from these regions for learning. Experimental results for DFDL and WND-CHARM are presented in Table 2. The Table also shows promising performance of DFDL in detecting MVP with a accuracy over three percent more than those for state-of-the-art WND-CHARM features.

4. REFERENCES

[1] M.N. Gurcan, L.E. Boucheron, A. Can, A. Madabhushi, N.M. Rajpoot, and B. Yener, “Histopathological image analysis: a review,” IEEE Rev. Biomed. Eng., vol. 2, 2009.

[2] U. Srinivas, H. S. Mousavi, C. Jeon, V. Monga, A. Hat- tel, and B. Jayarao, “SHIRC: A simultaneous sparsity model for histopathological image representation and classification,” Proc. IEEE Int. Symp. Biomed. Imag., pp. 1118–1121, Apr. 2013.

[3] U. Srinivas, H. S. Mousavi, V. Monga, A. Hattel, and B. Jayarao, “Simultaneous sparsity model for histopathological image representation and classifica-tion,” IEEE Trans. on Medical Imaging, vol. 33, no. 5, pp. 1163–1179, May 2014.

[4] N. Nayak, H. Chang, A. Borowsky, P. Spellman, and B. Parvin, “Classification of tumor histopathology via sparse feature learning,” in Proc. IEEE Int. Symp. Biomed. Imag., 2013, pp. 1348–1351.

[5] H. S. Mousavi, V. Monga, A. UK Rao, and G. Rao, “Automated discrimination of lower and higher grade gliomas based on histopathological image analysis,” Journal of Pathology Informatics, 2015.

[6] N. Orlov, L. Shamir, T. Macuraand, J. Johnston, D.M. Eckley, and I.G. Goldberg, “WND-CHARM: Multipurpose image classification using compound image transforms,” Pattern Recogn. Lett., vol. 29, no. 11, pp. 1684–1693, 2008.

[7] L. Shamir, N. Orlov, D.M. Eckley, T. Macura, J. Johnston, and I.G. Goldberg, “Wndchrm–an open source utility for biological image analysis,” Source Code Biol. Med., vol. 3, no. 13, 2008.

[8] T. Gultekin, C. Koyuncu, C. Sokmensuer, and C. Gunduz-Demir, “Two-tier tissue decomposition for histopathological image representation and classifica-tion,” IEEE Trans. on Medical Imaging, vol. 34, pp. 275–283.

[9] J. Shi, Y. Li, J. Zhu, H. Sun, and Y. Cai, “Joint sparse coding based spatial pyramid matching for classification of color medical image,” Computerized Medical Imaging and Graphics, 2014.

[10] SH. Minaee, Y. Wang, and Y. W. Lui, “Prediction of longterm outcome of neuropsychological tests of mtbi patients using imaging features,” in Signal Proc. in Med. and Bio. Symp. IEEE, 2013.

[11] J. Wright, A.Y. Yang, A. Ganesh, S.S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE Trans. on Pattern Analysis and Machine Int., vol. 31, no. 2, pp. 210–227, Feb. 2009.

[12] S. Bahrampour, A. Ray, N.M. Nasrabadi, and K.W. Jenkins, “Quality-based multimodal classification using tree-structured sparsity,” in Proc. IEEE Conf. Computer Vision Pattern Recognition, 2014, pp. 4114–4121.

[13] H. S. Mousavi, U. Srinivas, V. Monga, Y. Suo, M. Dao, and T.D. Tran, “Multi-task image classification via collaborative, hierarchical spike-and-slab priors,” in Proc. IEEE Conf. on Image Processing, 2014, pp. 4236– 4240.

[14] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned

dictionaries,” Image Processing, IEEE Transactions on, vol. 15, no. 12, pp. 3736–3745, 2006.

[15] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. on Signal Processing, vol. 54, no. 11, pp. 4311–4322, 2006.

[16] Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro, “Online dictionary learning for sparse coding,” in Proc. International Conference on Machine Learning. ACM, 2009, pp. 689–696.

[17] F. Anaraki Pourkamali and Sh. M. Hughes, “Kernel compressive sensing,” in Proc. IEEE Conf. on Image Processing, 2013, pp. 494–498.

[18] M. Sadeghi, M. Babaie-Zadeh, and C. Jutten, “Learning overcomplete dictionaries based on atom-by-atom updating,” IEEE Trans. on Signal Processing, vol. 62, no. 4, pp. 883–891, 2014.

[19] Y. Suo, M. Dao, T. Tran, H. Mousavi, U. Srinivas, and V. Monga, “Group structured dirty dictionary learning for classification,” in Proc. IEEE Conf. on Image Processing, 2014, pp. 150–154.

[20] Z. Jiang, Z. Lin, and L.S. Davis, “Label consistent K-SVD: Learning a discriminative dictionary for recognition,” IEEE Trans. on Pattern Analysis and Machine Int., vol. 35, no. 11, pp. 2651–2664, 2013.

[21] M. Yang, L. Zhang, X. Feng, and D. Zhang, “Fisher discrimination dictionary learning for sparse representation,” in Proc. IEEE Conf. on Computer Vision, Nov. 2011, pp. 543–550.

[22] M. Liu, L. Lu, X. Ye, S. Yu, and M. Salganicoff, “Sparse classification for computer aided diagnosis using learned dictionaries,” Proc. IEEE Int. Symp. Biomed. Imag., vol. 6893, pp. 41–48, 2011.

[23] S. Zhang, J. Huang, D. Metaxas, W. Wang, and X. Huang, “Discriminative sparse representations for cervigram image segmentation,” Proc. IEEE Int. Symp. Biomed. Imag., pp. 133–136, 2010.

[24] J.A. Tropp and A.C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans. on Info. Theory, vol. 53, no. 12, pp. 4655–4666, 2007.

[25] “SPArse Modeling Software,” http:// spams-devel.gforge.inria.fr/, Accessed: 2014-11-05.

[26] M. M. Dundar, S. Badve, G. Bilgin, V. Raykar, R. Jain, O. Sertel, and M. N. Gurcan, “Computerized classifica-tion of intraductal breast lesions using histopathological images,” IEEE Trans. on Signal Processing, vol. 58, no. 7, pp. 1977–1984, 2011.

[27] National Institute of Health, “The Cancer Genome Atlas (TCGA) database,” http://cancergenome.nih. gov, Accessed: 2014-11-09.

designed for accessibility and to further open science