b

DiscoverSearch
About
My stuff
DFDL: Discriminative Feature-oriented Dictionary Learning for Histopathological Image Classification
2015·arXiv
ABSTRACT
ABSTRACT

In histopathological image analysis, feature extraction for classification is a challenging task due to the diversity of histology features suitable for each problem as well as presence of rich geometrical structure. In this paper, we propose an automatic feature discovery framework for extracting discriminative class-specific features and present a low-complexity method for classification and disease grading in histopathology. Essentially, our Discriminative Feature-oriented Dictionary Learning (DFDL) method learns class-specific features which are suitable for representing samples from the same class while are poorly capable of representing samples from other classes. Experiments on three challenging real-world image databases: 1) histopathological images of intraductal breast lesions, 2) mammalian lung images provided by the Animal Diagnostics Lab (ADL) at Pennsylvania State University, and 3) brain tumor images from The Cancer Genome Atlas (TCGA) database, show the significance of DFDL model in a variety problems over state-of-the-art methods.

Index TermsHistopathological image classification, Sparse coding, Dictionary learning, Feature extraction

Automated histopathological image analysis has recently become a significant research problem in medical imaging and there is an increasing need for developing quantitative image analysis methods as a complement to the effort of pathologists in diagnosis process. Consequently, an emerging class of problems in medical imaging focuses on the the development of computerized frameworks to classify histopathological images [1–5]. These advanced image analysis methods have been developed with purpose of relieving the workload on pathologists by sieving out obviously diseased and also healthy cases, which allows specialists to spend more time on more sophisticated cases.

In the diagnosis process, pathologists often look for problem-specific visual cues in histopathological images in order to categorize a tissue image as one of the possible categories. Consequently, different customized feature

image

extraction techniques for a variety of problems have been developed based on these visual cues [6–10]. However, a challenging question in medical image analysis is how to extract these features. The challenge inherits from the richness of geometric structures in tissue imagery and the meaningful pathological information at diverse scales. Although several methods have been proposed for this crucial task, they are mostly exclusively designed for particular data sets and are highly dependent on preprocessing steps (e.g., color normalization and nuclear segmentation), limiting their performance on general histopathology problems. In order to mitigate the workload in preprocessing step and to develop a more general solution, we propose a dictionary learning method relying on a sparse representation-based framework that can automatically discover relevant features from raw medical images and can be applied to several histopathological data sets.

Sparse representation-based methods are powerful tools for image classification [11–13]. The underlying idea is that given a class of images and sufficient collection of bases, a test image can be expressed approximately as a sparse linear combination of bases. Representing signals using a set of learned bases instead of predefined bases, e.g. DCT and wavelet bases, has led to state-of-the-art results in various applications such as denoising, inpainting and classification [14–16]. To achieve a comprehensive set of bases, sparsity and task-driven constraints are combined together in several ways into optimization problems, which are called Dictionary Learning methods. For classification problems, the class-specific design of such dictionaries enables class assignment via a simple reconstruction error-based metric [17, 18]. In particular, GDDL [19] and LC-KSVD [20] enforced the label consistency needed between dictionary bases and training data for classification. Meanwhile, FDDL [21] encouraged coding coefficients to have small intra-class scatter but big inter-class scatter.

Sparsity-based classification schemes have also been proposed for medical applications, recently [22,23]. Specifically, Srinivas et al. [2,3] presented a multi-channel histopathological image as a sparse linear combination of training examples under channel-wise constraints and proposed a residualbased classification technique. In addition, Parvin et al. [4] combined a dictionary learning framework with a Restricted Boltzmann Machine to learn sparse features for classification.

Being mindful of the challenges of feature extraction of histopathological images, we aim to build discriminative bases for each class by imposing sparsity constraints on minimizing intra-class differences, while simultaneously emphasizing inter-class differences. Small intra-class differences encourage the comprehensibility of the set of learned bases, which has the ability of representing in-class samples with only few bases (intra class sparsity). Simultaneously, large inter-class differences prevent bases of a class from sparsely representing samples from other classes (complementary samples). This crucial property of learned bases would promote the discrimination ability of the sparse code (coefficient vector) for classification. Concretely, given a dictionary from a particular class D containing k bases and a certain number  L ≪ k, we define an L-subspace of D as a span of a subset of L bases from D. Our proposed Discriminative Feature-oriented Dictionary Learning (DFDL) aims to build dictionaries with this key property: any sample from a class can be reasonably close to an L-subspace of the associated dictionary while a complementary sample is far from any L-subspace of that dictionary. Illustration of the proposed idea is shown in Fig. 1.

Contributions: The main contributions of this paper are as follows: (1) A dictionary learning method for automatic feature discovery in histopathological images to mitigate the generally difficult problem of feature extraction in medical images. (2) Our framework is a discriminative dictionary learning method that emphasizes inter-class differences while keeping intra-class differences small, resulting in enhanced classification performance. (3) The proposed method is applied on three diverse histopathological data sets to show the capability of our method in handling a variety of diagnosis and grading problems. Extensive experimental results show that our method provides outstanding performances even with a small number of training images.

2.1. Notation

Suppose that we have c classes. The vectorization of a small block (or patch) of an image1, which will be referred as a sample, is denoted as a column vector  y ∈ Rd. For i = 1, 2, . . . , c, let  Yi ∈ Rd×Niand ¯Yi ∈ Rd× ¯Nibe matrices containing all data samples from class i and its complementary samples, respectively. We denote by  Di ∈Rd×kithe dictionary of class i.

For a code  s ∈ Rk, we denote by  ∥s∥0the number of its non-zeros. The sparsity constraint of s can be formulated as ∥s∥0 ≤ Lwith  L ≪ k. For a matrix  S , ∥S∥0 ≤ Lmeans that each column of S has no more than L non-zeros.

image

Fig. 1: Sparse representation space of a) predefined dictionary, e.g. DCT or Wavelet (SL0,ε0(Dpre)may cover all data), b) learned dictionary using in-class samples only, e.g. KSVD [15] or ODL [16] (SL1,ε1(Din-class)may also cover some complementary samples) and c) desired DFDL (SL2,ε2(DDFDL)cover in-class samples only).

2.2. Problem Formulation

We aim to build class-specific dictionaries  Disuch that each  Direasonably represents samples from class i but is poorly capable of representing its complementary samples. Concretely, for the learned dictionaries we need:

image

where  Licontrols the sparsity level and  ∥ • ∥Fdenotes the Frobenius norm. For simplicity, from now on, we consider only one class and drop the class index in each notion, i.e., using  Y, D, S, ¯S, N, ¯N, Linstead of  Yi, Di, Si, ¯Si, Ni, ¯Niand  Li. Based on the argument above, we formulate the optimization problem for each dictionary:

image

where  ρis a positive regularization parameter. The first term in the above optimization problem minimizes intra-class differences and the second term emphasizes inter-class differences. By solving the above problem, we can find the appropriate dictionaries as we desire.

In the same manner with SRC [11], a new patch y is classified as follows. Firstly, the sparse codes  ˆsare calculated via  l1-norm minimization:

image

where  Dtotal = [D1, D2, . . . , Dc]is the collection of all dictionaries and  λis a scalar constant. Secondly, the identity of y is determined as:  arg min1≤i≤c{δi(y)}where

image

and  δi(ˆs)is part of  ˆsassociated with class i.

Data:  Y, ¯Y: collection of all in-class samples and complementary samples. k: number of learned bases. ρ: regularization parameter. L: sparsity level Result: D: dictionary 1. Initializing D by randomly picking k columns of Y while not converged do 2. Fix D and update  S, ¯Sby solving Problem (2); 3. Fix  S, ¯S, calculate: E = 1N YST − ρ¯N¯Y¯ST ; F = 1N SST − ρ¯N¯S¯ST .

image

Algorithm 1: DFDL for sparse representation-based classification

2.3. Proposed solution

We use an iterative method to find the optimal solution for problem (1). Specifically, the process is iterative by fixing D while optimizing  S, ¯Sand vice versa.

At sparse coding step,  S∗, ¯S∗can be found by solving:

image

With the same dictionary, these two sparse coding problems can be combined into the following one:

image

with ˆY = [Y, ¯Y]being the matrix of all training samples and ˆS = [S, ¯S]. This sparse coding problem can be solved by OMP method [24] using SPAMS toolbox [25].

For the bases update stage,  D∗is found by solving:

image

using the method of block coordinate descent with a warm start to update bases one by one [16]. We have used the equation  ∥M∥2F =trace(MMT )to derive (4) from (3) and denoted  E = 1N YST − ρ¯N¯Y¯ST ; F = 1N SST − ρ¯N¯S¯ST .

In order to make the optimization problem tractable and solve it efficiently, we need one more requirement for  ρto ensure that  −2trace(EDT ) +trace(DT DF)is a convex function with respect to D, or in other words, the symmetric matrix F need to be positive semidefinite (PSD). If we let λ1(M) ≤ λ2(M) ≤ · · · ≤ λmax(M)be (real) eigenvalues of a symmetric matrix M, the PSD constraint of F is

image

Fig. 2: Samples from three data sets. Column 1: IBL data set, column 2: ADL data set, column 3: TCGA data set

equivalent to the non-negativity constraint of  λ1(F). Using Weyl’s inequalities, we can get lower bound for  λ1(F):λ0 =

N λ1(SST ) − ρ¯N λmax(¯S¯ST ) ≤ λ1(F). As a result, if  ρis guaranteed to be small enough such that  λ0 ≥ 0, then F is PSD. In fact, this problem is unstable and difficult to track since  S, ¯Sdepend on D. We propose a practical solution for this difficulty as follows. First, we start with a small value of  ρand check if F is PSD at each iteration. If so,  ρremains unchanged; otherwise,  ρwould be assigned to a smaller value, say,  0.9ρ. In our experiments,  ρ = 0.001gives good results. Our DFDL method is summarized in Algorithm 1.

In this section, we present the experimental results of applying DFDL to three diverse histopathological image databases (sample images are shown in Fig. 2.) and compare our results with those using SVM in conjunction with a collection of state-of-the-art histopathology features from WND-CHARM [7] (will be referred as WND-CHARM method), SRC [11], LA-SHIRC [3] and other DL methods (LC-KSVD [20] and FDDL [21]). In each experiment, 10000 20-by-20 patches are randomly extracted from training images for each class. Each Dictionary Learning method learns the same number of bases, say 500, per class.

IBL data set contains images which belong to either of two well-defined classes: usual ductal hyperplasia (UDH) and ductal carcinoma in situ (DCIS). Ground truth class labels for the images are assigned manually by the pathologists. A total of 40 patient cases – 20 well-defined DCIS and 20 UDH – are identified for experiments in the manner described in [26]. Each case contains a number of regions of interest (RoIs), and we have chosen a total of 120 images (RoIs), consisting of a randomly selected set of 20 images for training and the remaining 100 RoIs for test. Images are downsampled for computational reduction purpose such that size of a cell is

image

Fig. 3: Examples of learned bases from (a) UDH and (b) DCIS dictionaries.

around 20-by-20 (pixel). In classification step, an image is decomposed into non-overlapping patches and it is classified as Healthy if proportion of classified-as-healthy patches is higher than a threshold. In order to mitigate the issue of well-chosen training sets, we perform 10 different trials of each experiment with an arbitrary choice of training images. All results reported next are the average of the classification accuracies over 10 trials.

Learned bases corresponding to the two classes are visualized in Fig. 3. The average classification accuracy for each method is shown in Table 1. It is evident from the table that DFDL outperforms others, offering a classification accuracy of nearly 100 percent in recognizing DCIS and just under 97 percent in UDH. It means that by using DFDL method, the probability of miss is extremely low while that of false alarm is kept at a low level. In order to illustrate the efficiency of DFDL, we keep number of training patches and learned bases but reduce the number of training images by half. Noticeably, DFDL still shows outstanding results compared to the other methods.

ADL-Lung data set: This database contains bovine histopathology images of lung acquired by pathologists at the Animal Diagnostics Lab, Pennsylvania State University. These images are scanned using a whole slide digital scanner at 40x optical magnification and are of size  4000 × 3000pixels. For the purpose of computational speed-up, all images are downsampled to  400 × 300pixels in an aliasing-free manner. This database consists of images from two classes: healthy and inflammatory. Each class has 150 images from which 40 images are chosen for training, the remaining ones are used for testing. The averaging experiment results over 10 trials for different methods are presented in Table 1. Apparently, SHIRC and LC-KSVD are moderately suitable to detect inflammatory while WND-CHARM only provides a fairly high-performance in recognizing healthy organs. In contrast, DFDL offers the best accuracy in both detecting Healthy and Inflamed organs with more than 92 percent in the former and over 94 percent in the latter.

Table 1: Classification accuracies: IBL and ADL Lung

image

Table 2: Confusion matrix: MVP and Not MVP

image

TCGA data set: In this section, we present experimental results on the brain cancer histopathological images obtained from TCGA database [27] provided by the National Institute of Health. One important indicator of a high grade glioma is presence of MicroVascular Proliferation (MVP). Essentially MVP is presence of proliferation of hypertrophic endothelial cells in the tissue. An example of a tissue containing MVP regions is illustrated in Fig. 2f. In this paper, we applied our method to find MVP regions with a slight modification to the decision procedure. This modification is crucial to obtain desirable performance using our algorithm since MVP detection is an inherently more difficult problem because of the complexity of cell structure and morphological features of the cells in MVP regions.

Unlike classifying images in IBL and ADL Lung data set which are distinguishable by researching small regions, MVP detection requires more effort because an MVP region might be surrounded by tumor cells which are actually low grade. We define a patch as MVP if it lies entirely within an MVP region and as Not MVP otherwise. A new image is divided into non-overlapping patches and it is classified as MVP only if it contains a sufficiently large number of neighboring classified-as-MVP patches.

We use a total of 178 images (resolution  1800 × 1800) from the TCGA, 52 for MVP and 126 for not MVP and 20 images are randomly selected from each class for training. We manually extracted MVP and Not MVP regions from training images and randomly extract training patches from these regions for learning. Experimental results for DFDL and WND-CHARM are presented in Table 2. The Table also shows promising performance of DFDL in detecting MVP with a accuracy over three percent more than those for state-of-the-art WND-CHARM features.

[1] M.N. Gurcan, L.E. Boucheron, A. Can, A. Madabhushi, N.M. Rajpoot, and B. Yener, “Histopathological image analysis: a review,” IEEE Rev. Biomed. Eng., vol. 2, 2009.

[2] U. Srinivas, H. S. Mousavi, C. Jeon, V. Monga, A. Hat- tel, and B. Jayarao, “SHIRC: A simultaneous sparsity model for histopathological image representation and classification,” Proc. IEEE Int. Symp. Biomed. Imag., pp. 1118–1121, Apr. 2013.

[3] U. Srinivas, H. S. Mousavi, V. Monga, A. Hattel, and B. Jayarao, “Simultaneous sparsity model for histopathological image representation and classifica-tion,” IEEE Trans. on Medical Imaging, vol. 33, no. 5, pp. 1163–1179, May 2014.

[4] N. Nayak, H. Chang, A. Borowsky, P. Spellman, and B. Parvin, “Classification of tumor histopathology via sparse feature learning,” in Proc. IEEE Int. Symp. Biomed. Imag., 2013, pp. 1348–1351.

[5] H. S. Mousavi, V. Monga, A. UK Rao, and G. Rao, “Automated discrimination of lower and higher grade gliomas based on histopathological image analysis,” Journal of Pathology Informatics, 2015.

[6] N. Orlov, L. Shamir, T. Macuraand, J. Johnston, D.M. Eckley, and I.G. Goldberg, “WND-CHARM: Multipurpose image classification using compound image transforms,” Pattern Recogn. Lett., vol. 29, no. 11, pp. 1684–1693, 2008.

[7] L. Shamir, N. Orlov, D.M. Eckley, T. Macura, J. Johnston, and I.G. Goldberg, “Wndchrm–an open source utility for biological image analysis,” Source Code Biol. Med., vol. 3, no. 13, 2008.

[8] T. Gultekin, C. Koyuncu, C. Sokmensuer, and C. Gunduz-Demir, “Two-tier tissue decomposition for histopathological image representation and classifica-tion,” IEEE Trans. on Medical Imaging, vol. 34, pp. 275–283.

[9] J. Shi, Y. Li, J. Zhu, H. Sun, and Y. Cai, “Joint sparse coding based spatial pyramid matching for classification of color medical image,” Computerized Medical Imaging and Graphics, 2014.

[10] SH. Minaee, Y. Wang, and Y. W. Lui, “Prediction of longterm outcome of neuropsychological tests of mtbi patients using imaging features,” in Signal Proc. in Med. and Bio. Symp. IEEE, 2013.

[11] J. Wright, A.Y. Yang, A. Ganesh, S.S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” IEEE Trans. on Pattern Analysis and Machine Int., vol. 31, no. 2, pp. 210–227, Feb. 2009.

[12] S. Bahrampour, A. Ray, N.M. Nasrabadi, and K.W. Jenkins, “Quality-based multimodal classification using tree-structured sparsity,” in Proc. IEEE Conf. Computer Vision Pattern Recognition, 2014, pp. 4114–4121.

[13] H. S. Mousavi, U. Srinivas, V. Monga, Y. Suo, M. Dao, and T.D. Tran, “Multi-task image classification via collaborative, hierarchical spike-and-slab priors,” in Proc. IEEE Conf. on Image Processing, 2014, pp. 4236– 4240.

[14] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned

dictionaries,” Image Processing, IEEE Transactions on, vol. 15, no. 12, pp. 3736–3745, 2006.

[15] M. Aharon, M. Elad, and A. Bruckstein, “K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation,” IEEE Trans. on Signal Processing, vol. 54, no. 11, pp. 4311–4322, 2006.

[16] Julien Mairal, Francis Bach, Jean Ponce, and Guillermo Sapiro, “Online dictionary learning for sparse coding,” in Proc. International Conference on Machine Learning. ACM, 2009, pp. 689–696.

[17] F. Anaraki Pourkamali and Sh. M. Hughes, “Kernel compressive sensing,” in Proc. IEEE Conf. on Image Processing, 2013, pp. 494–498.

[18] M. Sadeghi, M. Babaie-Zadeh, and C. Jutten, “Learning overcomplete dictionaries based on atom-by-atom updating,” IEEE Trans. on Signal Processing, vol. 62, no. 4, pp. 883–891, 2014.

[19] Y. Suo, M. Dao, T. Tran, H. Mousavi, U. Srinivas, and V. Monga, “Group structured dirty dictionary learning for classification,” in Proc. IEEE Conf. on Image Processing, 2014, pp. 150–154.

[20] Z. Jiang, Z. Lin, and L.S. Davis, “Label consistent K-SVD: Learning a discriminative dictionary for recognition,” IEEE Trans. on Pattern Analysis and Machine Int., vol. 35, no. 11, pp. 2651–2664, 2013.

[21] M. Yang, L. Zhang, X. Feng, and D. Zhang, “Fisher discrimination dictionary learning for sparse representation,” in Proc. IEEE Conf. on Computer Vision, Nov. 2011, pp. 543–550.

[22] M. Liu, L. Lu, X. Ye, S. Yu, and M. Salganicoff, “Sparse classification for computer aided diagnosis using learned dictionaries,” Proc. IEEE Int. Symp. Biomed. Imag., vol. 6893, pp. 41–48, 2011.

[23] S. Zhang, J. Huang, D. Metaxas, W. Wang, and X. Huang, “Discriminative sparse representations for cervigram image segmentation,” Proc. IEEE Int. Symp. Biomed. Imag., pp. 133–136, 2010.

[24] J.A. Tropp and A.C. Gilbert, “Signal recovery from random measurements via orthogonal matching pursuit,” IEEE Trans. on Info. Theory, vol. 53, no. 12, pp. 4655–4666, 2007.

[25] “SPArse Modeling Software,” http:// spams-devel.gforge.inria.fr/, Accessed: 2014-11-05.

[26] M. M. Dundar, S. Badve, G. Bilgin, V. Raykar, R. Jain, O. Sertel, and M. N. Gurcan, “Computerized classifica-tion of intraductal breast lesions using histopathological images,” IEEE Trans. on Signal Processing, vol. 58, no. 7, pp. 1977–1984, 2011.

[27] National Institute of Health, “The Cancer Genome Atlas (TCGA) database,” http://cancergenome.nih. gov, Accessed: 2014-11-09.


Designed for Accessibility and to further Open Science