Face Verification via learning the kernel matrix

2020·Arxiv

Abstract

Abstract

The kernel function is introduced to solve the nonlinear pattern recognition problem. The advantage of a kernel method often depends critically on a proper choice of the kernel function. A promising approach is to learn the kernel from data automatically. Over the past few years, some methods which have been proposed to learn the kernel have some limitations: learning the parameters of some prespecified kernel function and so on. In this paper, the nonlinear face verification via learning the kernel matrix is proposed. A new criterion is used in the new algorithm to avoid inverting the possibly singular within-class which is a computational problem. The experimental results obtained on the facial database XM2VTS using the Lausanne protocol show that the verification performance of the new method is superior to that of the primary method Client Specific Kernel Discriminant Analysis (CSKDA). The method CSKDA needs to choose a proper kernel function through many experiments, while the new method could learn the kernel from data automatically which could save a lot of time and have the robust performance.

Keywords kernel function; learning the kernel matrix; CSKDA; face verification

1Introduction

Face Verification is the problem which judges the face of the image is whether or not the assigner[1].

Feature extraction is a key step for face verification[2-7]. Turk et al proposed the classical

eigen face method[8]which extract feature using the principal component analysis (PCA) and get the better result. However this method just considers the second order statistical information and mislays the data’s high order information and the nonlinear relation [9]. Through the research, we find that the high order statistical information include the pixels nonlinear relation of the image edge or the curve [10]. So the kernel operation is introduced to solve the problem, and then Scholkopf[10] proposed the nonlinear spread kernel PCA and Mika[11] proposed the nonlinear spread of LDA

In the application of face verification, Kittler introduced the Client Specific to the face verification based on the LDA in order to get the better performance [12-15]. After that, Yuan, Wu and Kittler introduced the kernel operation to extend the LDA method based on the Client Specific and proposed some methods about the Client Specific [16-17] to get the better results.

However the advantage of a kernel method often depends critically on a proper choice of the kernel function. Early work on kernel learning is limited to learning the parameters of some prespecified kernel function [18]More recent work has gone beyond kernel parameter learning by learning the kernel itself in a more nonparametric manner. In practice, since we work with data sets of finite size, we can learn the kernel matrix corresponding to a given date set instead of learning the kernel function [19]-[24].

The method kernel matrix learning proposed by Dit-Yan et al [24] is based on optimizing the Fisher criterion and use it to classification get the better result. The paper proposed face verification via learning the kernel matrix based on that and be applied to CSKDA[16]. The experimental results obtained on the facial database show that the verification performance of the new method is superior to that of the primary method Client Specific Kernel Discriminant Analysis (CSKDA). The method CSKDA need to choose a proper kernel function through many experiments, while the new method could learn the kernel from data automatically which could save a lot of time and have the robust performance. In this paper the second part introduces the method CSKDA; The third part writes up the paper’s method; The forth part gives the experimental results and analysis, and finally we get the conclusion.

denotes a M-dimensional real space. It is assumed that each image belongs to one of the C

from the input space to a high-dimensional feature space F, where different classes of faces are linear separable with great potentiality. Let us now consider the problem of discriminating class

from all the other classes. In the context of the face verification problem this corresponds to

discriminating between i-th client and imposters modeled by all the other clients in the training data set. Given the mean vector of i-th class as

available in the face database. In the samples some is in the c classes and some is not in the

base kernel matrices of rank-one. We define a parameterized family of kernel matrices as

3.2 Optimization[19]

is the mean vector of i-th class, m is the mean vector of all training images

However, we can see from equation (5) that the inverse of the within-class scatter matrix

Similarly, we can rewrite that

where

spectral variant solution degenerates to having only one base kernel. Apparently this is not what

nonlinear fractional programming problem [25], we define the following criterion function:

Where 0is a parameter that can be determined. The optimal value of (17) is given by

The learned kernel matrix can then be used to face verification of CSKDA.

In small sample cases where the dimensionality of the data exceeds the cardinality of the training set, LDA has to be preceded by a dimensionality reduction in order to avoid the problem of rank deficiency of the population scatter matrix. The between-class scatter matrix is

defined as

rewritten as follows

decreasing order of eigenvalues. We only use its first _m b eigenvectors:

Then the eigenvectors of is

3.3.2 calculating projecting vectors

consider the problem of discriminating class from all the other classes. In the context of the

face verification problem this corresponds to discriminating between i-th client and imposters modeled by all the other clients in the training data set. In this two class scenario, LDA involves finding one dimensional feature space. Given the mean vector of i-th class as:

impostors of i-th class is

In the rest of the section, we propose to utilize an equivalent Fisher criterion function [11]

the population scatter matrix.

The solution to the problem can be found easily as

Thus the overall client i specific discriminant transformation , which defines the client

specific fisher face of the claimed identity, is given as

3.3.3 Classification

where

The classification based on client model: the distance between the testing sample and the i-th

claim is rejected, otherwise the claimed identity is accepted.

The classification based on imposter model: the distance between the testing sample and the

threshold the claim is accepted, otherwise the claimed identity is rejected.

In order to test the performance of the proposed algorithm, face verification experiments have been conducted on the XM2VTS database, which is a multi-modal database consisting of video sequences of talking faces recorded for 295 subjects at one month intervals. The data has been recorded in 4 sessions with 2 shots taken per session [14]. From each session two facial images have been extracted to create an experimental face database of size 55 51. Figure 1 shows examples of images in XM2VTS.

Fig 1 Part of the extracted images in

XM2VTS The experimental protocol (known as Lausanne evaluation protocol) divides the data set into

200 clients and 95 impostors[14]. Within the protocol, the verification performance is measured using false acceptance and false rejection rates. The operating point where these two error rates equal each other is typically referred to as the equal error rate (EER) point. All the results were obtained using histogram equalization (HEQ) in conjunction with a global threshold determined by the EER point.

This paper use the minimize distance classifier. In the evaluation process, we modified the threshold to get the FAR and FRR same in order to require the final threshold. And then the paper use the final threshold to testing the method.

The comparison between the new method and CSKDA use the Client model and the imposter model classification and make them to OnC and OnI for short. In the table, TER=FAR+FRR, denotes the total error rate.

Because of the kernel operation, we should consider choosing the kernel function and its

follows with the different parameters:

Tab.2 use the RBF kernel function

different parameters:

The results imply that the new method could get the better performance through learning the

kernel matrix in the different parameters . The experimental results obtained on the facial database show that the verification performance of the new method is superior to that of the primary method Client Specific Kernel Discriminant Analysis (CSKDA). The classification based on imposter model is better than the client model. Moreover, there is a special relationship between FAR and FRR. When one larger, the other smaller. The method CSKDA need to choose a proper kernel function through many experiments, while the new method could learn the kernel from data automatically which could save a lot of time and have the robust performance.

The kernel function is introduced to solve the nonlinear pattern recognition problem. The advantage of a kernel method often depends critically on a proper choice of the kernel function. A promising approach is to learn the kernel from data automatically. Over the past few years, some methods which have been proposed to learn the kernel have some limitations: learning the parameters of some prespecified kernel function and so on. In this paper, the nonlinear face verification via learning the kernel matrix is proposed. The method CSKDA need to choose a proper kernel function through many experiments, while the new method could learn the kernel from data automatically which could save a lot of time and have the robust performance.

[1]YOU Yuanyuan,WU Xiaojun. Face Verification Based on Combination of Individual Eigenface Subspace and SVD[J].Journal of East China Shipbuilding Institute (Natural Science Edition),2005,19(1):44-48

[2]BIAN Zhao-qi, Zhang Xue-gong. Pattern Recognition (second edition) [M]. Beijing: Tsinghua University Press,1999,176-197

[3]Xiao-Jun Wu, Josef Kittler, Jing-Yu Yang, et al. A new direct LDA (D-LDA) algorithm for feature extraction in face recognition[C]//Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004. IEEE, 2004, 4: 545-548.

[4]Yu-Jie Zheng, Jing-Yu Yang, Jian Yang, et al. Nearest neighbour line nonparametric discriminant analysis for feature extraction[J]. Electronics Letters, 2006, 42(12): 679-680.

[5]Yu-Jie Zheng, Jiang-Yu Yang, Jian Yang, et al. A reformative kernel Fisher discriminant algorithm and its application to face recognition[J]. Neurocomputing, 2006, 69(13-15):

1806-1810.

[6]Zhen-Hua Feng, Josef Kittler, M. Awais, et al. Face detection, bounding box aggregation and pose estimation for robust facial landmark localisation in the wild[C]//Proceedings of the IEEE

conference on computer vision and pattern recognition workshops. 2017: 160-169. [7]Pual Koppen, Zhen-Hua Feng, Josef Kittler, et al. Gaussian mixture 3D morphable face model[J]. Pattern Recognition, 2018, 74: 617-628. [8]M.Turk, A.Pentland. Eigenfaces for recognition [J]. Cognitive Neuroscience,1991,3(1):71-86 [9]HUANG Guohong, SHAO Huihe. Kernel Principal Component Analysis and Application in Face Recognition[J]. Computer Engineering, 2004, 30(13):13-14 [10]B.Scholkopf, A. Smola, K.R.Muller. Nonlinear component analysis as a kernel eigenvalue problem [J]. Neural Computation,1998,10: 1299-1319 [11]S.Mika, G.Ratsch, J.Weston, B.Scholkopf, K.Muller. Fisher discriminant analysis with

kernels [J]IEEE Neural Networks for Signal Processing Workshop, 1999:41-48. [12]J.Kittler. Face authentication using client specific fisherfaces[J] (patented in the UK), CVSSP, University of Surrey, 2001 [13]Wu Xiaojun On dimensionality reduction for client specific discriminant analysis with application to face verification[J].LNCS,2004, 3338,308-316 [14]K Messer , J Kittler, J Luettin and G Maitre , XM2VTSDB:The Extended M2VTS Database Proc. of AVBPA99, 1999,72-77

[15]Y.P. Li. LDA and its application to face identification[J], PhD thesis, CVSSP, University of Surrey, 2000

[16]Wu Xiaojun Client Specific Kernel Discriminant Analysis (CSKDA) Algorithm for Face Verification[J]. Neural Networks and Brain, ICNN&B’05. International Conference. 2005,3,1511-1515.

[17]Yuan Ning, WuXiao etc, Face Verification Based on Combination of Modular 2DPCA and CSLDA[J]. Journal of Computer Research and Development.2008,45(6),1029-1035

[18]O. Chapelle, V. Vapnik, O. Bousquet, S. Mukherjee, Choosing multiple parameters for support vector machines[J], Mach. Learn. 2002,46 (1–3) ,131–159.

[19]G. Lanckriet, N. Cristianini, P. Bartlett, L. El Ghaoui, M.I. Jordan, Learning the kernel matrix with semi-definite programming[J] Proceedings of the 19th International Conference on Machine Learning, Sydney, Australia, 8(12)2002,323–330.

[20]F.R. Bach, G.R.G. Lanckriet, M.I. Jordan, Multiple kernel learning, conic duality, and the SMO algorithm[J], in: Proceedings of the 21st International Conference on Machine Learning, Banff, Alberta, Canada, 4–8 July 2004, 41–48.

[21]X. Zhu, J. Kandola, Z. Ghahramani, J. Lafferty, Nonparametric transforms of graph kernels for semi-supervised learning[J] L.K. Saul, Y. Weiss, L. Bottou (Eds.), Advances in Neural Information Processing Systems, vol. 17, MIT Press, Cambridge, MA, USA, 2005, 1641–1648.

[22]Z. Zhang, D.Y. Yeung, J.T. Kwok, Bayesian inference for transductive learning of kernel matrix using the Tanner-Wong data augmentation algorithm[J], in: Proceedings of the 21st International Conference on Machine Learning, Banff, Alberta, Canada, 4(8) 2004, 935–942.

[23]H. Xiong, M.N.S. Swamy, M.O. Ahmad, Optimizing the kernel in the empirical feature space[J]IEEE Transactions on Neural Networks 16(2) 2005 460–474

[24]Dit-Yan Yeung, Hong Chang, Guang Dai. Learning the kernel matrix by maximizing a KFD-based class separability criterion[J]. PatternRecognition 40 (2007),2021-2028

[25]I.M. Stancu-Minasian, Fractional Programming: Theory, Methods and Applications[J], Kluwer Academic Publishers, Dordrecht, The Netherlands, 1997