36:[["$","audio",null,{"id":"tts"}],["$","$L3b",null,{"paperID":"1307.4145","publisher":"arxiv","paperJSON":{"title":"A Safe Screening Rule for Sparse Logistic Regression","paperID":"1307.4145","avgLineHeight":11.95,"imgScale":4,"sections":[{"heading":"Abstract","paragraphs":[[{"text":"The ","element":"span"},{"style":{"height":6.4},"width":30.22,"height":16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-0.png","element":"img","alt":" ℓ1","inline":true},{"text":"-regularized logistic regression (or sparse logistic regression) is a widely used method for simultaneous classification and feature selection. Although many recent efforts have been devoted to its efficient implementation, its application to high dimensional data still poses significant challenges. In this paper, we present a fast and effective ","element":"span"},{"style":{"fontWeight":"bold"},"text":"s","element":"span"},{"text":"parse ","element":"span"},{"style":{"fontWeight":"bold"},"text":"lo","element":"span"},{"text":"gistic ","element":"span"},{"style":{"fontWeight":"bold"},"text":"re","element":"span"},{"text":"gression ","element":"span"},{"style":{"fontWeight":"bold"},"text":"s","element":"span"},{"text":"creening rule (Slores) to identify the “0” components in the solution vector, which may lead to a substantial reduction in the number of features to be entered to the optimization. An appealing feature of Slores is that the data set needs to be scanned only once to run the screening and its computational cost is negligible compared to that of solving the sparse logistic regression problem. Moreover, Slores is independent of solvers for sparse logistic regression, thus Slores can be integrated with any existing solver to improve the efficiency. We have evaluated Slores using high-dimensional data sets from different applications. Extensive experimental results demonstrate that Slores outperforms the existing state-of-the-art screening rules and the efficiency of solving sparse logistic regression is improved by one magnitude in general.","element":"span"}]]},{"heading":"1 Introduction","paragraphs":[[{"text":"Logistic regression (LR) is a popular and well established classification method that has been widely used in many domains such as machine learning ","element":"span"},{"href":"#id-0","referenceIndex":5,"text":"[5, ","element":"a"},{"href":"#id-1","referenceIndex":8,"text":"8]","element":"a"},{"text":", text mining ","element":"span"},{"href":"#id-2","referenceIndex":4,"text":"[4, ","element":"a"},{"href":"#id-3","referenceIndex":9,"text":"9]","element":"a"},{"text":", image processing ","element":"span"},{"href":"#id-4","referenceIndex":10,"text":"[10, ","element":"a"},{"href":"#id-5","referenceIndex":17,"text":"17]","element":"a"},{"text":", bioinformatics ","element":"span"},{"href":"#id-6","referenceIndex":1,"text":"[1, ","element":"a"},{"href":"#id-7","referenceIndex":15,"text":"15, ","element":"a"},{"href":"#id-8","referenceIndex":22,"text":"22, ","element":"a"},{"href":"#id-9","referenceIndex":30,"text":"30, ","element":"a"},{"href":"#id-10","referenceIndex":31,"text":"31]","element":"a"},{"text":", medical and social sciences ","element":"span"},{"href":"#id-11","referenceIndex":2,"text":"[2, ","element":"a"},{"href":"#id-12","referenceIndex":19,"text":"19] ","element":"a"},{"text":"etc. When the number of feature variables is large compared to the number of training samples, logistic regression is prone to over-fitting. To reduce over-fitting, regularization has been shown to be a promising approach. ","element":"span"},{"text":"Typical examples include ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-1.png","element":"img","alt":" ℓ2","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":7.6},"width":32.61,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-2.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularization. Although ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-3.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR is more challenging to solve compared to ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-4.png","element":"img","alt":" ℓ2","inline":true,"padRight":true},{"text":"regularized LR, it has received much attention in the last few years and the interest in it is growing ","element":"span"},{"href":"#id-13","referenceIndex":23,"text":"[23, ","element":"a"},{"href":"#id-14","referenceIndex":27,"text":"27, ","element":"a"},{"href":"#id-10","referenceIndex":31,"text":"31] ","element":"a"},{"text":"due to the increasing prevalence of high-dimensional data. The most appealing property of ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-5.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR is the sparsity of the resulting models, which is equivalent to feature selection.","element":"span"}],[{"text":"In the past few years, many algorithms have been proposed to efficiently solve the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-6.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR ","element":"span"},{"href":"#id-15","referenceIndex":6,"text":"[6, ","element":"a"},{"href":"#id-16","referenceIndex":14,"text":"14, ","element":"a"},{"href":"#id-17","referenceIndex":13,"text":"13, ","element":"a"},{"href":"#id-18","referenceIndex":20,"text":"20]","element":"a"},{"text":". ","element":"span"},{"text":"However, for large-scale problems, solving the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-7.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR with higher accuracy remains challenging. ","element":"span"},{"text":"One promising solution is by “screening”, that is, we first identify the “","element":"span"},{"style":{"fontStyle":"italic"},"text":"inactive","element":"span"},{"text":"” features, which have 0 coefficients in the solution and then discard them from the optimization. ","element":"span"},{"text":"This would result in a reduced feature matrix and substantial savings in computational cost and memory size. In ","element":"span"},{"href":"#id-19","referenceIndex":7,"text":"[7]","element":"a"},{"text":", El Ghaoui ","element":"span"},{"style":{"fontStyle":"italic"},"text":"et al. ","element":"span"},{"text":"proposed novel screening rules, called “SAFE”, to accelerate the optimization for a class of ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-8.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized problems, including LASSO ","element":"span"},{"href":"#id-20","referenceIndex":25,"text":"[25]","element":"a"},{"text":", ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-9.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR and ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-10.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized support vector machines. Inspired by SAFE, Tibshirani ","element":"span"},{"style":{"fontStyle":"italic"},"text":"et al. ","element":"span"},{"href":"#id-21","referenceIndex":24,"text":"[24] ","element":"a"},{"text":"proposed “strong rules” for a large class of ","element":"span"},{"style":{"height":7.6},"width":32.61,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-11.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized problems, including LASSO, elastic net, ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/0-12.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR and more general convex problems. In ","element":"span"},{"href":"#id-22","referenceIndex":29,"text":"[29, ","element":"a"},{"href":"#id-23","referenceIndex":28,"text":"28]","element":"a"},{"text":", Xiang et al. proposed “DOME” rules to further improve SAFE rules for LASSO based on the observation that SAFE rules can be understood as a special case of the general “sphere test”. Although both strong rules and the sphere tests are more effective in discarding features than SAFE for solving LASSO, it is worthwhile to mention that strong rules may mistakenly discard features that have non-zero coefficients in the solution and the sphere tests are not easy to be generalized to handle the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-0.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR. To the best of our knowledge, the SAFE rule is the only screening test for the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-1.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR that is “safe”, that is, it only discards features that are guaranteed to be absent from the resulting models.","element":"span"}],[{"style":{"width":"39%"},"width":743,"height":530,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-2.png","element":"img"}],[{"id":"id-26","text":"Figure 1: Comparison of Slores, strong rule ","element":"figcaption","subtype":"caption"},{"text":"and SAFE on the prostate cancer data set.","element":"figcaption","subtype":"caption"}],[{"text":"In this paper, we develop novel screening rules, called “Slores”, for the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-3.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR. The proposed screening tests detect inactive features by estimating an upper bound of the inner product between each feature vector and the “dual optimal solution” of the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-4.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR, which is unknown. The more accurate the estimation is, the more inactive features can be detected. An accurate estimation of such an upper bound turns out to be quite challenging. Indeed most of the key ideas/insights behind existing “safe” screening rules for LASSO heavily rely on the least square loss, which are not applicable for the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-5.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR case due to the presence of the logistic loss. To this end, we propose a novel framework to accurately estimate an upper bound. Our key technical contribution is to formulate the estimation of an upper bound of the inner product as a constrained convex optimization problem and show that it admits a closed form solution. Therefore, the","element":"span"}],[{"text":"estimation of the inner product can be computed efficiently. Our extensive experiments have shown that Slores discards far more features than SAFE yet requires much less computational efforts. In contrast with strong rules, Slores is “safe”, i.e., it never discards features which have non-zero coefficients in the solution. To illustrate the effectiveness of Slores, we compare Slores, strong rule and SAFE on a data set of prostate cancer along a sequence of 86 parameters equally spaced on the ","element":"span"},{"style":{"height":16},"width":130,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-6.png","element":"img","alt":" λ/λmax","inline":true,"padRight":true},{"text":"scale from 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"},{"text":"1 to 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"},{"text":"95, where ","element":"span"},{"style":{"height":10.8},"width":23,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-7.png","element":"img","alt":" λ","inline":true,"padRight":true},{"text":"is the parameter for the ","element":"span"},{"style":{"height":7.6},"width":32.61,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-8.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"penalty and ","element":"span"},{"style":{"height":13.19},"width":86.83,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-9.png","element":"img","alt":" λmax","inline":true,"padRight":true},{"text":"is the smallest tuning parameter ","element":"span"},{"href":"#id-24","referenceIndex":12,"text":"[12] ","element":"a"},{"text":"such that the solution of the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-10.png","element":"img","alt":"ℓ1","inline":true,"padRight":true},{"text":"regularized LR is 0 [please refer to Eq. ","element":"span"},{"href":"#id-25","text":"(1)","element":"a"},{"text":"]. The data matrix contains 132 patients with 15154 features. To measure the performance of different screening rules, we compute the rejection ratio which is the ratio between the number of features discarded by screening rules and the number of features with 0 coefficients in the solution. Therefore, the larger the rejection ratio is, the more effective the screening rule is. The results are shown in Fig. ","element":"span"},{"href":"#id-26","text":"1. ","element":"a"},{"text":"Clearly, Slores discards far more features than SAFE especially when ","element":"span"},{"style":{"height":16},"width":130,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-11.png","element":"img","alt":" λ/λmax","inline":true,"padRight":true},{"text":"is large while the strong rule is not applicable when ","element":"span"},{"style":{"height":16},"width":216.1,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-12.png","element":"img","alt":" λ/λmax ≤ 0.","inline":true},{"text":"5. We present more experimental results and discussions to demonstrate the effectiveness of Slores in Section ","element":"span"},{"text":"6.","element":"span"}]]},{"heading":"2 Basics and Motivations","paragraphs":[[{"text":"In this section, we briefly review the basics of the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-13.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR and then motivate the general screening rules via the KKT conditions. Suppose we are given a set of training samples ","element":"span"},{"style":{"height":16.15},"width":129.04,"height":40.37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-14.png","element":"img","alt":" {xi}mi=1","inline":true,"padRight":true},{"text":"and the associate ","element":"span"},{"text":"labels ","element":"span"},{"style":{"height":11.6},"width":132.28,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-15.png","element":"img","alt":" b ∈ ℜm","inline":true},{"text":", where ","element":"span"},{"style":{"height":12.97},"width":133.27,"height":32.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-16.png","element":"img","alt":" xi ∈ ℜp","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":16},"width":208.89,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-17.png","element":"img","alt":" bi ∈ {1, −1}","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":16},"width":247.16,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-18.png","element":"img","alt":" i ∈ {1, . . . , m}","inline":true},{"text":". The ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-19.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized logistic regression is:","element":"span"}],[{"id":"id-25","style":{"width":"72%"},"width":1352,"height":100,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-20.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":14.4},"width":129.21,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-21.png","element":"img","alt":" β ∈ ℜp","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":9.6},"width":96.04,"height":24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-22.png","element":"img","alt":" c ∈ ℜ","inline":true,"padRight":true},{"text":"are the model parameters to be estimated, ¯","element":"span"},{"style":{"height":13.19},"width":166.23,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-23.png","element":"img","alt":"xi = bixi","inline":true},{"text":", and ","element":"span"},{"style":{"height":11.6},"width":70.34,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-24.png","element":"img","alt":" λ >","inline":true,"padRight":true},{"text":"0 is the tuning parameter. Let the data matrix be ","element":"span"},{"style":{"height":12.59},"width":182.32,"height":31.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-25.png","element":"img","alt":" X ∈ ℜm×p","inline":true,"padRight":true},{"text":"with the ","element":"span"},{"style":{"height":13.39},"width":44.78,"height":33.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-26.png","element":"img","alt":" ith","inline":true,"padRight":true},{"text":"row being ¯","element":"span"},{"style":{"height":9.59},"width":35.18,"height":23.97,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-27.png","element":"img","alt":"xi","inline":true,"padRight":true},{"text":"and the ","element":"span"},{"style":{"height":16.59},"width":49.74,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-28.png","element":"img","alt":" jth","inline":true,"padRight":true},{"text":"column being ¯","element":"span"},{"style":{"height":12.99},"width":37.19,"height":32.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-29.png","element":"img","alt":"xj","inline":true},{"text":".","element":"span"}],[{"text":"Let ","element":"span"},{"style":{"height":16},"width":228.81,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-30.png","element":"img","alt":" C = {θ ∈ ℜm","inline":true,"padRight":true},{"text":": ","element":"span"},{"style":{"height":13.19},"width":71.83,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-31.png","element":"img","alt":" θi ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1)","element":"span"},{"style":{"fontStyle":"italic"},"text":", i ","element":"span"},{"text":"= 1","element":"span"},{"style":{"fontStyle":"italic"},"text":", . . . , m","element":"span"},{"style":{"fontStyle":"italic"},"text":"} ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"f","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"y","element":"span"},{"text":") = ","element":"span"},{"style":{"fontStyle":"italic"},"text":"y ","element":"span"},{"text":"log(","element":"span"},{"style":{"fontStyle":"italic"},"text":"y","element":"span"},{"text":") + (1 ","element":"span"},{"style":{"height":10},"width":60.56,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-32.png","element":"img","alt":" − y","inline":true},{"text":") log(1 ","element":"span"},{"style":{"height":10},"width":60.56,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-33.png","element":"img","alt":" − y","inline":true},{"text":") for ","element":"span"},{"style":{"height":12},"width":60.82,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-34.png","element":"img","alt":" y ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1). The dual problem of ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-35.png","element":"img","alt":"λ","inline":true},{"text":") (please refer to the supplement) is given by","element":"span"}],[{"style":{"width":"79%"},"width":1482,"height":121,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/1-36.png","element":"img"}],[{"text":"To simplify notations, we denote the feasible set of problem ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-0.png","element":"img","alt":"λ","inline":true},{"text":") as ","element":"span"},{"style":{"height":13.59},"width":47.65,"height":33.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-1.png","element":"img","alt":" Fλ","inline":true},{"text":", and let (","element":"span"},{"style":{"height":15.71},"width":97.4,"height":39.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-2.png","element":"img","alt":"β∗λ, c∗λ","inline":true},{"text":") and ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-3.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"be the ","element":"span"},{"text":"optimal solutions of problems ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-4.png","element":"img","alt":"λ","inline":true},{"text":") and ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-5.png","element":"img","alt":"λ","inline":true},{"text":") respectively. In ","element":"span"},{"href":"#id-24","referenceIndex":12,"text":"[12]","element":"a"},{"text":", the authors have shown that for some special choice of the tuning parameter ","element":"span"},{"style":{"height":10.8},"width":23,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-6.png","element":"img","alt":" λ","inline":true},{"text":", both of ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-7.png","element":"img","alt":"λ","inline":true},{"text":") and ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-8.png","element":"img","alt":"λ","inline":true},{"text":") have closed form solutions. In fact, let ","element":"span"},{"style":{"height":16},"width":180.28,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-9.png","element":"img","alt":" P = {i : bi","inline":true,"padRight":true},{"text":"= 1","element":"span"},{"style":{"height":16.4},"width":359.27,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-10.png","element":"img","alt":"}, N = {i : bi = −1}","inline":true},{"text":", and ","element":"span"},{"style":{"height":12.98},"width":59.99,"height":32.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-11.png","element":"img","alt":" m+","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":8.98},"width":59.99,"height":22.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-12.png","element":"img","alt":" m−","inline":true,"padRight":true},{"text":"be the cardinalities of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"P ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"N ","element":"span"},{"text":"respectively. We define","element":"span"}],[{"style":{"width":"60%"},"width":1141,"height":51,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-13.png","element":"img"}],[{"text":"where","element":"span"}],[{"id":"id-30","style":{"width":"69%"},"width":1293,"height":121,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-14.png","element":"img"}],[{"text":"([","element":"span"},{"style":{"height":16},"width":33.14,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-15.png","element":"img","alt":"·]i","inline":true,"padRight":true},{"text":"denotes the ","element":"span"},{"style":{"height":13.38},"width":44.78,"height":33.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-16.png","element":"img","alt":" ith","inline":true,"padRight":true},{"text":"component of a vector.) Then, it is known ","element":"span"},{"href":"#id-24","referenceIndex":12,"text":"[12] ","element":"a"},{"text":"that ","element":"span"},{"style":{"height":15.72},"width":41.54,"height":39.29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-17.png","element":"img","alt":" β∗λ","inline":true,"padRight":true},{"text":"= 0 and ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-18.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":17.09},"width":93.5,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-19.png","element":"img","alt":" θ∗λmax","inline":true,"padRight":true},{"text":"whenever ","element":"span"},{"style":{"height":13.2},"width":163.21,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-20.png","element":"img","alt":"λ ≥ λmax","inline":true},{"text":". When ","element":"span"},{"style":{"height":11.6},"width":61.32,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-21.png","element":"img","alt":" λ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-22.png","element":"img","alt":", λmax","inline":true},{"text":"], it is known that ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-23.png","element":"img","alt":"λ","inline":true},{"text":") has a unique optimal solution. (For completeness, we include the proof in the supplement.) We can now write the KKT conditions of problems ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-24.png","element":"img","alt":"λ","inline":true},{"text":") and ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-25.png","element":"img","alt":"λ","inline":true},{"text":") as","element":"span"}],[{"id":"id-28","style":{"width":"74%"},"width":1394,"height":168,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-26.png","element":"img"}],[{"text":"In view of Eq. ","element":"span"},{"href":"#id-28","text":"(3)","element":"a"},{"text":", we can see that","element":"span"}],[{"id":"id-29","style":{"width":"62%"},"width":1174,"height":48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-27.png","element":"img"}],[{"text":"In other words, if ","element":"span"},{"style":{"height":17.5},"width":254.84,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-28.png","element":"img","alt":" |⟨θ∗λ, ¯xj⟩ < mλ","inline":true},{"text":", then the KKT conditions imply that the coefficient of ¯","element":"span"},{"style":{"height":12.98},"width":37.19,"height":32.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-29.png","element":"img","alt":"xj","inline":true,"padRight":true},{"text":"in the solution ","element":"span"},{"style":{"height":15.71},"width":41.54,"height":39.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-30.png","element":"img","alt":"β∗λ","inline":true,"padRight":true},{"text":"is 0 and thus the ","element":"span"},{"style":{"height":16.59},"width":49.74,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-31.png","element":"img","alt":" jth","inline":true,"padRight":true},{"text":"feature can be safely removed from the optimization of ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-32.png","element":"img","alt":"λ","inline":true},{"text":"). However, for the ","element":"span"},{"text":"general case in which ","element":"span"},{"style":{"height":13.19},"width":163.21,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-33.png","element":"img","alt":" λ < λmax","inline":true},{"text":", ","element":"span"},{"href":"#id-29","text":"(R1) ","element":"a"},{"text":"is not applicable since it assumes the knowledge of ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-34.png","element":"img","alt":" θ∗λ","inline":true},{"text":". Although it is ","element":"span"},{"text":"unknown, we can still estimate a region ","element":"span"},{"style":{"height":13.99},"width":50.82,"height":34.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-35.png","element":"img","alt":" Aλ","inline":true,"padRight":true},{"text":"which contains ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-36.png","element":"img","alt":" θ∗λ","inline":true},{"text":". As a result, if max","element":"span"},{"style":{"height":16.98},"width":314.52,"height":42.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-37.png","element":"img","alt":"θ∈A |⟨θ, ¯xj⟩| < mλ","inline":true},{"text":", we can ","element":"span"},{"text":"also conclude that [","element":"span"},{"style":{"height":16.79},"width":67.5,"height":41.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-38.png","element":"img","alt":"β∗λ]j","inline":true,"padRight":true},{"text":"= 0 by ","element":"span"},{"href":"#id-29","text":"(R1)","element":"a"},{"text":". In other words, ","element":"span"},{"href":"#id-29","text":"(R1) ","element":"a"},{"text":"can be relaxed as","element":"span"}],[{"style":{"width":"70%"},"width":1323,"height":66,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-39.png","element":"img"}],[{"text":"In this paper, ","element":"span"},{"href":"#id-29","text":"(R1","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-40.png","element":"img","alt":"′","inline":true},{"text":") serves as the foundation for constructing our screening rules, Slores. From ","element":"span"},{"href":"#id-29","text":"(R1","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-41.png","element":"img","alt":"′","inline":true},{"text":"), it is easy to see that screening rules with smaller ","element":"span"},{"style":{"height":17.5},"width":138.82,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-42.png","element":"img","alt":" T(θ∗λ, ¯xj","inline":true},{"text":") are more aggressive in discarding features. To give a ","element":"span"},{"text":"tight estimation of ","element":"span"},{"style":{"height":17.5},"width":138.82,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-43.png","element":"img","alt":" T(θ∗λ, ¯xj","inline":true},{"text":"), we need to restrict the region ","element":"span"},{"style":{"height":13.99},"width":50.82,"height":34.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-44.png","element":"img","alt":" Aλ","inline":true,"padRight":true},{"text":"which includes ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-45.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"as small as possible. In ","element":"span"},{"text":"Section ","element":"span"},{"text":"3, ","element":"span"},{"text":"we show that the estimation of the upper bound ","element":"span"},{"style":{"height":17.5},"width":138.82,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-46.png","element":"img","alt":" T(θ∗λ, ¯xj","inline":true},{"text":") can be obtained via solving a convex ","element":"span"},{"text":"optimization problem. We show in Section ","element":"span"},{"text":"4 ","element":"span"},{"text":"that the convex optimization problem admits a closed form solution and derive Slores in Section ","element":"span"},{"text":"5 ","element":"span"},{"text":"based on ","element":"span"},{"href":"#id-29","text":"(R1","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-47.png","element":"img","alt":"′","inline":true},{"text":").","element":"span"}]]},{"heading":"3 Estimating the Upper Bound via Solving a Convex Optimiza-","paragraphs":[[{"style":{"width":"19%"},"width":371,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-48.png","element":"img"}],[{"text":"In this section, we present a novel framework to estimate an upper bound ","element":"span"},{"style":{"height":17.5},"width":138.82,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-49.png","element":"img","alt":" T(θ∗λ, ¯xj","inline":true},{"text":") of ","element":"span"},{"style":{"height":17.5},"width":151.36,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-50.png","element":"img","alt":" |⟨θ∗λ, ¯xj⟩|","inline":true},{"text":". In the ","element":"span"},{"text":"subsequent development, we assume a parameter ","element":"span"},{"style":{"height":13.19},"width":39.25,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-51.png","element":"img","alt":" λ0","inline":true,"padRight":true},{"text":"and the corresponding dual optimal ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-52.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"text":"are given. In ","element":"span"},{"text":"our Slores rule to be presented in Section ","element":"span"},{"text":"5, ","element":"span"},{"text":"we set ","element":"span"},{"style":{"height":13.19},"width":39.25,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-53.png","element":"img","alt":" λ0","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-54.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"text":"to be ","element":"span"},{"style":{"height":13.19},"width":86.83,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-55.png","element":"img","alt":" λmax","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.09},"width":93.5,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-56.png","element":"img","alt":" θ∗λmax","inline":true,"padRight":true},{"text":"given in Eqs. ","element":"span"},{"href":"#id-25","text":"(1) ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-30","text":"(2)","element":"a"},{"text":". We formulate the estimation of ","element":"span"},{"style":{"height":17.5},"width":138.82,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-57.png","element":"img","alt":" T(θ∗λ, ¯xj","inline":true},{"text":") as a constrained convex optimization problem in this section, ","element":"span"},{"text":"which will be shown to admit a closed form solution in Section ","element":"span"},{"text":"4.","element":"span"}],[{"text":"For the dual function ","element":"span"},{"style":{"height":16},"width":54.93,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-58.png","element":"img","alt":" g(θ","inline":true},{"text":"), it follows that [","element":"span"},{"style":{"height":19.37},"width":221.52,"height":48.43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-59.png","element":"img","alt":"∇g(θ)]i = 1m","inline":true,"padRight":true},{"text":"log( ","element":"span"},{"style":{"height":21.23},"width":66.97,"height":53.07,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-60.png","element":"img","alt":"θi1−θi","inline":true,"padRight":true},{"text":")","element":"span"},{"style":{"height":22.17},"width":546.9,"height":55.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-61.png","element":"img","alt":", [∇2g(θ)]i,i = 1m 1θi(1−θi) ≥ 4m.","inline":true,"padRight":true},{"text":"Since ","element":"span"},{"style":{"height":17.38},"width":106.02,"height":43.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-62.png","element":"img","alt":"∇2g(θ","inline":true},{"text":") is a diagonal matrix, it follows that ","element":"span"},{"style":{"height":19.37},"width":244.01,"height":48.43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-63.png","element":"img","alt":" ∇2g(θ) ⪰ 4mI","inline":true},{"text":", where ","element":"span"},{"style":{"fontStyle":"italic"},"text":"I ","element":"span"},{"text":"is the identity matrix. Thus, ","element":"span"},{"style":{"height":16},"width":54.94,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-64.png","element":"img","alt":" g(θ","inline":true},{"text":") is ","element":"span"},{"text":"strongly convex with modulus ","element":"span"},{"style":{"height":10},"width":24,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-65.png","element":"img","alt":" µ","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":19.37},"width":28,"height":48.43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/2-66.png","element":"img","alt":"4m","inline":true,"padRight":true},{"href":"#id-31","referenceIndex":18,"text":"[18]","element":"a"},{"text":". Rigorously, we have the following lemma.","element":"span"}],[{"id":"id-32","style":{"fontWeight":"bold"},"text":"Lemma 1. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Let ","element":"span"},{"style":{"height":11.6},"width":65.31,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-0.png","element":"img","alt":" λ >","inline":true,"padRight":true},{"text":"0 ","element":"span"},{"style":{"fontStyle":"italic"},"text":"and ","element":"span"},{"style":{"height":14.4},"width":187.23,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-1.png","element":"img","alt":" θ1, θ2 ∈ Fλ","inline":true},{"style":{"fontStyle":"italic"},"text":", then","element":"span"}],[{"style":{"width":"88%"},"width":1661,"height":131,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-2.png","element":"img"}],[{"text":"Given ","element":"span"},{"style":{"height":11.6},"width":61.66,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-3.png","element":"img","alt":" λ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":56.96,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-4.png","element":"img","alt":", λ0","inline":true},{"text":"], it is easy to see that both of ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-5.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-6.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"text":"belong to ","element":"span"},{"style":{"height":15.19},"width":61.55,"height":37.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-7.png","element":"img","alt":" Fλ0","inline":true},{"text":". Therefore, Lemma ","element":"span"},{"href":"#id-32","text":"1 ","element":"a"},{"text":"can be a ","element":"span"},{"text":"useful tool to bound ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-8.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"with the knowledge of ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-9.png","element":"img","alt":" θ∗λ0","inline":true},{"text":". In fact, we have the following theorem.","element":"span"}],[{"id":"id-36","style":{"fontWeight":"bold"},"text":"Theorem 2. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Let ","element":"span"},{"style":{"height":13.2},"width":301.6,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-10.png","element":"img","alt":" λmax ≥ λ0 > λ >","inline":true,"padRight":true},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", then the following holds:","element":"span"}],[{"style":{"width":"92%"},"width":1740,"height":160,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-11.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"a). It is easy to see that ","element":"span"},{"style":{"height":15.71},"width":328.39,"height":39.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-12.png","element":"img","alt":" Fλ ⊆ Fλ0, θ∗λ ∈ Fλ","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.31},"width":165.54,"height":43.27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-13.png","element":"img","alt":" θ∗λ0 ∈ Fλ0","inline":true},{"text":". Therefore, both of ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-14.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-15.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"belong to ","element":"span"},{"text":"the set ","element":"span"},{"style":{"height":15.18},"width":61.54,"height":37.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-16.png","element":"img","alt":" Fλ0","inline":true},{"text":". By Lemma ","element":"span"},{"href":"#id-32","text":"1, ","element":"a"},{"text":"we have","element":"span"}],[{"style":{"width":"99%"},"width":1867,"height":240,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-17.png","element":"img"}],[{"text":"Therefore, we can see that ","element":"span"},{"style":{"height":13.59},"width":135.95,"height":33.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-18.png","element":"img","alt":" θλ ∈ Fλ","inline":true,"padRight":true},{"text":"and thus","element":"span"}],[{"id":"id-33","style":{"width":"36%"},"width":687,"height":77,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-19.png","element":"img"}],[{"text":"Then the inequality in ","element":"span"},{"href":"#id-33","text":"(6) ","element":"a"},{"text":"becomes","element":"span"}],[{"style":{"width":"77%"},"width":1460,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-20.png","element":"img"}],[{"text":"On the other hand, by noting that ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-21.png","element":"img","alt":"λ","inline":true},{"text":") is feasible, we can see that the Slater’s conditions holds and thus the KKT conditions ","element":"span"},{"href":"#id-34","referenceIndex":21,"text":"[21] ","element":"a"},{"text":"lead to:","element":"span"}],[{"id":"id-35","style":{"width":"74%"},"width":1402,"height":118,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-22.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":18.55},"width":335.45,"height":46.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-23.png","element":"img","alt":" η+, η− ∈ ℜp+, γ ∈ ℜ","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":16.51},"width":105.91,"height":41.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-24.png","element":"img","alt":" NC(θ∗λ","inline":true},{"text":") is the normal cone of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C ","element":"span"},{"text":"at ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-25.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"href":"#id-34","referenceIndex":21,"text":"[21]","element":"a"},{"text":". Because ","element":"span"},{"style":{"height":15.5},"width":111.21,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-26.png","element":"img","alt":" θ∗λ ∈ C","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C ","element":"span"},{"text":"is an open ","element":"span"},{"text":"set, ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-27.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"is an interior point of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C ","element":"span"},{"text":"and thus ","element":"span"},{"style":{"height":16.51},"width":105.91,"height":41.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-28.png","element":"img","alt":" NC(θ∗λ","inline":true},{"text":") = ","element":"span"},{"style":{"height":13.6},"width":20,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-29.png","element":"img","alt":" ∅","inline":true,"padRight":true},{"href":"#id-34","referenceIndex":21,"text":"[21]","element":"a"},{"text":". Therefore, Eq. ","element":"span"},{"href":"#id-35","text":"(8) ","element":"a"},{"text":"becomes:","element":"span"}],[{"style":{"width":"70%"},"width":1327,"height":119,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-30.png","element":"img"}],[{"text":"Let ","element":"span"},{"style":{"height":20.42},"width":54.6,"height":51.06,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-31.png","element":"img","alt":" I+λ0","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":19.09},"width":413.59,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-32.png","element":"img","alt":" {j : ⟨θ∗λ0, ¯xj⟩ = mλ0, j","inline":true,"padRight":true},{"text":"= 1","element":"span"},{"style":{"height":18.4},"width":212.42,"height":46.01,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-33.png","element":"img","alt":", . . . , p}, I−λ0","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":19.09},"width":466.54,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-34.png","element":"img","alt":" {j′ : ⟨θ∗λ0, ¯xj′⟩ = −mλ0, j","inline":true,"padRight":true},{"text":"= 1","element":"span"},{"style":{"fontStyle":"italic"},"text":", . . . , p","element":"span"},{"style":{"fontStyle":"italic"},"text":"} ","element":"span"},{"text":"and ","element":"span"},{"style":{"height":14.79},"width":54.6,"height":36.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-35.png","element":"img","alt":" Iλ0","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":20.42},"width":158.73,"height":51.05,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-36.png","element":"img","alt":"I+λ0 ∪ I−λ0","inline":true},{"text":". We can see that ","element":"span"},{"style":{"height":20.42},"width":158.73,"height":51.05,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-37.png","element":"img","alt":" I+λ0 ∩ I−λ0","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":13.6},"width":20,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-38.png","element":"img","alt":" ∅","inline":true},{"text":". By the complementary slackness condition, if ","element":"span"},{"style":{"height":16},"width":129.58,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-39.png","element":"img","alt":" k /∈ Iλ0","inline":true},{"text":", we have ","element":"span"},{"style":{"height":18.83},"width":46.22,"height":47.07,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-40.png","element":"img","alt":"η+k","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":14.83},"width":46.22,"height":37.07,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-41.png","element":"img","alt":" η−k","inline":true,"padRight":true},{"text":"= 0. Therefore,","element":"span"}],[{"style":{"width":"68%"},"width":1282,"height":243,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/3-42.png","element":"img"}],[{"text":"Similarly, we have","element":"span"}],[{"id":"id-45","style":{"width":"99%"},"width":1869,"height":932,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-0.png","element":"img"}],[{"text":"Recall that to make our screening rules more aggressive in discarding features, we need to get a tight upper bound ","element":"span"},{"style":{"height":17.5},"width":138.82,"height":43.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-1.png","element":"img","alt":" T(θ∗λ, ¯xj","inline":true},{"text":") of ","element":"span"},{"style":{"height":17.5},"width":151.36,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-2.png","element":"img","alt":" |⟨θ∗λ, ¯xj⟩|","inline":true,"padRight":true},{"text":"[please see ","element":"span"},{"href":"#id-29","text":"(R1","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-3.png","element":"img","alt":"′","inline":true},{"text":")]. Thus, it is desirable to further restrict the possible region ","element":"span"},{"style":{"height":13.99},"width":50.82,"height":34.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-4.png","element":"img","alt":"Aλ","inline":true,"padRight":true},{"text":"of ","element":"span"},{"style":{"height":15.5},"width":37.7,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-5.png","element":"img","alt":" θ∗λ","inline":true},{"text":". Clearly, we can see that","element":"span"}],[{"id":"id-38","style":{"width":"54%"},"width":1022,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-6.png","element":"img"}],[{"text":"since ","element":"span"},{"style":{"height":15.5},"width":37.7,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-7.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"is feasible for problem ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-8.png","element":"img","alt":"λ","inline":true},{"text":"). On the other hand, we call the set ","element":"span"},{"style":{"height":14.78},"width":54.6,"height":36.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-9.png","element":"img","alt":" Iλ0","inline":true,"padRight":true},{"text":"defined in the proof of ","element":"span"},{"text":"Theorem ","element":"span"},{"href":"#id-36","text":"2 ","element":"a"},{"text":"the “","element":"span"},{"style":{"fontStyle":"italic"},"text":"active set","element":"span"},{"text":"” of ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-10.png","element":"img","alt":" θ∗λ0","inline":true},{"text":". In fact, we have the following lemma for the active set.","element":"span"}],[{"id":"id-37","style":{"fontWeight":"bold"},"text":"Lemma 3. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Given the optimal solution ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-11.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"of problem ","element":"span"},{"href":"#id-27","style":{"fontStyle":"italic"},"text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-12.png","element":"img","alt":"λ","inline":true},{"style":{"fontStyle":"italic"},"text":"), the active set ","element":"span"},{"style":{"height":17.5},"width":473.44,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-13.png","element":"img","alt":" Iλ = {j : |⟨θ∗λ, ¯xj⟩| = mλ, j","inline":true,"padRight":true},{"text":"= ","element":"span"},{"text":"1","element":"span"},{"style":{"fontStyle":"italic"},"text":", . . . , p","element":"span"},{"style":{"fontStyle":"italic"},"text":"} ","element":"span"},{"style":{"fontStyle":"italic"},"text":"is not empty if ","element":"span"},{"style":{"height":11.6},"width":61.32,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-14.png","element":"img","alt":" λ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-15.png","element":"img","alt":", λmax","inline":true},{"text":"]","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"}],[{"text":"Since ","element":"span"},{"style":{"height":13.19},"width":79.19,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-16.png","element":"img","alt":" λ0 ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-17.png","element":"img","alt":", λmax","inline":true},{"text":"], we can see that ","element":"span"},{"style":{"height":14.79},"width":54.6,"height":36.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-18.png","element":"img","alt":" Iλ0","inline":true,"padRight":true},{"text":"is not empty by Lemma ","element":"span"},{"href":"#id-37","text":"3. ","element":"a"},{"text":"We pick ","element":"span"},{"style":{"height":14.79},"width":137.6,"height":36.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-19.png","element":"img","alt":" j0 ∈ Iλ0","inline":true,"padRight":true},{"text":"and set","element":"span"}],[{"id":"id-39","style":{"width":"99%"},"width":1867,"height":210,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-20.png","element":"img"}],[{"text":"As a result, Theorem ","element":"span"},{"href":"#id-36","text":"2, ","element":"a"},{"text":"Eq. ","element":"span"},{"href":"#id-38","text":"(11) ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-39","text":"(13) ","element":"a"},{"text":"imply that ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-21.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"is contained in the following set:","element":"span"}],[{"style":{"width":"99%"},"width":1867,"height":258,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-22.png","element":"img"}],[{"text":"is smaller than ","element":"span"},{"style":{"height":10.8},"width":57.99,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-23.png","element":"img","alt":" mλ","inline":true},{"text":", we can conclude that [","element":"span"},{"style":{"height":16.79},"width":67.5,"height":41.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-24.png","element":"img","alt":"β∗λ]j","inline":true,"padRight":true},{"text":"= 0 and ¯","element":"span"},{"style":{"height":12.99},"width":37.19,"height":32.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-25.png","element":"img","alt":"xj","inline":true,"padRight":true},{"text":"can be discarded from the optimization of ","element":"span"},{"text":"(LRP","element":"span"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-26.png","element":"img","alt":"λ","inline":true},{"text":"). Notice that, we replace the notations ","element":"span"},{"style":{"height":13.99},"width":50.82,"height":34.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-27.png","element":"img","alt":" Aλ","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.5},"width":138.82,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-28.png","element":"img","alt":" T(θ∗λ, ¯xj","inline":true},{"text":") with ","element":"span"},{"style":{"height":19.09},"width":211.93,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-29.png","element":"img","alt":" T(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":") and ","element":"span"},{"style":{"height":19.49},"width":64.72,"height":48.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-30.png","element":"img","alt":" Aλλ0","inline":true,"padRight":true},{"text":"to emphasize ","element":"span"},{"text":"their dependence on ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-31.png","element":"img","alt":" θ∗λ0","inline":true},{"text":". Clearly, as long as we can solve for ","element":"span"},{"style":{"height":19.09},"width":211.93,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-32.png","element":"img","alt":" T(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":"), ","element":"span"},{"href":"#id-29","text":"(R1","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-33.png","element":"img","alt":"′","inline":true},{"text":") would be an applicable ","element":"span"},{"text":"screening rule to discard features which have 0 coefficients in ","element":"span"},{"style":{"height":15.71},"width":41.54,"height":39.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/4-34.png","element":"img","alt":" β∗λ","inline":true},{"text":". We give a closed form solution of problem ","element":"span"},{"href":"#id-40","text":"(42) ","element":"a"},{"text":"in the next section.","element":"span"}]]},{"heading":"4 Solving the Convex Optimization Problem (UBP)","paragraphs":[[{"text":"In this section, we show how to solve the convex optimization problem ","element":"span"},{"href":"#id-40","text":"(42) ","element":"a"},{"text":"based on the standard Lagrangian multiplier method. We first transform problem ","element":"span"},{"href":"#id-40","text":"(42) ","element":"a"},{"text":"into a pair of convex minimization problem ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-0.png","element":"img","alt":"′","inline":true},{"text":") via Eq. ","element":"span"},{"href":"#id-41","text":"(15) ","element":"a"},{"text":"and then show that the strong duality holds for ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-1.png","element":"img","alt":"′","inline":true},{"text":") in Lemma ","element":"span"},{"href":"#id-42","text":"6. ","element":"a"},{"text":"The strong duality guarantees the applicability of the Lagrangian multiplier method. We then give the closed form solution of ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-2.png","element":"img","alt":"′","inline":true},{"text":") in Theorem ","element":"span"},{"href":"#id-43","text":"8. ","element":"a"},{"text":"After we solve problem ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-3.png","element":"img","alt":"′","inline":true},{"text":"), it is straightforward to compute the solution of problem ","element":"span"},{"href":"#id-40","text":"(42) ","element":"a"},{"text":"via Eq. ","element":"span"},{"href":"#id-41","text":"(15)","element":"a"},{"text":".","element":"span"}],[{"style":{"width":"99%"},"width":1866,"height":107,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-4.png","element":"img"}],[{"text":"of the space spanned by ","element":"span"},{"style":{"fontWeight":"bold"},"text":"b","element":"span"},{"text":". In fact, we have the following theorem.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Theorem 4. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Let ","element":"span"},{"style":{"height":13.2},"width":329.14,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-5.png","element":"img","alt":" λmax ≥ λ0 > λ >","inline":true,"padRight":true},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", and assume ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-6.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"is known. For ","element":"span"},{"style":{"height":16},"width":246.89,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-7.png","element":"img","alt":" j ∈ {1, . . . , p}","inline":true},{"style":{"fontStyle":"italic"},"text":", if ","element":"span"},{"style":{"height":12.98},"width":68.52,"height":32.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-8.png","element":"img","alt":" P¯xj","inline":true,"padRight":true},{"text":"= 0","element":"span"},{"style":{"fontStyle":"italic"},"text":", then ","element":"span"},{"style":{"height":19.09},"width":211.93,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-9.png","element":"img","alt":"T(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":") = 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"}],[{"text":"Because of ","element":"span"},{"href":"#id-29","text":"(R1","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-10.png","element":"img","alt":"′","inline":true},{"text":"), we immediately have the following corollary.","element":"span"}],[{"id":"id-47","style":{"fontWeight":"bold"},"text":"Corollary 5. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Let ","element":"span"},{"style":{"height":11.6},"width":61.32,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-11.png","element":"img","alt":" λ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-12.png","element":"img","alt":", λmax","inline":true},{"text":") ","element":"span"},{"style":{"fontStyle":"italic"},"text":"and ","element":"span"},{"style":{"height":16},"width":235.85,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-13.png","element":"img","alt":" j ∈ {1, . . . , p}","inline":true},{"style":{"fontStyle":"italic"},"text":". If ","element":"span"},{"style":{"height":12.99},"width":68.52,"height":32.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-14.png","element":"img","alt":" P¯xj","inline":true,"padRight":true},{"text":"= 0","element":"span"},{"style":{"fontStyle":"italic"},"text":", then ","element":"span"},{"text":"[","element":"span"},{"style":{"height":16.79},"width":67.5,"height":41.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-15.png","element":"img","alt":"β∗λ]j","inline":true,"padRight":true},{"text":"= 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"}],[{"text":"For the general case in which ","element":"span"},{"style":{"height":16.58},"width":83.37,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-16.png","element":"img","alt":" P¯xj ̸","inline":true},{"text":"= 0, let","element":"span"}],[{"style":{"width":"81%"},"width":1521,"height":82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-17.png","element":"img"}],[{"text":"Clearly, we have","element":"span"}],[{"id":"id-44","style":{"width":"99%"},"width":1868,"height":305,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-18.png","element":"img"}],[{"text":"To make use of the standard Lagrangian multiplier method, we transform problem ","element":"span"},{"href":"#id-44","text":"(UBP","element":"a"},{"style":{"height":4.8},"width":15,"height":12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-19.png","element":"img","alt":"s","inline":true},{"text":") to the following minimization problem:","element":"span"}],[{"id":"id-41","style":{"width":"65%"},"width":1229,"height":82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-20.png","element":"img"}],[{"text":"by noting that max","element":"span"},{"style":{"height":23.12},"width":325.58,"height":57.8,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-21.png","element":"img","alt":"θ∈Aλλ0 ⟨θ, ξ¯xj⟩ = −","inline":true,"padRight":true},{"text":"min","element":"span"},{"style":{"height":23.12},"width":272.94,"height":57.8,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-22.png","element":"img","alt":"θ∈Aλλ0 ⟨θ, −ξ¯xj⟩","inline":true},{"text":".","element":"span"}],[{"id":"id-42","style":{"fontWeight":"bold"},"text":"Lemma 6. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Let ","element":"span"},{"style":{"height":13.2},"width":301.6,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-23.png","element":"img","alt":" λmax ≥ λ0 > λ >","inline":true,"padRight":true},{"text":"0 ","element":"span"},{"style":{"fontStyle":"italic"},"text":"and assume ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-24.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"is known. The strong duality holds for problem ","element":"span"},{"href":"#id-41","style":{"fontStyle":"italic"},"text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-25.png","element":"img","alt":"′","inline":true},{"style":{"fontStyle":"italic"},"text":"). ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Moreover, problem ","element":"span"},{"href":"#id-41","style":{"fontStyle":"italic"},"text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-26.png","element":"img","alt":"′","inline":true},{"style":{"fontStyle":"italic"},"text":") admits an optimal solution in ","element":"span"},{"style":{"height":19.49},"width":64.72,"height":48.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-27.png","element":"img","alt":" Aλλ0","inline":true},{"style":{"fontStyle":"italic"},"text":".","element":"span"}],[{"text":"Because the strong duality holds for problem ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-28.png","element":"img","alt":"′","inline":true},{"text":") by Lemma ","element":"span"},{"href":"#id-42","text":"6, ","element":"a"},{"text":"the Lagrangian multiplier method is applicable for ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-29.png","element":"img","alt":"′","inline":true},{"text":"). In general, we need to first solve the dual problem and then recover the optimal solution of the primal problem via KKT conditions. Recall that ","element":"span"},{"style":{"fontStyle":"italic"},"text":"r ","element":"span"},{"text":"and ¯","element":"span"},{"style":{"height":10.98},"width":40.19,"height":27.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-30.png","element":"img","alt":"x∗","inline":true,"padRight":true},{"text":"are defined by Eq. ","element":"span"},{"href":"#id-45","text":"(10) ","element":"a"},{"text":"and ","element":"span"},{"href":"#id-38","text":"(12) ","element":"a"},{"text":"respectively. Lemma ","element":"span"},{"href":"#id-46","text":"7 ","element":"a"},{"text":"derives the dual problems of ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-31.png","element":"img","alt":"′","inline":true},{"text":") for different cases.","element":"span"}],[{"id":"id-46","style":{"fontWeight":"bold"},"text":"Lemma 7. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Let ","element":"span"},{"style":{"height":13.2},"width":343.29,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-32.png","element":"img","alt":" λmax ≥ λ0 > λ >","inline":true,"padRight":true},{"text":"0 ","element":"span"},{"style":{"fontStyle":"italic"},"text":"and assume ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-33.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"is known. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"For ","element":"span"},{"style":{"height":16},"width":252.54,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-34.png","element":"img","alt":" j ∈ {1, . . . , p}","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"and ","element":"span"},{"style":{"height":16.59},"width":91.72,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-35.png","element":"img","alt":" P¯xj ̸","inline":true},{"text":"= 0","element":"span"},{"style":{"fontStyle":"italic"},"text":", let ","element":"span"},{"text":"¯","element":"span"},{"style":{"height":16.19},"width":164.77,"height":40.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-36.png","element":"img","alt":"x = −ξ¯xj","inline":true},{"style":{"fontStyle":"italic"},"text":". Denote","element":"span"}],[{"style":{"width":"73%"},"width":1374,"height":73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/5-37.png","element":"img"}],[{"style":{"width":"99%"},"width":1867,"height":487,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-0.png","element":"img"}],[{"text":"We can now solve problem ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-1.png","element":"img","alt":"′","inline":true},{"text":") in the following theorem.","element":"span"}],[{"id":"id-43","style":{"fontWeight":"bold"},"text":"Theorem 8. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Let ","element":"span"},{"style":{"height":13.2},"width":331.8,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-2.png","element":"img","alt":" λmax ≥ λ0 > λ >","inline":true,"padRight":true},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d ","element":"span"},{"text":"= ","element":"span"},{"style":{"height":24.43},"width":131,"height":61.07,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-3.png","element":"img","alt":"m(λ0−λ)r∥P¯x∗∥2","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"and assume ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-4.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"is known. For ","element":"span"},{"style":{"height":16},"width":247.94,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-5.png","element":"img","alt":" j ∈ {1, . . . , p}","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"and ","element":"span"},{"style":{"height":16.58},"width":83.38,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-6.png","element":"img","alt":"P¯xj ̸","inline":true},{"text":"= 0","element":"span"},{"style":{"fontStyle":"italic"},"text":", let ","element":"span"},{"text":"¯","element":"span"},{"style":{"height":16.18},"width":164.77,"height":40.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-7.png","element":"img","alt":"x = −ξ¯xj","inline":true},{"style":{"fontStyle":"italic"},"text":".","element":"span"}],[{"style":{"width":"97%"},"width":1834,"height":778,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-8.png","element":"img"}],[{"text":"Notice that, although the dual problems of ","element":"span"},{"href":"#id-41","text":"(UBP","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-9.png","element":"img","alt":"′","inline":true},{"text":") in Lemma ","element":"span"},{"href":"#id-46","text":"7 ","element":"a"},{"text":"are different, the resulting upper bound ","element":"span"},{"style":{"height":19.09},"width":223.84,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-10.png","element":"img","alt":"Tξ(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":") can be given by Theorem ","element":"span"},{"href":"#id-43","text":"8 ","element":"a"},{"text":"in a uniform way. The tricky part is how to deal with the extremal ","element":"span"},{"text":"cases in which ","element":"span"},{"style":{"height":24.84},"width":334.54,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-11.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2 ∈ {−1,","inline":true,"padRight":true},{"text":"+1","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":". To avoid the lengthy discussion of Theorem ","element":"span"},{"href":"#id-43","text":"8, ","element":"a"},{"text":"we omit the proof in ","element":"span"},{"text":"the main text and include the details in the supplement.","element":"span"}]]},{"heading":"5 The proposed Slores Rule for ℓ1 Regularized Logistic Regression","paragraphs":[[{"text":"Using ","element":"span"},{"href":"#id-29","text":"(R1","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-12.png","element":"img","alt":"′","inline":true},{"text":"), we are now ready to construct the screening rules for the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-13.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"Regularized Logistic Regression. By Corollary ","element":"span"},{"href":"#id-47","text":"5, ","element":"a"},{"text":"we can see that the orthogonality between the ","element":"span"},{"style":{"height":16.58},"width":49.74,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-14.png","element":"img","alt":" jth","inline":true,"padRight":true},{"text":"feature and the response vector ","element":"span"},{"style":{"fontWeight":"bold"},"text":"b ","element":"span"},{"text":"implies the absence of ¯","element":"span"},{"style":{"height":12.98},"width":37.19,"height":32.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-15.png","element":"img","alt":"xj","inline":true,"padRight":true},{"text":"from the resulting model. For the general case in which ","element":"span"},{"style":{"height":16.58},"width":87.24,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-16.png","element":"img","alt":" P¯xj ̸","inline":true},{"text":"= 0, ","element":"span"},{"href":"#id-29","text":"(R1","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-17.png","element":"img","alt":"′","inline":true},{"text":") implies that if ","element":"span"},{"style":{"height":19.09},"width":211.93,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-18.png","element":"img","alt":"T(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":") = max","element":"span"},{"style":{"height":19.09},"width":684.17,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-19.png","element":"img","alt":"{T+(θ∗λ, ¯xj; θ∗λ0), T−(θ∗λ, ¯xj; θ∗λ0)} < mλ,","inline":true,"padRight":true},{"text":"then the ","element":"span"},{"style":{"height":16.58},"width":49.74,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-20.png","element":"img","alt":" jth","inline":true,"padRight":true},{"text":"feature can be discarded from the ","element":"span"},{"text":"optimization of ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-21.png","element":"img","alt":"λ","inline":true},{"text":"). Notice that, letting ","element":"span"},{"style":{"height":19.09},"width":389.52,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-22.png","element":"img","alt":" ξ = ±1, T+(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":") and ","element":"span"},{"style":{"height":19.09},"width":233.29,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-23.png","element":"img","alt":" T−(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":") have been solved ","element":"span"},{"text":"by Theorem ","element":"span"},{"href":"#id-43","text":"8. ","element":"a"},{"text":"Rigorously, we have the following theorem.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Theorem 9 ","element":"span"},{"text":"(Slores)","element":"span"},{"style":{"fontWeight":"bold"},"text":". ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Let ","element":"span"},{"style":{"height":13.19},"width":159.59,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-24.png","element":"img","alt":" λ0 > λ >","inline":true,"padRight":true},{"text":"0 ","element":"span"},{"style":{"fontStyle":"italic"},"text":"and assume ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-25.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"is known.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"1. If ","element":"span"},{"style":{"height":13.2},"width":163.21,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-26.png","element":"img","alt":" λ ≥ λmax","inline":true},{"style":{"fontStyle":"italic"},"text":", then ","element":"span"},{"style":{"height":15.72},"width":41.54,"height":39.29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-27.png","element":"img","alt":" β∗λ","inline":true,"padRight":true},{"text":"= 0","element":"span"},{"style":{"fontStyle":"italic"},"text":";","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"2. If ","element":"span"},{"style":{"height":13.2},"width":301.6,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/6-28.png","element":"img","alt":" λmax ≥ λ0 > λ >","inline":true,"padRight":true},{"text":"0 ","element":"span"},{"style":{"fontStyle":"italic"},"text":"and either of the following holds:","element":"span"}],[{"id":"id-48","style":{"width":"52%"},"width":976,"height":751,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-0.png","element":"img"}],[{"text":"Notice that, the output ","element":"span"},{"style":{"fontStyle":"italic"},"text":"R ","element":"span"},{"text":"of Slores is the indices of the features that need to be entered to the optimization. ","element":"span"},{"text":"As a result, suppose the output of Algorithm ","element":"span"},{"href":"#id-48","text":"1 ","element":"a"},{"text":"is ","element":"span"},{"style":{"height":16},"width":291.4,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-1.png","element":"img","alt":" R = {j1, . . . , jk}","inline":true},{"text":", we can substitute the full matrix ","element":"span"},{"style":{"fontWeight":"bold"},"text":"X ","element":"span"},{"text":"in problem ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-2.png","element":"img","alt":"λ","inline":true},{"text":") with the submatrix ","element":"span"},{"style":{"height":13.19},"width":61.65,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-3.png","element":"img","alt":" XR","inline":true,"padRight":true},{"text":"= (¯","element":"span"},{"style":{"height":16.19},"width":195.85,"height":40.47,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-4.png","element":"img","alt":"xj1, . . . , ¯xjk","inline":true},{"text":") and just solve for [","element":"span"},{"style":{"height":16.51},"width":81.5,"height":41.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-5.png","element":"img","alt":"β∗λ]R","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":15.5},"width":36.24,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-6.png","element":"img","alt":" c∗λ","inline":true},{"text":".","element":"span"}],[{"text":"On the other hand, Algorithm ","element":"span"},{"href":"#id-48","text":"1 ","element":"a"},{"text":"implies that Slores needs five inputs. Since ","element":"span"},{"style":{"fontWeight":"bold"},"text":"X ","element":"span"},{"text":"and ","element":"span"},{"style":{"fontWeight":"bold"},"text":"b ","element":"span"},{"text":"come with the data and ","element":"span"},{"style":{"height":10.8},"width":23,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-7.png","element":"img","alt":" λ","inline":true,"padRight":true},{"text":"is chosen by the user, we only need to specify ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-8.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":13.19},"width":39.24,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-9.png","element":"img","alt":" λ0","inline":true},{"text":". In other words, we need to ","element":"span"},{"text":"provide Slores with an dual optimal solution of problem ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-10.png","element":"img","alt":"λ","inline":true},{"text":") for an arbitrary parameter. A natural choice is by setting ","element":"span"},{"style":{"height":13.19},"width":200.03,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-11.png","element":"img","alt":" λ0 = λmax","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-12.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":17.09},"width":93.5,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-13.png","element":"img","alt":" θ∗λmax","inline":true,"padRight":true},{"text":"given in Eq. ","element":"span"},{"href":"#id-25","text":"(1) ","element":"a"},{"text":"and Eq. ","element":"span"},{"href":"#id-30","text":"(2)","element":"a"},{"text":".","element":"span"}]]},{"heading":"6 Experiments","paragraphs":[[{"text":"We evaluate our screening rules using the newgroup data set ","element":"span"},{"href":"#id-24","referenceIndex":12,"text":"[12] ","element":"a"},{"text":"and Yahoo web pages data sets ","element":"span"},{"href":"#id-49","referenceIndex":26,"text":"[26]","element":"a"},{"text":". The newgroup data set is cultured from the data by Koh et al. ","element":"span"},{"href":"#id-24","referenceIndex":12,"text":"[12]","element":"a"},{"text":". The Yahoo data sets include 11 top-level categories, each of which is further divided into a set of subcategories. In our experiment we construct five balanced binary classification datasets from the topics of Computers, Education, Health, Recreation, and Science. For each topic, we choose samples from one subcategory as the positive class and randomly sample an equal number of samples from the rest of subcategories as the negative class. The statistics of the data sets are given in Table ","element":"span"},{"href":"#id-50","text":"1.","element":"a"}],[{"id":"id-50","style":{"width":"47%"},"width":892,"height":360,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-14.png","element":"img"}],[{"id":"id-51","text":"Table 2: Running time (in seconds) of Slores, strong","element":"figcaption","subtype":"caption"}],[{"text":"We compare the performance of Slores and the strong rule which achieves state-of-the-art performance for ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-15.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR. We do not include SAFE because it is less effective in discarding features than strong rules and requires much higher computational time ","element":"span"},{"href":"#id-21","referenceIndex":24,"text":"[24]","element":"a"},{"text":". Fig. ","element":"span"},{"href":"#id-26","text":"1 ","element":"a"},{"text":"has shown the performance of Slores, strong rule and SAFE. We compare the efficiency of the three screening rules on the same prostate cancer data set in Table ","element":"span"},{"href":"#id-51","text":"2. ","element":"a"},{"text":"All of the screening rules are tested along a sequence of 86 parameter values equally spaced on the ","element":"span"},{"style":{"height":16},"width":130,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-16.png","element":"img","alt":" λ/λmax","inline":true,"padRight":true},{"text":"scale from 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"},{"text":"1 to 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"},{"text":"95. We repeat the procedure 100 times and during each time we undersample 80% of the data. We report the total running time of the three screening rules over the 86 values of ","element":"span"},{"style":{"height":16},"width":130,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-17.png","element":"img","alt":" λ/λmax","inline":true,"padRight":true},{"text":"in Table ","element":"span"},{"href":"#id-51","text":"2. ","element":"a"},{"text":"For reference, we also report the total","element":"span"}],[{"text":"running time of the solver","element":"span"},{"style":{"height":7.6},"width":16,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/7-18.png","element":"img","alt":"1","inline":true},{"text":". We observe that the running time of Slores and strong rule is negligible compared to that of the solver. However, SAFE takes much longer time even than the solver.","element":"span"}],[{"text":"In Section ","element":"span"},{"href":"#id-52","text":"6.1, ","element":"a"},{"text":"we evaluate the performance of Slores and strong rule. Recall that we use the rejection ratio, i.e., the ratio between the number of features discarded by the screening rules and the number of features with 0 coefficients in the solution, to measure the performance of screening rules. ","element":"span"},{"text":"Note that, because no features with non-zero coefficients in the solution would be mistakenly discarded by Slores, its rejection ratio is no larger than one. We then compare the efficiency of Slores and strong rule in Section ","element":"span"},{"href":"#id-53","text":"6.2.","element":"a"}],[{"text":"The experiment settings are as follows. For each data set, we undersample 80% of the date and run Slores and strong rules along a sequence of 86 parameter values equally spaced on the ","element":"span"},{"style":{"height":16},"width":130,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/8-0.png","element":"img","alt":" λ/λmax","inline":true,"padRight":true},{"text":"scale from 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"},{"text":"1 to 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"},{"text":"95. We repeat the procedure 100 times and report the average performance and running time at each of the 86 values of ","element":"span"},{"style":{"height":16},"width":130,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/8-1.png","element":"img","alt":" λ/λmax","inline":true},{"text":". Slores, strong rules and SAFE are all implemented in Matlab. All of the experiments are carried out on a Intel(R) (i7-2600) 3.4Ghz processor.","element":"span"}],[{"id":"id-52","style":{"fontWeight":"bold"},"text":"6.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Comparison of Performance","element":"span"}],[{"text":"In this experiment, we evaluate the performance of the Slores and the strong rule via the rejection ratio. Fig. ","element":"span"},{"href":"#id-54","text":"2 ","element":"a"},{"text":"shows the rejection ratio of Slores and strong rule on six real data sets. When ","element":"span"},{"style":{"height":16},"width":216.46,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/8-2.png","element":"img","alt":" λ/λmax > 0.","inline":true},{"text":"5, we can see that both Slores and strong rule are able to identify almost 100% of the inactive features, i.e., features with 0 coefficients in the solution vector. However, when ","element":"span"},{"style":{"height":16},"width":230.24,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/8-3.png","element":"img","alt":" λ/λmax ≤ 0.","inline":true},{"text":"5, strong rule can not detect the inactive features. In contrast, we observe that Slores exhibits much stronger capability in discarding inactive features for small ","element":"span"},{"style":{"height":10.8},"width":23,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/8-4.png","element":"img","alt":" λ","inline":true},{"text":", even when ","element":"span"},{"style":{"height":16},"width":130,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/8-5.png","element":"img","alt":" λ/λmax","inline":true,"padRight":true},{"text":"is close to 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"},{"text":"1. ","element":"span"},{"text":"Taking the data point at which ","element":"span"},{"style":{"height":16},"width":130,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/8-6.png","element":"img","alt":" λ/λmax","inline":true,"padRight":true},{"text":"= 0","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"},{"text":"1 for example, Slores discards about 99% inactive features for the newsgroup data set. For the other data sets, more than 80% inactive features are identified by Slores. Therefore, in terms of rejection ratio, Slores significantly outperforms the strong rule. It is also worthwhile to mention that the discarded features by Slores are guaranteed to have 0 coefficients in the solution. But strong rule may mistakenly discard features which have non-zero coefficients in the solution.","element":"span"}],[{"style":{"width":"91%"},"width":1711,"height":946,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/8-7.png","element":"img"}],[{"id":"id-54","text":"Figure 2: Comparison of the performance of Slores and strong rules on six real data sets.","element":"figcaption","subtype":"caption"}],[{"style":{"width":"91%"},"width":1711,"height":945,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/9-0.png","element":"img"}],[{"id":"id-55","text":"Figure 3: Comparison of the efficiency of Slores and strong rule on six real data sets.","element":"figcaption","subtype":"caption"}],[{"id":"id-53","style":{"fontWeight":"bold"},"text":"6.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Comparison of Efficiency","element":"span"}],[{"text":"We compare efficiency of Slores and the strong rule in this experiment. The data sets for evaluating the rules are the same as Section ","element":"span"},{"href":"#id-52","text":"6.1. ","element":"a"},{"text":"The running time of the screening rules reported in Fig. ","element":"span"},{"href":"#id-55","text":"3 ","element":"a"},{"text":"includes the computational cost of the rules themselves and that of the solver after screening. We plot the running time of the screening rules against that of the solver without screening. As indicated by Fig. ","element":"span"},{"href":"#id-54","text":"2, ","element":"a"},{"text":"when ","element":"span"},{"style":{"height":16},"width":216.1,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/9-1.png","element":"img","alt":" λ/λmax > 0.","inline":true},{"text":"5, Slores and strong rule discards almost 100% of the inactive features. As a result, the size of the feature matrix involved in the optimization of problem ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/9-2.png","element":"img","alt":"λ","inline":true},{"text":") is greatly reduced. From Fig. ","element":"span"},{"href":"#id-55","text":"3, ","element":"a"},{"text":"we can observe that the efficiency is improved by about one magnitude on average compared to that of the solver without screening. However, when ","element":"span"},{"style":{"height":16},"width":218.2,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/9-3.png","element":"img","alt":" λ/λmax < 0.","inline":true},{"text":"5, strong rule can not identify any inactive features and thus the running time is almost the same as that of the solver without screening. In contrast, Slores is still able to identify more than 80% of the inactive features for the data sets cultured from the Yahoo web pages data sets and thus the efficiency is improved by roughly 5 times. For the newgroup data set, about 99% inactive features are identified by Slores which leads to about 10 times savings in running time. These results demonstrate the power of the proposed Slores rule in improving the efficiency of solving the ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/9-4.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR.","element":"span"}]]},{"heading":"7 Conclusions","paragraphs":[[{"text":"In this paper, we propose novel screening rules to effectively discard features for ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/9-5.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized LR. Extensive numerical experiments on real data demonstrate that Slores outperforms the existing state-of-the-art screening rules. We plan to extend the framework of Slores to more general sparse formulations, including convex ones, like group Lasso, fused Lasso, ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/9-6.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized SVM, and non-convex ones, like ","element":"span"},{"style":{"height":7.2},"width":33.6,"height":18,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/9-7.png","element":"img","alt":" ℓp","inline":true,"padRight":true},{"text":"regularized problems where 0 ","element":"span"},{"style":{"fontStyle":"italic"},"text":"< p < ","element":"span"},{"text":"1.","element":"span"}]]},{"heading":"References","paragraphs":[[{"id":"id-6","text":"[1] M. Asgary, S. Jahandideh, P. Abdolmaleki, and A. Kazemnejad. Analysis and identification of ","element":"span"},{"style":{"height":14.4},"width":23,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/10-0.png","element":"img","alt":" β","inline":true},{"text":"-turn types using multinomial logistic regression and artificial neural network. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Bioinformatics","element":"span"},{"text":", 23(23):3125– 3130, 2007. ","element":"span"},{"text":"URL ","element":"span"},{"href":"http://dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics23.html#AsgaryJAK07","style":{"fontFamily":"monospace"},"text":"http://dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics23. ","element":"a"},{"href":"http://dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics23.html#AsgaryJAK07","style":{"fontFamily":"monospace"},"text":"html#AsgaryJAK07","element":"a"},{"text":".","element":"span"}],[{"id":"id-11","text":"[2] C. Boyd, M. Tolson, and W. Copes. Evaluating trauma care: The TRISS method, trauma score and ","element":"span"},{"text":"the injury severity score. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Journal of Trauma","element":"span"},{"text":", 27:370–378, 1987.","element":"span"}],[{"id":"id-56","text":"[3] S. Boyd and L. Vandenberghe. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Convex Optimization","element":"span"},{"text":". Cambridge University Press, 2004.","element":"span"}],[{"id":"id-2","text":"[4] Jack R. Brzezinski and George J. Knafl. Logistic regression modeling for context-based classification. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"DEXA Workshop","element":"span"},{"text":", pages 755–759, 1999. URL ","element":"span"},{"href":"http://dblp.uni-trier.de/db/conf/dexaw/dexaw99.html#BrzezinskiK99","style":{"fontFamily":"monospace"},"text":"http://dblp.uni-trier.de/db/conf/dexaw/dexaw99. ","element":"a"},{"href":"http://dblp.uni-trier.de/db/conf/dexaw/dexaw99.html#BrzezinskiK99","style":{"fontFamily":"monospace"},"text":"html#BrzezinskiK99","element":"a"},{"text":".","element":"span"}],[{"id":"id-0","text":"[5] K. Chaudhuri and C. Monteleoni. Privacy-preserving logistic regression. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"NIPS","element":"span"},{"text":", 2008.","element":"span"}],[{"id":"id-15","text":"[6] B. Efron, T. Hastie, I. Johnstone, and R. Tibshirani. Least angle regression. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Ann. Statist.","element":"span"},{"text":", 32:407–499, 2004.","element":"span"}],[{"id":"id-19","text":"[7] L. El Ghaoui, V. Viallon, and T. Rabbani. Safe feature elimination for the lasso and sparse supervised ","element":"span"},{"text":"learning problems. arXiv:1009.4219v2.","element":"span"}],[{"id":"id-1","text":"[8] J. Friedman, T. Hastie, and R. Tibshirani. Additive logistic regression: a statistical view of boosting. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"The Annals of Statistics","element":"span"},{"text":", 38(2), 2000.","element":"span"}],[{"id":"id-3","text":"[9] A. Genkin, D. Lewis, and D. Madigan. ","element":"span"},{"text":"Large-scale bayesian logistic regression for text categorization. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Technometrics","element":"span"},{"text":", 49:291–304(14), 2007. ","element":"span"},{"text":"doi: doi:10.1198/004017007000000245. ","element":"span"},{"text":"URL ","element":"span"},{"href":"http://www.ingentaconnect.com/content/asa/tech/2007/00000049/00000003/art00007","style":{"fontFamily":"monospace"},"text":"http: ","element":"a"},{"href":"http://www.ingentaconnect.com/content/asa/tech/2007/00000049/00000003/art00007","style":{"fontFamily":"monospace"},"text":"//www.ingentaconnect.com/content/asa/tech/2007/00000049/00000003/art00007","element":"a"},{"text":".","element":"span"}],[{"id":"id-4","text":"[10] S. Gould, J. Rodgers, D. Cohen, G. Elidan, and D. Koller. ","element":"span"},{"text":"Multi-class segmentation with relative location prior. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"International Journal of Computer Vision","element":"span"},{"text":", 80(3):300–316, 2008. URL ","element":"span"},{"href":"http://dblp.uni-trier.de/db/journals/ijcv/ijcv80.html#GouldRCEK08","style":{"fontFamily":"monospace"},"text":"http://dblp. ","element":"a"},{"href":"http://dblp.uni-trier.de/db/journals/ijcv/ijcv80.html#GouldRCEK08","style":{"fontFamily":"monospace"},"text":"uni-trier.de/db/journals/ijcv/ijcv80.html#GouldRCEK08","element":"a"},{"text":".","element":"span"}],[{"id":"id-65","text":"[11] O. G¨uler. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Foundations of Optimization","element":"span"},{"text":". Springer, 2010.","element":"span"}],[{"id":"id-24","text":"[12] K. Koh, S. J. Kim, and S. Boyd. An interior-point method for large scale l1-regularized logistic regression. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"J. Mach. Learn. Res.","element":"span"},{"text":", 8:1519–1555, 2007.","element":"span"}],[{"id":"id-17","text":"[13] B. Krishnapuram, L. Carin, M. Figueiredo, and A. Hartemink. Sparse multinomial logistic regression: ","element":"span"},{"text":"Fast algorithms and generalization bounds. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"IEEE Trans. Pattern Anal. Mach. Intell.","element":"span"},{"text":", 27:957–968, 2005.","element":"span"}],[{"id":"id-16","text":"[14] S. Lee, H. Lee, P. Abbeel, and A. Ng. Efficient l1 regularized logistic regression. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"In AAAI-06","element":"span"},{"text":", 2006.","element":"span"}],[{"id":"id-7","text":"[15] J. Liao and K. Chin. Logistic regression for disease classification using microarray data: model selection ","element":"span"},{"text":"in a large p and small n case. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Bioinformatics","element":"span"},{"text":", 23(15):1945–1951, 2007. URL ","element":"span"},{"href":"http://dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics23.html#LiaoC07","style":{"fontFamily":"monospace"},"text":"http://dblp.uni-trier. ","element":"a"},{"href":"http://dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics23.html#LiaoC07","style":{"fontFamily":"monospace"},"text":"de/db/journals/bioinformatics/bioinformatics23.html#LiaoC07","element":"a"},{"text":".","element":"span"}],[{"text":"[16] J. Liu, S. Ji, and J. Ye. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"SLEP: Sparse Learning with Efficient Projections","element":"span"},{"text":". Arizona State University, 2009. URL ","element":"span"},{"href":"http://www.public.asu.edu/~jye02/Software/SLEP","style":{"fontFamily":"monospace"},"text":"http://www.public.asu.edu/~jye02/Software/SLEP","element":"a"},{"text":".","element":"span"}],[{"id":"id-5","text":"[17] S. Martins, L. Sousa, and J. Martins. Additive logistic regression applied to retina modelling. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"ICIP (3)","element":"span"},{"text":", pages 309–312. IEEE, 2007. URL ","element":"span"},{"href":"http://dblp.uni-trier.de/db/conf/icip/icip2007-3.html#MartinsSM07","style":{"fontFamily":"monospace"},"text":"http://dblp.uni-trier.de/db/conf/icip/icip2007-3.html# ","element":"a"},{"href":"http://dblp.uni-trier.de/db/conf/icip/icip2007-3.html#MartinsSM07","style":{"fontFamily":"monospace"},"text":"MartinsSM07","element":"a"},{"text":".","element":"span"}],[{"id":"id-31","text":"[18] Y. Nesterov. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Introductory Lectures on Convex Optimization: A Basic Course","element":"span"},{"text":". Springer, 2004.","element":"span"}],[{"id":"id-12","text":"[19] S. Palei and S. Das. Logistic regression model for prediction of roof fall risks in bord and pillar workings ","element":"span"},{"text":"in coal mines: An approach. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Safety Science","element":"span"},{"text":", 47:88–96, 2009.","element":"span"}],[{"id":"id-18","text":"[20] M. Park and T. Hastie. ","element":"span"},{"style":{"height":7.6},"width":32.6,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/11-0.png","element":"img","alt":" ℓ1","inline":true,"padRight":true},{"text":"regularized path algorithm for generalized linear models. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"J. R. Statist. Soc. B","element":"span"},{"text":", 69:659–677, 2007.","element":"span"}],[{"id":"id-34","text":"[21] A. Ruszczy´nski. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Nonlinear Optimization","element":"span"},{"text":". Princeton university Press, 2006.","element":"span"}],[{"id":"id-8","text":"[22] M. Sartor, G. Leikauf, and M. Medvedovic. ","element":"span"},{"text":"LRpath: a logistic regression approach for identifying enriched biological groups in gene expression data. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Bioinformatics","element":"span"},{"text":", 25(2):211–217, 2009. URL ","element":"span"},{"href":"http://dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics25.html#SartorLM09","style":{"fontFamily":"monospace"},"text":"http: ","element":"a"},{"href":"http://dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics25.html#SartorLM09","style":{"fontFamily":"monospace"},"text":"//dblp.uni-trier.de/db/journals/bioinformatics/bioinformatics25.html#SartorLM09","element":"a"},{"text":".","element":"span"}],[{"id":"id-13","text":"[23] D. Sun, T. Erp, P. Thompson, C. Bearden, M. Daley, L. Kushan, M. Hardt, K. Nuechterlein, A. Toga, ","element":"span"},{"text":"and T. Cannon. Elucidating a magnetic resonance imaging-based neuroanatomic biomarker for psychosis: classification analysis using probabilistic brain atlas and machine learning algorithms. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Biological Psychiatry","element":"span"},{"text":", 66:1055–1–60, 2009.","element":"span"}],[{"id":"id-21","text":"[24] R. Tibshirani, J. Bien, J. Friedman, T. Hastie, N. Simon, J. Taylor, and R. Tibshirani. Strong rules for ","element":"span"},{"text":"discarding predictors in lasso-type problems. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"J. R. Statist. Soc. B","element":"span"},{"text":", 74:245–266, 2012.","element":"span"}],[{"id":"id-20","text":"[25] Robert Tibshirani. Regression shringkage and selection via the lasso. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"J. R. Statist. Soc. B","element":"span"},{"text":", 58:267–288, 1996.","element":"span"}],[{"id":"id-49","text":"[26] Naonori Ueda and Kazumi Saito. Parametric mixture models for multi-labeled text. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Advances in neural information processing systems","element":"span"},{"text":", 15:721–728, 2002.","element":"span"}],[{"id":"id-14","text":"[27] T. T. Wu, Y. F. Chen, T. Hastie, E. Sobel, and K. Lange. Genome-wide association analysis by lasso ","element":"span"},{"text":"penalized logistic regression. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Bioinformatics","element":"span"},{"text":", 25:714–721, 2009.","element":"span"}],[{"id":"id-23","text":"[28] Z. J. Xiang and P. J. Ramadge. Fast lasso screening tests based on correlations. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"IEEE ICASSP","element":"span"},{"text":", 2012.","element":"span"}],[{"id":"id-22","text":"[29] Z. J. Xiang, H. Xu, and P. J. Ramadge. Learning sparse representation of high dimensional data on ","element":"span"},{"text":"large scale dictionaries. In ","element":"span"},{"style":{"fontStyle":"italic"},"text":"NIPS","element":"span"},{"text":", 2011.","element":"span"}],[{"id":"id-9","text":"[30] J. Zhu and T. Hastie. Kernel logistic regression and the import vector machine. In Thomas G. Dietterich, ","element":"span"},{"text":"Suzanna Becker, and Zoubin Ghahramani, editors, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"NIPS","element":"span"},{"text":", pages 1081–1088. MIT Press, 2001. URL ","element":"span"},{"href":"http://dblp.uni-trier.de/db/conf/nips/nips2001.html#ZhuH01","style":{"fontFamily":"monospace"},"text":"http://dblp.uni-trier.de/db/conf/nips/nips2001.html#ZhuH01","element":"a"},{"text":".","element":"span"}],[{"id":"id-10","text":"[31] J. Zhu and T. Hastie. Classification of gene microarrays by penalized logistic regression. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"Biostatistics","element":"span"},{"text":", 5:427–443, 2004.","element":"span"}]]},{"heading":"Appendix","paragraphs":[[{"text":"In this appendix, we will provide detailed proofs of the theorems, lemmas and corollaries in the main text.","element":"span"}]]},{"heading":"A Deviation of the Dual Problem of the Sparse Logistic Regression","paragraphs":[[{"text":"Suppose we are given a set of training samples ","element":"span"},{"style":{"height":16.15},"width":129.04,"height":40.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-0.png","element":"img","alt":" {xi}mi=1","inline":true,"padRight":true},{"text":"and the associate labels ","element":"span"},{"style":{"height":11.6},"width":130.95,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-1.png","element":"img","alt":" b ∈ ℜm","inline":true},{"text":", where ","element":"span"},{"style":{"height":12.98},"width":131.94,"height":32.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-2.png","element":"img","alt":" xi ∈ ℜp","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":16},"width":207.56,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-3.png","element":"img","alt":"bi ∈ {1, −1}","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":16},"width":245.83,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-4.png","element":"img","alt":" i ∈ {1, . . . , m}","inline":true},{"text":". The logistic regression problem takes the form as follows:","element":"span"}],[{"style":{"width":"45%"},"width":845,"height":110,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-5.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":14.4},"width":119.58,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-6.png","element":"img","alt":" β ∈ ℜp","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":9.6},"width":86.4,"height":24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-7.png","element":"img","alt":" c ∈ ℜ","inline":true,"padRight":true},{"text":"are the model parameters to be estimated, ¯","element":"span"},{"style":{"height":13.19},"width":156.59,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-8.png","element":"img","alt":"xi = bixi","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":11.6},"width":65.51,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-9.png","element":"img","alt":" λ >","inline":true,"padRight":true},{"text":"0. Let ","element":"span"},{"text":"¯","element":"span"},{"style":{"fontWeight":"bold"},"text":"X ","element":"span"},{"text":"denote the data matrix whose rows consist of ¯","element":"span"},{"style":{"height":9.59},"width":35.18,"height":23.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-10.png","element":"img","alt":"xi","inline":true},{"text":". Denote the columns of ","element":"span"},{"text":"¯","element":"span"},{"style":{"fontWeight":"bold"},"text":"X ","element":"span"},{"text":"as ¯","element":"span"},{"style":{"height":16.98},"width":301.18,"height":42.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-11.png","element":"img","alt":"xj, j ∈ {1, . . . , p}","inline":true},{"text":".","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"A.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Dual Formulation","element":"span"}],[{"text":"By introducing the slack variables ","element":"span"},{"style":{"height":19.37},"width":388.86,"height":48.43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-12.png","element":"img","alt":" qi = 1m(−⟨β, ¯xi⟩ − bic","inline":true},{"text":") for all ","element":"span"},{"style":{"height":16},"width":255.73,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-13.png","element":"img","alt":" i ∈ {1, . . . , m}","inline":true},{"text":", problem ","element":"span"},{"href":"#id-25","text":"(LRP","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-14.png","element":"img","alt":"λ","inline":true},{"text":") can be ","element":"span"},{"text":"formulated as:","element":"span"}],[{"style":{"width":"70%"},"width":1311,"height":210,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-15.png","element":"img"}],[{"text":"The Lagrangian is","element":"span"}],[{"style":{"width":"88%"},"width":1665,"height":243,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-16.png","element":"img"}],[{"text":"In order to find the dual function, we need to solve the following subproblems:","element":"span"}],[{"style":{"width":"71%"},"width":1331,"height":335,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-17.png","element":"img"}],[{"text":"Consider ","element":"span"},{"style":{"height":16},"width":77.88,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-18.png","element":"img","alt":" f1(q","inline":true},{"text":"). It is easy to see","element":"span"}],[{"style":{"width":"28%"},"width":535,"height":95,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-19.png","element":"img"}],[{"text":"By setting [","element":"span"},{"style":{"height":16},"width":147.85,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-20.png","element":"img","alt":"∇f1(q)]i","inline":true,"padRight":true},{"text":"= 0, we get","element":"span"}],[{"style":{"width":"36%"},"width":688,"height":90,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/12-21.png","element":"img"}],[{"text":"Clearly, we can see that ","element":"span"},{"style":{"height":13.19},"width":70.04,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-0.png","element":"img","alt":" θi ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1) for all ","element":"span"},{"style":{"height":16},"width":245.83,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-1.png","element":"img","alt":" i ∈ {1, . . . , m}","inline":true},{"text":". Therefore,","element":"span"}],[{"style":{"width":"58%"},"width":1086,"height":110,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-2.png","element":"img"}],[{"text":"Consider ","element":"span"},{"style":{"height":16},"width":75.88,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-3.png","element":"img","alt":" f2(β","inline":true},{"text":") and let ","element":"span"},{"style":{"height":14.4},"width":38.64,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-4.png","element":"img","alt":" β′","inline":true,"padRight":true},{"text":"= argmin","element":"span"},{"style":{"height":18.3},"width":104.02,"height":45.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-5.png","element":"img","alt":"β f2(β","inline":true},{"text":"). The optimality condition is","element":"span"}],[{"style":{"width":"41%"},"width":776,"height":82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-6.png","element":"img"}],[{"text":"It is easy to see that","element":"span"}],[{"style":{"width":"22%"},"width":421,"height":110,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-7.png","element":"img"}],[{"text":"and thus","element":"span"}],[{"style":{"width":"31%"},"width":598,"height":170,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-8.png","element":"img"}],[{"text":"Moreover, it follows that","element":"span"}],[{"style":{"width":"21%"},"width":404,"height":65,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-9.png","element":"img"}],[{"text":"For ","element":"span"},{"style":{"height":16},"width":70.88,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-10.png","element":"img","alt":" f3(c","inline":true},{"text":"), we can see that","element":"span"}],[{"style":{"width":"16%"},"width":315,"height":82,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-11.png","element":"img"}],[{"text":"Therefore, we have","element":"span"}],[{"style":{"width":"9%"},"width":172,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-12.png","element":"img"}],[{"text":"since otherwise inf","element":"span"},{"style":{"height":16},"width":93.76,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-13.png","element":"img","alt":"c f3(c","inline":true},{"text":") = ","element":"span"},{"style":{"height":7.2},"width":71,"height":18,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-14.png","element":"img","alt":" −∞","inline":true,"padRight":true},{"text":"and the dual problem is infeasible. Clearly, min","element":"span"},{"style":{"height":16},"width":93.76,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-15.png","element":"img","alt":"c f3(c","inline":true},{"text":") = 0. All together, the dual problem is","element":"span"}],[{"id":"id-27","style":{"width":"73%"},"width":1375,"height":293,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-16.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":16},"width":221.67,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-17.png","element":"img","alt":" C = {θ ∈ ℜm","inline":true,"padRight":true},{"text":": ","element":"span"},{"style":{"height":13.19},"width":70.05,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-18.png","element":"img","alt":" θi ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1)","element":"span"},{"style":{"fontStyle":"italic"},"text":", i ","element":"span"},{"text":"= 1","element":"span"},{"style":{"fontStyle":"italic"},"text":", . . . , m","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":".","element":"span"}]]},{"heading":"B Proof of the Existence of the Optimal Solution of (LRDλ)","paragraphs":[[{"text":"In this section, we prove that problem ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-19.png","element":"img","alt":"λ","inline":true},{"text":") has a unique optimal solution for all ","element":"span"},{"style":{"height":11.6},"width":65.31,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-20.png","element":"img","alt":" λ >","inline":true,"padRight":true},{"text":"0. Therefore, in Lemma A, we first show that problem ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-21.png","element":"img","alt":"λ","inline":true},{"text":") is feasible for all ","element":"span"},{"style":{"height":11.6},"width":68.98,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-22.png","element":"img","alt":" λ >","inline":true,"padRight":true},{"text":"0. Then Lemma B confirms the existence of the dual optimal solution ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-23.png","element":"img","alt":" θ∗λ","inline":true},{"text":".","element":"span"}],[{"style":{"width":"76%"},"width":1435,"height":90,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/13-24.png","element":"img"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"1. When ","element":"span"},{"style":{"height":13.2},"width":163.21,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-0.png","element":"img","alt":" λ ≥ λmax","inline":true},{"text":", the feasibility of (LRD","element":"span"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-1.png","element":"img","alt":"λ","inline":true},{"text":") is trivial because of the existence of ","element":"span"},{"style":{"height":17.09},"width":93.5,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-2.png","element":"img","alt":" θ∗λmax","inline":true},{"text":". We focus ","element":"span"},{"text":"on the case in which ","element":"span"},{"style":{"height":11.6},"width":61.31,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-3.png","element":"img","alt":" λ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-4.png","element":"img","alt":", λmax","inline":true},{"text":"] below.","element":"span"}],[{"text":"Recall that ","element":"span"},{"href":"#id-24","referenceIndex":12,"text":"[12] ","element":"a"},{"style":{"height":13.19},"width":86.82,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-5.png","element":"img","alt":" λmax","inline":true,"padRight":true},{"text":"is the smallest tuning parameter such that ","element":"span"},{"style":{"height":15.72},"width":41.54,"height":39.29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-6.png","element":"img","alt":" β∗λ","inline":true,"padRight":true},{"text":"= 0 and ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-7.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":17.09},"width":93.5,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-8.png","element":"img","alt":" θ∗λmax","inline":true,"padRight":true},{"text":"whenever ","element":"span"},{"style":{"height":13.2},"width":163.2,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-9.png","element":"img","alt":"λ ≥ λmax","inline":true},{"text":". For convenience, we rewrite the definition of ","element":"span"},{"style":{"height":13.19},"width":86.82,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-10.png","element":"img","alt":" λmax","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":17.09},"width":93.5,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-11.png","element":"img","alt":" θ∗λmax","inline":true,"padRight":true},{"text":"as follows.","element":"span"}],[{"style":{"width":"38%"},"width":727,"height":256,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-12.png","element":"img"}],[{"text":"Clearly, we have ","element":"span"},{"style":{"height":17.31},"width":249.4,"height":43.27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-13.png","element":"img","alt":" θ∗λmax ∈ Fλmax","inline":true},{"text":", i.e., ","element":"span"},{"style":{"height":19.78},"width":599.22,"height":49.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-14.png","element":"img","alt":" θ∗λmax ∈ C, ∥ ¯XT θ∗λmax∥∞ ≤ mλmax","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":18.11},"width":171.94,"height":45.27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-15.png","element":"img","alt":" ⟨θ∗λmax, b⟩","inline":true,"padRight":true},{"text":"= 0. ","element":"span"},{"text":"Let us define:","element":"span"}],[{"style":{"width":"70%"},"width":1321,"height":533,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-16.png","element":"img"}],[{"text":"2. The constraints of ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-17.png","element":"img","alt":"λ","inline":true},{"text":") are all affine. Therefore, the Slater’s condition reduces to the feasibility of ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-18.png","element":"img","alt":"λ","inline":true},{"text":") ","element":"span"},{"href":"#id-56","referenceIndex":3,"text":"[3]","element":"a"},{"text":". When ","element":"span"},{"style":{"height":17.09},"width":289.46,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-19.png","element":"img","alt":" λ ≥ λmax, θ∗λmax","inline":true,"padRight":true},{"text":"is clearly a feasible solution of ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-20.png","element":"img","alt":"λ","inline":true},{"text":"). When ","element":"span"},{"style":{"height":11.6},"width":63.63,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-21.png","element":"img","alt":" λ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-22.png","element":"img","alt":", λmax","inline":true},{"text":"], ","element":"span"},{"text":"we have shown the feasibility of ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-23.png","element":"img","alt":"λ","inline":true},{"text":") in part 1. Therefore, the Slater’s condition always holds for ","element":"span"},{"href":"#id-27","text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-24.png","element":"img","alt":"λ","inline":true},{"text":") in which ","element":"span"},{"style":{"height":11.6},"width":65.31,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-25.png","element":"img","alt":" λ >","inline":true,"padRight":true},{"text":"0.","element":"span"}],[{"style":{"width":"1%"},"width":28,"height":28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-26.png","element":"img"}],[{"style":{"fontWeight":"bold"},"text":"Lemma 11. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"B Given ","element":"span"},{"style":{"height":11.6},"width":67.78,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-27.png","element":"img","alt":" λ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-28.png","element":"img","alt":", λmax","inline":true},{"text":"]","element":"span"},{"style":{"fontStyle":"italic"},"text":", problem ","element":"span"},{"href":"#id-27","style":{"fontStyle":"italic"},"text":"(LRD","element":"a"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-29.png","element":"img","alt":"λ","inline":true},{"style":{"fontStyle":"italic"},"text":") has a unique optimal solution, i.e., there exists a unique ","element":"span"},{"style":{"height":15.71},"width":135.95,"height":39.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-30.png","element":"img","alt":" θ∗λ ∈ Fλ","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"such that ","element":"span"},{"style":{"height":16},"width":54.94,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-31.png","element":"img","alt":" g(θ","inline":true},{"text":") ","element":"span"},{"style":{"fontStyle":"italic"},"text":"achieves its minimum over ","element":"span"},{"style":{"height":13.59},"width":47.64,"height":33.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-32.png","element":"img","alt":" Fλ","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"at ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-33.png","element":"img","alt":" θ∗λ","inline":true},{"style":{"fontStyle":"italic"},"text":".","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":10.8},"width":26.12,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-34.png","element":"img","alt":"�C","inline":true,"padRight":true},{"text":":= ","element":"span"},{"style":{"height":16},"width":145.23,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-35.png","element":"img","alt":" {θ ∈ ℜm","inline":true,"padRight":true},{"text":": ","element":"span"},{"style":{"height":13.19},"width":70.04,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-36.png","element":"img","alt":" θi ∈","inline":true,"padRight":true},{"text":"[0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]","element":"span"},{"style":{"fontStyle":"italic"},"text":", i ","element":"span"},{"text":"= 1","element":"span"},{"style":{"fontStyle":"italic"},"text":", . . . , m","element":"span"},{"style":{"fontStyle":"italic"},"text":"} ","element":"span"},{"text":"and","element":"span"}],[{"style":{"width":"74%"},"width":1404,"height":269,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-37.png","element":"img"}],[{"text":"By L","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-38.png","element":"img","alt":"′","inline":true},{"text":"Hˆopital","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-39.png","element":"img","alt":"′","inline":true},{"text":"s rule, it is easy to see that lim","element":"span"},{"style":{"height":16.79},"width":117.31,"height":41.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-40.png","element":"img","alt":"y↓0 f(y","inline":true},{"text":") = lim","element":"span"},{"style":{"height":16.79},"width":117.3,"height":41.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-41.png","element":"img","alt":"y↑1 f(y","inline":true},{"text":") = 0. Therefore, let","element":"span"}],[{"style":{"width":"24%"},"width":463,"height":120,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-42.png","element":"img"}],[{"text":"and","element":"span"}],[{"style":{"width":"17%"},"width":335,"height":110,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/14-43.png","element":"img"}],[{"id":"id-57","style":{"width":"99%"},"width":1870,"height":342,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-0.png","element":"img"}],[{"text":"Because of Lemma A, we know that ","element":"span"},{"style":{"height":16.4},"width":122.67,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-1.png","element":"img","alt":" Fλ ̸= ∅","inline":true,"padRight":true},{"text":"and thus ","element":"span"},{"style":{"height":16.4},"width":122.67,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-2.png","element":"img","alt":"�Fλ ̸= ∅","inline":true,"padRight":true},{"text":"either. Therefore, problem ","element":"span"},{"href":"#id-57","text":"(LRD","element":"a"},{"style":{"height":10.3},"width":19,"height":25.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-3.png","element":"img","alt":"′λ","inline":true},{"text":") is feasible. ","element":"span"},{"text":"By noting that the constraints of problem ","element":"span"},{"href":"#id-57","text":"(LRD","element":"a"},{"style":{"height":10.3},"width":19,"height":25.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-4.png","element":"img","alt":"′λ","inline":true},{"text":") are all linear, the Slater’s condition is satisfied. Hence, ","element":"span"},{"text":"there exists a set of Lagrangian multipliers ","element":"span"},{"style":{"height":18.55},"width":456.89,"height":46.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-5.png","element":"img","alt":" η+, η− ∈ ℜp+, ξ+, ξ− ∈ ℜm+","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":12.4},"width":91.57,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-6.png","element":"img","alt":" γ ∈ ℜ","inline":true,"padRight":true},{"text":"such that","element":"span"}],[{"id":"id-58","style":{"width":"75%"},"width":1408,"height":110,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-7.png","element":"img"}],[{"text":"We can see that if there is an ","element":"span"},{"style":{"height":12.79},"width":29.73,"height":31.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-8.png","element":"img","alt":" i0","inline":true,"padRight":true},{"text":"such that [","element":"span"},{"style":{"height":16.52},"width":242.82,"height":41.29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-9.png","element":"img","alt":"θ∗λ]i0 ∈ {0, 1}","inline":true},{"text":", i.e., ","element":"span"},{"style":{"height":16.52},"width":153.08,"height":41.29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-10.png","element":"img","alt":" θ∗λ /∈ Fλ","inline":true},{"text":", Eq. ","element":"span"},{"href":"#id-58","text":"(24) ","element":"a"},{"text":"does not hold since ","element":"span"},{"style":{"height":16.52},"width":312.55,"height":41.29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-11.png","element":"img","alt":"|[∇�g(θ∗λ)]i0| = ∞","inline":true},{"text":". ","element":"span"},{"text":"Therefore, we can conclude that ","element":"span"},{"style":{"height":15.72},"width":158.02,"height":39.29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-12.png","element":"img","alt":" θ∗λ ∈ Fλ","inline":true},{"text":". ","element":"span"},{"text":"Moreover, it is easy to see that ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-13.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"= ","element":"span"},{"text":"argmin","element":"span"},{"style":{"height":19.48},"width":142.64,"height":48.7,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-14.png","element":"img","alt":"θ∈ �Fλ �g(θ","inline":true},{"text":") = argmin","element":"span"},{"style":{"height":17.99},"width":142.64,"height":44.97,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-15.png","element":"img","alt":"θ∈Fλ g(θ","inline":true},{"text":"), i.e. ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-16.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"is a minimum of ","element":"span"},{"style":{"height":16},"width":54.94,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-17.png","element":"img","alt":" g(θ","inline":true},{"text":") over ","element":"span"},{"style":{"height":13.59},"width":47.64,"height":33.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-18.png","element":"img","alt":" Fλ","inline":true},{"text":". The uniqueness of ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-19.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"is due to ","element":"span"},{"text":"the strict convexity of ","element":"span"},{"style":{"height":16},"width":54.94,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-20.png","element":"img","alt":" g(θ","inline":true},{"text":") (strong convexity implies strict convexity), which completes the proof.","element":"span"}]]},{"heading":"C Proof of Lemma 1","paragraphs":[[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Recall that the domain of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"g ","element":"span"},{"text":"is ","element":"span"},{"style":{"height":16},"width":221.66,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-21.png","element":"img","alt":" C = {θ ∈ ℜm","inline":true,"padRight":true},{"text":": [","element":"span"},{"style":{"height":16},"width":82.22,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-22.png","element":"img","alt":"θ]i ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1)","element":"span"},{"style":{"fontStyle":"italic"},"text":", i ","element":"span"},{"text":"= 1","element":"span"},{"style":{"fontStyle":"italic"},"text":", . . . , m","element":"span"},{"style":{"fontStyle":"italic"},"text":"}","element":"span"},{"text":".","element":"span"}],[{"text":"a. It is easy to see that","element":"span"}],[{"style":{"width":"94%"},"width":1767,"height":289,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-23.png","element":"img"}],[{"text":"Therefore, ","element":"span"},{"style":{"fontStyle":"italic"},"text":"g ","element":"span"},{"text":"is a strong convex function with convexity parameter ","element":"span"},{"style":{"height":10},"width":24,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-24.png","element":"img","alt":" µ","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":19.37},"width":28,"height":48.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-25.png","element":"img","alt":"4m","inline":true,"padRight":true},{"href":"#id-31","referenceIndex":18,"text":"[18]","element":"a"},{"text":". The claim then follows ","element":"span"},{"text":"directly from the definition of strong convex functions.","element":"span"}],[{"text":"b. If ","element":"span"},{"style":{"height":15.2},"width":130.84,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-26.png","element":"img","alt":" θ1 ̸= θ2","inline":true},{"text":", then there exists at least one ","element":"span"},{"style":{"height":16},"width":263.42,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-27.png","element":"img","alt":" i′ ∈ {1, . . . , m}","inline":true,"padRight":true},{"text":"such that [","element":"span"},{"style":{"height":16},"width":85.98,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-28.png","element":"img","alt":"θ1]i′ ̸","inline":true},{"text":"= [","element":"span"},{"style":{"height":16},"width":68.93,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-29.png","element":"img","alt":"θ2]i′","inline":true},{"text":". Moreover, at most one of [","element":"span"},{"style":{"height":16},"width":68.93,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-30.png","element":"img","alt":"θ1]i′","inline":true,"padRight":true},{"text":"and [","element":"span"},{"style":{"height":16},"width":68.93,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-31.png","element":"img","alt":"θ2]i′","inline":true,"padRight":true},{"text":"can be ","element":"span"},{"style":{"height":19.37},"width":16.01,"height":48.43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-32.png","element":"img","alt":"12","inline":true},{"text":". Without loss of generality, assume [","element":"span"},{"style":{"height":19.37},"width":145.62,"height":48.43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-33.png","element":"img","alt":"θ1]i′ ̸= 12","inline":true},{"text":".","element":"span"}],[{"text":"For ","element":"span"},{"style":{"height":10.8},"width":52.66,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-34.png","element":"img","alt":" t ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1), let ","element":"span"},{"style":{"height":16},"width":49.31,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-35.png","element":"img","alt":" θ(t","inline":true},{"text":") = ","element":"span"},{"style":{"height":13.19},"width":49.1,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-36.png","element":"img","alt":" tθ2","inline":true,"padRight":true},{"text":"+ (1 ","element":"span"},{"style":{"height":16},"width":104.53,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-37.png","element":"img","alt":" − t)θ1","inline":true},{"text":". Since [","element":"span"},{"style":{"height":16},"width":82.98,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-38.png","element":"img","alt":"θ1]i′ ̸","inline":true},{"text":"= ","element":"span"},{"style":{"height":19.37},"width":16,"height":48.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-39.png","element":"img","alt":"12","inline":true},{"text":", we can find ","element":"span"},{"style":{"height":10.8},"width":63.84,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-40.png","element":"img","alt":" t′ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1) such that [","element":"span"},{"style":{"height":19.37},"width":185.82,"height":48.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-41.png","element":"img","alt":"θ(t′)]i′ ̸= 12","inline":true},{"text":". ","element":"span"},{"text":"Therefore, we can see that","element":"span"}],[{"style":{"width":"94%"},"width":1768,"height":590,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/15-42.png","element":"img"}],[{"style":{"width":"94%"},"width":1770,"height":2468,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/16-0.png","element":"img"}],[{"style":{"width":"94%"},"width":1770,"height":882,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-0.png","element":"img"}]]},{"heading":"D Proof of Lemma 3","paragraphs":[[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"According to the definition of ","element":"span"},{"style":{"height":13.19},"width":86.83,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-1.png","element":"img","alt":" λmax","inline":true,"padRight":true},{"text":"in Eq. ","element":"span"},{"href":"#id-25","text":"(1)","element":"a"},{"text":", there must be ","element":"span"},{"style":{"height":16},"width":267.04,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-2.png","element":"img","alt":" j0 ∈ {1, . . . , p}","inline":true,"padRight":true},{"text":"such that ","element":"span"},{"style":{"height":13.19},"width":86.83,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-3.png","element":"img","alt":" λmax","inline":true,"padRight":true},{"text":"=","element":"span"}],[{"style":{"width":"60%"},"width":1131,"height":23,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-4.png","element":"img"}],[{"text":"For ","element":"span"},{"style":{"height":11.6},"width":61.32,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-5.png","element":"img","alt":" λ ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-6.png","element":"img","alt":", λmax","inline":true},{"text":"), we prove the statement by contradiction. Suppose ","element":"span"},{"style":{"height":13.19},"width":40.7,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-7.png","element":"img","alt":" Iλ","inline":true,"padRight":true},{"text":"is empty, then the KKT condition for (LRD","element":"span"},{"style":{"height":7.6},"width":19,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-8.png","element":"img","alt":"λ","inline":true},{"text":") at ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-9.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"can be written as:","element":"span"}],[{"style":{"width":"50%"},"width":939,"height":118,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-10.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":18.55},"width":335.92,"height":46.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-11.png","element":"img","alt":" η+, η− ∈ ℜp+, γ ∈ ℜ","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":16.51},"width":105.91,"height":41.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-12.png","element":"img","alt":" NC(θ∗λ","inline":true},{"text":") is the normal cone of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C ","element":"span"},{"text":"at ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-13.png","element":"img","alt":" θ∗λ","inline":true},{"text":". Because ","element":"span"},{"style":{"height":15.5},"width":111.39,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-14.png","element":"img","alt":" θ∗λ ∈ C","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C ","element":"span"},{"text":"is an open set, ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-15.png","element":"img","alt":"θ∗λ","inline":true,"padRight":true},{"text":"is an interior point of ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C ","element":"span"},{"text":"and thus ","element":"span"},{"style":{"height":16.51},"width":105.91,"height":41.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-16.png","element":"img","alt":" NC(θ∗λ","inline":true},{"text":") = ","element":"span"},{"style":{"height":13.6},"width":20,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-17.png","element":"img","alt":" ∅","inline":true},{"text":". Therefore, the above equation becomes:","element":"span"}],[{"id":"id-59","style":{"width":"70%"},"width":1326,"height":118,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-18.png","element":"img"}],[{"text":"Moreover, by the complementary slackness condition ","element":"span"},{"href":"#id-56","referenceIndex":3,"text":"[3]","element":"a"},{"text":", we have ","element":"span"},{"style":{"height":20.86},"width":46.22,"height":52.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-19.png","element":"img","alt":" η+j","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":16.86},"width":46.22,"height":42.16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-20.png","element":"img","alt":" η−j","inline":true,"padRight":true},{"text":"= 0 for ","element":"span"},{"style":{"fontStyle":"italic"},"text":"j ","element":"span"},{"text":"= 1","element":"span"},{"style":{"fontStyle":"italic"},"text":", . . . , p ","element":"span"},{"text":"since ","element":"span"},{"style":{"height":13.19},"width":40.7,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-21.png","element":"img","alt":" Iλ","inline":true,"padRight":true},{"text":"is ","element":"span"},{"text":"empty. Then, Eq. ","element":"span"},{"href":"#id-59","text":"(39) ","element":"a"},{"text":"becomes:","element":"span"}],[{"style":{"width":"57%"},"width":1084,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-22.png","element":"img"}],[{"text":"By the similar argument, the KKT condition for (LRD","element":"span"},{"style":{"height":9.2},"width":74.8,"height":22.99,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-23.png","element":"img","alt":"λmax","inline":true},{"text":") at ","element":"span"},{"style":{"height":17.09},"width":93.5,"height":42.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-24.png","element":"img","alt":" θ∗λmax","inline":true,"padRight":true},{"text":"is:","element":"span"}],[{"id":"id-60","style":{"width":"72%"},"width":1360,"height":119,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-25.png","element":"img"}],[{"text":"where ¯","element":"span"},{"style":{"height":18.55},"width":215.98,"height":46.38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-26.png","element":"img","alt":"η+, ¯η− ∈ ℜp+","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":12.4},"width":102.75,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-27.png","element":"img","alt":" γ′ ∈ ℜ","inline":true},{"text":".","element":"span"}],[{"text":"Since ","element":"span"},{"style":{"height":15.19},"width":115.72,"height":37.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-28.png","element":"img","alt":" Iλ = ∅","inline":true},{"text":", we can see that ","element":"span"},{"style":{"height":17.5},"width":437.74,"height":43.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-29.png","element":"img","alt":" |⟨θ∗λ, ¯xj⟩| < mλ < mλmax","inline":true,"padRight":true},{"text":"for all ","element":"span"},{"style":{"height":16},"width":250.79,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-30.png","element":"img","alt":" j ∈ {1, . . . , m}","inline":true},{"text":". Therefore, ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-31.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"also satisfies ","element":"span"},{"text":"Eq. ","element":"span"},{"href":"#id-60","text":"(41) ","element":"a"},{"text":"by setting ","element":"span"},{"style":{"height":16.58},"width":46.22,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-32.png","element":"img","alt":" η+","inline":true,"padRight":true},{"text":"= ¯","element":"span"},{"style":{"height":16.58},"width":116.66,"height":41.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-33.png","element":"img","alt":"η+, η−","inline":true,"padRight":true},{"text":"= ¯","element":"span"},{"style":{"height":12.58},"width":46.22,"height":31.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-34.png","element":"img","alt":"η−","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":10.4},"width":112.83,"height":26,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-35.png","element":"img","alt":" γ = γ′","inline":true,"padRight":true},{"text":"without violating the complementary slackness conditions. As a result, ","element":"span"},{"style":{"height":15.5},"width":37.71,"height":38.75,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-36.png","element":"img","alt":" θ∗λ","inline":true,"padRight":true},{"text":"is an optimal solution of problem (LRD","element":"span"},{"style":{"height":9.19},"width":74.79,"height":22.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-37.png","element":"img","alt":"λmax","inline":true},{"text":") as well.","element":"span"}],[{"text":"Moreover, it is easy to see that ","element":"span"},{"style":{"height":17.71},"width":190.4,"height":44.27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-38.png","element":"img","alt":" θ∗λ ̸= θ∗λmax","inline":true,"padRight":true},{"text":"because ","element":"span"},{"style":{"height":19.09},"width":742.34,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-39.png","element":"img","alt":" |⟨θ∗λ, ¯xj0⟩| < mλ < mλmax = |⟨θ∗λmax, ¯xj0⟩|","inline":true},{"text":". Conse- ","element":"span"},{"text":"quently, (LRD","element":"span"},{"style":{"height":9.19},"width":74.79,"height":22.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-40.png","element":"img","alt":"λmax","inline":true},{"text":") has at least two distinct optimal solutions, which contradicts with Lemma B. Therefore, ","element":"span"},{"style":{"height":13.19},"width":40.7,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-41.png","element":"img","alt":"Iλ","inline":true,"padRight":true},{"text":"must be an nonempty set. Because ","element":"span"},{"style":{"height":10.8},"width":23,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-42.png","element":"img","alt":" λ","inline":true,"padRight":true},{"text":"is arbitrary in (0","element":"span"},{"style":{"height":14},"width":104.54,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-43.png","element":"img","alt":", λmax","inline":true},{"text":"), the proof is complete.","element":"span"}],[{"style":{"width":"1%"},"width":28,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/17-44.png","element":"img"}]]},{"heading":"E Proof of Theorem 4","paragraphs":[[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Recall that we need to solve the following optimization proble:","element":"span"}],[{"id":"id-40","style":{"width":"99%"},"width":1867,"height":194,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-0.png","element":"img"}],[{"text":"i.e., ","element":"span"},{"style":{"height":10.8},"width":19,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-1.png","element":"img","alt":" θ","inline":true,"padRight":true},{"text":"belongs to the orthogonal complement of the space spanned by ","element":"span"},{"style":{"fontWeight":"bold"},"text":"b","element":"span"},{"text":". As a result, ","element":"span"},{"style":{"height":10.8},"width":123.28,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-2.png","element":"img","alt":" θ = Pθ","inline":true,"padRight":true},{"text":"and","element":"span"}],[{"style":{"width":"99%"},"width":1869,"height":172,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-3.png","element":"img"}]]},{"heading":"F Proof of Corollary 5","paragraphs":[[{"style":{"width":"99%"},"width":1871,"height":141,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-4.png","element":"img"}]]},{"heading":"G Proof of Lemma 6","paragraphs":[[{"style":{"width":"74%"},"width":1399,"height":181,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-5.png","element":"img"}],[{"text":"To show that the Slater’s condition holds, we need to seek a point ","element":"span"},{"style":{"height":10.8},"width":33.82,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-6.png","element":"img","alt":" θ′","inline":true,"padRight":true},{"text":"such that","element":"span"}],[{"style":{"width":"99%"},"width":1868,"height":181,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-7.png","element":"img"}],[{"text":"empty. Let ","element":"span"},{"style":{"height":14.78},"width":137.6,"height":36.96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-8.png","element":"img","alt":" j0 ∈ Iλ0","inline":true},{"text":", we can see that","element":"span"}],[{"style":{"width":"16%"},"width":314,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-9.png","element":"img"}],[{"text":"However, because ","element":"span"},{"style":{"height":15.71},"width":135.95,"height":39.28,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-10.png","element":"img","alt":" θ∗λ ∈ Fλ","inline":true},{"text":", we have ","element":"span"},{"style":{"fontStyle":"italic"},"text":"| ","element":"span"},{"text":"¯","element":"span"},{"style":{"height":18.18},"width":266.54,"height":45.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-11.png","element":"img","alt":"XT θ∗λ|∞ ≤ mλ,","inline":true}],[{"style":{"width":"71%"},"width":1334,"height":256,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-12.png","element":"img"}],[{"text":"As a result, the Slater’s condition holds for (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-13.png","element":"img","alt":"′","inline":true},{"text":") which implies the strong duality of (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-14.png","element":"img","alt":"′","inline":true},{"text":"). Moreover, it is easy to see that problem (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-15.png","element":"img","alt":"′","inline":true},{"text":") admits optimal solution in ","element":"span"},{"style":{"height":19.49},"width":64.72,"height":48.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-16.png","element":"img","alt":" Aλλ0","inline":true,"padRight":true},{"text":"because the objective ","element":"span"},{"text":"function of (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-17.png","element":"img","alt":"′","inline":true},{"text":") is continuous and ","element":"span"},{"style":{"height":19.49},"width":64.72,"height":48.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-18.png","element":"img","alt":" Aλλ0","inline":true,"padRight":true},{"text":"is compact.","element":"span"}],[{"style":{"width":"1%"},"width":28,"height":15,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/18-19.png","element":"img"}]]},{"heading":"H Proof of Lemma 7","paragraphs":[[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"The Lagrangian of (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-0.png","element":"img","alt":"′","inline":true},{"text":") is","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"L","element":"span"},{"text":"(","element":"span"},{"style":{"fontStyle":"italic"},"text":"θ","element":"span"},{"text":"; ","element":"span"},{"style":{"fontStyle":"italic"},"text":"u","element":"span"},{"style":{"height":7.6},"width":16,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-1.png","element":"img","alt":"1","inline":true},{"style":{"fontStyle":"italic"},"text":", u","element":"span"},{"style":{"height":7.6},"width":16,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-2.png","element":"img","alt":"2","inline":true},{"style":{"fontStyle":"italic"},"text":", v","element":"span"},{"text":") = ","element":"span"},{"style":{"height":16},"width":16,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-3.png","element":"img","alt":" ⟨","inline":true},{"style":{"fontStyle":"italic"},"text":"θ, ","element":"span"},{"text":"¯","element":"span"},{"style":{"height":16},"width":40.18,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-4.png","element":"img","alt":"x⟩","inline":true,"padRight":true},{"text":"+ ","element":"span"},{"style":{"fontStyle":"italic"},"text":"u","element":"span"},{"style":{"height":26.93},"width":338.43,"height":67.32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-5.png","element":"img","alt":"12�∥θ − θ∗λ0∥22 − r2�","inline":true},{"text":"+ ","element":"span"},{"style":{"fontStyle":"italic"},"text":"u","element":"span"},{"style":{"height":16},"width":49.38,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-6.png","element":"img","alt":"2(⟨","inline":true},{"style":{"fontStyle":"italic"},"text":"θ, ","element":"span"},{"text":"¯","element":"span"},{"style":{"height":16},"width":97.85,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-7.png","element":"img","alt":"x∗⟩ −","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"mλ","element":"span"},{"style":{"height":7.6},"width":16,"height":19,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-8.png","element":"img","alt":"2","inline":true},{"text":") + ","element":"span"},{"style":{"fontStyle":"italic"},"text":"v","element":"span"},{"style":{"height":16},"width":94.48,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-9.png","element":"img","alt":"⟨θ, b⟩","inline":true,"padRight":true},{"text":"(43) = ","element":"span"},{"style":{"height":16},"width":16,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-10.png","element":"img","alt":" ⟨","inline":true},{"style":{"fontStyle":"italic"},"text":"θ, ","element":"span"},{"text":"¯","element":"span"},{"style":{"fontWeight":"bold"},"text":"x ","element":"span"},{"text":"+ ","element":"span"},{"style":{"fontStyle":"italic"},"text":"u","element":"span"},{"style":{"height":14.17},"width":58.06,"height":35.43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-11.png","element":"img","alt":"2¯x∗","inline":true,"padRight":true},{"text":"+ ","element":"span"},{"style":{"fontStyle":"italic"},"text":"v","element":"span"},{"style":{"height":16},"width":41.46,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-12.png","element":"img","alt":"b⟩","inline":true,"padRight":true},{"text":"+ ","element":"span"},{"style":{"fontStyle":"italic"},"text":"u","element":"span"},{"style":{"height":26.92},"width":502.32,"height":67.31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-13.png","element":"img","alt":"12�∥θ − θ∗λ0∥22 − r2�− u2mλ2","inline":true},{"style":{"fontStyle":"italic"},"text":".","element":"span"}],[{"text":"where ","element":"span"},{"style":{"height":13.6},"width":148.59,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-14.png","element":"img","alt":" u1, u2 ≥","inline":true,"padRight":true},{"text":"0 and ","element":"span"},{"style":{"height":9.6},"width":104.31,"height":24,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-15.png","element":"img","alt":" v ∈ ℜ","inline":true,"padRight":true},{"text":"are the Lagrangian multipliers. To derive the dual function ˆ","element":"span"},{"style":{"height":16},"width":171.76,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-16.png","element":"img","alt":"g(u1, u2, v","inline":true},{"text":") = min","element":"span"},{"style":{"height":16},"width":240.64,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-17.png","element":"img","alt":"θ L(θ; u1, u2, v","inline":true},{"text":"), we can simply set ","element":"span"},{"style":{"height":16},"width":267.2,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-18.png","element":"img","alt":" ∇θL(θ; u1, u2, v","inline":true},{"text":") = 0, i.e.,","element":"span"}],[{"style":{"width":"48%"},"width":913,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-19.png","element":"img"}],[{"text":"When ","element":"span"},{"style":{"height":15.2},"width":51.76,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-20.png","element":"img","alt":" u1 ̸","inline":true},{"text":"= 0, we set","element":"span"}],[{"id":"id-61","style":{"width":"64%"},"width":1202,"height":88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-21.png","element":"img"}],[{"text":"to minimize ","element":"span"},{"style":{"height":16},"width":142.34,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-22.png","element":"img","alt":" L(θ; u, v","inline":true},{"text":") with (","element":"span"},{"style":{"fontWeight":"bold"},"text":"u","element":"span"},{"style":{"fontStyle":"italic"},"text":", v","element":"span"},{"text":") fixed. By plugging Eq. ","element":"span"},{"href":"#id-61","text":"(44) ","element":"a"},{"text":"to Eq. ","element":"span"},{"href":"#id-61","text":"(43)","element":"a"},{"text":", we can see that the dual function ˆ","element":"span"},{"style":{"height":16},"width":171.76,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-23.png","element":"img","alt":"g(u1, u2, v","inline":true},{"text":") is:","element":"span"}],[{"id":"id-62","style":{"width":"99%"},"width":1870,"height":1027,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-24.png","element":"img"}],[{"text":"Clearly, Eq. ","element":"span"},{"href":"#id-62","text":"(46) ","element":"a"},{"text":"implies that ","element":"span"},{"style":{"height":10.8},"width":55.33,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-25.png","element":"img","alt":" P¯x","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":10.98},"width":71.52,"height":27.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-26.png","element":"img","alt":" P¯x∗","inline":true,"padRight":true},{"text":"are collinear, i.e., ","element":"span"},{"style":{"height":24.84},"width":391.38,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-27.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2 ∈ {−1, 1}","inline":true},{"text":". Moreover, by ","element":"span"},{"text":"Eq. ","element":"span"},{"href":"#id-62","text":"(47)","element":"a"},{"text":", we know that ","element":"span"},{"style":{"height":16},"width":225.75,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-28.png","element":"img","alt":" ⟨P¯x, P¯x∗⟩ ≤","inline":true,"padRight":true},{"text":"0. Therefore, it is easy to see that ","element":"span"},{"style":{"height":24.84},"width":197.66,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-29.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":4.4},"width":31,"height":11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-30.png","element":"img","alt":" −","inline":true},{"text":"1, which ","element":"span"},{"text":"contradicts the assumption. Hence, ¯","element":"span"},{"style":{"height":13.38},"width":153.77,"height":33.44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-31.png","element":"img","alt":"x + u2¯x∗","inline":true,"padRight":true},{"text":"+ ","element":"span"},{"style":{"height":15.2},"width":57.28,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-32.png","element":"img","alt":" vb ̸","inline":true},{"text":"= 0 for all ","element":"span"},{"style":{"height":12.8},"width":82.76,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-33.png","element":"img","alt":" u2 ≥","inline":true,"padRight":true},{"text":"0 and ","element":"span"},{"style":{"fontStyle":"italic"},"text":"v","element":"span"},{"text":".","element":"span"}],[{"style":{"width":"94%"},"width":1768,"height":284,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/19-34.png","element":"img"}],[{"id":"id-63","style":{"width":"94%"},"width":1769,"height":910,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-0.png","element":"img"}],[{"text":"Moreover, it is easy to see that problem ","element":"span"},{"href":"#id-63","text":"(52) ","element":"a"},{"text":"is an unconstrained optimization problem with respect to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"v","element":"span"},{"text":". Therefore, we set","element":"span"}],[{"id":"id-64","style":{"width":"94%"},"width":1768,"height":261,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-1.png","element":"img"}],[{"text":"By plugging ","element":"span"},{"href":"#id-64","text":"(53) ","element":"a"},{"text":"into ˆ","element":"span"},{"style":{"height":16},"width":171.76,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-2.png","element":"img","alt":"g(u1, u2, v","inline":true},{"text":") and noting that ","element":"span"},{"style":{"height":16},"width":236.52,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-3.png","element":"img","alt":" U1 = {(u1, u2","inline":true},{"text":") : ","element":"span"},{"style":{"height":16},"width":281.08,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-4.png","element":"img","alt":" u1 > 0, u2 ≥ 0}","inline":true},{"text":", problem ","element":"span"},{"href":"#id-63","text":"(52) ","element":"a"},{"text":"is equivalent to","element":"span"}],[{"id":"id-66","style":{"width":"87%"},"width":1641,"height":89,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-5.png","element":"img"}],[{"text":"By Lemma 6, we know that the Slater’s conditions holds for (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-6.png","element":"img","alt":"′","inline":true},{"text":"). Therefore, the strong duality holds for (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-7.png","element":"img","alt":"′","inline":true},{"text":") and (UBD","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-8.png","element":"img","alt":"′","inline":true},{"text":"). By the strong duality theorem ","element":"span"},{"href":"#id-65","referenceIndex":11,"text":"[11]","element":"a"},{"text":", there exists ","element":"span"},{"style":{"height":14.94},"width":87.16,"height":37.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-9.png","element":"img","alt":" u∗1 ≥","inline":true,"padRight":true},{"text":"0, ","element":"span"},{"style":{"height":14.94},"width":87.16,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-10.png","element":"img","alt":" u∗2 ≥","inline":true,"padRight":true},{"text":"0 and ","element":"span"},{"style":{"height":10.99},"width":36.74,"height":27.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-11.png","element":"img","alt":"v∗","inline":true,"padRight":true},{"text":"such that ˆ","element":"span"},{"style":{"height":16},"width":190.37,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-12.png","element":"img","alt":"g(u∗1, u∗2, v∗","inline":true},{"text":") = max","element":"span"},{"style":{"height":10},"width":185.3,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-13.png","element":"img","alt":"u1≥0,u2≥0,v","inline":true,"padRight":true},{"text":"ˆ","element":"span"},{"style":{"height":16},"width":171.76,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-14.png","element":"img","alt":"g(u1, u2, v","inline":true},{"text":"). By Eq. ","element":"span"},{"href":"#id-63","text":"(51)","element":"a"},{"text":", it is easy to see that ","element":"span"},{"style":{"height":14.94},"width":88.76,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-15.png","element":"img","alt":" u∗1 >","inline":true,"padRight":true},{"text":"0. ","element":"span"},{"text":"Therefore, ¯","element":"span"},{"style":{"height":16},"width":133.15,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-16.png","element":"img","alt":"g(u1, u2","inline":true},{"text":") attains its maximum in ","element":"span"},{"style":{"height":13.19},"width":40.94,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-17.png","element":"img","alt":" U1","inline":true},{"text":".","element":"span"}],[{"style":{"width":"81%"},"width":1530,"height":615,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/20-18.png","element":"img"}],[{"id":"id-73","style":{"width":"94%"},"width":1770,"height":669,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-0.png","element":"img"}]]},{"heading":"I Proof of Theorem 8","paragraphs":[[{"style":{"fontWeight":"bold"},"text":"I.1 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Proof for the Non-Collinear Case","element":"span"}],[{"text":"In this section, we show that the results in Theorem 8 holds for the case in which ","element":"span"},{"style":{"height":24.84},"width":337.62,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-1.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2 ∈ (−1,","inline":true,"padRight":true},{"text":"1), ","element":"span"},{"text":"i.e., ","element":"span"},{"style":{"height":10.8},"width":55.33,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-2.png","element":"img","alt":" P¯x","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":10.99},"width":70.1,"height":27.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-3.png","element":"img","alt":" P¯x∗","inline":true,"padRight":true},{"text":"are not collinear.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"As shown by part a) of Lemma 7, the dual problem of (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-4.png","element":"img","alt":"′","inline":true},{"text":") is equivalent to ","element":"span"},{"href":"#id-66","text":"(UBD","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-5.png","element":"img","alt":"′","inline":true},{"text":"). Since the strong duality holds by Lemma 6, we have","element":"span"}],[{"style":{"width":"67%"},"width":1263,"height":67,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-6.png","element":"img"}],[{"text":"Clearly, problem ","element":"span"},{"href":"#id-66","text":"(UBD","element":"a"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-7.png","element":"img","alt":"′","inline":true},{"text":") can be solved via the following minimization problem","element":"span"}],[{"id":"id-67","style":{"width":"67%"},"width":1259,"height":67,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-8.png","element":"img"}],[{"text":"Let ","element":"span"},{"style":{"height":14.94},"width":38.81,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-9.png","element":"img","alt":" u∗1","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":14.94},"width":38.82,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-10.png","element":"img","alt":" u∗2","inline":true,"padRight":true},{"text":"be the optimal solution of problem ","element":"span"},{"href":"#id-67","text":"(56)","element":"a"},{"text":". By part a) of Lemma 7, the existence of ","element":"span"},{"style":{"height":14.94},"width":83.2,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-11.png","element":"img","alt":" u∗1 >","inline":true,"padRight":true},{"text":"0 and ","element":"span"},{"style":{"height":14.94},"width":83.2,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-12.png","element":"img","alt":"u∗2 ≥","inline":true,"padRight":true},{"text":"0 is guaranteed. By introducing the slack variables ","element":"span"},{"style":{"height":12.8},"width":78.63,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-13.png","element":"img","alt":" s1 ≥","inline":true,"padRight":true},{"text":"0 and ","element":"span"},{"style":{"height":12.8},"width":78.63,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-14.png","element":"img","alt":" s2 ≥","inline":true,"padRight":true},{"text":"0, the KKT conditions of problem ","element":"span"},{"href":"#id-67","text":"(56) ","element":"a"},{"text":"can be written as follows:","element":"span"}],[{"style":{"width":"99%"},"width":1868,"height":596,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-15.png","element":"img"}],[{"text":"It is easy to observe that","element":"span"}],[{"style":{"width":"100%"},"width":1883,"height":106,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/21-16.png","element":"img"}],[{"style":{"height":16},"width":170.39,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-0.png","element":"img","alt":"m2). d ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1] due to the fact that ","element":"span"},{"style":{"height":13.19},"width":117.27,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-1.png","element":"img","alt":" λ0 > λ","inline":true,"padRight":true},{"text":"and Eq. ","element":"span"},{"href":"#id-68","text":"(68) ","element":"a"},{"text":"(note ","element":"span"},{"style":{"height":12.8},"width":78.63,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-2.png","element":"img","alt":" s2 ≥","inline":true,"padRight":true},{"text":"0).","element":"span"}],[{"style":{"width":"75%"},"width":1409,"height":297,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-3.png","element":"img"}],[{"text":"Therefore, we can see that ","element":"span"},{"style":{"height":11.19},"width":78.79,"height":27.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-4.png","element":"img","alt":" s2 >","inline":true,"padRight":true},{"text":"0 and thus ","element":"span"},{"style":{"height":14.94},"width":38.81,"height":37.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-5.png","element":"img","alt":" u∗2","inline":true,"padRight":true},{"text":"= 0 due to the complementary slackness condition. By ","element":"span"},{"text":"plugging ","element":"span"},{"style":{"height":14.94},"width":38.81,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-6.png","element":"img","alt":" u∗2","inline":true,"padRight":true},{"text":"= 0 into Eq. ","element":"span"},{"href":"#id-69","text":"(67)","element":"a"},{"text":", we can get ","element":"span"},{"style":{"height":14.94},"width":38.81,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-7.png","element":"img","alt":" u∗1","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":21.63},"width":90.21,"height":54.07,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-8.png","element":"img","alt":" ∥P¯x∥2r","inline":true,"padRight":true},{"text":". The result in part a) of Theorem 8 follows by noting that ","element":"span"},{"style":{"height":19.09},"width":223.84,"height":47.73,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-9.png","element":"img","alt":" Tξ(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":") = ","element":"span"},{"style":{"height":16},"width":164.59,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-10.png","element":"img","alt":" −¯g(u∗1, u∗2","inline":true},{"text":").","element":"span"}],[{"text":"a2) Suppose ","element":"span"},{"style":{"height":24.84},"width":197.65,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-11.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d","element":"span"},{"text":". If ","element":"span"},{"style":{"height":14.94},"width":84.58,"height":37.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-12.png","element":"img","alt":" u∗2 >","inline":true,"padRight":true},{"text":"0, then ","element":"span"},{"style":{"height":9.19},"width":34.68,"height":22.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-13.png","element":"img","alt":" s2","inline":true,"padRight":true},{"text":"= 0 by the complementary slackness condition. In view ","element":"span"},{"text":"of Eq. ","element":"span"},{"href":"#id-68","text":"(68) ","element":"a"},{"text":"and ","element":"span"},{"style":{"fontWeight":"bold"},"text":"m1)","element":"span"},{"text":", we can see that","element":"span"}],[{"style":{"width":"36%"},"width":691,"height":95,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-14.png","element":"img"}],[{"text":"which leads to a contradiction. Therefore ","element":"span"},{"style":{"height":14.94},"width":38.81,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-15.png","element":"img","alt":" u∗2","inline":true,"padRight":true},{"text":"= 0 and the result in part a) of Theorem 8 follows by a ","element":"span"},{"text":"similar argument as in the proof of a1).","element":"span"}],[{"style":{"width":"69%"},"width":1309,"height":200,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-16.png","element":"img"}],[{"text":"which implies that ","element":"span"},{"style":{"height":11.19},"width":78.63,"height":27.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-17.png","element":"img","alt":" s2 <","inline":true,"padRight":true},{"text":"0, a contradiction. Thus, we have ","element":"span"},{"style":{"height":14.94},"width":83.2,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-18.png","element":"img","alt":" u∗2 >","inline":true,"padRight":true},{"text":"0 and ","element":"span"},{"style":{"height":9.19},"width":34.68,"height":22.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-19.png","element":"img","alt":" s2","inline":true,"padRight":true},{"text":"= 0 by the complementary slackness ","element":"span"},{"text":"condition. Eq. ","element":"span"},{"href":"#id-68","text":"(68) ","element":"a"},{"text":"becomes:","element":"span"}],[{"id":"id-70","style":{"width":"69%"},"width":1301,"height":96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-20.png","element":"img"}],[{"text":"Expanding the terms in Eq. ","element":"span"},{"href":"#id-70","text":"(62) ","element":"a"},{"text":"yields the following quadratic equation:","element":"span"}],[{"style":{"width":"99%"},"width":1871,"height":191,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-21.png","element":"img"}],[{"style":{"height":24.84},"width":288.66,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-22.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2 ̸= −","inline":true},{"text":"1, we can see that ","element":"span"},{"style":{"height":10.98},"width":71.51,"height":27.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-23.png","element":"img","alt":" P¯x∗","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":10.8},"width":55.33,"height":27,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-24.png","element":"img","alt":" P¯x","inline":true,"padRight":true},{"text":"are not collinear. Therefore, ","element":"span"},{"style":{"fontWeight":"bold"},"text":"m1) ","element":"span"},{"text":"and Eq. ","element":"span"},{"href":"#id-70","text":"(62) ","element":"a"},{"text":"imply that ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d < ","element":"span"},{"text":"1. Moreover, the assumption ","element":"span"},{"style":{"height":13.19},"width":117.28,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-25.png","element":"img","alt":" λ0 > λ","inline":true,"padRight":true},{"text":"leads to ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d > ","element":"span"},{"text":"0. As a result, we have (1 ","element":"span"},{"style":{"height":17.38},"width":136.04,"height":43.46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-26.png","element":"img","alt":" − d2) >","inline":true,"padRight":true},{"text":"0 and thus","element":"span"}],[{"style":{"width":"75%"},"width":1422,"height":46,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-27.png","element":"img"}],[{"text":"Consequently, ","element":"span"},{"href":"#id-70","text":"(63) ","element":"a"},{"text":"has only one positive solution which can be computed by the formula in Eq. ","element":"span"},{"href":"#id-71","text":"(86)","element":"a"},{"text":". The result in Eq. ","element":"span"},{"href":"#id-72","text":"(85) ","element":"a"},{"text":"follows by a similar argument as in the proof of a1).","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"I.2 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Proof for the Collinear and Positive Correlated Case","element":"span"}],[{"text":"We prove for the case in which ","element":"span"},{"style":{"height":24.84},"width":197.66,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/22-28.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2","inline":true,"padRight":true},{"text":"= 1.","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Because ","element":"span"},{"style":{"height":24.84},"width":197.65,"height":62.09,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-0.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2","inline":true,"padRight":true},{"text":"= 1, by part a) of Lemma 7, the dual problem of (UBP","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-1.png","element":"img","alt":"′","inline":true},{"text":") is given by (UBD","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-2.png","element":"img","alt":"′","inline":true},{"text":"). ","element":"span"},{"text":"Therefore, the following KKT conditions hold as well:","element":"span"}],[{"id":"id-69","style":{"width":"71%"},"width":1330,"height":213,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-3.png","element":"img"}],[{"text":"where ","element":"span"},{"style":{"height":14.94},"width":85.14,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-4.png","element":"img","alt":" u∗1 >","inline":true,"padRight":true},{"text":"0 and ","element":"span"},{"style":{"height":14.94},"width":85.14,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-5.png","element":"img","alt":" u∗2 ≥","inline":true,"padRight":true},{"text":"0 are the optimal solution of (UBD","element":"span"},{"style":{"height":0},"width":14,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-6.png","element":"img","alt":"′","inline":true},{"text":"), and ","element":"span"},{"style":{"height":13.6},"width":134.83,"height":34,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-7.png","element":"img","alt":" s1, s2 ≥","inline":true,"padRight":true},{"text":"0 are the slack variables. Since ","element":"span"},{"style":{"height":14.94},"width":83.2,"height":37.35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-8.png","element":"img","alt":"u∗1 >","inline":true,"padRight":true},{"text":"0, then ","element":"span"},{"style":{"height":9.19},"width":34.68,"height":22.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-9.png","element":"img","alt":" s1","inline":true,"padRight":true},{"text":"= 0 and Eq. ","element":"span"},{"href":"#id-69","text":"(64) ","element":"a"},{"text":"results in","element":"span"}],[{"id":"id-68","style":{"width":"99%"},"width":1869,"height":514,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-10.png","element":"img"}],[{"text":"By plugging Eq. ","element":"span"},{"href":"#id-68","text":"(70) ","element":"a"},{"text":"into Eq. ","element":"span"},{"href":"#id-68","text":"(68)","element":"a"},{"text":", we have","element":"span"}],[{"style":{"width":"78%"},"width":1914,"height":459,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-11.png","element":"img"}],[{"text":"Therefore, we can see that ","element":"span"},{"style":{"height":11.19},"width":78.79,"height":27.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-12.png","element":"img","alt":" s2 >","inline":true,"padRight":true},{"text":"0 and thus ","element":"span"},{"style":{"height":14.94},"width":38.81,"height":37.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-13.png","element":"img","alt":" u∗2","inline":true,"padRight":true},{"text":"= 0 due to the complementary slackness condition. By ","element":"span"},{"text":"plugging ","element":"span"},{"style":{"height":14.94},"width":38.81,"height":37.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-14.png","element":"img","alt":" u∗2","inline":true,"padRight":true},{"text":"= 0 into Eq. ","element":"span"},{"href":"#id-69","text":"(67)","element":"a"},{"text":", we can get ","element":"span"},{"style":{"height":14.94},"width":38.81,"height":37.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-15.png","element":"img","alt":" u∗1","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":21.63},"width":90.21,"height":54.06,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-16.png","element":"img","alt":" ∥P¯x∥2r","inline":true,"padRight":true},{"text":". The result in part a) of Theorem 8 follows by noting that ","element":"span"},{"style":{"height":19.09},"width":223.84,"height":47.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-17.png","element":"img","alt":" Tξ(θ∗λ, ¯xj; θ∗λ0","inline":true},{"text":") = ","element":"span"},{"style":{"height":16},"width":164.59,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-18.png","element":"img","alt":" −¯g(u∗1, u∗2","inline":true},{"text":").","element":"span"}],[{"style":{"width":"78%"},"width":1917,"height":724,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/23-19.png","element":"img"}],[{"style":{"fontWeight":"bold"},"text":"I.3 ","element":"span"},{"style":{"fontWeight":"bold"},"text":"Proof for the Collinear and Negative Correlated Case","element":"span"}],[{"text":"Before we proceed to prove Theorem 8 for the case in which ","element":"span"},{"style":{"height":24.84},"width":197.66,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-0.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":4.4},"width":31,"height":11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-1.png","element":"img","alt":" −","inline":true},{"text":"1, it is worthwhile to noting ","element":"span"},{"text":"the following lemma.","element":"span"}],[{"style":{"fontWeight":"bold"},"text":"Lemma 12. ","element":"span"},{"style":{"fontStyle":"italic"},"text":"C Let ","element":"span"},{"style":{"height":13.2},"width":301.6,"height":33,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-2.png","element":"img","alt":" λmax ≥ λ0 > λ >","inline":true,"padRight":true},{"text":"0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d ","element":"span"},{"text":"= ","element":"span"},{"style":{"height":24.43},"width":131,"height":61.06,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-3.png","element":"img","alt":"m(λ0−λ)r∥P¯x∗∥2","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"and assume ","element":"span"},{"style":{"height":17.09},"width":51.61,"height":42.74,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-4.png","element":"img","alt":" θ∗λ0","inline":true,"padRight":true},{"style":{"fontStyle":"italic"},"text":"is known. Then ","element":"span"},{"style":{"height":11.6},"width":58.81,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-5.png","element":"img","alt":" d ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1]","element":"span"},{"style":{"fontStyle":"italic"},"text":".","element":"span"}],[{"style":{"fontStyle":"italic"},"text":"Proof. ","element":"span"},{"text":"Let ¯","element":"span"},{"style":{"fontWeight":"bold"},"text":"x ","element":"span"},{"text":":= ","element":"span"},{"style":{"height":16},"width":136.94,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-6.png","element":"img","alt":" −ξ(−¯x∗","inline":true},{"text":"). Clearly, we can see that","element":"span"}],[{"style":{"width":"99%"},"width":1869,"height":352,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-7.png","element":"img"}],[{"text":"Moreover, since ","element":"span"},{"style":{"height":13.19},"width":117.28,"height":32.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-8.png","element":"img","alt":" λ0 > λ","inline":true},{"text":", it is easy to see that ","element":"span"},{"style":{"fontStyle":"italic"},"text":"d > ","element":"span"},{"text":"0, which completes the proof.","element":"span"}],[{"text":"We prove Theorem 8 for the case in which ","element":"span"},{"style":{"height":24.84},"width":197.66,"height":62.1,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-9.png","element":"img","alt":"⟨P¯x,P¯x∗⟩∥P¯x∥2∥P¯x∗∥2","inline":true,"padRight":true},{"text":"= ","element":"span"},{"style":{"height":4.4},"width":31,"height":11,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-10.png","element":"img","alt":" −","inline":true},{"text":"1.","element":"span"}],[{"id":"id-75","style":{"width":"99%"},"width":1870,"height":174,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-11.png","element":"img"}],[{"text":"As shown by part b) of Lemma 7, the dual problem is given by ","element":"span"},{"href":"#id-73","text":"(UBD","element":"a"},{"style":{"height":0},"width":23.18,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-12.png","element":"img","alt":"′′","inline":true},{"text":"). Therefore, to find","element":"span"}],[{"id":"id-74","style":{"width":"99%"},"width":1869,"height":434,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-13.png","element":"img"}],[{"text":"Let us consider the problem in ","element":"span"},{"href":"#id-74","text":"(76)","element":"a"},{"text":". From problem ","element":"span"},{"href":"#id-73","text":"(UBD","element":"a"},{"style":{"height":0},"width":23.18,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-14.png","element":"img","alt":"′′","inline":true},{"text":"), we observe that","element":"span"}],[{"style":{"width":"65%"},"width":1230,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-15.png","element":"img"}],[{"text":"By noting Eq. ","element":"span"},{"href":"#id-75","text":"(74)","element":"a"},{"text":", we have","element":"span"}],[{"style":{"width":"86%"},"width":1622,"height":87,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-16.png","element":"img"}],[{"text":"Suppose ","element":"span"},{"style":{"height":9.19},"width":38.81,"height":22.98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-17.png","element":"img","alt":" u1","inline":true,"padRight":true},{"text":"is fixed, it is easy to see that","element":"span"}],[{"style":{"width":"76%"},"width":1434,"height":96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-18.png","element":"img"}],[{"text":"and","element":"span"}],[{"style":{"width":"73%"},"width":1378,"height":159,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/24-19.png","element":"img"}],[{"style":{"width":"68%"},"width":1284,"height":205,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/25-0.png","element":"img"}],[{"text":"Otherwise, if ","element":"span"},{"style":{"height":11.6},"width":58.81,"height":29,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/25-1.png","element":"img","alt":" d ∈","inline":true,"padRight":true},{"text":"(0","element":"span"},{"style":{"fontStyle":"italic"},"text":", ","element":"span"},{"text":"1), it is easy to see that","element":"span"}],[{"id":"id-76","style":{"width":"99%"},"width":1867,"height":201,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/25-2.png","element":"img"}],[{"text":"Moreover, we observe that","element":"span"}],[{"style":{"width":"55%"},"width":1031,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/25-3.png","element":"img"}],[{"text":"Therefore, Eq. ","element":"span"},{"href":"#id-76","text":"(82) ","element":"a"},{"text":"can simplified as:","element":"span"}],[{"style":{"width":"88%"},"width":1651,"height":95,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/25-4.png","element":"img"}],[{"text":"By Eq. ","element":"span"},{"href":"#id-74","text":"(77)","element":"a"},{"text":", we can see that","element":"span"}],[{"id":"id-72","style":{"width":"99%"},"width":1871,"height":364,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/25-5.png","element":"img"}],[{"text":"where","element":"span"}],[{"id":"id-71","style":{"width":"83%"},"width":1556,"height":292,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/25-6.png","element":"img"}],[{"text":"In fact, if we plugging Eq. ","element":"span"},{"href":"#id-75","text":"(74) ","element":"a"},{"text":"into Eq. ","element":"span"},{"href":"#id-71","text":"(86)","element":"a"},{"text":", we have","element":"span"}],[{"id":"id-77","style":{"width":"99%"},"width":1868,"height":376,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1307.4145/images/25-7.png","element":"img"}],[{"text":"Clearly, Eq. ","element":"span"},{"href":"#id-72","text":"(84) ","element":"a"},{"text":"and Eq. ","element":"span"},{"href":"#id-77","text":"(87) ","element":"a"},{"text":"give the same result, which completes the proof.","element":"span"}]]}],"_version":"3.3.2"},"paperNode":"$28:props:children:props:children:0:props:product"}]]