36:[["$","audio",null,{"id":"tts"}],["$","$L3b",null,{"paperID":"1404.6000","publisher":"arxiv","paperJSON":{"title":"Robust and computationally feasible community detection in the presence of arbitrary outlier nodes","paperID":"1404.6000","avgLineHeight":12.84,"imgScale":4,"sections":[{"heading":"Abstract","paragraphs":[[{"text":"ROBUST AND COMPUTATIONALLY FEASIBLE COMMUNITY DETECTION IN THE PRESENCE OF ARBITRARY OUTLIER NODES","element":"span"},{"style":{"height":8.4},"width":20,"height":21,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/0-0.png","element":"img","alt":"1","inline":true}],[{"id":"id-0","style":{"width":"88%"},"width":1267,"height":1413,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/0-1.png","element":"img"}],[{"text":"Received April 2014; revised November 2014. ","element":"span"},{"style":{"height":6.4},"width":15,"height":16,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/0-2.png","element":"img","alt":"1","inline":true},{"text":"Supported in part by NSF Grants DMS-12-08982 and DMS-14-03708, NIH Grant R01 CA127334-05, and the Wharton Dean’s Fund for Post-Doctoral Research.","element":"span"}],[{"href":"http://www.ams.org/msc/","text":"AMS 2000 subject classifications. ","element":"a"},{"text":"62H30, 91C20. ","element":"span"},{"text":"Key words and phrases. ","element":"span"},{"text":"Robust community detection, SDP relaxation, dual certificate, ","element":"span"},{"text":"k","element":"span"},{"text":"-means clustering.","element":"span"}],[{"style":{"width":"102%"},"width":1465,"height":214,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/0-3.png","element":"img"}]]},{"heading":"$3c","paragraphs":[]},{"heading":"min1≤l≤r |φ−1(l)|, where |S| denotes the cardinality of the set S. Then the diﬃculty of the community detection problem is determined by the tuple","paragraphs":[[{"style":{"width":"23%"},"width":341,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/2-0.png","element":"img"}]]},{"heading":"$3d","paragraphs":[]},{"heading":"$3e","paragraphs":[[{"text":"Does there exist a computationally fast community detection method that is robust to a portion of arbitrary outlier nodes with theoretical guarantees?","element":"span"}]]},{"heading":"Our answer is aﬃrmative, and we will introduce our model, methodology, numerical results and theoretical guarantees with rigorous proofs in this paper. We begin by formalizing the GSBM which allows for a small portion of arbitrary nodes.","paragraphs":[[{"style":{"width":"96%"},"width":1384,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/3-0.png","element":"img"}]]},{"heading":"for community detection which covers a range of settings in practice where the usual SBM is not suitable. More speciﬁcally, we assume the undirected graph G = (V,E) has N := n + m nodes, among which there are n “inliers” obeying the SBM described above, while the other m nodes are “outliers” which are connected with the other nodes in an arbitrary way. We refer to","paragraphs":[[{"text":"this model as ","element":"span"},{"text":"generalized stochastic block model ","element":"span"},{"text":"(","element":"span"},{"text":"GSBM ","element":"span"},{"text":"). Denote ","element":"span"},{"text":"V ","element":"span"},{"text":"= [","element":"span"},{"text":"N","element":"span"},{"text":"] =","element":"span"}]]},{"heading":"I ∪O, where I is the set of indices of the inliers, while O is the set of indices of outliers. Each inlier node i ∈ I is assigned a label φ(i) ∈ {1,...,r}, while all outliers are simply labeled φ(i) = r + 1. For any two nodes i,j ∈ I, P((i,j) ∈ E) = Bφ(i)φ(j), and moreover we assume the event {(i,j) ∈ E}, i < j ∈ I are independent. The r×r symmetric connectivity matrix B only represents the likelihood of connectivity of the inlier nodes. The connectivity between the outliers and the inliers and the connectivity among the outliers themselves are arbitrary. The only restriction of the connectivity of the outliers is that there is no self-loop.","paragraphs":[]},{"heading":"The GSBM can be equivalently expressed in terms of its adjacency matrix A. To be speciﬁc, deﬁne","paragraphs":[[{"style":{"width":"29%"},"width":427,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/4-0.png","element":"img"}]]},{"heading":"... ... ... ...","paragraphs":[]},{"heading":"$3f","paragraphs":[]},{"heading":"$40","paragraphs":[]},{"heading":"$41","paragraphs":[[{"style":{"width":"89%"},"width":1280,"height":944,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/7-0.png","element":"img"}],[{"text":"Fig. 1. ","element":"figcaption","subtype":"caption"},{"text":"The upper left panel illustrates the adjacency matrix of ","element":"figcaption","subtype":"caption"},{"text":"1000 ","element":"figcaption","subtype":"caption"},{"text":"nodes satisfying the ordinary SBM. The upper right panel is the adjacency matrix obtained by permuting the adjacency matrix such that nodes ","element":"figcaption","subtype":"caption"},{"text":"1 ","element":"figcaption","subtype":"caption"},{"text":"to ","element":"figcaption","subtype":"caption"},{"text":"500 ","element":"figcaption","subtype":"caption"},{"text":"belong to the same cluster while the remaining ones constitute another cluster. The lower left panel plots the eigenvectors of the graph Laplacian corresponding to the top ","element":"figcaption","subtype":"caption"},{"text":"2 ","element":"figcaption","subtype":"caption"},{"text":"eigenvalues in absolute value (red for the first and black for the second), while those for the adjacency matrix are plotted in the lower right panel. In both cases, these two eigenvectors are capable of discriminating between the two communities.","element":"figcaption","subtype":"caption"}]]},{"heading":"$42","paragraphs":[[{"style":{"width":"89%"},"width":1280,"height":1041,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/8-0.png","element":"img"}],[{"text":"Fig. 2. ","element":"figcaption","subtype":"caption"},{"text":"The upper left panel illustrates the adjacency matrix of ","element":"figcaption","subtype":"caption"},{"text":"1030 ","element":"figcaption","subtype":"caption"},{"text":"nodes satisfying the GSBM with two major clusters and ","element":"figcaption","subtype":"caption"},{"text":"30 ","element":"figcaption","subtype":"caption"},{"text":"outliers. The upper right panel is obtained by permuting the nodes such that nodes belonging to the same group are consecutive. The lower left panel plots the eigenvectors of the graph Laplacian corresponding the top ","element":"figcaption","subtype":"caption"},{"text":"3 ","element":"figcaption","subtype":"caption"},{"text":"eigenvalues in absolute value (red for the first, black for the second and green for the third), while those for the adjacency matrix are plotted in the lower right panel. Ordinary spectral clustering with ","element":"figcaption","subtype":"caption"},{"text":"r ","element":"figcaption","subtype":"caption"},{"text":"= 2 ","element":"figcaption","subtype":"caption"},{"text":"or ","element":"figcaption","subtype":"caption"},{"text":"r ","element":"figcaption","subtype":"caption"},{"text":"= 3 ","element":"figcaption","subtype":"caption"},{"text":"is ineffective or even powerless on this data set since the top three eigenvectors cannot clearly discriminate between the two main communities.","element":"figcaption","subtype":"caption"}]]},{"heading":"ters, but it is not clear whether penalized spectral clustering applied to the graph Laplacian can diminish the inﬂuence of other types of outliers. Another method to improve standard spectral clustering methods is to detect outlier nodes based on the ﬁrst several eigenvectors. However, it is not clear whether there exists an approach which can uniformly detect all kinds of outliers with a theoretical guarantee. In order to ﬁnd in one shot the major clusters among the inlier nodes, we introduce in Section 2.1 a convex optimization method as well as a detailed algorithm which is implementable. It will be shown in Section 3 that the proposed procedure is robust against a small portion of arbitrary outliers with theoretical guarantees. 2.1. Convex optimization. In this section, we will choose the method of semideﬁnite programming (SDP) to ﬁt the GSBM, followed by a k-means","paragraphs":[]},{"heading":"$43","paragraphs":[]},{"heading":"[Aij(Xij log p + (1 − Xij)log q) + (1 − Aij)(Xij log(1 − p) + (1 − Xij)log(1 − q))]. For any ﬁxed p and q, given A, we would like to choose an appropriate X to maximize ℓ(A|X). If we let λ = log(1 − q) − log plog p − log q + log(1 − q) − log(1 − p),(2.1) since the diagonal entries of A are all equal to 0, the maximization is equivalent to","paragraphs":[[{"style":{"width":"49%"},"width":709,"height":62,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/9-0.png","element":"img"}]]},{"heading":"where JN is the N × N matrix with all entries 1. Now let us ﬁgure out the constraint of X. By the SBM, it is easy to check that X must have the following form:","paragraphs":[[{"style":{"width":"20%"},"width":291,"height":50,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/10-0.png","element":"img"}]]},{"heading":"...","paragraphs":[]},{"heading":"P⊺,(2.2) where P is some unknown permutation matrix, while Js is an s × s matrix with all entries 1’s. Solving optimization (2.2) under such constraint is computationally infeasible, so we seek for some relaxed form. Here we notice there are three major features of X. First, it is positive semideﬁnite; second, all its entries are between 0 and 1; third, it is of rank-r, which is relatively low. If we convexify the second integer constraint and neglect the third requirement, the relaxed maximum likelihood method becomes","paragraphs":[[{"style":{"width":"54%"},"width":784,"height":83,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/10-1.png","element":"img"}]]},{"heading":"subject to �X ⪰ 0, 0 ≤ �Xij ≤ 1 for 1 ≤ i,j ≤ N. The above optimization method is diﬀerent from that in Chen, Sanghavi and Xu (2012), where the relaxation is based on the observation that X is of low rank and hence a nuclear norm penalization is added up to the original objective function. On the contrary, our convex relaxation is derived from the observation that X is both low-rank and positive semideﬁnite, and consequently we impose constraint of the positive semideﬁnite cone. Now let us come back to the robust community detection under the GSBM. To control the possible outliers as formalized in the GSBM model, for the convenience of theoretical analysis, we add an additional term in the objective function to penalize the trace min ⟨ �X,αIN − (1 − λ)A + λ(JN − IN − A)⟩ subject to �X ⪰ 0, 0 ≤ �Xij ≤ 1 for 1 ≤ i,j ≤ N, which is equivalent to min ⟨ �X,E⟩ subject to �X ⪰ 0,(2.3) 0 ≤ �Xij ≤ 1 for 1 ≤ i,j ≤ N, where E := αIN − (1 − λ)A + λ(JN − IN − A).(2.4)","paragraphs":[]},{"heading":"$44","paragraphs":[]},{"heading":"review paper on alternating direction method of multipliers (ADMM) Boyd et al. (2010). Notice that (2.3) can be rewritten as min","paragraphs":[[{"style":{"width":"3%"},"width":54,"height":17,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/12-0.png","element":"img"}]]},{"heading":"subject to Y = Z, where the indicator function ι(a ∈ A) is deﬁned as","paragraphs":[[{"style":{"width":"37%"},"width":531,"height":107,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/12-1.png","element":"img"}]]},{"heading":"By this deﬁnition, we can easily conclude that ι(a ∈ A) is a convex function if and only if A is a convex set. Deﬁne the augmented Lagrangian of this optimization problem as Lρ(Y,Z;Λ) := ι(Y ⪰ 0) + ι(0 ≤ Z ≤ JN) + ⟨Y,E⟩ + ρ2∥Y − Z + Λ∥2F . If both Λ and Z are ﬁxed, and we aim to minimize Lρ(Y,Z;Λ) with respect to Y, it is equivalent to minimizing ι(Y ⪰ 0) + ρ2","paragraphs":[]},{"heading":"For any symmetric matrix X whose eigenvalue decomposition is VΣV⊺, deﬁne X+ := VΣ+V⊺. Then the solution to the above minimization has an explicit form argmin","paragraphs":[]},{"heading":"Lρ(Y,Z;Λ) =","paragraphs":[[{"style":{"width":"45%"},"width":657,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/12-2.png","element":"img"}]]},{"heading":"Remark 2.2. This step has dominating computational complexity in each iteration of ADMM. In fact, an exact implementation of this subproblem of optimization requires a full SVD of Z − Λ − Eρ , whose computational complexity is O(N 3). When N is as large as hundreds of thousands, the full SVD has scalability issue. An open question is how to facilitate the implementation, or whether there exists a surrogate that is computationally inexpensive. A possible remedy is applying the low-rank iterative method, which means in each iteration of ADMM, the full SVD is replaced by a partial SVD where only the leading eigenvalues and eigenvectors are computed. Although this type of method may be stuck in local minimizers, given the fact that SDP implementation can be viewed as a preprocessing before kmeans clustering, such a low-rank iterative method might be helpful. We leave this large-scale computing problem as a future research project.","paragraphs":[]},{"heading":"On the other hand, if both Λ and Y are ﬁxed, to minimize Lρ(Y,Z;Λ) with respect to Z is equivalent to minimizing","paragraphs":[[{"style":{"width":"44%"},"width":636,"height":79,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/13-0.png","element":"img"}]]},{"heading":"Again, we have a closed-form solution argmin","paragraphs":[]},{"heading":"$45","paragraphs":[[{"style":{"width":"100%"},"width":1436,"height":747,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/13-1.png","element":"img"}]]},{"heading":"The following theorem provides an explicit condition of the parameters n, m, p−, q+ and nmin, as well as the tuning parameters (α,λ), under which the solution to (2.3) is capable of unveiling the underlying group structures among the inliers in presence of a portion of outlier confounders.","paragraphs":[[{"id":"id-1","text":"Theorem 3.1. ","element":"span"},{"text":"Let ","element":"span"},{"text":"A ","element":"span"},{"text":"be the adjacency matrix of the semi-random graph under the GSBM, as defined in ","element":"span"},{"text":"(","element":"span"},{"text":"1.3","element":"span"},{"text":")","element":"span"},{"text":". Let ","element":"span"},{"style":{"height":12},"width":38,"height":30,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/14-0.png","element":"img","alt":"�X","inline":true,"padRight":true},{"text":"be a solution to the semidefinite program ","element":"span"},{"text":"(","element":"span"},{"text":"2.3","element":"span"},{"text":") ","element":"span"},{"text":"and the density gap ","element":"span"},{"style":{"height":12.8},"width":20,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/14-1.png","element":"img","alt":" δ","inline":true,"padRight":true},{"text":"be defined as in ","element":"span"},{"text":"(","element":"span"},{"text":"1.2","element":"span"},{"text":")","element":"span"},{"text":", and the minimum within-group density ","element":"span"},{"style":{"height":13.54},"width":48.08,"height":33.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/14-2.png","element":"img","alt":" p− ","inline":true,"padRight":true},{"text":"and the maximum cross-group density ","element":"span"},{"style":{"height":18.34},"width":248.2,"height":45.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/14-3.png","element":"img","alt":" q+ be defined","inline":true,"padRight":true},{"text":"as in ","element":"span"},{"text":"(","element":"span"},{"text":"1.1","element":"span"},{"text":")","element":"span"},{"text":". As defined in Section ","element":"span"},{"href":"#id-0","text":"1","element":"a"},{"text":", the integer ","element":"span"},{"text":"n ","element":"span"},{"text":"denotes the number of inlier nodes, ","element":"span"},{"text":"m ","element":"span"},{"text":"denotes the number of outlier nodes and ","element":"span"},{"style":{"height":10.69},"width":83.32,"height":26.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/14-4.png","element":"img","alt":" nmin","inline":true,"padRight":true},{"text":"denotes the minimum community size among the inliers. Suppose that ","element":"span"},{"style":{"height":24.93},"width":317.2,"height":62.32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/14-5.png","element":"img","alt":" p− ≥ C log nnmin , α ≥","inline":true,"padRight":true},{"text":"3","element":"span"},{"text":"m ","element":"span"},{"text":"and","element":"span"}]]},{"heading":"p− log n","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"(α − 2m)nmin","paragraphs":[]},{"heading":"(3.1)","paragraphs":[[{"style":{"width":"72%"},"width":1034,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/14-6.png","element":"img"}],[{"text":"for some sufficiently large numerical constant ","element":"span"},{"text":"C","element":"span"},{"text":", and the tuning parameter","element":"span"}],[{"style":{"width":"100%"},"width":1435,"height":752,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/14-7.png","element":"img"}]]},{"heading":"Theorem 3.1 guarantees that any solution to (2.3) �X satisﬁes �Xjk = 1 for φ(j) = φ(k) ≤ r, and �Xjk = 0 for φ(j) ̸= φ(k) and φ(j) ≤ r,φ(k) ≤ r. In other words, for each pair of inlier nodes j and k, whether they belong to the same group or not solely depends on whether �Xjk equals 1 or 0. It is noteworthy that condition (3.2) is similar to the tuning parameter condition imposed in Chen, Sanghavi and Xu (2012). To interpret condition (3.1), it is helpful to consider two examples. First, let us consider the very sparse case where p− ≃ q+ ≃ δ ≃ O(log nn ), nmin ≃ O(n) and hence r ≃ O(1). This condition implies that our procedure works even for a graph whose average degree of inlier nodes is on the oder of","paragraphs":[]},{"heading":"O(log n). This is consistent with the best-known result in the literature of community detection without outliers by spectral clustering based on the adjacency matrices or graph Laplacians [see Lei and Rinaldo (2015)], although the log n barrier could be resolved by more sophisticated nonbacktracking matrix methods; see Krzakala et al. (2013). In this case, condition (3.1) becomes","paragraphs":[]},{"heading":"�log n","paragraphs":[[{"style":{"width":"51%"},"width":741,"height":53,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/15-0.png","element":"img"}]]},{"heading":"Then by letting α = log N, m = log n outliers are allowed. In the second example, we assume δ ≃ p− ≃ q+ ≃ O(1), and the number of clusters r grows with n. As a speciﬁc example, we let r ≃ n1/4. Moreover, we assume nmin ≃ n3/4. Then condition (3.1) becomes","paragraphs":[[{"style":{"width":"88%"},"width":1263,"height":80,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/15-1.png","element":"img"}]]},{"heading":"Then by letting α = N 3/4, m = O(n1/2−ε) outliers are allowed for any ε > 0. A prominent feature of Theorem 3.1 is its consistency with the state-of-the-art community detection under the ordinary SBM in the literature. Assume there is no outlier node, that is, m = 0, and we simply let α = O(1) or just α = 0. Then condition (3.1) becomes","paragraphs":[]},{"heading":"p− log n","paragraphs":[]},{"heading":"+","paragraphs":[[{"style":{"width":"35%"},"width":510,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/15-2.png","element":"img"}]]},{"heading":"If the number of clusters is ﬁxed, that is, r = O(1), we also assume the size of the smallest community nmin = O(n). As mentioned above, this condition is guaranteed by letting the minimum within-group density p− to be","paragraphs":[[{"text":"as low as ","element":"span"},{"style":{"height":22.62},"width":126.12,"height":56.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/15-3.png","element":"img","alt":" O(log nn ","inline":true,"padRight":true},{"text":") and the density gap ","element":"span"},{"style":{"height":22.62},"width":205.32,"height":56.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/15-4.png","element":"img","alt":" δ = O(log nn ","inline":true,"padRight":true},{"text":"). In another example","element":"span"}]]},{"heading":"where p− = O(1), q+ = O(1) and δ = O(1), condition (3.1) is equivalent to nmin ≥ O(√nlogn). By modifying Lemma 6.7 as discussed in Section 6, this condition can be relaxed to nmin ≥ O(√n). This is consistent with the state-of-the-art result in the community detection literature by spectral clustering [see, e.g., Chaudhuri, Chung and Tsiatas (2012)], and planted partition [see, e.g., Giesen and Mitsche (2005), Shamir and Tsur (2007), Oymak and Hassibi (2011), Ames (2014), Chen, Sanghavi and Xu (2012)] where the within-group densities are usually assumed to be the same, so do the cross-group densities. The O(√n) barrier of the small cluster size is well known in the literature of planted clique problems; see Deshpande and Montanari (2015) and the references therein. Remark 3.1. The proof of Theorem 3.1 is involved, and the details are given in Section 6. It is helpful to understand the intuition behind the proof.","paragraphs":[]},{"heading":"$46","paragraphs":[]},{"heading":"... ...","paragraphs":[[{"style":{"width":"54%"},"width":784,"height":129,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/16-0.png","element":"img"}]]},{"heading":"Moreover, deﬁne yi = xi/∥xi∥2. Then all yi’s belong to the set of Ndimensional vectors with two-norm 1 and all coordinates being nonnegative. Notice that if xi = 0, we then deﬁne yi as an arbitrary nonnegative vector with norm 1. Then, for any inlier indices i,j ∈ I and φ(i) ̸= φ(j), we have","paragraphs":[[{"style":{"width":"45%"},"width":658,"height":96,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/16-1.png","element":"img"}]]},{"heading":"and for any i,j ∈ I and φ(i) = φ(j) = k, we have","paragraphs":[[{"style":{"width":"69%"},"width":1002,"height":98,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/16-2.png","element":"img"}]]},{"heading":"Moreover, for any yi and yj, since both of them are nonnegative, we have","paragraphs":[[{"style":{"width":"35%"},"width":513,"height":53,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/16-3.png","element":"img"}]]},{"heading":"By deﬁnition, the solution to the k-means applied to {y1,...,yN} is argmin","paragraphs":[]},{"heading":"∥yj − µk∥2,(3.4)","paragraphs":[[{"style":{"width":"21%"},"width":304,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/16-4.png","element":"img"}]]},{"heading":"where S = {S1,...,Sr} is all r nonoverlapping partitions of [N]. It is obvious","paragraphs":[[{"style":{"width":"99%"},"width":1434,"height":43,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/17-0.png","element":"img"}]]},{"heading":"choose ˜µk as any vector yi belonging to the kth community, that is, φ(i) = k. Then there holds min","paragraphs":[[{"style":{"width":"86%"},"width":1244,"height":95,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/17-1.png","element":"img"}]]},{"heading":"∥yj − ˜µr∥2(3.5)","paragraphs":[[{"style":{"width":"40%"},"width":575,"height":62,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/17-2.png","element":"img"}]]},{"heading":"2m","paragraphs":[]},{"heading":"+ 2m = 2mr + 2m. Suppose the solution to the k-means clustering is �S1,..., �Sr and ˆµk =1 | �Sk| �yj∈ �Sk yj. For each j ∈ �Sk, deﬁne ˆφ(j) := k. Now we show that if m <","paragraphs":[[{"style":{"height":19.74},"width":491.36,"height":49.36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/17-3.png","element":"img","alt":"2r+4, each Di, i = 1,...,r","inline":true,"padRight":true},{"text":"must account for more than 50 percent in some","element":"span"}]]},{"heading":"cluster �Sk. Assume this is not true. Then there is a Di being minority in each �Sk, and hence for each yaj ∈ Di, there exists a ybj /∈ Di, but ˆφ(yaj) = ˆφ(ybj). Moreover, all these 2li indices are distinct. This implies","paragraphs":[[{"style":{"width":"0%"},"width":6,"height":8,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/17-4.png","element":"img"}]]},{"heading":"1","paragraphs":[[{"style":{"width":"99%"},"width":1434,"height":164,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/17-5.png","element":"img"}]]},{"heading":"$47","paragraphs":[]},{"heading":"Back to our robust community detection problem, if we assume the mis-classiﬁcation rate among the inliers is pn, we have","paragraphs":[[{"style":{"width":"75%"},"width":1087,"height":145,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/18-0.png","element":"img"}]]},{"heading":"Therefore, we have","paragraphs":[]},{"heading":"2mr + 2m(1 − (m/nmin))n ≤ (2r + 3)mn","paragraphs":[[{"text":"provided ","element":"span"},{"style":{"height":20.24},"width":178.28,"height":50.6,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/18-1.png","element":"img","alt":" m < nmin2r+4","inline":true},{"text":". In summary, we have proven the following theorem,","element":"span"}]]},{"heading":"which guarantees that the misclassiﬁcation rate among the inliers can be well controlled:","paragraphs":[[{"text":"Theorem 3.2. ","element":"span"},{"text":"Suppose the assumptions in Theorem ","element":"span"},{"href":"#id-1","text":"3.1 ","element":"a"},{"text":"hold as well as ","element":"span"},{"style":{"height":20.24},"width":172.52,"height":50.6,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/18-2.png","element":"img","alt":"m < nmin2r+4","inline":true},{"text":". Then, with high probability, the misclassification rate among the ","element":"span"},{"text":"inlier nodes ","element":"span"},{"style":{"height":12.8},"width":84.4,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/18-3.png","element":"img","alt":" i ∈ I","inline":true,"padRight":true},{"text":"is less than or equal to ","element":"span"},{"style":{"height":24.19},"width":150.76,"height":60.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/18-4.png","element":"img","alt":"(2r+3)mn .","inline":true}]]},{"heading":"Rigorously speaking, k-means minimization is computationally NP-hard, although in practice it is often easy and fast to implement with a number of repetitions. However, as shown in Kumar, Sabharwal and Sen [(2011), Theorem 4.9], there is a (1 + ε) approximate k-means clustering for (3.4) with computational time O(2(r/ε)O(1)N 2), which is polynomial time when r is a constant. Suppose { ˇS1,..., ˇSr} is a polynomial time approximate kmeans solution, such that","paragraphs":[]},{"heading":"∥yj − ˇµk∥2 ≤ (1 + ε) minS,µ1,...,µr","paragraphs":[]},{"heading":"≤ (1 + ε)(2mr + 2m). Then if within the inliers there are p misclassiﬁed nodes by { ˇS1,..., ˇSr}, similarly to the previous argument, we get pn ≤ (1+ε)(2r+3)mn . When r grows with N, one can also cluster the rows of �X in (3.3) based on the ℓ1 distance. If two inlier nodes belong to the same community, their corresponding rows in �X have ℓ1 distance less than m; on the other hand, if two inlier nodes belong to diﬀerent communities, their corresponding rows have ℓ1 distance greater than 2nmin. If the number of outliers is far less than the minimum size of the major clusters, for example, nmin > O(m2), a pairwise comparison between the rows of �X can detect the inlier communities accurately even without the knowledge of r. However, this method does not work as eﬀectively as k-means clustering in numerical simulations. An","paragraphs":[]},{"heading":"$48","paragraphs":[[{"style":{"width":"73%"},"width":1058,"height":473,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/20-0.png","element":"img"}],[{"text":"Fig. 3. ","element":"figcaption","subtype":"caption"},{"text":"On the right is the plot of the solution to convex optimization (","element":"figcaption","subtype":"caption"},{"text":"2.3","element":"span","subtype":"caption"},{"text":"). Based on it, the community detection result followed by ","element":"figcaption","subtype":"caption"},{"text":"k","element":"figcaption","subtype":"caption"},{"text":"-means algorithm is shown on the left.","element":"figcaption","subtype":"caption"}]]},{"heading":"$49","paragraphs":[[{"style":{"width":"88%"},"width":1266,"height":749,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/21-0.png","element":"img"}],[{"text":"Fig. 4. ","element":"figcaption","subtype":"caption"},{"text":"The performance of convex community detection with different values of ","element":"figcaption","subtype":"caption"},{"style":{"height":10},"width":33.08,"height":25,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/21-1.png","element":"img","alt":" λ.","inline":true}]]},{"heading":"division and interaction between the liberal and conservative blogs prior to the 2004 presidential election. By ignoring the directions of the hyperlinks and selecting the largest connected component, there are totally 1222 nodes and 16,714 edges, which implies that the average degree is about 27. As indicated in Zhao, Levina and Zhu (2012), the distribution of the degrees is highly skewed to the right and has high variability. Also, the political memberships of all blogs are clearly studied and labeled manually in Adamic and","paragraphs":[[{"style":{"width":"89%"},"width":1288,"height":731,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/21-2.png","element":"img"}],[{"text":"Fig. 5. ","element":"figcaption","subtype":"caption"},{"text":"The solutions of (","element":"figcaption","subtype":"caption"},{"text":"2.3","element":"span","subtype":"caption"},{"text":") with different values of ","element":"figcaption","subtype":"caption"},{"text":"p","element":"figcaption","subtype":"caption"},{"text":".","element":"figcaption","subtype":"caption"}],[{"style":{"width":"89%"},"width":1283,"height":1181,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/22-0.png","element":"img"}],[{"text":"Fig. 6. ","element":"figcaption","subtype":"caption"},{"text":"Political blogs data of two clusters of conservatives and liberals, along with the ","element":"figcaption","subtype":"caption"},{"text":"performance of convex optimization.","element":"figcaption","subtype":"caption"}]]},{"heading":"$4a","paragraphs":[]},{"heading":"optimization is (2.3) with","paragraphs":[[{"style":{"width":"89%"},"width":1286,"height":53,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/23-0.png","element":"img"}]]},{"heading":"$4b","paragraphs":[[{"text":"is ","element":"span"},{"style":{"height":22.63},"width":315.44,"height":56.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/23-1.png","element":"img","alt":" O(log nn ), O(log n","inline":true},{"text":") outliers are allowed; when the edge density within the","element":"span"}]]},{"heading":"inliers is on the order of O(1), and the number of clusters grows with n, for example, O(n1/4), our method is robust against O(n1/2−ε) adversarial outliers. Under the special case when there is no outlier node, our theoretical result is also consistent with the state-of-the-art results in the literature of computationally feasible community detection under the SBM.","paragraphs":[]},{"heading":"There are a number of possible extensions to the current results. The proposed community detection procedure as well as the theoretical guarantees depend on the assumption δ = p− − q+ > 0. Although this assumption is common in the literature of community detection, it is actually a strong assumption which sometimes does not hold in real-world network data applications. For example, suppose there are r = 3 clusters, and the connectivity matrix is","paragraphs":[[{"style":{"width":"27%"},"width":391,"height":163,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/24-0.png","element":"img"}]]},{"heading":"For each node, its associated within-group density is bigger than its associated cross-group densities; however,","paragraphs":[[{"style":{"width":"29%"},"width":428,"height":66,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/24-1.png","element":"img"}]]},{"heading":"$4c","paragraphs":[[{"style":{"width":"13%"},"width":188,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/24-2.png","element":"img"}]]},{"heading":"6.1. Notation. Throughout the proofs we will use the following notation: the ℓ × ℓ identity matrix is denoted by Iℓ. An ℓ1 × ℓ2 matrix whose entries all equal to 1 is denoted as J(ℓ1,ℓ2). For square matrices, we write Jℓ := J(ℓ,ℓ). An ℓ-dimensional vector whose coordinates all equal to 1 is denoted as 1ℓ. If all coordinates of a vector v are nonnegative, we write v ≥ 0. When all coordinates of v are positive, we write v > 0. We use u ≥ v to denote","paragraphs":[]},{"heading":"$4d","paragraphs":[[{"text":"and ","element":"span"},{"style":{"height":12.8},"width":276.56,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-0.png","element":"img","alt":" P be two n×n","inline":true,"padRight":true},{"text":"Hermitian matrices. Suppose that ","element":"span"},{"text":"H","element":"span"},{"text":"+","element":"span"},{"text":"P","element":"span"},{"text":", ","element":"span"},{"text":"H ","element":"span"},{"text":"and ","element":"span"},{"text":"P ","element":"span"},{"text":"have real eigenvalues ","element":"span"},{"style":{"height":18.19},"width":836.36,"height":45.48,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-1.png","element":"img","alt":" {λi(H+P)}ni=1, {λi(H)}ni=1 and {λi(P)}ni=1","inline":true},{"text":", each arranged ","element":"span"},{"text":"in algebraically nonincreasing order. Then for ","element":"span"},{"text":"i ","element":"span"},{"text":"= 1","element":"span"},{"text":",...,n ","element":"span"},{"text":"we have","element":"span"}]]},{"heading":"λi(H) + λn(P) ≤ λi(H + P) ≤ λi(H) + λ1(P). Lemma 6.2 (Cauchy’s interlacing theorem [Horn and Johnson (2013),","paragraphs":[[{"text":"Theorem 4.3.28]). ","element":"span"},{"text":"Let ","element":"span"},{"style":{"height":12.8},"width":285.2,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-2.png","element":"img","alt":" H be an n × n","inline":true,"padRight":true},{"text":"Hermitian matrix and ","element":"span"},{"style":{"height":12.8},"width":223.16,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-3.png","element":"img","alt":" G its k × k","inline":true,"padRight":true},{"text":"principal submatrix. Suppose that ","element":"span"},{"text":"H ","element":"span"},{"text":"and ","element":"span"},{"text":"G ","element":"span"},{"text":"have real eigenvalues ","element":"span"},{"style":{"height":18},"width":210.44,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-4.png","element":"img","alt":" {λi(H)}ni=1","inline":true,"padRight":true},{"text":"and ","element":"span"},{"style":{"height":19.94},"width":210.44,"height":49.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-5.png","element":"img","alt":" {λi(G)}ki=1","inline":true},{"text":", each arranged in algebraically nonincreasing order. Then ","element":"span"},{"text":"for ","element":"span"},{"text":"j ","element":"span"},{"text":"= 1","element":"span"},{"text":",...,k ","element":"span"},{"text":"we have","element":"span"}],[{"style":{"width":"38%"},"width":551,"height":47,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-6.png","element":"img"}]]},{"heading":"Lemma 6.3 (Chernoﬀ’s inequality [Chernoﬀ (1981)]). Let X1,...,Xn be","paragraphs":[[{"style":{"width":"46%"},"width":660,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-7.png","element":"img"}]]},{"heading":"P(Xi = 1) = pi, P(Xi = 0) = 1 − pi.","paragraphs":[[{"style":{"width":"98%"},"width":1406,"height":222,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/25-8.png","element":"img"}]]},{"heading":"Finally, we consider the following problem: suppose that A = (aij)1≤i,j≤n is a random symmetric matrix, whose diagonal entries are all zeros, while aij,1 ≤ i < j ≤ n are independent zero-mean Bernoulli random variables obeying |aij| ≤ 1 and Var(aij) ≤ σ2. Can we prove that with high probability, ∥A∥ ≤ C(σ√nlogn + log n) for some numerical constant C? In the sequel, this upper bound is derived by applying the following matrix Bernstein inequality, which is an improvement of Ahlswede and Winter (2002): Lemma 6.4 [Tropp (2012), Theorem 6.1]. Consider a ﬁnite sequence","paragraphs":[[{"style":{"height":17.6},"width":102.16,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-0.png","element":"img","alt":"{Xk}","inline":true,"padRight":true},{"text":"of independent, random, self-adjoint matrices with dimension ","element":"span"},{"text":"d","element":"span"},{"text":". Assume that","element":"span"}],[{"style":{"width":"77%"},"width":1113,"height":505,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-1.png","element":"img"}],[{"style":{"height":18.29},"width":764.52,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-2.png","element":"img","alt":"Corollary 6.5. Let A = (aij)1≤i,j≤n","inline":true,"padRight":true},{"text":"be a symmetric random matrix whose diagonal entries are all zeros. Moreover, suppose ","element":"span"},{"style":{"height":17.09},"width":390.12,"height":42.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-3.png","element":"img","alt":" aij, 1 ≤ i < j ≤ n are","inline":true,"padRight":true},{"text":"independent zero-mean random variables satisfying ","element":"span"},{"style":{"height":18.29},"width":449.2,"height":45.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-4.png","element":"img","alt":" |aij| ≤ 1 and Var(aij) ≤","inline":true},{"style":{"height":15.13},"width":43.4,"height":37.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-5.png","element":"img","alt":"σ2","inline":true},{"text":". Then, with probability at least ","element":"span"},{"text":"1 ","element":"span"},{"style":{"height":18.85},"width":268.68,"height":47.12,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-6.png","element":"img","alt":" − cn4 , we have","inline":true}]]},{"heading":"∥A∥ ≤ C0(σ�nlogn + log n)","paragraphs":[[{"style":{"width":"52%"},"width":759,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-7.png","element":"img"}]]},{"heading":"Proof. For each pair (i,j):1 ≤ i < j ≤ n, let Xij be the matrix whose (i,j) and (j,i) entries are both aij, whereas other entires are zeros. Then we have A = �","paragraphs":[[{"style":{"width":"9%"},"width":140,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-8.png","element":"img"}]]},{"heading":"Moreover, we can easily have EXij = 0, ∥Xij∥ ≤ 1 and","paragraphs":[[{"style":{"width":"42%"},"width":603,"height":101,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/26-9.png","element":"img"}]]},{"heading":"They by applying Lemma 6.4, the proof is complete. □","paragraphs":[]},{"heading":"6.3. Supporting lemmas. Notice that optimization (2.3) is determined by the adjacency matrix A. Here we derive some properties of A and leave the detailed proofs in the supplemental article Cai and Li (2015). More precisely, we give some properties of the random matrix K, which is a principal submatrix of A; see (1.3).","paragraphs":[[{"style":{"width":"100%"},"width":1436,"height":97,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-0.png","element":"img"}]]},{"heading":"q+ log n","paragraphs":[]},{"heading":",(6.1)","paragraphs":[[{"style":{"width":"26%"},"width":383,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-1.png","element":"img"}],[{"text":"for some sufficiently large numerical constant ","element":"span"},{"text":"C","element":"span"},{"text":", then with probability at least ","element":"span"},{"text":"1 ","element":"span"},{"style":{"height":21.84},"width":1057.32,"height":54.6,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-2.png","element":"img","alt":" − 2n − 2rn2 , for all i = 1,...,r and 1 ≤ j < k ≤ r, we have","inline":true}]]},{"heading":"Kii1li ≥ ((li − 1)Bii − 2�(li − 1)Bii log n)1li,(6.2)","paragraphs":[]},{"heading":"�Bjk + δ16�lk1lj,(6.3) K⊺jk1lj ≤�Bjk + δ16�lj1lk,(6.4)","paragraphs":[[{"style":{"width":"55%"},"width":794,"height":106,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-3.png","element":"img"}],[{"style":{"height":24.93},"width":701,"height":62.32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-4.png","element":"img","alt":"Lemma 6.7. Suppose p− ≥ C( log nnmin)","inline":true},{"text":". With probability at least ","element":"span"},{"text":"1 ","element":"span"},{"style":{"height":24.23},"width":155.08,"height":60.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-5.png","element":"img","alt":" − c rn4min ,","inline":true,"padRight":true},{"text":"we have","element":"span"}],[{"style":{"width":"84%"},"width":1208,"height":123,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-6.png","element":"img"}]]},{"heading":"∥U∥ ≤ C0(�nq+ log n + log n),(6.7)","paragraphs":[[{"style":{"width":"81%"},"width":1170,"height":142,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-7.png","element":"img"}]]},{"heading":"U :=","paragraphs":[[{"style":{"width":"56%"},"width":808,"height":77,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-8.png","element":"img"}],[{"text":"whose diagonal blocks are all ","element":"span"},{"style":{"height":15.6},"width":447.16,"height":39,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/27-9.png","element":"img","alt":" 0’s. Here C, C0 and c","inline":true,"padRight":true},{"text":"are some numerical constants.","element":"span"}]]},{"heading":"It is worth noting that by applying a very recent result Vu [(2014), Lemma 8], which is an improvement of F¨uredi and Koml´os (1981), Vu (2007),","paragraphs":[]},{"heading":"we can prove ∥U∥ ≤ C0(�nq+ + √log n). Condition (3.1) in Theorem 3.1 can then be relaxed to","paragraphs":[]},{"heading":"p− log n","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"(α − 2m)nmin","paragraphs":[]},{"heading":"The beneﬁt is that when m = O(1), p− = O(1), q+ = O(1) and δ = O(1), nmin can be as small as O(√N) by letting α =√N. In particular, if there is no outlier node, that is, the ordinary SBM, this is consistent with the state-of-the-art result in the literature of computationally feasible community detection. 6.4. Proof of Theorem 3.1. In this section, we will rigorously prove Theorem 3.1. First, to simplify the calculations, we can assume the permutation matrix P to be the identity matrix IN. This suggestion is formalized by the following lemma:","paragraphs":[[{"text":"Lemma 6.8. ","element":"span"},{"text":"If Theorem ","element":"span"},{"href":"#id-1","text":"3.1 ","element":"a"},{"text":"is true for ","element":"span"},{"style":{"height":14.69},"width":137.04,"height":36.72,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/28-0.png","element":"img","alt":" P = IN","inline":true},{"text":", it is also true for any permutation matrix ","element":"span"},{"text":"P","element":"span"},{"text":".","element":"span"}]]},{"heading":"$4e","paragraphs":[[{"style":{"width":"37%"},"width":534,"height":45,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/28-1.png","element":"img"}],[{"style":{"width":"80%"},"width":1162,"height":116,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-0.png","element":"img"}]]},{"heading":":=","paragraphs":[]},{"heading":"... ... ... ...","paragraphs":[[{"style":{"width":"87%"},"width":1258,"height":136,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-1.png","element":"img"}]]},{"heading":"which is equivalent to deﬁning �Zi = λJ(li,m) − Zi, i = 1,...,r,(6.8)","paragraphs":[]},{"heading":"W = (α − λ)Im + λJm − W.(6.9) The following lemma, the proof of which is given in the supplemental article Cai and Li (2015), guarantees the existence of r vectors x1,...,xr ∈ Rm, which will be employed to construct a candidate solution:","paragraphs":[[{"id":"id-2","style":{"width":"72%"},"width":1047,"height":111,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-2.png","element":"img"}]]},{"heading":"min","paragraphs":[[{"style":{"width":"83%"},"width":1200,"height":263,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-3.png","element":"img"}],[{"text":"exists uniquely. Moreover, denote the solutions by ","element":"span"},{"style":{"height":16.4},"width":483.8,"height":41,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-4.png","element":"img","alt":" x1,...,xr ∈ Rm, which by","inline":true,"padRight":true},{"text":"definition satisfy ","element":"span"},{"style":{"height":17.6},"width":193.36,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-5.png","element":"img","alt":" ∥xi∥∞ ≤ 1","inline":true},{"text":". Then there are nonnegative vectors ","element":"span"},{"style":{"height":16.03},"width":226.76,"height":40.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-6.png","element":"img","alt":" β1,...,βr ∈","inline":true},{"style":{"height":12.8},"width":355.32,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-7.png","element":"img","alt":"Rm and an m × m","inline":true,"padRight":true},{"text":"nonnegative diagonal matrix","element":"span"}]]},{"heading":"Ξ = diag(ξ1,...,ξm),","paragraphs":[[{"style":{"width":"11%"},"width":172,"height":32,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-8.png","element":"img"}]]},{"heading":"Wxi + �Z⊺i 1li = βi − Ξxi,(6.11)","paragraphs":[[{"style":{"width":"27%"},"width":389,"height":35,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-9.png","element":"img"}]]},{"heading":"= 0, j = 1,...,m(6.12)","paragraphs":[[{"style":{"width":"50%"},"width":723,"height":108,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-10.png","element":"img"}]]},{"heading":"⟨xi,βi⟩ = 0, i = 1,...,r.(6.13)","paragraphs":[[{"style":{"width":"40%"},"width":584,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/29-11.png","element":"img"}]]},{"heading":"x⊺j(�W + Ξ)xk ≤ m�ljlk.(6.14)","paragraphs":[[{"style":{"width":"76%"},"width":1101,"height":124,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/30-0.png","element":"img"}]]},{"heading":"xik.(6.15)","paragraphs":[[{"style":{"width":"75%"},"width":1089,"height":120,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/30-1.png","element":"img"}]]},{"heading":"0 ≤ βi ≤ (m + li − 1)1m.(6.16) Throughout the paper, we deﬁne V := [v1,...,vr] :=","paragraphs":[]},{"heading":"... ... ... ...","paragraphs":[[{"style":{"width":"26%"},"width":386,"height":117,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/30-2.png","element":"img"}]]},{"heading":"and","paragraphs":[[{"style":{"width":"52%"},"width":755,"height":52,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/30-3.png","element":"img"}]]},{"heading":"... ... ... ...","paragraphs":[[{"style":{"width":"72%"},"width":1034,"height":128,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/30-4.png","element":"img"}]]},{"heading":"Since xi’s are feasible to optimization (6.10), we can easily see that X is feasible to optimization (2.3). We aim to prove that under mild technical conditions, X is actually a solution to optimization (2.3).","paragraphs":[[{"style":{"width":"96%"},"width":1385,"height":49,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/30-5.png","element":"img"}]]},{"heading":"propose a condition which guarantees that any solution �X to (2.3) must be in the form of (3.3) with P = IN. This suﬃcient condition is equivalent to constructing a matrix Λ satisfying a series of equalities and inequalities as indicated in the following lemma. We call it a dual certiﬁcate. In Section 6.4.3, we will show that with high probability, this dual certiﬁcate can be constructed in an explicit way.","paragraphs":[[{"style":{"height":16.43},"width":800.76,"height":41.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/30-6.png","element":"img","alt":"Lemma 6.10. Suppose Ξ and β1,...,βr","inline":true,"padRight":true},{"text":"are defined as in Lemma ","element":"span"},{"href":"#id-2","text":"6.9","element":"a"},{"text":". If there exist symmetric matrices ","element":"span"},{"style":{"height":20.22},"width":782.92,"height":50.56,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/30-7.png","element":"img","alt":" Λ ∈ RN×N, Ψjj ∈ Rlj×lj (1 ≤ j ≤ r) and","inline":true}],[{"style":{"width":"104%"},"width":1499,"height":199,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-0.png","element":"img"}]]},{"heading":"Λ =","paragraphs":[[{"style":{"width":"91%"},"width":1312,"height":124,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-1.png","element":"img"}]]},{"heading":"(6.17)","paragraphs":[[{"text":"satisfies ","element":"span"},{"style":{"height":17.68},"width":741.64,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-2.png","element":"img","alt":" Ψii > 0, Φjk > 0, ΛV = 0 and Λ ⪰ 0","inline":true},{"text":", then any minimizer ","element":"span"},{"style":{"height":12.4},"width":94.52,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-3.png","element":"img","alt":"�X to","inline":true,"padRight":true},{"text":"(","element":"span"},{"text":"2.3","element":"span"},{"text":") must be of the form","element":"span"}],[{"style":{"width":"82%"},"width":1181,"height":102,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-4.png","element":"img"}]]},{"heading":"�X =","paragraphs":[[{"style":{"width":"90%"},"width":1296,"height":154,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-5.png","element":"img"}]]},{"heading":"An intuition behind the theorem and the rigorous proof are given in the supplemental article Cai and Li (2015). It is noteworthy that the condition on Λ is weaker if the number of clusters r gets smaller. The reason is that the equality condition is ΛV = 0. Obviously when r gets smaller, V has fewer columns, and hence the equality constraint becomes milder. We emphasize that the choices of Ψii and Φij are intended to ﬁt the equality constraint of Λ, that is, ΛV = 0. To make sure Λ ⪰ 0, we need to ﬁrst project Λ onto the orthogonal compliment of V, and then show the projection is positive deﬁ-nite. This is based on the spectral norm bound as indicated in Lemma 6.7, which provides a concentration inequality for a random matrix. 6.4.3. Construction of dual certiﬁcate. It suﬃces to construct a matrix Λ in the form of (6.17) in Lemma 6.10, which satisﬁes ΛV = 0, Ψii > 0, Φjk > 0 and Λ ⪰ 0. The following lemma guarantees the existence of such Λ, and its proof is given in the supplemental article Cai and Li (2015).","paragraphs":[[{"style":{"height":24.74},"width":1386.28,"height":61.84,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-6.png","element":"img","alt":"Lemma 6.11. Suppose p− ≥ C( log nnmin), q+ + δ4 < λ < p− − δ4 and α ≥ 3m.","inline":true,"padRight":true},{"text":"Moreover, assume","element":"span"}]]},{"heading":"p− log n","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"+","paragraphs":[]},{"heading":"(α − 2m)nmin","paragraphs":[]},{"heading":"(6.18)","paragraphs":[[{"style":{"width":"78%"},"width":1130,"height":31,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-7.png","element":"img"}],[{"text":"for some sufficiently large numerical constant ","element":"span"},{"text":"C","element":"span"},{"text":". Then, with probability at least ","element":"span"},{"text":"1","element":"span"},{"style":{"height":26.83},"width":272.32,"height":67.08,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-8.png","element":"img","alt":"− 1n − 2rn2 − crn4min ","inline":true,"padRight":true},{"text":", there exist matrices ","element":"span"},{"style":{"height":17.68},"width":252.24,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-9.png","element":"img","alt":" Ψii’s and Φjk","inline":true},{"text":"’s satisfying ","element":"span"},{"style":{"height":15.2},"width":154.6,"height":38,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-10.png","element":"img","alt":" Ψii > 0,","inline":true},{"style":{"height":17.28},"width":155.56,"height":43.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-11.png","element":"img","alt":"Φjk > 0","inline":true,"padRight":true},{"text":"and the matrix ","element":"span"},{"style":{"height":17.68},"width":945.64,"height":44.2,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-12.png","element":"img","alt":" Λ defined by Ψii’s and Φjk’s obey ΛV = 0 and","inline":true},{"style":{"height":14.8},"width":125.8,"height":37,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/31-13.png","element":"img","alt":"Λ ⪰ 0.","inline":true}]]},{"heading":"SUPPLEMENTARY MATERIAL","paragraphs":[[{"text":"Supplemental materials to “Robust and computationally feasible community detection in the presence of arbitrary outliers nodes”","element":"span"}]]},{"heading":"(DOI: 10.1214/14-AOS1290SUPP; .pdf). We give in the supplement proofs","paragraphs":[[{"style":{"width":"57%"},"width":820,"height":40,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/32-0.png","element":"img"}]]},{"heading":"REFERENCES","paragraphs":[[{"text":"Adamic, A. ","element":"span"},{"text":"and ","element":"span"},{"text":"Glance, N. ","element":"span"},{"text":"(2005). The political blogosphere and the 2004 US election: Divided they blog. In ","element":"span"},{"text":"Proceedings of the 3rd International Workshop on Link Discovery ","element":"span"},{"text":"36–43. ACM, New York.","element":"span"}],[{"text":"Ahlswede, R. ","element":"span"},{"text":"and ","element":"span"},{"text":"Winter, A. ","element":"span"},{"text":"(2002). Strong converse for identification via quantum channels. ","element":"span"},{"text":"IEEE Trans. Inform. Theory ","element":"span"},{"text":"48 ","element":"span"},{"text":"569–579. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=1889969","text":"MR1889969","element":"a"}],[{"text":"Airoldi, E.","element":"span"},{"text":", ","element":"span"},{"text":"Blei, M.","element":"span"},{"text":", ","element":"span"},{"text":"Fienberg, S. ","element":"span"},{"text":"and ","element":"span"},{"text":"Xing, E. ","element":"span"},{"text":"(2008). Mixed membership stochastic blockmodels. ","element":"span"},{"text":"J. Mach. Learn. Res. ","element":"span"},{"text":"9 ","element":"span"},{"text":"1981–2014.","element":"span"}],[{"text":"Ames, B. P. W. ","element":"span"},{"text":"(2014). Guaranteed clustering and biclustering via semidefinite programming. ","element":"span"},{"text":"Math. Program. ","element":"span"},{"text":"147 ","element":"span"},{"text":"429–465. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3258531","text":"MR3258531","element":"a"}],[{"text":"Ames, B. P. W. ","element":"span"},{"text":"and ","element":"span"},{"text":"Vavasis, S. A. ","element":"span"},{"text":"(2014). Convex optimization for the planted ","element":"span"},{"text":"k","element":"span"},{"text":"-disjoint-clique problem. ","element":"span"},{"text":"Math. Program. ","element":"span"},{"text":"143 ","element":"span"},{"text":"299–337. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3152071","text":"MR3152071","element":"a"}],[{"text":"Amini, A. A.","element":"span"},{"text":", ","element":"span"},{"text":"Chen, A.","element":"span"},{"text":", ","element":"span"},{"text":"Bickel, P. J. ","element":"span"},{"text":"and ","element":"span"},{"text":"Levina, E. ","element":"span"},{"text":"(2013). Pseudo-likelihood methods for community detection in large sparse networks. ","element":"span"},{"text":"Ann. Statist. ","element":"span"},{"text":"41 ","element":"span"},{"text":"2097–2122. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3127859","text":"MR3127859","element":"a"}],[{"text":"Balakrishnan, S.","element":"span"},{"text":", ","element":"span"},{"text":"Xu, M.","element":"span"},{"text":", ","element":"span"},{"text":"Krishnamurthy, A. ","element":"span"},{"text":"and ","element":"span"},{"text":"Singh, A. ","element":"span"},{"text":"(2011). Noise thresholds for spectral clustering (NIPS 2011). ","element":"span"},{"text":"Adv. Neural Inf. Process. Syst. ","element":"span"},{"text":"25 ","element":"span"},{"text":"954–962.","element":"span"}],[{"text":"Bhattacharyya, S. ","element":"span"},{"text":"and ","element":"span"},{"text":"Bickel, P. J. ","element":"span"},{"text":"(2014). Community detection in networks using graph distance. Available at ","element":"span"},{"href":"http://arxiv.org/abs/arXiv:1401.3915","text":"arXiv:1401.3915","element":"a"},{"text":".","element":"span"}],[{"text":"Bickel, P. J. ","element":"span"},{"text":"and ","element":"span"},{"text":"Chen, A. ","element":"span"},{"text":"(2009). A nonparametric view of network models and Newman–Girvan and other modularities. ","element":"span"},{"text":"Proc. Natl. Acad. Sci. USA ","element":"span"},{"text":"106 ","element":"span"},{"text":"21068–21073.","element":"span"}],[{"text":"Bickel, P.","element":"span"},{"text":", ","element":"span"},{"text":"Choi, D.","element":"span"},{"text":", ","element":"span"},{"text":"Chang, X. ","element":"span"},{"text":"and ","element":"span"},{"text":"Zhang, H. ","element":"span"},{"text":"(2013). Asymptotic normality of maximum likelihood and its variational approximation for stochastic blockmodels. ","element":"span"},{"text":"Ann. Statist. ","element":"span"},{"text":"41 ","element":"span"},{"text":"1922–1943. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3127853","text":"MR3127853","element":"a"}],[{"text":"Boyd, S.","element":"span"},{"text":", ","element":"span"},{"text":"Parikh, N.","element":"span"},{"text":", ","element":"span"},{"text":"Chu, E.","element":"span"},{"text":", ","element":"span"},{"text":"Peleato, B. ","element":"span"},{"text":"and ","element":"span"},{"text":"Eckstein, J. ","element":"span"},{"text":"(2010). Distributed optimization and statistical learning via the alternating direction method of multipliers. ","element":"span"},{"text":"Faund. Trends Mach. Learn. ","element":"span"},{"text":"3 ","element":"span"},{"text":"1–122.","element":"span"}],[{"style":{"height":14.4},"width":1136.04,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/32-1.png","element":"img","alt":"Cai, T. and Li, X. (2015). Supplement to “Robust and","inline":true,"padRight":true},{"text":"computationally feasible ","element":"span"},{"text":"community ","element":"span"},{"text":"detection ","element":"span"},{"text":"in ","element":"span"},{"text":"the ","element":"span"},{"text":"presence ","element":"span"},{"text":"of ","element":"span"},{"text":"arbitrary ","element":"span"},{"text":"outlier ","element":"span"},{"text":"nodes.” DOI:","element":"span"},{"href":"http://dx.doi.org/10.1214/14-AOS1290SUPP","text":"10.1214/14-AOS1290SUPP","element":"a"},{"text":".","element":"span"}],[{"text":"Cand`es, E. J.","element":"span"},{"text":", ","element":"span"},{"text":"Strohmer, T. ","element":"span"},{"text":"and ","element":"span"},{"text":"Voroninski, V. ","element":"span"},{"text":"(2013). PhaseLift: Exact and stable signal recovery from magnitude measurements via convex programming. ","element":"span"},{"text":"Comm. Pure Appl. Math. ","element":"span"},{"text":"66 ","element":"span"},{"text":"1241–1274. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3069958","text":"MR3069958","element":"a"}],[{"text":"Cand`es, E. J.","element":"span"},{"text":", ","element":"span"},{"text":"Li, X.","element":"span"},{"text":", ","element":"span"},{"text":"Ma, Y. ","element":"span"},{"text":"and ","element":"span"},{"text":"Wright, J. ","element":"span"},{"text":"(2011). Robust principal component analysis? ","element":"span"},{"text":"J. ACM ","element":"span"},{"text":"58 ","element":"span"},{"text":"Art. 11, 37. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2811000","text":"MR2811000","element":"a"}],[{"text":"Celisse, A.","element":"span"},{"text":", ","element":"span"},{"text":"Daudin, J.-J. ","element":"span"},{"text":"and ","element":"span"},{"text":"Pierre, L. ","element":"span"},{"text":"(2012). Consistency of maximum-likelihood and variational estimators in the stochastic block model. ","element":"span"},{"text":"Electron. J. Stat. ","element":"span"},{"text":"6 ","element":"span"},{"text":"1847–1899. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2988467","text":"MR2988467","element":"a"}],[{"text":"Chaudhuri, K.","element":"span"},{"text":", ","element":"span"},{"text":"Chung, F. ","element":"span"},{"text":"and ","element":"span"},{"text":"Tsiatas, A. ","element":"span"},{"text":"(2012). Spectral clustering of graphs with general degrees in the extended planted partition model. ","element":"span"},{"text":"J. Mach. Learn. Res. ","element":"span"},{"text":"23 ","element":"span"},{"text":"35.1– 35.23.","element":"span"}],[{"text":"Chen, Y.","element":"span"},{"text":", ","element":"span"},{"text":"Sanghavi, S. ","element":"span"},{"text":"and ","element":"span"},{"text":"Xu, H. ","element":"span"},{"text":"(2012). Clustering sparse graphs. ","element":"span"},{"text":"Adv. Neural Inf. Process. Syst. ","element":"span"},{"text":"25 ","element":"span"},{"text":"2213–2221.","element":"span"}],[{"text":"Chernoff, H. ","element":"span"},{"text":"(1981). A note on an inequality involving the normal distribution. ","element":"span"},{"text":"Ann. Probab. ","element":"span"},{"text":"9 ","element":"span"},{"text":"533–535. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=0614640","text":"MR0614640","element":"a"}],[{"text":"Clauset, A.","element":"span"},{"text":", ","element":"span"},{"text":"Newman, M. ","element":"span"},{"text":"and ","element":"span"},{"text":"Moore, C. ","element":"span"},{"text":"(2004). Finding community structure in very large networks. ","element":"span"},{"text":"Phys. Rev. E ","element":"span"},{"text":"70 ","element":"span"},{"text":"066111.","element":"span"}],[{"text":"Coja-Oghlan, A. ","element":"span"},{"text":"and ","element":"span"},{"text":"Lanka, A. ","element":"span"},{"text":"(2009/10). Finding planted partitions in random graphs with general degree distributions. ","element":"span"},{"text":"SIAM J. Discrete Math. ","element":"span"},{"text":"23 ","element":"span"},{"text":"1682–1714. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2570199","text":"MR2570199","element":"a"}],[{"style":{"height":13.95},"width":1006.52,"height":34.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/33-0.png","element":"img","alt":"Decelle, A., Krzakala, F., Moore, C. and Zdeborov´a, L.","inline":true,"padRight":true},{"text":"(2011). Asymptotic analysis of the stochastic block model for modular networks and its algorithmic applications. ","element":"span"},{"text":"Phys. Rev. E ","element":"span"},{"text":"84 ","element":"span"},{"text":"066106.","element":"span"}],[{"text":"Deshpande, Y. ","element":"span"},{"text":"and ","element":"span"},{"text":"Montanari, A. ","element":"span"},{"text":"(2015). Finding hidden cliques of size","element":"span"},{"style":{"height":17.6},"width":151.08,"height":44,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/33-1.png","element":"img","alt":"�N/e in","inline":true,"padRight":true},{"text":"nearly linear time. ","element":"span"},{"text":"Found. Comput. Math. ","element":"span"},{"text":"DOI:","element":"span"},{"href":"http://dx.doi.org/10.1007/s10208-014-9125-y","text":"10.1007/s10208-014-9125-y","element":"a"},{"text":". To appear.","element":"span"}],[{"text":"Fienberg, S. E. ","element":"span"},{"text":"(2010). Introduction to papers on the modeling and analysis of network data. ","element":"span"},{"text":"Ann. Appl. Stat. ","element":"span"},{"text":"4 ","element":"span"},{"text":"1–4. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2758081","text":"MR2758081","element":"a"}],[{"text":"Fienberg, S. E. ","element":"span"},{"text":"(2012). A brief history of statistical models for network analysis and open challenges. ","element":"span"},{"text":"J. Comput. Graph. Statist. ","element":"span"},{"text":"21 ","element":"span"},{"text":"825–839. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3005799","text":"MR3005799","element":"a"}],[{"text":"Fishkind, D. E.","element":"span"},{"text":", ","element":"span"},{"text":"Sussman, D. L.","element":"span"},{"text":", ","element":"span"},{"text":"Tang, M.","element":"span"},{"text":", ","element":"span"},{"text":"Vogelstein, J. T. ","element":"span"},{"text":"and ","element":"span"},{"text":"Priebe, C. E. ","element":"span"},{"text":"(2013). Consistent adjacency-spectral partitioning for the stochastic block model when the model parameters are unknown. ","element":"span"},{"text":"SIAM J. Matrix Anal. Appl. ","element":"span"},{"text":"34 ","element":"span"},{"text":"23–39. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3032990","text":"MR3032990","element":"a"}],[{"style":{"height":13.95},"width":448.76,"height":34.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/33-2.png","element":"img","alt":"F¨uredi, Z. and Koml´os, J.","inline":true,"padRight":true},{"text":"(1981). The eigenvalues of random symmetric matrices. ","element":"span"},{"text":"Combinatorica ","element":"span"},{"text":"1 ","element":"span"},{"text":"233–241. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=0637828","text":"MR0637828","element":"a"}],[{"text":"Giesen, J. ","element":"span"},{"text":"and ","element":"span"},{"text":"Mitsche, D. ","element":"span"},{"text":"(2005). Reconstructing many partitions using spectral techniques. In ","element":"span"},{"text":"Fundamentals of Computation Theory","element":"span"},{"text":". ","element":"span"},{"text":"Lecture Notes in Computer Science ","element":"span"},{"text":"3623 ","element":"span"},{"text":"433–444. Springer, Berlin. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2194866","text":"MR2194866","element":"a"}],[{"text":"Goldenberg, A.","element":"span"},{"text":", ","element":"span"},{"text":"Zheng, A. X.","element":"span"},{"text":", ","element":"span"},{"text":"Fienberg, S. E. ","element":"span"},{"text":"and ","element":"span"},{"text":"Airoldi, E. M. ","element":"span"},{"text":"(2010). A survey of statistical network models. ","element":"span"},{"text":"Foundations and Trends in Machine Learning ","element":"span"},{"text":"2 ","element":"span"},{"text":"129–233.","element":"span"}],[{"text":"Handcock, M. S.","element":"span"},{"text":", ","element":"span"},{"text":"Raftery, A. E. ","element":"span"},{"text":"and ","element":"span"},{"text":"Tantrum, J. M. ","element":"span"},{"text":"(2007). Model-based clustering for social networks. ","element":"span"},{"text":"J. Roy. Statist. Soc. Ser. A ","element":"span"},{"text":"170 ","element":"span"},{"text":"301–354. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2364300","text":"MR2364300","element":"a"}],[{"text":"Holland, P. W.","element":"span"},{"text":", ","element":"span"},{"text":"Laskey, K. B. ","element":"span"},{"text":"and ","element":"span"},{"text":"Leinhardt, S. ","element":"span"},{"text":"(1983). Stochastic blockmodels: First steps. ","element":"span"},{"text":"Soc. Netw. ","element":"span"},{"text":"5 ","element":"span"},{"text":"109–137. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=0718088","text":"MR0718088","element":"a"}],[{"text":"Horn, R. A. ","element":"span"},{"text":"and ","element":"span"},{"text":"Johnson, C. R. ","element":"span"},{"text":"(2013). ","element":"span"},{"text":"Matrix Analysis","element":"span"},{"text":", 2nd ed. Cambridge Univ. Press, Cambridge. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2978290","text":"MR2978290","element":"a"}],[{"text":"Jalali, A.","element":"span"},{"text":", ","element":"span"},{"text":"Chen, Y.","element":"span"},{"text":", ","element":"span"},{"text":"Sanghavi, S. ","element":"span"},{"text":"and ","element":"span"},{"text":"Xu, H. ","element":"span"},{"text":"(2014). Clustering partially observed graphs via convex optimization. ","element":"span"},{"text":"J. Mach. Learn. Res. ","element":"span"},{"text":"15 ","element":"span"},{"text":"2213–2238.","element":"span"}],[{"text":"Jin, J. ","element":"span"},{"text":"(2015). Fast network community detection by SCORE. ","element":"span"},{"text":"Ann. Statist. ","element":"span"},{"text":"43 ","element":"span"},{"text":"57–89.","element":"span"}],[{"text":"Joseph, A. ","element":"span"},{"text":"and ","element":"span"},{"text":"Yu, B. ","element":"span"},{"text":"(2013). Impact of regularization on spectral clustering. Available at ","element":"span"},{"href":"http://arxiv.org/abs/arXiv:1312.1733","text":"arXiv:1312.1733","element":"a"},{"text":".","element":"span"}],[{"text":"Karrer, B. ","element":"span"},{"text":"and ","element":"span"},{"text":"Newman, M. E. J. ","element":"span"},{"text":"(2011). Stochastic blockmodels and community structure in networks. ","element":"span"},{"text":"Phys. Rev. E (3) ","element":"span"},{"text":"83 ","element":"span"},{"text":"016107, 10. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2788206","text":"MR2788206","element":"a"}],[{"style":{"height":13.95},"width":1435.08,"height":34.88,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/33-3.png","element":"img","alt":"Krzakala, F., Moore, C., Mossel, E., Neeman, J., Sly, A., Zdeborov´a, L. and","inline":true,"padRight":true},{"text":"Zhang, P. ","element":"span"},{"text":"(2013). Spectral redemption in clustering sparse networks. ","element":"span"},{"text":"Proc. Natl. Acad. Sci. USA ","element":"span"},{"text":"110 ","element":"span"},{"text":"20935–20940. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3174850","text":"MR3174850","element":"a"}],[{"text":"Kumar, A.","element":"span"},{"text":", ","element":"span"},{"text":"Sabharwal, Y. ","element":"span"},{"text":"and ","element":"span"},{"text":"Sen, S. ","element":"span"},{"text":"(2011). A simple linear time (1 + ","element":"span"},{"style":{"height":14.4},"width":41.76,"height":36,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/33-4.png","element":"img","alt":" ǫ)-","inline":true,"padRight":true},{"text":"approximation algorithm for ","element":"span"},{"text":"k","element":"span"},{"text":"-means clustering in any dimensions. ","element":"span"},{"text":"J. ACM ","element":"span"},{"text":"58 ","element":"span"},{"text":"11.","element":"span"}],[{"text":"Lei, J. ","element":"span"},{"text":"and ","element":"span"},{"text":"Rinaldo, A. ","element":"span"},{"text":"(2015). Consistency of spectral clustering in stochastic block models. ","element":"span"},{"text":"Ann. Statist. ","element":"span"},{"text":"43 ","element":"span"},{"text":"215–237. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3285605","text":"MR3285605","element":"a"}],[{"text":"Li, X. ","element":"span"},{"text":"and ","element":"span"},{"text":"Voroninski, V. ","element":"span"},{"text":"(2013). Sparse signal recovery from quadratic measurements via convex programming. ","element":"span"},{"text":"SIAM J. Math. Anal. ","element":"span"},{"text":"45 ","element":"span"},{"text":"3019–3033. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3106479","text":"MR3106479","element":"a"}],[{"text":"Lin, Z.","element":"span"},{"text":", ","element":"span"},{"text":"Liu, R. ","element":"span"},{"text":"and ","element":"span"},{"text":"Su, Z. ","element":"span"},{"text":"(2011). Linearized alternating direction method with adaptive penalty for low rank representation. In ","element":"span"},{"text":"Advances in Neural Information Processing Systems (NIPS) ","element":"span"},{"text":"612–620.","element":"span"}],[{"text":"Mathieu, C. ","element":"span"},{"text":"and ","element":"span"},{"text":"Schudy, W. ","element":"span"},{"text":"(2010). Correlation clustering with noisy input. In ","element":"span"},{"text":"Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms ","element":"span"},{"text":"712–728. SIAM, Philadelphia, PA. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2768627","text":"MR2768627","element":"a"}],[{"text":"McSherry, F. ","element":"span"},{"text":"(2001). Spectral partitioning of random graphs. In ","element":"span"},{"text":"42nd IEEE Symposium on Foundations of Computer Science (Las Vegas, NV, 2001) ","element":"span"},{"text":"529–537. IEEE Computer Soc., Los Alamitos, CA. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=1948742","text":"MR1948742","element":"a"}],[{"text":"Newman, M. ","element":"span"},{"text":"and ","element":"span"},{"text":"Girvan, M. ","element":"span"},{"text":"(2004). Finding and evaluating community structure in networks. ","element":"span"},{"text":"Phys. Rev. E ","element":"span"},{"text":"69 ","element":"span"},{"text":"026113.","element":"span"}],[{"text":"Newman, M. ","element":"span"},{"text":"and ","element":"span"},{"text":"Leicht, E. ","element":"span"},{"text":"(2007). Mixture models and exploratory analysis in networks. ","element":"span"},{"text":"Proc. Natl. Acad. Sci. USA ","element":"span"},{"text":"104 ","element":"span"},{"text":"9564–9569.","element":"span"}],[{"text":"Nowicki, K. ","element":"span"},{"text":"and ","element":"span"},{"text":"Snijders, T. A. B. ","element":"span"},{"text":"(2001). Estimation and prediction for stochastic blockstructures. ","element":"span"},{"text":"J. Amer. Statist. Assoc. ","element":"span"},{"text":"96 ","element":"span"},{"text":"1077–1087. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=1947255","text":"MR1947255","element":"a"}],[{"text":"Oymak, S. ","element":"span"},{"text":"and ","element":"span"},{"text":"Hassibi, B. ","element":"span"},{"text":"(2011). Finding dense clusters via low rank + sparse decomposition. Available at ","element":"span"},{"href":"http://arxiv.org/abs/arXiv:1104.5186","text":"arXiv:1104.5186","element":"a"},{"text":".","element":"span"}],[{"text":"Rohe, K.","element":"span"},{"text":", ","element":"span"},{"text":"Chatterjee, S. ","element":"span"},{"text":"and ","element":"span"},{"text":"Yu, B. ","element":"span"},{"text":"(2011). Spectral clustering and the highdimensional stochastic blockmodel. ","element":"span"},{"text":"Ann. Statist. ","element":"span"},{"text":"39 ","element":"span"},{"text":"1878–1915. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2893856","text":"MR2893856","element":"a"}],[{"text":"Sarkar, P. ","element":"span"},{"text":"and ","element":"span"},{"text":"Bickel, P. J. ","element":"span"},{"text":"(2013). Role of normalization in spectral clustering for stochastic blockmodels. Available at ","element":"span"},{"href":"http://arxiv.org/abs/arXiv:1310.1495","text":"arXiv:1310.1495","element":"a"},{"text":".","element":"span"}],[{"text":"Shamir, R. ","element":"span"},{"text":"and ","element":"span"},{"text":"Tsur, D. ","element":"span"},{"text":"(2007). Improved algorithms for the random cluster graph model. ","element":"span"},{"text":"Random Structures Algorithms ","element":"span"},{"text":"31 ","element":"span"},{"text":"418–449. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2362638","text":"MR2362638","element":"a"}],[{"text":"Snijders, T. A. B. ","element":"span"},{"text":"and ","element":"span"},{"text":"Nowicki, K. ","element":"span"},{"text":"(1997). Estimation and prediction for stochastic blockmodels for graphs with latent block structure. ","element":"span"},{"text":"J. Classification ","element":"span"},{"text":"14 ","element":"span"},{"text":"75–100. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=1449742","text":"MR1449742","element":"a"}],[{"text":"Sussman, D. L.","element":"span"},{"text":", ","element":"span"},{"text":"Tang, M.","element":"span"},{"text":", ","element":"span"},{"text":"Fishkind, D. E. ","element":"span"},{"text":"and ","element":"span"},{"text":"Priebe, C. E. ","element":"span"},{"text":"(2012). A consistent adjacency spectral embedding for stochastic blockmodel graphs. ","element":"span"},{"text":"J. Amer. Statist. Assoc. ","element":"span"},{"text":"107 ","element":"span"},{"text":"1119–1128. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3010899","text":"MR3010899","element":"a"}],[{"text":"Tropp, J. A. ","element":"span"},{"text":"(2012). User-friendly tail bounds for sums of random matrices. ","element":"span"},{"text":"Found. Comput. Math. ","element":"span"},{"text":"12 ","element":"span"},{"text":"389–434. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2946459","text":"MR2946459","element":"a"}],[{"text":"Vu, V. H. ","element":"span"},{"text":"(2007). Spectral norm of random matrices. ","element":"span"},{"text":"Combinatorica ","element":"span"},{"text":"27 ","element":"span"},{"text":"721–736. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=2384414","text":"MR2384414","element":"a"}],[{"text":"Vu, V. ","element":"span"},{"text":"(2014). A simple SVD algorithm for finding hidden partitions. Available at ","element":"span"},{"href":"http://arxiv.org/abs/arXiv:1404.3918","text":"arXiv:1404.3918","element":"a"},{"text":".","element":"span"}],[{"text":"Zhao, Y.","element":"span"},{"text":", ","element":"span"},{"text":"Levina, E. ","element":"span"},{"text":"and ","element":"span"},{"text":"Zhu, J. ","element":"span"},{"text":"(2012). Consistency of community detection in networks under degree-corrected stochastic block models. ","element":"span"},{"text":"Ann. Statist. ","element":"span"},{"text":"40 ","element":"span"},{"text":"2266–2292. ","element":"span"},{"href":"http://www.ams.org/mathscinet-getitem?mr=3059083","text":"MR3059083","element":"a"}],[{"style":{"width":"43%"},"width":627,"height":349,"src":"https://cdn.bytez.com/mobilePapers/v2/arxiv/1404.6000/images/34-0.png","element":"img"}]]}],"_version":"3.3.4"},"paperNode":"$28:props:children:props:children:0:props:product"}]]