DETERMINING the authenticity of digital multimedia isan important social problem. Many important institutions rely upon truthful multimedia content, including news services, law firms, and intelligence agencies. As a result, multimedia forensics researchers have developed a variety of approaches to inspect images for evidence of forgery [1]. Many of these approaches operate by directly identifying traces of manipulation, such as resampling [2], double JPEG compression [3]– [5], or contrast enhancement [6]. Several approaches work by identifying physical and lighting inconsistencies [7], [8]. Other approaches have found significant success by identifying inconsistencies in imaging traces such as lens aberrations [9], [10], sensor noise [11], [12] and general residual-based features [13], [14]. More recently, deep learning based approaches have been found to be very effective for detecting and localizing image forgeries [15]. Several techniques have been proposed, including methods that target double JPEG compression [16]–[18], tampering boundaries [19], [20], inpainting [21], face-warping [22], general manipulations [19], [23]–[25], and source inconsistencies [24], [26]–[30].
O. Mayer and M.C. Stamm are with the Department of Electrical and Computer Engineering, Drexel Univeristy, Philadelphia, PA, 19104 USA email: om82@drexel.edu
This material is based upon work supported by the National Science Foundation under Grant No. 1553610. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
A recent deep-learning-based forensics approach called “Forensic Similarity” has been shown to be a promising technique for forgery detection and localization [24], [30]. Earlier work in [30] demonstrated utility in forgery localization, and later work [24] proposed a technique using Forensic Similarity that outperformed prior art in forgery detection. Forensic Similarity is a technique that maps two small image patches to a score indicating whether the two patches contain the same or different forensic traces. These forensic traces are related to the source camera model and/or processing history of the image. Importantly, Forensic Similarity is effective even on “unknown” forensic traces that were not used to train the system. By identifying differences in forensic traces within an image, falsified images are exposed [24], [30].
While Forensic Similarity has shown encouraging promise for forgery analysis, existing methods have several drawbacks. First, previously proposed forgery localization techniques based on Forensic Similarity require the selection of a reference patch in a region known to be unaltered. This type of approach is sensitive to the selection of reference patch, and is an unrealistic scenario for a forensic investigator, who may not know a priori which regions are unaltered. Second, the forgery detection approach using Forensic Similarity in [24] uses a heuristic detection measure. While the heuristic approach achieved high forgery detection rates, performance can be sig-nificantly improved by developing more powerful approaches.
To address these problems, in this work we build off of prior Forensic Similarity research to more accurately detect image forgeries and localize the falsified image regions. To do this, we propose a new multimedia forensics concept called the Forensic Similarity Graph, which is an abstract graph-based representation of an image that captures important forensics relationships among all image regions. Briefly, to do this we first sample small patches from an image and represent each patch as vertex of a graph. Then, we assign edge weights between vertices according to the forensic similarity value between each corresponding pair of patches. Localized image tampering (e.g. splicing or localized airbrushing) introduces particular structures, called communities, into this Forensic Graph Representation. We propose two techniques to identify and partition these communities, adapting community detection techniques from literature [31]. Using our proposed technique we expose localized image tampering with high accuracy, and do so without requiring a reference patches to be selected.
The remaining sections of this paper are organized as follows. In Sec. II, we discuss related works on forgery detection and localization, as well as related community detection literature. In Sec. III we first introduce the concept of the Forensic Similarity Graph, and describe how to compute it in an image. In this section we also show how localized tampering introduces identifiable structures, called “communities,” into the Forensic Similarity Graph. In Sec. IV, we propose two community detection techniques which are used to identify and partition these community structures. We show how these techniques are used to detect forged images and, in tampered images, localize the tampered regions. Furthermore, we describe how the community partitions can be converted to pixel-level forgery localization decisions. Finally, in Sec. V and Sec. VI, we conduct a series of experiments testing the efficacy of our proposed approach. Experiments show that our proposed approach outperforms naive approaches that do not consider community structure, improving upon heuristics proposed in prior art. We also show that our approach outperforms prior-art localization approaches on publicly available benchmark databases.
A. Image Forgery Detection and Localization
Researchers have developed a number of heurisitc and hand derived features for image forgery analysis [2]–[6], [9]– [13]. More recently, research has shown that deep learning based approaches are able to achieve improved performance in forgery detection and localization [15]. Work by Barni et al. [16], Amerini et al. [17], and Wang et al. [18], trained deep learning systems to localize image areas with double JPEG compression artifacts indicative of tampering. Research by Salloum et al. [19] developed a multi-task fully convolutional network, which improved forgery localization by training a branch to learn boundaries of spliced regions. Bappy et al. [20] proposed a system to learn splice-pristine region boundary transition using an LSTM architecture. Work by Wu et al. [23] also proposed a fully convolutional network, but trained on a targeted set of 385 manipulations.
The above mentioned deep-learning based techniques are trained to identify a targeted set of forgery features. Recently, researchers have discovered that techniques which identify inconsistencies in forensic features [13], [24], and in particular features related to image source [26], [27], [29], [30], can be even more successful. Since it is not feasible to train a system on all possible and real-world manipulations, these systems instead detect inconsistencies and anomalies in forensic features indicative of splicing.
In early work by Bondi et al. a convolutional neural network (CNN) was used to generate deep-feature representations of an image’s source camera-model, and then detected and localized image forgeries via an iterative k-means clustering approach [26]. Subsequently, our work in [30] showed that the similarity of camera-model related forensic traces can be directly measured using a CNN-based siamese network, which can be used to localize image forgeries. This idea was further refined by our research in [24], where we proposed a more formal definition of Forensic Similarity, which is the quantifiable measure of similarity of forensic traces related to the source and/or processing between two images patches, and refined the siamese network technique.
Deep learning forensics research by Cozzolino et al. [28], [29] proposed a CNN that transforms an image to highlight artifacts associated with camera-specific traces. Inconsistencies in the resulting fingerprint map were then used to identify forged regions. Huh et al. developed a deep-learning technique to create a “consistency map” for forged images [27]. In this consistency map approach, regions of the image are highlighted which contain predictions of EXIF-based metadata that are inconsistent with the majority of the image. Huh et al. showed that taking spatial average of the consistency map and comparing to a threshold can be used to perform forgery detection.
Multimedia forensics techniques explicitly draw the distinction between forgery detection and forgery localization [26], [27]. Forgery detection methods are those that determine if an image has been tampered or is alternatively unaltered. Forgery localization methods are those that, given a forged image, determine which regions of an image have been tampered. While these are distinctly different problems, the underlying mechanisms for analysis are similar and, as a result, are often studied in conjunction. In this paper, we propose techniques for both forgery detection and forgery localization.
The work in this paper directly builds from our prior research on Forensic Similarity in [24]. In [24], a deep-learning system was designed to input two image patches and output a score indicating forensic similarity. The system was composed of a two CNNs in a hard-sharing configuration, with the CNN architecture based upon the CNN architecture designed in [32], which was shown to be generalization to many forensic tasks in [33]. These CNNs act as feature extractors, which feed a pair of features, one for each image patch, into a shallow three-layer neural “similarity network” which has a final layer composed of a single neuron, which outputs a score indicating similarity. This system is then trained end-to-end with pairs of image patches and labels associated with whether they contain the same or different forensic traces.
In [24], we showed that tampered regions of an image can be highlighted by selecting reference patch, and highlighting the patches in the image that contain different forensic traces. Provided a patch in the unaltered region of the image was selected, the tampered regions are highlighted. Furthermore, we showed state-of-the-art forgery detection accuracy by computing the forensic similarity values between all patches, computing the mean of these values, and comparing that mean to a threshold. In unaltered images, this value would be high indicating all patches were forensically similar. Alternatively in tampered images, this value would be low indicating some patches were forensically dissimilar.
There are, however, drawbacks to existing forgery detection and localization approaches using Forensic Similarity. First, the forgery localization approach require a selection of a reference patch in the image. This is problematic since the investigator may not have knowledge of which regions of the image are unaltered, and the poor selection of a patch may lead to erroneous results. Second, the forgery detection approach, which utilizes the mean forensic similarity value of the image, does not adequately consider the complex forensic relationships that occur in a tampered image. For example, a large tampered region will be self-similar within the region, but dissimilar to the unaltered part of the image. This type of relationship is lost by averaging all similarity values. Furthermore, by utilizing the mean similarity value, the detection statistic becomes inherently tied to the size of the forged region. However, it is important to detect small forgeries with high accuracy. This problem also occurs in the forgery detection approach by Huh et al. [27], which utilizes the spatial average of their EXIF-based consistency map to detect forgeries. In this paper, we propose techniques that address these drawbacks.
B. Community Detection in Graphs
In this paper, we propose a graph-based representation of the image which is then analyzed for forgery. A graph G = (V, E) is defined by a set of elements called “vertices” V , with “edges” that connect pairs of vertices [31], [34]. In this work, we propose a graph-based representation of images, with small image regions represented by vertices and edges assigned according to the similarity of their forensic traces. Communities are subsets of vertices with dense edges within their own community [31]. In Sec. III, we show that localized tampering introduces communities into this graph representation. As a result, localized tampering is exposed via detection of multiple communities in the graph, and the tampered regions are localized by partitioning the vertices according to their respective communities.
Researchers have developed a variety of techniques to detect and partition communities in general graphs. These techniques are referred to as “community detection” algorithms [31]. Early techniques included divisive techniques, which iteratively remove edges from the graph until communities form as disconnected subgraphs [35]. Other techniques optimize community partitions based on community quality measures such as modularity [36]–[38], label propagation [39], the minimum energy state of the spin-glass model [40], and probability derived from random walks [41]. Some clustering algorithms can also be framed as community detection techniques, including spectral clustering [42] and extensions of k-means clustering [43]. In the multimedia forensics field, spectral clustering has been applied to the image phylogeny problem [44].
While there are many approaches to community detection in graphs, we focus on just two for this paper. However, we take care to provide a framework that is readily extended to any community detection technique, allowing for future research to take advantage of improvements in community detection algorithms. Namely the two techniques we focus on are: 1) Modularity Optimization [37] and 2) Spectral Clustering [42]. These techniques are chosen since they have been shown to be powerful and popular techniques in literature [31], [42].
1) Spectral Clustering: Spectral clustering is a technique for partitioning graphs into clusters [31], [42], [45], which leverages properties of the graph Laplacian matrix to determine community structures.
The graph Laplacian matrix, L, is defined as
where W is the edge weight matrix with elements assigned according to the similarity of vertices i and j, and D is the degree matrix with values
on the diagonal, and zeroes off-diagonal. Additionally, we define the normalized graph Laplacian
which regularizes the matrix [45].
The graph Laplacian has a number of special properties [45]. In particular, it has spectrum of non-negative real valued eigenvalues . These eigenvalues a have property such that the multiplicity of
is equal to the number of disconnected communities in the graph. Furthermore, the eigenspace of 0 is spanned by the indicator vectors
of those k communities, where
is the vertex membership of the
community. For the normalized graph Laplacian, it this eigenspace spanned by
[42], [45]. We leverage these properties for detecting forgeries and localizing the forged regions, described in Sec. IV.
2) Modularity Optimization: Modularity, Q, i an index for the quality of how well a graph has been partitioned into communities [31], defined as:
where m is the sum of edges in the graph, W is the edge weight matrix, is the weighted degree of vertex i, and the indicator function
is 1 when both vertex i and j are assigned to be members of the the same community. The term
connection between vertices i and j. Intuitively, if two vertices in the same community have a low connection expectation, but have a high weight between them, then modularity is increased.
Higher values of modularity indicate that community structure exists, and that a good partitioning has been found [31], [35], [46]. Modularity in the case of a single community is zero.
Modularity Optimization is a family of techniques used to determine the underlying community memberships in a graph. These techniques operate by assigning vertices to communities such that modularity is maximized,
where is the maximum found modularity value in the graph. There have been several proposed methods for optimizing for modularity, including divisive and agglomerative algorithms. Pos and Latapy proposed an agglomerative approach based on random walks [41]. Blondel et al. proposed a optimization approach that iteratively applies local changes to community memberships until a maxima is found [36]. Reichart and Bornholdt posed modularity optmization as a simulated annealing problem, with good results [40]. In this paper we utilize the ”fast greedy” method proposed by Clauset et. al [38], which uses a hierarchical agglomeration approach, and achieves similar results of other methods on a reasonable time scale.
In this paper, we propose a new graph-based concept called the “Forensic Similarity Graph” of an image. The Forensic Similarity Graph captures important relationships among small regions of an image. These relationships bear evidence of localized tampering (e.g. splicing). Through analysis of this representation, we detect and localize falsified images. In later sections, we experimentally show that by using this proposed technique, we detect forgeries with higher accuracy than with techniques that do not capture such relationships.
In the Forensic Similarity Graph, each image patch is represented by a vertex. Edges are assigned between each pair of vertices with weight specified by the forensic similarity between those two corresponding image patches. In forged images, tampered regions form communities, where edges within the tampered region are densely connected with high weight, and are disconnected (or at least have low weight) to the regions outside of the tampered region. The term “community” aligns with literature on graph-based community detection [31], [35], [46], which are used to analyze community structures in other domains, such as social networks [49] and biological networks [50].
The procedure we propose is as follows:
1) Sample n patches from the image
2) Calculate forensic similarity between all pairs of sampled patches, according to [24]
3) Convert the image into its graph representation, called the Forensic Similarity Graph, with patches as vertices and edges assigned according to the forensic similarity between patches
4) Perform forgery detection and/or localization by applying community detection to the Forensic Similarity Graph
Here, we more formally describe the construction of the Forensic Similarity Graph. To construct the Forensic Graph of an image, we first sample n patches from the image , where
is the
sampled patch and X is the space of image patches. In this paper we use square patches that are regularly spaced and overlapped, typically we use a patch sizes of
or
and 50% to 75% overlap in both x and y directions. 1
We define the graph G = (V, E) with vertex set . and edges E which connect each unique pairing of vertices. Edges have weight
equal to the forensic similarity between
and
, given that the similarity is
Fig. 1: An example of community structure in the forensic similarity graph representation of a forged image. Original image (a) and edited image (b), where water stains were removed using a brush tool, with vertex indices overlaid. Editing credit to Reddit.com user “/u/rombouts.” The forensic similarity matrix in (c) shows the forensic similarity between each sampled image patch. The graph representation in (d) highlights the community structure, showing the patches associated with the tampered region appears as its own inconnected cluster. The graph in (d) is drawn in igraph [47], showing only edges > 0.90 and using the Kamada-Kawai force-directed layout method [48].
larger than a threshold, and otherwise have weight 0. That is, the graph has edge weights
where is the edge weight between vertex i and vertex j, and
is the forensic similarity between image patch
and image patch
. Unless otherwise speci-fied, we use a threshold t = 0 meaning that the graph is fully connected, with
non-zero edges. In some cases, we use a threshold 1 > t > 0 to improve the visual interpretation of the graph, or to improve algorithm performance. Here, we use forensic similarity
as defined in [24], which quantifies the similarity of the forensic traces across two image patches related to the source and/or processing history. A value of 1 indicates high similarity, and a value of 0 indicates dissimilarity. The edge weights create the edge weight matrix, which we refer to as the Forensic Similarity Matrix. A visual example of this Forensic Similarity Matrix can be seen in Fig. 1c. In some literature, an edge weight matrix is sometimes referred to as the weighted adjacency matrix [31], [42].
Community structures in this Forensic Similarity Graph are then used to identify and localize image forgeries. For example, in an image that has been locally tampered (e.g. a spliced image), an edge between a patch in the tampered region and a patch in the unaltered image region will have no-or-low weight, since they have dissimilar forensic traces. However, edges among patches within a tampered region are densely connected with high weight, provided that the entirety of the tampered region contain the same forensic traces, i.e. undergone the same tampering process (e.g. brush tool, resizing, etc). Additionally, edges among patches in the unaltered part of the image are similarly densely connected with high weight, since all unaltered patches have the same processing history. As a result, this forms subsets of vertices have have high edge-connectivity within the subset, called a community, and low edge-connectivity to other subsets.
In unaltered images, where every patch was captured by the same camera model and has the same processing history, all edges in the graph are expected to have high weight forming only a single community. In addition, images that have been uniformly/globally tampered will also have all edges of high weight, since all patches will have the same processing history. Examples of uniform tampering include global image resizing, JPEG re-compression, and global nonadaptive contrast enhancement.
Detecting community structures in general graphs has been studied in other domains [31], [35], [46], [50]. In Sec. III, we adapt two community detection techniques from literature, and apply them to the Forensic Similarity Graph to perform forgery detection and localization. Later, in Sec. V and VI, we experimentally show that our proposed community detection based approaches achieve higher forgery detection accuracy than approaches that do not consider community structure.
1) Example: We show an example of the graph representation in Fig. 1. The tampered image is sampled by gridding it into patches, with 50% overlap. An index is assigned to each vertex and are shown in Fig. 1(b) at their corresponding patch locations. The edge weight matrix is shown in Fig. 1(c), where the vertices corresponding to the edited shirt patches, indices = {14, 15, 16, 21, 22, 23, 28, 29}, have low similarity to the rest of the image patches, and high similarity to other edited shirt patches. Similarly, the non-edited patches have low similarity to the edited patches, and high similarity to other non-edited patches. This suggests that in this tampered image example, there are two forensic “communities.”
In Fig. 1(d), the vertex-edge representation of this graph is shown. This visual representation clearly highlights that two communities exist, one associated with the tampered region, and one associated with the non-tampered region. Vertex colors are assigned according to the partitioning determined by the a “modularity optimization” method [46], with edges between the identified communities grayed out.
In the previous section, we introduced the Forensic Sim- ilarity Graph of an image and provided intuition about how structures in this representation, called “communities,” indicate that the image has been locally tampered. In this section, we present techniques to perform the detection of this community structure, and as a result detect and localize the image tampering.
Forgery detection and forgery localization are two related, but very different problems in multimedia forensics, and are differentiated in literature [26], [27]. Forgery detection techniques are used to indicate whether tampering has occurred in the image, and output a binary yes/no decision about the image. In this work, we propose community detection techniques that input a forensic graph and output a forgery detection decision,
We also propose community detection techniques that input a forensic graph, and output a tampering classification for each patch of the image, which is then used to segment the tampered regions of the image,
where is the community membership for the
vertex in the graph, i.e. sampled image patch, and n is the number of sampled image patches. We note that the nature of this type of localization analysis partitions the tampered region from the unaltered, but does not inherently identify which of the partitioned regions is the tampered one or unaltered one. This is why each vertex/patch is mapped into a community identifier {1, . . . , k} as opposed to {Unaltered, Forged}. In this paper we focus on the case of k = 2, since most evaluation databases only differentiate between the tampered and unaltered regions, and don’t differentiate between different tampered regions, if they exist. However, the techniques we propose are general to k > 2. Recent work has highlighted the need for techniques that identify more than one tampered region [51].
Fig. 2: Forgery detection scores on example original and manipulated images. Low values (< 100) and high
values (> 0.025) indicate localized tampering.
In the remainder of this section, we present two different community detection techniques, namely the Spectral Clustering and Modularity Optimization techniques introduced in Sec. II-B. These techniques are applied to the Forensic Similarity Graph to detect and partition the community structures associated with image tampering. While we present multiple community detection methods, only one is used at a time, and we compare their efficacy in the experimental results. Additionally, we show how to convert the patch-based localization partitions into a pixel-level segmentation of forged and unaltered regions. This is done so that the proposed localization methods can be effectively compared to other forgery localization techniques that operate at the pixel-level, such as those in [13], [26], [27], [29].
A. Spectral Clustering
In this method, we apply Spectral Clustering to the forensic similarity graph to detect forgeries and localize tampered regions. We start by constructing the Forensic Similarity Graph an image, described in Sec. III, by sampling n patches and computing the edge weight matrix W. Then, we construct the graph Laplacian L according to (1). Forgery detection and forgery localization are then performed as follows.
1) Forgery Detection: As described in Sec. II, the graph Laplacian matrix has a spectrum of eigenvalues . To detect forgeries, we use the fact that the multiplicity of
is equal to the number of disconnected communities in the graph to decide whether an image has been forged. Typically, this multiplicity is used to determine the number of communities in graph by determining the number of consecutive eigenvalues less than a threshold, and is sometimes called the “Eigengap” heuristic [42]. However, the problem of forgery detection is different in that we need to know only whether the image is forged or unaltered. In other words, we are concerned only if there exists a single community (unaltered) or, alternatively, more than one community (localized tampered).
To do this, in our approach we calculate the second smallest eigenvalue . If this value is low, it is indication that at least two communities exist, and potentially more. If it is high, then it is an indication only one community exists, and the image has not been altered. As a result, our forgery detection decision rule becomes
where G is the forensic similarity graph of the image, is second smallest eigenvalue of the graph Laplacian from Eq. (1), and
is the decision threshold determined empirically depending on the desired operating point. This “spectral gap” value is also sometimes referred to as the Fiedler eigenvalue, which measures the algebraic connectivity of the graph [31], [52].
Examples of the spectral gap are shown in Fig. 2 for unaltered and forged images. In the two unaltered images(a) and (c) the spectral gap values of 318.42 and 235.56 are high, indicating that only one forensic community exists and no tampering has occurred. In the two spliced images (b) and (d) the spectral gap values of 76.75 and 4.28 are low, indicating that more than one forensic community exists and the tampering has occurred.
2) Forgery Localization: To perform localization, we use the fact that the eigenspace of 0 of the graph Laplacian is spanned by the indicator vectors to partition the graph vertices into forged and unaltered regions. In this paper we consider the case of k = 2, in order to partition the tampered versus unaltered regions. To do this, we first calculate the eigenvector
associated with the eigenvector
where is the second smallest eigenvalue of the graph Laplacian matrix.
Next, we use the sign of each component of to assign each vertex into a community membership,
where is the predicted community membership of the
graph vertex / image patch. This method approximates the (NP-Hard) Ratio cut algorithm for k = 2. When using the normalized graph Laplacian, it approximates the Normalized Cut algorithm [42].
An example of this algorithm is shown in Fig. 3, where we partition the image patches from the spliced image in Fig. 2(d) into two communities. In this figure, we show the histogram of values of . The components associated with patches in the tampered region are shown in blue, and are typically above
Fig. 3: Component values of for the edited image in Fig. 2(d). In this example, components of
corresponding to the forged patches are greater than 0, and unaltered patches are less than 0.
0. The components associated with patches in the unaltered regions are shown in orange, and are typically below 0. Ground truth was determined based on the central location of each image patch.
We note that this algorithm can be easily extended to consider k > 2. This is done by considering the eigenvector values for each vertex as a point in space and performing k-means clustering [42].
B. Modularity Optimization
Here, we describe a second technique for forgery detection and localization based on Modularity Optimization. We start by constructing the Forensic Similarity Graph an image, described in Sec. III, by sampling n patches and computing the edge weight matrix W. We then perform the Modularity optimization (5) resulting in .
To calculate, we use a popular modularity optimization technique called the “fast-greedy” method, which agglomeratively forms communities by iteratively combining communities which yield the greatest modularity increase [37], [38], and reports the largest modularity value found. While there are a number of variations of modularity optimization techniques, the fast-greedy method is a popular technique [31] and operates on a reasonable timescale on our forensic graphs. Furthermore, when constructing W, we typically use a thresholded edge weight matrix (6), often with a threshold 0.5 < t < 0.9, for modularity optimization techniques. Using a thresholded edge weight matrix significantly reduces the complexity of the optimization, and we have anecdotally found that doing so produces more reliable modularity scores.
1) Forgery Detection: To detect whether the image has been locally tampered, we compare the optimized modularity value, , found for the optimal partition to a threshold. If
exceeds the threshold, it indicates that there is strong community structure and, as a result, indicates that one or more regions of the image has been tampered with. Our decision measure becomes
where G is the forensic similarity graph of the image, is the optimized modularity value found on G, and
is the decision threshold, chosen empirically.
2) Forgery Localization: Here, we localize tampered regions by using the optimized community memberships, , to assign each patch into a community,
where is the optimized community membership of the
vertex/image patch determined by the modularity optimization algorithm. As with the Spectral Clustering method, we use k = 2 to segment the unaltered region from the forged region(s). Though we note that this method is trivially extendable to k > 2.
As an example, the partitioning into the green and red coloring of Fig. 1 was determined by the Modularity Optimization method, using an edge weighting threshold of t = 0.9 and k = 2.
C. Pixel-level localization
The above localization techniques partition image patches into forged and unaltered regions. However, localization is often performed at the pixel level [13], [26], [27]. Here, we describe how to convert patch-level community partitions to a pixel-level forgery localization.
We start by building a pixel map of localization predictions P with dimensions in x and y that are the same as the image,
where is the predicted community membership for patch l, and
is the set of coordinates covered by patch l. That is, for each pixel we sum the number of patches that contain that pixel and are also predicted to be in community
. The investigator can vary
to inspect the various communities. For k = 2, we choose
to be the smaller of the two communities, since the largest community is likely to be the unaltered region.
Pixels near the edges and corners of an image are covered by less patches, and are more difficult to accurately predict. To address this, we create a normalized pixel map
where T is the map of total patches that cover each pixel defined by
Finally, we smooth the normalized pixel map, using a Gaussian blur kernel, and compare to a threshold, chosen empirically.
In Fig. 4, we show an example. Fig. 4a shows an edited image from the Carvalho database [53] and Fig. 4b shows the ground truth mask, with the edited region highlighted in black.
Fig. 4: An example showing the conversion from patch-level community partitions to pixel-level forgery localization, described in Sec. IV-C.
Fig. 4c shows the patch-level prediction map, determined using the Spectral Clustering method. Since patches were sampled from image with 50% overlap, pixels are covered by 1 to 4 patches, as shown in Fig. 4d. Fig. 4e shows the normalized prediction map, in which we can see more confident forgery localization along the corner and edges of the image. Finally, Fig. 4f shows the smoothed and thresholded version of the normalized prediction map, in which the tampered region is localized and matches the ground truth.
We conducted a series of experiments to test the efficacy of our proposed approach for forgery detection and localization. In this section, we describe the experimental procedures and results for forgery detection. In Sec. VI, we conduct experiments to test forgery localization performance. In general, we found that the proposed Spectral Gap technique achieved highest forgery detection performance, and outperformed naive methods that do not include community structure.
We performed two sets of experiments for forgery detection. In the first experiment, we evaluated performance on three publicly available benchmark datasets, which contain unaltered and tampered images. In the second experiment, we evaluated performance on “synthetic” forgeries, which were created by copying and pasting a block from one image into another. While not realistic in appearance, the synthetic
Fig. 5: Example forgeries from the benchmark datasets.
forgery experiment shows statistical characterizations of the system, controlling for the size of the forged area.
In these experiments we evaluated the two proposed community detection techniques that are described in Sec. IV: the Spectral Gap method (10) and Modularity Optimization method (13). We then compared to two naive approaches that utilize the same forensic similarity procedure of our proposed approach, but do not consider community structure. These are 1) “Mean Similarity” which is the average forensic similarity edge weight as a detection measure and used in prior work [24], and 2) “Minimum Similarity” using the minimum forensic similarity edge weight as a detection measure. A significant contribution of the Forensic Similarity Graph is that it captures complex relationships among regions of the image, in the form of communities, which enables more powerful analysis. The following experiments show that the proposed community detection based approaches outperform these naive approaches. These results highlight that identifying community structure is critical to accurately detecting image forgeries.
In addition, we compared against several state-of-the-art methods. We compared against the work by Huh et al. [27], which computes a deep-learning based self-consistency map and detects image forgeries by spatially averaging this map, and comparing to a threshold. They compute three types of self-consistency maps based on camera-model consistency, image consistency, or EXIF consistency. Our approach is most comparable to the “Camera” approach in [27], since forensic similarity network we used was trained on patches labeled according to their source camera model. We also compared to the work by Bondi et al. [26], in which an 18-elements feature vector is extracted with a CNN trained for camera model identification on 64 x 64 pixels patches. A simple clustering procedure is iteratively applied to separate the background from the foreground.
In these experiments, we calculated the forensic similarity of image patches using the deep-learning approach described in [24], trained on four million image patches from 80 camera models. Two networks were tested, one with forensic patch size 128128, and a second with size 256
256. Patches were sampled spanning the image with 50% overlap. The forensic similarity graph was constructed for each testing image as described in Sec. III. Finally a forgery decision was rendered based on the proposed detection methods in Sec. IV. Performance was evaluated by the receiver operating characteristic (ROC) performance and the mean-Average-Precision (mAP) score, which were determined by sweeping the decision threshold
for each method.
Fig. 6: Forgery detection ROC curves on benchmark datasets
A. Performance on Benchmark Datasets
In this experiment, we evaluated the performance of our approach on publicly-available benchmarking datasets. These databases are 1) the “Columbia Uncompressed Splicing Database” [54] containing 180 spliced and 183 authentic tiff images ranging from sizes to
, 2) the “Carvalho DSO-1 Database” [53] containing 100 spliced and 100 authentic png images of size
, and 3) the “Korus Realistic Tampering Dataset” [55] containing 220 tampered and 220 corresponding original tiff images of size
. Examples of spliced images from each database are shown in Fig.5.
Fig. 6 shows ROC curves for our proposed approaches, as well as the methods we compare against. The ROC curve shows the trade off of the probability of correct detection () of a forged image versus the probability of false alarm (
), i.e. misclassification, of an unaltered image. In the majority of cases, our proposed approaches achieved highest performance, with the Spectral Gap method achieving highest detection rates. We note that the Huh et al. EXIF method achieves higher detection rates at high false alarm rates in the Columbia dataset, but was outperformed by our approaches at low false alarm rates. This is important because in many realistic conditions, forensic investigators must operate at low false alarm rates. This is because scenarios such as criminal or intelligence investigations, there can be a high cost to misidentifying an unaltered image as falsified.
Table I shows mAP scores for our proposed approaches as well as the methods we compared against. The mAP measure was chosen to compare to the reported scores in [27]. All of our proposed approaches achieved higher mAP scores than the exisitng Bondi et al., and Huh “Camera” approaches. The Huh “EXIF” method achieved a higher mAP score of 0.98 in the Columbia dataset. However, our proposed Spectral Gap method outperformed it on the more challenging Carvalho and Korus datasets, with mAP scores of 0.97 using a 256256
TABLE I: Mean Average Precision on Benchmark Databases
TABLE II: at
on Benchmark Databases
forensic patch size on Carvalho, and 0.65 using a 128128 forensic patch size on the Korus dataset.
Table II shows the detection rate () at the low false alarm rate of 0.01 for the tested approaches. Operating at low false alarm rates are important for investigators, due to the high cost of falsely identifying authentic content as tampered, and due to the high volume of unaltered content that they encounter. For the Columbia database, the Modularity Optimization approach with a patch size
achieved highest detection rate of 0.64. The Spectral Gap approach achieved highest rates of 0.80 on the Carvalho dataset with a size of
, and 0.16 on the Korus dataset with a size of
.
The results of this experiment show that our proposed technique exceeds prior-art forgery detection performance on the three tested benchmark datasets, and significantly outperforms these approaches in challenging datasets. Importantly, the experiment also shows that significant benefit is achieved when using community-structure aware approaches over the community-naive approaches (Mean and Minimum), especially at low false alarm rates.
Interestingly, we note that while the Modularity Optimization approach performed very well on the Columbia database, performance degraded in the other databases. The forged regions in Columbia dataset are much larger relative to the size of the image than in the Carvalho and Korus datasets. One possible reason behind this is performance decrease is that modularity optimization techniques are known to have difficulties detecting communities that are small relative to the size of the graph [56], referred to as the “resolution problem.”
Effects of global post processing: Here we investigated the impact of additional global post processing on forgery detection performance. To do this, we repeated the above
Fig. 7: Forgery detection mAP on benchmark images with global JPEG compression applied.
Fig. 8: Synthetic forgery detection results. The plot (a) shows ROC curves for proposed and naive methods in synthetic forgeries with forgery block size of 256256. Plot (b) shows the forgery detection probability for each method for different forgery block sizes. Plot (c) shows the ROC area-under-the-curve at challenging forgery block sizes using the spectral gap method with two different forensic patch sizes. In (c) forgery block size is displayed as relative area to the forensic patch size.
forgery experiment, with additional JPEG compression applied to each image, controlling for JPEG compression quality factor (QF). For each image, with a specified QF, we then computed its respective forensic similarity matrix using a forensic similarity model trained on images with JPEG re-compression applied of that QF. Finally, we computed the community-aware Spectral Gap and community-naive Mean Similarity forgery detection measures. We investigated performance at QF , where lower quality factors correspond to heavier compression effects.
Forgery detection performance under global JPEG compression is shown in Fig. 7. As expected forgery detection performance decreases with heaver compression (lower QF). Interestingly, the community-aware Spectral Gap approach is less affected by JPEG compression than the community-naive Mean Similarity measure in the Columbia dataset. Another interesting observation is that performance decrease is steeper in the Carvalho dataset than in the Columbia dataset. Two possible explanations are that the salient features in the Carvalho dataset are more affected by JPEG compression, or that the relatively smaller size of the forged regions are harder to detect under significant compression.
B. Characterizations on Synthetic Forgeries
In the above experiment, we observed that some forgeries were more challenging to detect than others. In this experiment, we studied the impacts of the tampered region size and patch size on forgery detection performance. To do this, we started with a database of images from the VISION database [57], using the “natural” images from the 35 cameras in the database. This database was chosen since the camera models in the database were different from those used to train our proposed system, representing a more practical scenario. We then created synthetic forgeries by copying a small-block from one image into another at random locations. We refer to the size of this block as the forgery block size. We tested detection rates at different forgery block sizes, and for a given forgery block size, we created 1000 such forgeries. To control for the size of the image, we ensured that the host image was of size 32642448, the most common size in the dataset. For unaltered images, we used 1000 images randomly chosen of size 3264
2448. To ensure an appropriate similarity signal was present, the donor and host images were captured by different camera models. We then calculated the proposed detection measures on each forged and authentic image to evaluate performance.
Fig. 8 shows forgery detection performance for the two community approaches and two naive approaches, using an input patch size of 256256. Fig. 8(a) shows ROC curves for these approaches at the
forgery block size. In this case, the Spectral Gap and Minimum Similarity methods demonstrated improved performance over the Modularity Optimization and Mean Similarity approaches. Notably, the Spectral Gap showed significant improvement over the community-naive Minimum Similarity method at low false alarm rates between 0 and 0.10.
Fig. 8(b) shows the positive detection rates of the four approaches at a false alarm rate of 0.05. In general, all approaches performed poorly at the smallest forgery size of , and improved with forgery block size. All approaches achieved
with the forgery block size of
. This experiment demonstrates that forgery size has a significant effect on detection performance.
In particular, forgery size impacted the Modularity Optimization method. Modularity optimization methods are well known to have difficulty in detecting communities much smaller in scale to the size of the graph [56]. This is also logical for the Mean Similarity method, as the mean value is inherently tied to the percentage of tampered patches relative to the whole image, likely achieving highest detection performance when 50% of the image is tampered. The largest forgery block size of is approximately 13% of the image size, whereas
is less than 1% of the image size. This result suggests that the Modularity Optimization and Mean Similarity results are sensitive to the size of the forged region relative to the size of the image, whereas the Spectral Gap method is only sensitive to forgery size up to the forensic patch size, as we discuss in more detail below. Further experimentation on the impact of forgery size relative to the image size may lend further insight on this finding.
We further experimented using the Spectral Gap method at more challenging forgery block sizes. Fig. 8(c) shows the ROC area-under-the-curve (AUC) of forgery block sizes from to
, for forensic patch sizes of
and
. We normalized the x-axis of the result relative to the forgery size, equal to the area of the forgery size divided by the area of the analysis patch size. We see that Spectral Gap performance steeply improves for forgery sizes larger than the analysis patch size. For both forensic patch sizes, forgeries smaller than the size of the forensic similarity patch were less likely to be detected. As a result, we would expect smaller patch sizes to improve performance in small forgery scenarios. This result is consistent with results on the benchmark datasets; methods using a patch size of
performed better on the Columbia dataset, which generally have large forged regions, and methods with patch sizes of
generally performed better on Korus dataset, which has smaller forged regions.
The results of this experiment highlight strong dependence of the proposed approaches on the size of the tampered region. The Spectral Gap method performed well when forgery sizes were equal to or greater than the analysis patch size. At challenging forgery sizes, the Spectral Gap method showed performance improvements over the naive Minimum Similarity approach, demonstrating the importance of identifying community structures captured by the Forensic Similarity Graph. The Modularity Optimization method worked well only when the forgery size was large, demonstrating a dependence on size of the tampered region relative to the size of whole image for this approach.
In this set of experiments, we tested the efficacy of our proposed forgery localization approaches. To do this, we performed forgery localization on tampered images from the three publicly available datasets used in the prior experiments. We evaluated forgery localizations using a variety of scoring measures, and compared to existing-art methods. The methods we compare to include localization methods by Bondi et al. [26] and Huh et al. [27], which were compared against in the previous forgery detection experiments. In addition, we compared to two localization-only techniques, the rich-model based SpliceBuster [13] and deep-learning based NoisePrint [29] methods by Cozzolino et al. The results of our experiments shows that our proposed approaches outperform these existing-art methods in the majority of scenarios.
To construct the forensic similarity graph for each forged image, we first sampled the image using patches of size , with 75% overlap in each dimension. The smaller patch size and higher overlap was chosen to increase spatial resolution. Then, we calculated the forensic similarity between image patches using the deep-learning approach described in [24], trained on four million image patches from 80 camera models. The forensic similarity graph was constructed for each forged image as described in Sec. III. Then, forgery localization was performed according to the proposed methods in Sec. IV. For the Spectral Clustering methods, we tested both cases with the non-normalized Laplacian, and the normalized Laplacian. For the Modularity Optimization approach, we used a edge threshold t = 0.7. Finally, the pixel-level forgery prediction map was created according to the method in Sec. IV-C using a Gaussian smoothing function with window size of
.
A. Localization Performance
Fig. 9 shows un-thresholded localization examples on tampered images from the benchmark databases. From left to right, we show the edited image, the ground truth mask, and the localization results from the SpliceBuster algorithm [13], the Huh et al. algorithm [27], the Noiseprint algorithm [29], and our proposed Spectral Clustering algorithm. Row (a) corresponds to an example tampered image from the Columbia dataset, rows (b)-(e) correspond to tampered images from the Carvalho dataset, and rows (f)-(i) correspond to tampered images from the Korus dataset. These examples show the power of our proposed approach. In many cases, our proposed approach accurately localized the tampered region, even in scenarios where the other algorithms had difficulty such as in the Korus dataset.
In addition, in Fig. 10 we show localization results from images downloaded from a photo-manipulating competition forum on Reddit.com, a social media website. These images contain highly varying and complex tampering techniques. Still, our proposed Spectral Clustering localization algorithm accurately localized the tampered regions of the images. These examples show that our proposed technique is effective on realistically tampered images, including those downloaded from social media websites.
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Fig. 9: Localization examples from the benchmarking datasets. Localization is performed using comparison methods, and our proposed Spectral Clustering method. Example images are from the (a) Columbia, (b-e) Carvalho, and (f-i) Korus datasets.
(a)
(b)
(c)
(d)
Fig. 10: Spectral clustering localization examples on images downloaded from the social media website Reddit.com. Image editing credit to Reddit users “artunitinc” (a), “ene due rabe” (b-c), and “Hordon Gayward” (d).
To quantify localization performance, we used several scoring measures. Namely, we used Matthews Correlation Coeffi-cient2 (MCC), the Fscore3, and the Area Under the Curve (AUC) of the reciever operatoring characteristic curve (ROC). The MCC and F
scores require a choice of threshold, we evaluated both cases where the threshold was chosen for each image, according to the approach in [27], and where the threshold was chosen for the entire database. Setting a single threshold for the entire database is a more realistic scenario, since the forensic investigator does not have ground truth data available for each image under investigation and must choose a threshold using outside information. The MCC and F
scores were calculated for each image and then averaged for each benchmark database, the AUC score was calculated using the ROC determined by all pixels in the database.
Table III shows MCC scores for our proposed and comparison approaches on the three benchmark datasets, using per-image thresholds, meaning that a different threshold was chosen for each image. For the Columbia dataset, our proposed methods achieved high scores of 0.84 and above, with the Spectral Clustering method performing the highest of 0.86 among the proposed methods, and just under the score achieved by Huh et al. For the more challenging Carvalho dataset, all three of our proposed methods outperformed
TABLE III: MCC, per image threshold
TABLE IV: MCC, per database threshold
the comparison methods. The Spectral Clustering algorithm achieved an MCC score of 0.80, which is significantly higher than the next highest comparison method. For the most challenging Korus dataset, our proposed spectral clustering achieved the highest MCC score of 0.38.
Table IV shows MCC scores using per-database thresholds, meaning that a single threshold was used for each database and is a more practical scenario. In all three benchmark datasets, our proposed Spectral Clustering algorithm achieved the highest scores. In the Columbia dataset, all three of our proposed methods outperform the Huh et al. method, suggesting that our proposed technique is more consistent and does not require threshold tuning for each image.
Similar trends were found for the Fscores, which are shown with per-image thresholds in Table V and with per-database thresholds in Table VI. Table VII shows the area under the curve (AUC) scores for each method. In general, the three proposed techniques outperform prior art, especially in the more challenging datasets. We note exceptional performance by the Normed Spectral Clustering approach in the
TABLE V: F1, per image threshold
TABLE VI: F1, per database threshold
TABLE VII: Area Under the Curve (AUC)
Carvalho dataset with a score of 0.97 (with next highest prior-art score of 0.76), and the Modularity Optimization in the Korus dataset with a score of 0.73 (with next highest prior-art score of 0.60).
The results of this experiment shows that our proposed community detection techniques, and in particular the Spectral Clustering method, consistently outperforms existing art demonstrating the power of our proposed Forensic Similarity Graph based approach. This was shown on several benchmarking forgery databases with a variety of scoring measures. These and the forgery detection experiments highlight the importance of considering community structures when analyzing image forgeries.
B. Localization with Multiple Tampering Communities
For forgery localization so far we have only been concerned with partitioning the tampered content from the unaltered content. For this reason, we have set the number of localized communities k = 2, where one community is the unaltered content and the other is the tampered content. This approach has been sufficient since the benchmark datasets typically only contain one tampering. However, forged images may contain multiple tampered regions that have undergone different types of manipulations. In which case, a forensic investigator may also wish to partition the different tampered regions, and can do so by considering k > 2, as mentioned in Sec. IV. In this experiment, we considered the case of k > 2 for Spectral Clustering based localization. To do this, we computed the first k eigenvectors of the graph laplacian matrix, and performed k-means clustering using these eigenevectors as feature vectors to determine the community partitions.
One example of an image with multiple tamperings is shown in Fig. 11. In this example, the forged image shows that 1) the person in the background was removed, and 2) the person in the foreground has their arm and leg muscles enlarged via warping. By setting k = 3, we were able to differentiate between the communities associated with the tamperings of the arms and legs (c = 2), and the community associated with the tampering of removing the person in the background (c = 3). By setting k = 2, we still differentiated between the tampered regions and the unaltered parts of the image, but not between the two different tamperings. It is important to note that this partitioned the different tampered regions because they were the result of different manipulation processes, and not because they were in different spatial locations.
Interestingly, we note that by setting k = 2, we still effectively localized the tampered regions. However, some of the tampered areas of removed-persons legs were assigned
Fig. 11: Forgery localization with multiple tampered regions, each with different processing operations. Comparison shows the effect of the number of localized communities k of 3 and 2. Image editing credit to Reddit user “Daexsin.”
to the unaltered community c = 0. This is likely because those areas were cloned from an unaltered region with minimal additional processing, and as a result were more similar to the unaltered community than the other patches in the tampered community. This result suggests that using k = 2 is useful for localizing in general, but results may be improved by correctly setting the value for k.
The result of this experiment shows that by setting k > 2, when appropriate, can localize between different types of tampering. Still, setting k = 2 in the case of multiple tamperings separates the tampered regions from the unaltered regions, and setting the value of k that matches the number of unique tamperings may improve forgery localization.
The execution time of forensic algorithms is an important consideration for the forensic investigator who conducts large scale analyses, such as filtering content at a social media company. To address this, we conducted an experiment to evaluate the computational complexity of the proposed approach. Since the overall approach is composed of several components, we investigated the computation time of each individually. The components are as follows: 1) the CNNbased feature vector extraction step, which is a sub-process
TABLE VIII: Computation Time
of the forensic similarity computation that is particularly computationally demanding [24], performed once per sampled image patch for the n sampled patches, 2) construction of the Forensic Similarity Graph, by computing the forensic similarity matrix, requiring forensic similarity calculations, and finally 3) the community detection step, which for the Spectral Clustering approach primarily consists of computing the eigendecomposition of the forensic similarity matrix, which runs in
time.
To conduct this experiment, we started with a standard 10801080 pixel image. Then, we computed the elapsed processing time of each algorithm component individually. Since the computational complexity of each step is tied to the number of sampled patches, we investigated different patch sizes with different sampling overlap. Computation of features and forensic similarity for graph construction were performed using Tensorflow v1.15.0, community detection was performed using the spectral clustering approach, with the eigendecomposition computed using NumPy v1.17.4. Experiments were conducted on a single processor computer with Intel i7-8700K CPU clocked at 3.70GHz, and Nvidia GeForce GTX 1080 Ti GPU.
Elapsed computation time for each step, and total elapsed computation time, is shown in Table VIII. For a small number of large patches, the feature extraction step dominates computation time. For a large number of small patches, graph construction dominates computation time. For larger images, smaller patch sizes, or more aggressive sample, it may be possible that the community detection step dominates (with the complexity relationship), but we haven’t encountered such a scenario in our investigations.
In addition to patch size and number of sampled patches, the hardware has a significant impact on computation time, as well as the architecture of the CNN-based feature extractor. The feature extractor used for similarity calculations in our approach is relatively light-weight, and increasing complexity of the architecture may increase computation time. Still, the results of this experiment show that forgery detection and localization can be achieved on the order of to
seconds per image.
We proposed an abstract, graph-based representation of an image, which we call the Forensic Similarity Graph. This representation can be used to analyze images for evidence localized tampering with high accuracy.Tampered and unaltered regions form unique structures that align with a concept called “communities” in graph-theory literature. As a result, forgery detection is performed by identifying multiple communities, and forgery localization is performed by partitioning these communities. We experimentally showed that this approach signficantly outperforms naive implementations that do not consider this community structure, including prior art.
This material is based upon work supported by the National Science Foundation under Grant No. 1553610. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the authors and do not necessarily reflect the views of the National Science Foundation.
The authors would also like to thank Luca Bondi for his assistance with this work.
[1] M. C. Stamm, M. Wu, and K. R. Liu, “Information forensics: An overview of the first decade,” IEEE access, vol. 1, pp. 167–200, 2013.
[2] A. C. Popescu and H. Farid, “Exposing digital forgeries by detecting traces of resampling,” Signal Processing, IEEE Transactions on, vol. 53, no. 2, pp. 758–767, 2005.
[3] T. Bianchi and A. Piva, “Image forgery localization via block-grained analysis of JPEG artifacts,” Information Forensics and Security, IEEE Transactions on, vol. 7, no. 3, pp. 1003–1017, 2012.
[4] H. Farid, “Exposing digital forgeries from JPEG ghosts,” Information Forensics and Security, IEEE Transactions on, vol. 4, no. 1, pp. 154– 160, 2009.
[5] S. Ye, Q. Sun, and E.-C. Chang, “Detecting digital image forgeries by measuring inconsistencies of blocking artifact,” in Multimedia and Expo, 2007 IEEE International Conference on. IEEE, 2007, pp. 12–15.
[6] M. C. Stamm and K. J. R. Liu, “Forensic detection of image manipu- lation using statistical intrinsic fingerprints,” Information Forensics and Security, IEEE Transactions on, vol. 5, no. 3, pp. 492–506, 2010.
[7] M. K. Johnson and H. Farid, “Exposing digital forgeries by detecting inconsistencies in lighting,” in Proceedings of the 7th workshop on Multimedia and security. ACM, 2005, pp. 1–10.
[8] F. Matern, C. Riess, and M. Stamminger, “Gradient-based illumination description for image forgery detection,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 1303–1317, 2019.
[9] M. K. Johnson and H. Farid, “Exposing digital forgeries through chro- matic aberration,” in Proceedings of the 8th Workshop on Multimedia and security. ACM, 2006, pp. 48–55.
[10] O. Mayer and M. C. Stamm, “Accurate and efficient image forgery detection using lateral chromatic aberration,” IEEE Transactions on Information Forensics and Security, vol. 13, no. 7, pp. 1762–1777, 2018.
[11] J. Luk´aˇs, J. Fridrich, and M. Goljan, “Detecting digital image forgeries using sensor pattern noise,” in Electronic Imaging 2006. International Society for Optics and Photonics, 2006, pp. 60 720Y–60 720Y.
[12] M. Chen, J. Fridrich, J. Luk´aˇs, and M. Goljan, “Imaging sensor noise as digital x-ray for revealing forgeries,” in International Conference on Information Hiding. Springer-Verlag, 2007, pp. 342–358.
[13] D. Cozzolino, G. Poggi, and L. Verdoliva, “Splicebuster: A new blind image splicing detector,” in International Workshop on Information Forensics and Security. IEEE, 2015, pp. 1–6.
[14] D. Cozzolino and L. Verdoliva, “Single-image splicing localization through autoencoder-based anomaly detection,” in International Workshop on Information Forensics and Security. IEEE, 2016, pp. 1–6.
[15] L. Verdoliva, “Media forensics and deepfakes: an overview,” arXiv preprint arXiv:2001.06564, 2020.
[16] M. Barni, L. Bondi, N. Bonettini, P. Bestagini, A. Costanzo, M. Maggini, B. Tondi, and S. Tubaro, “Aligned and non-aligned double jpeg detection using convolutional neural networks,” Journal of Visual Communication and Image Representation, vol. 49, pp. 153–163, 2017.
[17] I. Amerini, T. Uricchio, L. Ballan, and R. Caldelli, “Localization of jpeg double compression through multi-domain convolutional neural networks,” in 2017 IEEE Conference on computer vision and pattern recognition workshops (CVPRW). IEEE, 2017, pp. 1865–1871.
[18] Q. Wang and R. Zhang, “Double jpeg compression forensics based on a convolutional neural network,” EURASIP Journal on Information Security, vol. 2016, no. 1, p. 23, 2016.
[19] R. Salloum, Y. Ren, and C.-C. J. Kuo, “Image splicing localization using a multi-task fully convolutional network (mfcn),” Journal of Visual Communication and Image Representation, vol. 51, pp. 201–209, 2018.
[20] J. H. Bappy, C. Simons, L. Nataraj, B. Manjunath, and A. K. Roy- Chowdhury, “Hybrid lstm and encoder–decoder architecture for detection of image forgeries,” IEEE Transactions on Image Processing, vol. 28, no. 7, pp. 3286–3300, 2019.
[21] H. Li and J. Huang, “Localization of deep inpainting using high-pass fully convolutional network,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 8301–8310.
[22] S.-Y. Wang, O. Wang, A. Owens, R. Zhang, and A. A. Efros, “Detecting photoshopped faces by scripting photoshop,” in Proceedings of the IEEE International Conference on Computer Vision, 2019, pp. 10 072–10 081.
[23] Y. Wu, W. AbdAlmageed, and P. Natarajan, “Mantra-net: Manipulation tracing network for detection and localization of image forgeries with anomalous features,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019, pp. 9543–9552.
[24] O. Mayer and M. C. Stamm, “Forensic similarity for digital images,” IEEE Transactions on Information Forensics and Security, 2019.
[25] P. Zhou, X. Han, V. I. Morariu, and L. S. Davis, “Learning rich features for image manipulation detection,” in IEEE Conference on Computer Vision and Pattern Recognition, 2018, pp. 1053–1061.
[26] L. Bondi, S. Lameri, D. Gera, P. Bestagini, E. J. Delp, and S. Tubaro, “Tampering detection and localization through clustering of camera-based CNN features,” in 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2017, pp. 1855–1864.
[27] M. Huh, A. Liu, A. Owens, and A. A. Efros, “Fighting fake news: Image splice detection via learned self-consistency,” in ECCV, 2018.
[28] D. Cozzolino and L. Verdoliva, “Camera-based image forgery localiza- tion using convolutional neural networks,” in 2018 26th European Signal Processing Conference (EUSIPCO). IEEE, 2018, pp. 1372–1376.
[29] ——, “Noiseprint: A CNN-based camera model fingerprint,” IEEE Transactions on Information Forensics and Security, vol. 15, pp. 144– 159, 2020.
[30] O. Mayer and M. C. Stamm, “Learned forensic source similarity for unknown camera models,” in International Conference on Acoustics, Speech and Signal Processing. IEEE, 2018, pp. 2012–2016.
[31] S. Fortunato, “Community detection in graphs,” Physics reports, vol. 486, no. 3-5, pp. 75–174, 2010.
[32] B. Bayar and M. C. Stamm, “Constrained convolutional neural networks: A new approach towards general purpose image manipulation detection,” IEEE Transactions on Information Forensics and Security, vol. 13, no. 11, pp. 2691–2706, 2018.
[33] O. Mayer, B. Bayar, and M. C. Stamm, “Learning unified deep-features for multiple forensic tasks,” in Proceedings of the 6th ACM workshop on information hiding and multimedia security, 2018, pp. 79–84.
[34] R. Diestel, “Graph theory,” Graduate Texts in Mathetmatics, vol. 101, 2005.
[35] M. Girvan and M. E. Newman, “Community structure in social and biological networks,” Proceedings of the national academy of sciences, vol. 99, no. 12, pp. 7821–7826, 2002.
[36] V. D. Blondel, J.-L. Guillaume, R. Lambiotte, and E. Lefebvre, “Fast unfolding of communities in large networks,” Journal of statistical mechanics: theory and experiment, vol. 2008, no. 10, p. P10008, 2008.
[37] M. E. Newman, “Fast algorithm for detecting community structure in networks,” Physical review E, vol. 69, no. 6, p. 066133, 2004.
[38] A. Clauset, M. E. Newman, and C. Moore, “Finding community structure in very large networks,” Physical review E, vol. 70, no. 6, p. 066111, 2004.
[39] U. N. Raghavan, R. Albert, and S. Kumara, “Near linear time algorithm to detect community structures in large-scale networks,” Physical review E, vol. 76, no. 3, p. 036106, 2007.
[40] J. Reichardt and S. Bornholdt, “Statistical mechanics of community detection,” Physical review E, vol. 74, no. 1, p. 016110, 2006.
[41] P. Pons and M. Latapy, “Computing communities in large networks using random walks,” in International symposium on computer and information sciences. Springer, 2005, pp. 284–293.
[42] U. Von Luxburg, “A tutorial on spectral clustering,” Statistics and computing, vol. 17, no. 4, pp. 395–416, 2007.
[43] M. J. Rattigan, M. Maier, and D. Jensen, “Graph clustering with network structure indices,” in Proceedings of the 24th international conference on Machine learning, 2007, pp. 783–790.
[44] M. A. Oikawa, Z. Dias, A. de Rezende Rocha, and S. Goldenstein, “Manifold learning and spectral clustering for image phylogeny forests,” IEEE Transactions on Information Forensics and Security, vol. 11, no. 1, pp. 5–18, 2015.
[45] F. R. Chung, “Spectral graph theory (cbms regional conference series in mathematics, no. 92),” 1996.
[46] M. E. Newman and M. Girvan, “Finding and evaluating community structure in networks,” Physical review E, vol. 69, no. 2, 2004.
[47] G. Csardi and T. Nepusz, “The igraph software package for complex network research,” InterJournal, vol. Complex Systems, p. 1695, 2006. [Online]. Available: http://igraph.org
[48] T. Kamada, S. Kawai et al., “An algorithm for drawing general undirected graphs,” Information processing letters, vol. 31, no. 1, pp. 7–15, 1989.
[49] J. Yang, J. McAuley, and J. Leskovec, “Community detection in net- works with node attributes,” in 2013 IEEE 13th International Conference on Data Mining. IEEE, 2013, pp. 1151–1156.
[50] P. Sah, L. O. Singh, A. Clauset, and S. Bansal, “Exploring community structure in biological networks with random graphs,” BMC bioinformatics, vol. 15, no. 1, p. 220, 2014.
[51] M. D. M. Hosseini and M. Kirchner, “Unsupervised image manipulation localization with non-binary label attribution,” IEEE Signal Processing Letters, vol. 26, no. 7, pp. 976–980, 2019.
[52] M. Fiedler, “Algebraic connectivity of graphs,” Czechoslovak mathematical journal, vol. 23, no. 2, pp. 298–305, 1973.
[53] T. J. De Carvalho, C. Riess, E. Angelopoulou, H. Pedrini, and A. de Rezende Rocha, “Exposing digital image forgeries by illumination color classification,” IEEE Transactions on Information Forensics and Security, vol. 8, no. 7, pp. 1182–1194, 2013.
[54] Y.-F. Hsu and S.-F. Chang, “Detecting image splicing using geometry invariants and camera characteristics consistency,” in International Conference on Multimedia and Expo, 2006.
[55] P. Korus and J. Huang, “Multi-scale analysis strategies in prnu-based tampering localization,” IEEE Trans. on Information Forensics & Security, 2017.
[56] S. Fortunato and M. Barthelemy, “Resolution limit in community detection,” Proceedings of the National Academy of Sciences, vol. 104, no. 1, pp. 36–41, 2007.
[57] D. Shullani, M. Fontani, M. Iuliani, O. Al Shaya, and A. Piva, “Vision: a video and image dataset for source identification,” EURASIP Journal on Information Security, vol. 2017, no. 1, p. 15, 2017.