Interference Classification Using Deep Neural Networks

2020·arXiv

Abstract

I. INTRODUCTION

With the increasing spectrum usage nowadays, interference mitigation, especially narrowband interference becomes essential [1]. In the literature, several techniques and algorithms are proposed to cancel or suppress certain types of interference. For instance, if the interference is a chirp signal, then an adaptive wavelet filter can suppress that signal. Furthermore, time-domain methods such as recursive least squares and least mean squares suppress interference by iteratively training the weights of the channel equalizer [2]. Moreover, mitigating interference can also be implemented in the frequency domain using methods such as the constant modulus algorithm [3].

The plethora of both interference sources and mitigation algorithms along with the exponential growth of wireless systems necessitate designing an adaptive system that accounts for multiple interference sources without compromising the bandwidth of the signal-of-interest or increasing the computational complexity of the system. Working under such constraints, the system should first classify the interference type and then selects among various algorithms a specific algorithm that was proven to work well under such interference type.

Although few works have investigated interference classi-fication especially at the physical layer, some research articles classified interference in the upper layers. For instance, Schmidt et al. [4] classified interference in the MAC layer based on identical features of different 802.11 packets, and [5] applied classic k-Nearest Neighbors (kNN) and support-vector machine (SVM) on classifying indoor WiFi signals based on channel state information features, while received

Fig. 1. Interference classifier in a cognitive radio.

signal strength indicator based classifier can help to identify the signal type and traffic pattern both in WiFi [6] or 802.15.4e IoT-type traffic[7]. A closely-related problem to interference classification is modulation classification, which is well-studied in the research literature. For instance, Fehshe et al. [8] classified various modulation techniques by extracting the cyclic feature of the signal and using it as an input to a neural network. Further improvements of the method in [8] were presented by the authors of [9] and [10]. Moreover, the authors of [11] and [12] applied classical machine learning algorithms such as SVM [11] and principle component analysis (PCA) [12] to successfully classify the modulation type. However, in both articles, the authors assumed an interference-free signal, which might suggest that these algorithms would not perform as well as they do when an NBI signal exists. Motivated by the recent advances in deep-learning-based computer vision, O’Shea et al. [13] modeled the quadrature-phase components as a two-dimensional image. This image-like signal is used as an input to a convolutional neural network (CNN) to classify the modulation type of the signal.

In this paper, we design an interference classifier, which will be an essential component of future cognitive radio systems. We start by discussing the role of the interference classifier in cognitive systems. Then, we discuss the relationship between modulation and interference classifiers. In addition, we explain how the cyclic spectrum and spectrum information can be used as extracted input features for the deep neural network (DNN) and the impact on classification accuracy when using either input.

II. SYSTEM MODEL

As depicted in Fig. 1, the interference classifier (IC) is the first block in the digital receiver chain, right after the analog-digital converter. The classifier receives the signal1 and converts it into a feature vector. Then, it predicts the type of interference. The output of the classifier is then fed to a hub, which triggers an interference cancellation algorithm based on the IC predicted interference. In more a generic system, the output of the cancellation algorithm can be fed to a modulation classifier to further improve the signal-of-interest detection. However, this work more focuses on the IC part of the system in Fig. 1.

The signal model is shown in the flowchart of Fig. 2. The mathematical model of the continuous-time received signal r(t), is

where x(t) is the transmitted signal, i(t) is the interfering signal, n(t) the additive-white Gaussian noise (AWGN), SIR is the signal-to-interference ratio, and SNR is the signal-to-noise ratio. The continuous-time received signal is then passed to an analog-digital converter whose output is then downsampled to reduces the data size. To extract the features of the signal, the downsampled signal is passed to either an FFT or an cyclic spectrum block. The output of either blocks is normalized and fed to an IC. In the whole process of Fig. 2, we assume the following:

1) Information bits are modulated using m-ary phase shift keying (mPSK).

2) The channel is a frequency-flat channel; that is, a freefading channel.

3) No time, frequency, or phase offset between the transmitter and receiver.

1From this point forward, the word received signal stands for received sampled signal. That is, the signal is digital.

Fig. 2. A flowchart of the system model.

Fig. 3. Example spectrum for the signal of interest and several interference types

A. Types of Interferences Considered

In this paper, we consider five typical types of interference —i(t)—in (1), which are (the n(t) term is present in all of them):

1) None: only the AWGN, the n(t) term in (1) 2) Single tone: i(t) is a complex sine signal with a constant frequency in each example.

3) Chirp: i(t) is a complex sine signal with a frequency that is either linearly or exponentially changing with time in each example. The examples are generated by varying the frequency rate of change, known as chirpiness, which is defined as . Therefore, each distinct value of corresponds to a distinct spectrum. This distinction is depicted Fig. 3, where fourth and fifth plot refer to chirp spectrum with different chirp rate.

4) Filtered noise: i(t) is similar to n(t) but is passed through a low-pass filter whose transfer function is given by

where is a parameter that controls the filter opening, the larger the value of a, the wider the bandwidth. the spectrum of i(t) is shown in Fig. 3 with two choices of a, namely, a = 0.1 and 0.5.

5) Unknown modulated signal: i(t) is a random, modulated, and information-carrying signal, while the modulation parameters is unknown.

B. Cyclic Spectrum and Power Spectrum Density

Since the time-domain information of the received signal can be easily buried with the white noise, a direct extraction of the time-domain features might not be feasible. Nonetheless, since signals used in communication systems exhibit periodicity in their second-order statistics, a useful way to extract time-domain-based features is to use the cyclostationary properties of these signals [14], which is also known as spectral correlation features. The time-smoothed cyclic periodogram is defined as

and

where g(n) is a unity area weighting function, a(r) is a data tapering window. The following relation between the cyclic spectrum and the time-smoothed cyclic periodogram holds

Spectral separation parameter is calculated by . While cyclic spectrum is defined as

cyclic spectrum is in a much smaller size than that of the power spectrum, and provides unique modulation specific characteristics. The cyclic spectrum plot of interference is shown in Fig. 4, the tone and chirp get identical plots while not for filtered noise and unknown modulated signals. The codes are partially referred to open source code and corresponding work [14]. On the other hand, the calculated PSD is simply as and require much less computation, in that cycle frequency is not pre-known most of the time, and exhausted searching is applied and it takes longer time to plot cyclic spectrum.

C. k-Nearest Neighborhood and Support-vector Machine

In this paper, we use kNN and SVM as the baseline classifier in our design. kNN estimate the type by finding existing classes with most similar features, while the feature here can be spectrum and the similarity can be Euclid distance between different spectrum vectors. Then the predicted class usually is the mode number of the neighboring class, or major voting in other words. Despite the number of neighbor and metric of distance ,kNN does not require many parameters tuning or computation, yet may break down when the class distribution is skewed [15].

Fig. 4. Cyclic spectrum of interference

On the other hand, SVM classifies data by finding the best hyperplane that separates all data points feature-label pair , of one class from those of the other class [15]. The best hyperplane for an SVM is that with the largest margin between classes. Mathematically, finding this hyperplane is equivalent to solving the following programming:

where is a Lagrange multiplier and C is the box constraint. We use the LIBSVM library [16] to classify the interference. However, the major limitations of SVM make it inferior to neural network algorithms is choosing the appropriate kernel function, which can be tricky depending on different dataset.

D. Random Forest

Another branch of supervised learning is decision tree, and it predicts in a tree-like structure, where a leaf denotes a group of features and the branch represents the weight or probability, and it applies recursive binary splitting to further split grouped features into smaller combinations. Moreover, the weights are updated and some leaves are cut off by split cost function. One of decision tree methods is the random forest, where each tree gives a classification result and the forest chooses the class that has the highest votes. It becomes popular since it does not need for pruning trees, not sensitive to outliers in training data and can infer variable importance [17].

E. Deep Neural Network

From the view of a neural network, the problem is aimed to predict class out of features x, that is , and here is the approximate function given by a neural network, and each neuron is a non-linear weight-sum operation. At the training stage, the training data fed into neural network and output a temporary estimate, the corresponding ground label value is provided to calculate the loss, and the weight of neural networks is updated according to the loss function. The update will repeat many times till the loss reach certain minima. Then at the test stage, the neural network has to predict labels from the data it never sees, and the weights should not be updated. Neural networks prove to have advantages of being robust to outliner points, general to input features, comparing to previous classic methods.

III. PERFORMANCE EVALUATION

The specifications of the signals are listed Table I. We used MATLAB to generate datasets, and python code to build neural networks or random forest method from sklearn library. All of the dataset and source codes are publicly available in Github repository with detailed descriptions 2 and results are reproducible.

TABLE I SIGNAL MODEL

TABLE II METHODS PARAMETERS SETTINGS

A. Neural Network Settings

After our heuristic tuning, three fully connect layers of size 256 are added, with drop out layers between them, followed by a softmax layer and a classifying layer to output the decision. Fig. 5 show how these layers are connected. The detailed specifications of the network setup are listed in Table II, along with the setting of other learning methods.

B. Input Data

Choosing the input feature strongly impacts the classifica-tion accuracy. As depicted in Fig. 6, using the PSD of the received signal results in better accuracy compared to either using the cyclic spectrum or directly feeding the sampled input signal to the DNN. In fact, as Fig. 6 shows, using the cyclic spectrum of the signal as input features performs poorly in the case of filtered noise. For the calculation of confusion matrices, each element in confusion matrix is calculated

Fig. 5. Neural Network Layers

by , and there is . The color indicates the relative value of each element, the higher the probability the darker color.

C. Neural Network versus Others

The comparisons between SVM, kNN, random forest and DNN are also illustrated in Fig. 6 using the same training and testing data. In the case of SVM, The kernal function is a Gaussian radial basis function . Compared to DNN, the kNN SVM suffers from overfitting problems, neither of them can recognize tone well, which spectrum information is only carried in a few elements among the input. Similarly, the random forest has lower accuracy at single one, in that it does not fully utilize all features of the input signal during the pruning of training. Hence it is concluded the classic methods can miss some important details during training.

D. Downsampled data

To further ease the computation for Fourier transformation and neural network training. A practical way is downsampling the received data in each example, get each 1 sample out of every d samples, as to reduce the input size. In theory, downsample in the time domain corresponding to suppress the spectrum amplitude in the frequency domain but still keep most spectrum features. Notice one iteration means one batch processed, one epoch means all data processed one time, and batch size defines the number of samples to work through before updating the internal model parameters. Here the same raw training data is given first and the same testing data, the default number of samples is 2000, the samples after downsampling are 500, and another baseline is sample size 500 without downsampling. The Fig. 8 shows that downsample can converge much faster than the original with maintaining similar performance, while smaller samples may lack sufficient information and cannot reach near-optimal prediction.

E. Performance Over Various SNR and SIR

Fig. 9 shows that the classifier works well under either high SNR or low SIR. The higher SNR the more noise can influence the prediction, while higher SIR makes the interference more apparent in power density spectrum that make classification easier. To sum up, the accuracy degrades as we deviate from these two sub-optimal regions. Nonetheless, if the minimum accuracy is 90%, then the classifier has at least 15 dB dynamic range.

F. Multiple Interference

The classier also extends to evaluate a combination of the five interference types when more than one interference type is added to the signal-of-interest. In this paper, we generated four different combinations of interference types and added them to the signal-of-interest. Those four combinations are tone-chirp, tone-filtered, chirp-filtered and tone-chirp-filtered. The confusion matrix in Fig. 10 shows that the classifier performs fairly good under these combinations: with tone-chirp-filtered

Fig. 6. Confusion matrices for all approaches

Fig. 7. Prediction accuracy for all approaches

being hard to classify, while tone-chirp being relatively easy to recognize.

IV. CONCLUSIONS AND FURTHER WORK

In this paper, we showed that deep learning algorithm can be used to classify various types of interference that are added to a signal-of-interest. Besides, we showed that using PSD of the input signal outperforms—in terms of accuracy—using either the cyclic spectrum of the signal or even merely use the time-domain samples. Finally, we showed that prediction accuracy improves when using the DNN framework, which leverages the full information of the input signal compared to using either the random forest, kNN or the SVM algorithm.

Fig. 8. Training accuracy over iterations

Fig. 9. Prediction accuracy versus SNR/SIR

The classifier could be extended to classify other types of interference that are not considered in this work. Also, a combination of modulation and interference classifier can

Fig. 10. Prediction accuracy

be used to improve prediction accuracy. Furthermore, the use of unsupervised learning techniques would be useful for classifying unlabeled interference types, a common machine learning paradigm known as clustering.

ACKNOWLEDGMENT

The authors would like to thank Virginia Tech Research Computing Center (ARC) for providing ample computational resources that allow fast verification of the proposed system.

REFERENCES

[1] K. Shi, Y. Zhou, B. Kelleci, T. W. Fischer, E. Serpedin, and A. l. Karsi- layan, “Impacts of narrowband interference on OFDM-UWB receivers: Analysis and mitigation,” IEEE Transactions on Signal Processing, vol. 55, no. 3, pp. 1118–1128, March 2007.

[2] M. H. Hayes, Statistical digital signal processing and modeling. John Wiley & Sons, 2009.

[3] A. van der Veen and A. Paulraj, “An analytical constant modulus algorithm,” IEEE Transactions on Signal Processing, vol. 44, no. 5, pp. 1136–1155, May 1996.

[4] M. Schmidt, D. Block, and U. Meier, “Wireless interference iden- tification with convolutional neural networks,” in 2017 IEEE 15th International Conference on Industrial Informatics (INDIN), July 2017, pp. 180–185.

[5] Z. Yang, Y. Wang, L. Zhang, and Y. Shen, “Indoor interference clas- sification based on wifi channel state information,” in International Conference on Security, Privacy and Anonymity in Computation, Communication and Storage. Springer, 2018, pp. 136–145.

[6] K. R. Chowdhury and I. F. Akyildiz, “Interferer classification, channel selection and transmission adaptation for wireless sensor networks,” in 2009 IEEE International Conference on Communications. IEEE, 2009, pp. 1–5.

[7] S. Zacharias, T. Newe, S. OKeeffe, and E. Lewis, “Identifying sources of interference in rssi traces of a single ieee 802.15. 4 channel,” in The 8th International Conference on Wireless and Mobile Communications, 2012, pp. 408–414.

[8] A. Fehske, J. Gaeddert, and J. H. Reed, “A new approach to signal classification using spectral correlation and neural networks,” in First IEEE International Symposium on New Frontiers in Dynamic Spectrum Access Networks, 2005. DySPAN 2005., Nov 2005, pp. 144–150.

[9] W. C. Headley, J. D. Reed, and C. R. C. M. da Silva, “Distributed cyclic spectrum feature-based modulation classification,” in 2008 IEEE Wireless Communications and Networking Conference, March 2008, pp. 1200–1204.

[10] B. Le, T. Rondeau, D. Maldonado, and C. W. Bostian, “Modulation identification using neural networks for cognitive radios,” in Software Defined Radio Forum Technical Conference, 2005.

[11] H. Hu, Y. Wang, and J. Song, “Signal classification based on spectral correlation analysis and svm in cognitive radio,” in 22nd International Conference on Advanced Information Networking and Applications (aina 2008), March 2008, pp. 883–887.

[12] A. Elrharras, R. Saadane, M. Wahbi, and A. Hamdoun, “Signal detection and automatic modulation classification based spectrum sensing using PCA-ANN with real word signals,” Applied Mathematical Sciences, vol. 8, no. 160, pp. 7959–7977, 2014.

[13] T. J. OShea, J. Corgan, and T. C. Clancy, “Convolutional radio modula- tion recognition networks,” in International conference on engineering applications of neural networks. Springer, 2016, pp. 213–226.

[14] J. Antoni, “Cyclic spectral analysis in practice,” Mechanical Systems and Signal Processing, vol. 21, no. 2, pp. 597–630, 2007.

[15] J. A. Suykens and J. Vandewalle, “Least squares support vector machine classifiers,” Neural processing letters, vol. 9, no. 3, pp. 293–300, 1999.

[16] C.-C. Chang and C.-J. Lin, “LIBSVM: A library for support vector machines,” ACM transactions on intelligent systems and technology (TIST), vol. 2, no. 3, p. 27, 2011.

[17] E. Jedari, Z. Wu, R. Rashidzadeh, and M. Saif, “Wi-fi based indoor location positioning employing random forest classifier,” in 2015 international conference on indoor positioning and indoor navigation (IPIN). IEEE, 2015, pp. 1–5.

Designed for Accessibility and to further Open Science