The electrocardiogram (ECG) is a recording of the electrical activity of the heart, obtained with the help of electrodes located on the human body. This is one of the most important methods for the diagnosis of heart diseases. The ECG is usually treated by a doctor. Recently, automatic ECG analysis is of great interest.
The ECG analysis includes detection of QRS complexes, P and T waves, followed by an analysis of their shapes, amplitudes, relative positions, etc. (see Fig. 1). The detection of onsets and offsets of QRS complexes and P and T waves is also called segmentation or delineation of the ECG signal.
Accurate ECG automatic segmentation is a difficult problem for the following reasons. For example, the P wave has a small amplitude and can be difficult to identify due to interference arising from the movement of electrodes, muscle noise, etc. P and T waves can be biphasic, which makes it difficult to accurately determine their onsets and offsets. Some cardiac cycles may not contain all standard segments, for example, the P wave may be missing, etc.
Among the methods of automatic ECG segmentation, methods using wavelet transforms have proven to be the best [3,4,7,6,8,9]. In [11], a neural network approach for ECG segmentation is proposed. The segmentation quality turned
Fig. 1. An example of medical segmentation. Yellow color corresponds to P waves, red to QRS complexes, green to T waves. The symbol means the onset of a wave,
means the wave peak,
corresponds to the offset of a wave.
– in [11], an ensemble of 12 convolutional neural networks is used; here we use one full-convolutional neural network with skip links;
– in contrast to the present work, [11] does not use postprocessing;
– in [11], a preprocessing is used to remove a isoline drift; we process signals as is; in Section 3.3, we will see that the quality of ECG segmentation is high even in the case of the isoline drift.
2.1 Preprocessing
duration, we propose the following preprocessing. Let the frequency of an input signal x = () be
, and the network is trained on signals with the frequency
. Then
is the signal duration. Convert the input signal as follows.
1. Form an array of time samples t = (), where
= (2
1)T2n are the midpoints of the time intervals formed by dividing the segment [0, T] into n equal parts (i = 1, 2, . . . , n).
2. On the set of points , construct the cubic spline [2].
3. Form the array of new time samples = (
), where
4. Using the cubic spline, find the signal values at . The resulting array will be the input to the neural network.
2.2 The neural network architecture
The architecture of the neural network (see Fig. 2) is similar to the UNet architecture [10]. The input of the neural network is a vector of length l, where l is the length of the ECG signal received from one lead. Each lead is fed to the input of the neural network separately.
Fig. 2. Neural network architecture
The output size is (4, l). Each column of the output matrix contains 4 scores, that characterize the confidence degree of the neural network that the current value of the signal belongs to the segments P, QRS, T or none of the above. The proposed neural network includes the following layers:
The main differences between the proposed network and UNet follow:
– we use 1d convolutions instead of 2d convolutions;
– we use a different number of channels and different parameters in the convolutions;
– we use of copy + zero pad layers instead of copy + crop layers; as a result, in the proposed method the dimension of the output is the same as the input; in contrast, at the output of the UNet network, we obtain a segmentation of only a part of the image.
2.3 Postprocessing
3.1 LUDB dataset
with the duration of 10 seconds recorded with the sampling rate of 500 Hz. For comparison of algorithms, the dataset was divided into a train and a test sets, where the test consists of 200 ECG signals borrowed from the original LUDB dataset. Since the proposed neural network elaborate the leads independently, 255 12 = 3060 signals of length 500
10 = 5000 were used for training. To prevent overfitting, augmentation of data was performed: at each batch iteration, a random continuous ECG fragment of 4 seconds was fed to the input of the neural network.
The LUDB dataset has the following feature. One (sometimes two) first and last cardiac cycles are not annotated. At the same time, the first and last marked segments are necessarily QRS (see an exanmple in Fig. 1). To implement a correct comparison with the reference segmentation, the following modifications were made in the algorithm:
– during augmentation, the first and last 2 seconds were not taken, i. e. subsequences of the length of 4 seconds were chosen starting from the 2-nd to the 4-th (ending from the 6-th to the 8-th seconds);
– in order to avoid a large number of false positives, the first and the last cardiac cycles were removed during the validation of the algorithm.
3.2 Comparison of the algorithms
Table 1 contains results of the experiment and the comparison of the results with one of the best segmentation algorithm using wavelets [4] and the neural network segmentation algorithm [11]. The last line shows the characteristics of our algorithm that analyses the leads independently for a test set consisting of 200 12 = 2400 ECG. The quality of the algorithms is determined using the following procedure. According to the recommendations of the Association for Medical Instrumentation [1], it is considered that an onset or an offset are detected correctly, if their deviation from the doctor annotations does not exceed in absolute value the tolerance of 150 ms. If an algorithm correctly detects a significant point (an onset or an offset of one of the P, QRS, T segments), then a true positive result (TP) is counted and the time deviation (error) of the automatic determined point from the manually marked point is measured. If there is no corresponding significant point in the test sample in the neighborhood of
of the detected significant point, then the I type error is counted (false positive – FP). If the algorithm does not detect a significant point, then the II type error is counted (false negative – FN). Following [3,6,8,9], we measure the following quality metrics:
– the mean error m; – the standard deviation of the mean error; – the sensitivity, or recall, Se = TP/(TP + FN); – the positive predictive value, or precision, PPV = TP/(TP + FP).
Here TP, FP, FN denotes the total number of correct solutions, type I errors, and type II errors, respectively. We also give the value of
Table 1. The comparison of ECG segmentation algorithms
Analyzing the results, we can draw the following conclusions:
– the indicators Se and PPV for the proposed algorithm are the most or almost the highest for all types of ECG segments;
– averaging the answer over all 12 leads helps to detect the complexes better: it has improved both Se and PPV; however, the detecting the onsets and the offsets worsens, which is indicated by the growth of in all indicators;
– to detect the QRS-complexes, it is enough to use only lead II, since it gives the highest quality of their determination; such an approach will reduce the time of the algorithm 12 times, without passing the other leads through the neural network;
– the best values are given by the algorithm [4];
– the results of the proposed approach for all indicators surpassed the other neural network approach [11].
3.3 Examples of the resulting segmentations
An example of the segmentation of an ECG with a pathology (ventricular extrasystole) is shown in Fig. 5. An example of segmentation of an ECG obtained from another type of ECG monitor is shown in Fig. 6. It is characterized by high T waves and a strong degree of smoothness. Figure 7 presents an example of segmentation of an ECG with the frequency of 50 Hz, reduced using a cubic spline to the frequency of 500 Hz.
Fig. 3. An example of low frequency noise ECG segmentation (breathing)
Fig. 4. An example of high frequency noise ECG segmentation
The paper describes an algorithm based on the use of a UNet-like neural network, which is capable to quickly and efficiently construct the ECG segmentation. Our
Fig. 5. An example of ECG segmentation with pathology (ventricular extrasystole)
Fig. 6. An example of segmentation of an ECG obtained from another type of ECG monitor. It is characterized by high T waves and a strong degree of smoothness.
Fig. 7. An example of segmentation of an ECG with the frequency of 50 Hz, reduced using a cubic spline to the frequency of 500 Hz
method uses a small number of parameters and it has a good generalization. In particular, it is adaptive to different sampling rates and it is generalized to various types of ECG monitors. The proposed approach is superior to other state-of-the-art segmentation methods in terms of quality. F1-measures for detection of onsets and offsets of P and T waves and for QRS-complexes are at least 97.8%, 99.5%, and 99.9%, respectively.
In the future, this can be used with diagnostic purposes. Using segmentation, one can compute useful signal characteristics or use the neural network output directly as a new network input for automated diagnostics with the hope of improving the quality of classification.
In addition, one can try to improve the algorithm itself. In particular, the loss function used in the proposed neural network probably does not quite reflect the quality of segmentation. For example, it does not take into account some features of the ECG (e. g. two adjacent QRS complexes cannot be too close to each other or too far from each other).
Acknowledgement. The authors are grateful to the referee for valuable suggestions and comments. The work is supported by the Ministry of Education and Science of Russian Federation (project 14.Y26.31.0022).
1. Association for the Advancement of Medical Instrumentation. NSI/AAMI EC57:1998/(R)2008 (Revision of AAMI ECAR:1987), 1999.
2. De Boor, C.: A practical guide to splines. Springer-Verlag (1978)
3. Bote, J.M., Recas, J., Rincon, F., Atienza, D., Hermida, R.: A modular low- complexity ECG delineation algorithm for real-time embedded systems. IEEE Journal of Biomedical and Health Informatics. 22, 429–441 (2017)
4. Kalyakulina, A.I., Yusipov, I.I., Moskalenko, V.A., Nikolskiy, A.V., Kozlov, A.A., Zolotykh, N.Y., Ivanchenko, M.V.: Finding Morphology Points of Electrocardiographic-Signal Waves Using Wavelet Analysis. Radiophysics and Quantum Electronics, 61(8-9), 689–703 (2019)
5. Kalyakulina, A.I., Yusipov, I.I., Moskalenko, V.A., Nikolskiy, A.V., Kozlov, A.A., Kosonogov, K.A., Zolotykh, N.Yu., Ivanchenko, M.V.: LU electrocardiography database: a new open-access validation tool for delineation algorithms. arXiv:1809.03393 (2018)
6. Di Marco, L.Y, Lorenzo, C.: A wavelet-based ECG delineation algorithm for 32-bit integer online processing. Biomedical Engineering Online. 10(23) (2011)
7. Li, C., Zheng, C., Tai, C.: Detection of ECG characteristic points using wavelet transforms. IEEE Transactions on Biomedical Engineering, 42(1), 21–28 (1995)
8. Martinez, A., Alcaraz, R., Rieta, J.J.: Automatic electrocardiogram delineator based on the phasor transform of single lead recordings. Computing in Cardiology. IEEE. 987–990 (2010)
9. Rincon, F., Recas, J., Khaled, N., Atienza, D.: Development and evaluation of multilead wavelet-based ECG delineation algorithms for embedded wireless sensor nodes. IEEE Transactions on Information Technology in Biomedicine. 15, 854–863 (2011)