Machine learning and deep learning on health diagnosis of rotating fan

Machine learning and deep learning on the health diagnosis of a rotating machine are studied for smart monitoring. The signals of vibration and sound pressure of a rotating fan driven by DC motor detected by an accelerometer and microphone are processed by machine/deep learning for health diagnosis of blade. For the machine learning, two methods, support vector machine (SVM) and random forest (RF), are used for classification of normal and abnormal status based on three features extracted from the signals in time domain and frequency domain. For the deep learning, convolution neural network (CNN) method is used to process the two signals in time domain for modelling; certain layers of convolution and pooling for feature extraction are followed by two layers of artificial neural network. After the learning, a confusion matrix of testing is given to evaluate the performance. In particular, the importance scores of input features are analyzed by RF, which is useful for us to screen out the non-significant features for improving the learning to avoid overfitting. We demonstrate three different methods (SVM, RF, CNN) on the diagnosis of a rotating fan with a damaged blade to illustrate their characteristics.


INTRODUCTION
In the recent decades, the offshore wind power has been an important energy source. In addition, the health monitoring and diagnosis of wind turbines have become more crucial, particularly the completeness of blade. For example, the blades of offshore wind turbine are susceptible to wind field, e.g. typhoon frequency, wind speed, etc. Due to wind loading, the fatigue failure and fracture are main causes of blade damage. Hence, a smart monitoring and diagnosis for early detection of blades of wind turbines is a crucial issue for maintenance. Recently, machine learning such as the support vector machine (SVM), artificial neural network (ANN), maximum likelihood classifier, and random forest (RF) have been extensively developed for classification based on the given features extracted from detected signals (Kateris et al., 2014). If the significant features are available based on domain knowledge (e.g. failure modes of system, physical properties of signals), the machine learning usually performs a good classifier. A confusion matrix can be used to evaluate the performance. In particular, the nonsignificant features can be screened out according the importance scores of the input features to avoid overfitting, and then reprocess the learning to improve the reliability of modelling. Each method possesses unique characteristic. For example, there are certain number of different decision trees for RF. If features extracted from the detected signals are not available for a very complicated system, it is difficult to adopt the machine learning for classification sometimes. On the other hand, deep learning without providing features in advance becomes more popular recently. For example, convolution neural network (CNN) with certain layers of convolution and pooling can automatically implement the internal feature extraction to provide features for the following ANN to carry out the classification (Janssens et al., 2016). However, each algorithm has both unique advantages and disadvantages, very relying on enough and precise signals. Although for deep learning the feature extraction is not necessary in advance, the number of samples for training should be sufficient. Otherwise, the performance of classification could be unacceptable. Therefore, it is difficult to assess and compare the superiority of different methods.
In this research, we use a simple rotating machine, a rotating fan driven by a DC motor with a constant speed, to simulate a complicated system, e.g. wind turbine. Two fans are tested; one is a normal fan of seven blades and the other a damaged fan of losing a blade. The signals of accelerometer and microphone for measuring the vibration and sound of a rotating fan are acquired by data acquisition (DAQ) from the normal and damaged fans (Heng and Nor, 1998). Then, the data in time domain are processed by Fourier transform (FFT) to obtain the information in frequency domain (spectrum). Both are useful for feature extraction, and are available for machine learning (SVM, RF) and deep learning (1D CNN). We will compare the performance of each method on the classification of normal and abnormal status of a rotating fan to demonstrate their characteristics for application. Our research may pave a way to the application of machine and deep learning on health monitoring and diagnosis of a rotating system (Souza et al., 2021).

METHOD
The signals of accelerometer and microphone involve the information of the rotation of fan and the vibration of blades. The former is low-frequency signal, while the latter highfrequency one. Both are mixed together as modulation in the detected signals, which is a result of the multiple in time domain and the convolution in frequency domain. The two types of signals of accelerometer and microphone have unique features due to different physical mechanism. The former is the response of vibration of structure and blades induced by the rotating rotor, and the latter is the aerodynamic interaction of air with rotating blades, the sound pressure. In particular, there are two distinct features in the vibrational signal: the rotational speed of rotor and the natural frequencies of a blade. Through FFT of the timeseries data, the two distinct features can be obviously observed in frequency domain (spectrum). On the other hand, the features of the sound depend on the number of blades as well as the rotating speed. If a rotating fan is damaged, e.g. fracture of blades, the amplitudes of these signals (vibration and acoustic fingerprint) will be amplified due to the dynamic unbalance of rotation. Therefore, using the useful features of these signals of vibration and sound, e.g. root mean square (RMS) in the time domain and the amplitudes of the harmonic peaks in frequency domain, we can accurately identify the health status of a rotating machine. The schematic diagram of signal process and feature extraction for machine learning (SVM, RF) or deep learning (1D CNN) to build a model for classifying the status of a rotating fan is shown in Fig. 1.
Two fans were prepared for experiment; one is a normal fan (control case) with seven complete blades and the other is with a damaged blade (abnormal case), as shown in Fig.  2. Both are driven by a DC motor with 12 V to maintain a constant rotating speed (26 Hz). We used an accelerometer installed on the case of fan for measuring vibration and a microphone for acoustic signal, as shown in Fig. 2. Both data were acquired via DAQ of 16 bits with a sampling rate of 20 KHz. For each sensor, the signals in one second are acquired. After that, these digitalized data are transferred into computer, and are processed by a program of LabVIEW for FFT and feature extraction. Several available programs (Scikit-learn, Keras) of Python in website were applied for machine/deep learning. In the following, we used SVM and RF for machine learning, where only two groups are classified (normal and abnormal). For RF, only ten decision trees are sufficient for modelling. The schematics of machine learning (SVM, RF) and deep learning (CNN) are

RESULTS AND DISCUSSION
The signals of accelerometer in time domain and frequency domain for both fans are shown in Fig. 4. The results of microphone for sound pressure are shown in Fig.  5. The root mean square (RMS) of vibration signal (in time domain) of abnormal fan is obviously larger than that of a normal fan, in comparison of Fig. 4(a) and (b). The RMS of a signal in time domain is given by, is the discrete-time signal and is the number of data. In contrast, the difference in RMS of sound pressure of the normal and abnormal fans is not significant. The profiles of these spectra look like a fishbone with discrete peaks, the gap is the rotating speed of fan (26 Hz). In comparison of the spectra of Fig. 4(c) and (d), we found that the amplitude of the first peak corresponding to the rotating speed (26 Hz) is increased for a damaged fan due to the dynamic unbalance of centrifugal force of a rotating fan. Fig.  4(c) and (d) also shows that the globally maximum peaks in spectrum are at 627.5 Hz and 1255 Hz, which are the natural frequencies of a blade for the fundamental and second modes. The other peaks in spectrum could be attributed to the natural vibration mode of structure. On the other hand, Fig. 5(c) shows that the first peak in the spectrum of sound pressure at 182 Hz is the fundamental mode for the normal fan with seven blades, which is a number of seven times of the rotational speed (7 × 26 Hz). In addition, the second peak at 364 Hz is the second harmonic of sound pressure for the normal fan. In contrast, there are many other discrete peaks in the spectrum of sound corresponding to the integer multiple of the rotating frequency of fan (26m Hz, m= 1, 2…) for a damaged fan, as shown in Fig. 5(d). Again, this is because that the dynamic unbalance of centrifugal force of a rotating fan with a damaged blade induces these harmonic frequencies.
Based on the above findings, we select three major features for machine learning: RMS of vibration in time domain, the amplitude of the first peak (rotating speed) of vibration in frequency domain, and the averaged amplitude of the first ten peaks (26m Hz, m= 1, 2…10) of sound pressure in frequency domain. First, we used RF of ten decision trees for training, and then tested the performance. These trees were randomly generated by software. The confusion matrix of RF are shown in Fig. 6(a), which indicates the RF classifier predicts very well. For this simple system of a rotating fan with constant speed, not only the classification of RF but also conventional SVM has a good performance (not shown here). In addition, we repeat RF learning by using 100 decision trees for training, and then calculate the scores of importance for the three features to evaluate their importance. The scores are 0.13, 0.53 and 0.34 for RMS of vibration, the amplitude of the first peak of vibration, and the averaged amplitude of the first ten harmonics of sound, respectively, as shown in Fig. 6(b).
From the scores, we can tell that the amplitude corresponding to the rotation speed (26 Hz) is the major feature, and the RMS is a minor one. This is because that the dynamic unbalance during rotation caused by a broken blade amplifies the amplitude of vibration per cycle, which can be detected by accelerometer. In contrast, it is difficult to detect the unbalance from the sound pressure. Using this method, us can screen out the non-significant features for improving the learning to avoid overfitting. For CNN learning, we used the data of vibration in time domain for training. Fig. 7 shows the accuracy and loss versus training epoch and the confusion matrix of CNN. The model takes only two epochs to converge, as shown in Fig. 7(a). The confusion matrix illustrates that CNN can accurately classify the health status. In addition, we used the data of vibration in frequency domain for CNN training, and the corresponding confusion matrix also shows a good performance for classification.

CONCLUSION
In this paper, the machine learning (SVM, RF) and deep learning (CNN) on the health diagnosis of a rotating fan were demonstrated. The signals of vibration and sound pressure of a rotating fan driven by DC motor were detected by an accelerometer and microphone, and were put into the models of machine and deep learnings for classification. Three features are input into the model for the machine learning: RMS of vibration in time domain, the amplitude of the first peak (rotating speed) of vibration in frequency domain, and the averaged amplitude of the first ten peaks of sound in frequency domain. For the deep learning, the data of accelerometer in time domain or frequency domain was used for 1D CNN with one layer of convolution and pooling and two layers of ANN for classification. After these learnings, a confusion matrix of testing was given individually to evaluate the performance. In particular, from the importance scores of these input features analyzed by RF, we can screen out the non-significant features for improving the learning to avoid overfitting. Since the three features we select are significant, the performances of SVM and RF are very good. On the other hand, the convolution and pooling layers of CNN automatically implement the feature extraction, so that the performance of CNN is also very good. Our research may pave a way to the applications of machine and deep learnings on the health monitoring and diagnosis of rotating system with signals of multi sensors.