Currently, the widely studied data cleaning methods include methods based on removal, direct manipulation, models, and imputation. Noise in data mainly includes data missing, redundant data, conflicting data, and wrong data, which are collectively called dirty data. Therefore, before the application of big data analysis algorithm in NPP, it is necessary to study the detecting and filling algorithm of missing data in operation parameters. Data cleaning aims to identify and correct the noise in the data and minimize its impact on the data analysis results. Therefore, the research on methods to improve data quality, which are called data cleaning, has received widespread attention. Due to the influence of environment interference, inherent characteristics of instruments and some other reasons, the NPP operation data missing may occur randomly, which may directly affect the analysis of data. Most of the operation data collection, transmission and storage in NPP are carried out by automatic instruments. Many big data analysis algorithms, like Machine Learning and Deep Learning, can be utilized to analyze the operation data. With the development and application of digital instrument and control (I&C) system in nuclear power plant (NPP), the capacity of data storage and analysis is improved. The results shows that the designed algorithm performs much better than the mean interpolation method and LSTM. Finally, the operation data is taken to build the experiment data set for the algorithm verification. To improve the accuracy of the measuring results, taken the differences between the characteristics of the analog parameters and the switch parameters into consideration, the similarity measurements using Mahalanobis distance for the analog parameter vectors and the matching measure for the switch parameter vectors are studied respectively. As the dynamic properties of the parameters are closely related to the operating state of NPP, the similarity of the operation parameter vectors are formed to express the similarity of the operating states, so as to fulfill the requirements of the hot deck algorithm. Then, the filling method based on the hot deck algorithm are studied. Different judging basis is proposed for discrete and continuous missing respectively. Firstly, to locate the missing data accurately the detecting algorithm for missing data of the NPP operation parameters based on wavelet analysis. In order to improve the data quality, two parts of researches are carried on. It may lower the data quality and affect the accuracy of the analysis results. Data missing exists in the recorded operation data. By analyzing the recorded operation data of a nuclear power plant (NPP), its results can serve the fault detection or operation experience feedback.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |