Introducing new outlier detection method using robust statistical distance in water quality data |
---|
학술지명 Desalination and Water Treatment
저자 김성수,채선하,윤석민,박노석
발표일 2019-05-01
|
Various water qualities are currently being measured in real time in order to monitor source water as well as drinking and waste water processed by treatment plants. However, there are likely to be various potential outliers in the water quality dataset due to replacement of consumables and equipment calibration; and missing data from mechanical malfunctions, etc. Outlier detection method based on multivariate analysis, which has been generally used, is an approach to detecting outliers using chi-squared distribution and Mahalanobis distance derived from multivariate Gaussian distribution. However, Mahalanobis distance is sensitive to the effects of potential outliers and extreme values distributed outside the cluster mean. Accordingly, we adopted robust distance based on minimum covariance determinant estimators to minimize the effects of potential outliers and extreme values. In addition, the modified cutoff point of chi-squared distribution and the cutoff point calculation methodology were applied to reduce the effects of data size in detecting outliers using robust distance and chi-squared distribution. |