air quality monitoring data are the most important source for public awareness
regarding air quality and are widely used in many research fields, such as
improving air quality forecasting and the analysis of haze episodes. However,
there are outliers among such monitoring data, due to instrument malfunctions,
the influence of harsh environments, and the limitation of measuring methods.
Four types of outliers in ambient air
quality monitoring data.
practice, manual inspection is often applied to identify these outliers.
However, as the amount of data grows rapidly, this method becomes increasingly
deal with the problem, Dr. Huangjian Wu and Associate Professor Xiao Tang from
the Institute of Atmospheric Physics, Chinese Academy of Sciences, propose a fully
automatic outlier detection method based on the probability of residuals. The
method adopts multiple regression methods, and the regression residuals are
used to discriminate outliers. Based on the standard
deviations of the residuals, probabilities of the residuals can be calculated,
and the observations with small probabilities are tagged as outliers and
removed by a computer program. Their findings are published in Advances in Atmospheric Sciences.
introducing the probabilities of residuals, multiple rules can be used for
identifying outliers on the same framework,” says Dr. Wu. “For example, by
assuming that the residuals of spatial regression and temporal regression obey
a bivariate normal distribution, spatial and temporal consistencies can be
simultaneously evaluated for better identification of outliers”.
method can flag potentially erroneous data in the hourly observations from 1436
stations of the China National Environmental Monitoring Center (CNEMC) within a
minute. Indeed, it has been used in CNEMC’s air quality forecasting system, and
is going to be integrated into the data management system. The hope is that
outliers in the system’s real-time air quality data will be removed in the near
H. J., X. Tang, Z. F. Wang, L. Wu, M. M. Lu, L. F. Wei, and J. Zhu, 2018:
Probabilistic automatic outlier detection for surface air quality measurements
from the China National Environmental Monitoring Network. Adv. Atmos. Sci., 35(12),