自然灾害调查数据的多尺度异常检测方法研究及应用
Study and Application of the Method of Multi-scale Outliers Detection of Natural Disaster Investigation Data
查看参考文献30篇
文摘
|
大范围自然灾害调查,涉及区域环境差异大,数据获取方式多样,参与人员多,各级汇总成果中存在一些异常调查单元,需要人工判读其合理性,但单纯依靠人工从海量数据中有效识别异常是不现实的。本文设计了一种自然灾害调查数据的多尺度异常检测方法,综合运用离群检测方法和空间数据挖掘算法,分别进行异常值和异常空间分布模式检测,能够从海量调查数据中快速提取各级尺度的异常值和异常调查单元,支撑人工判读工作。将该方法应用于全国山洪灾害调查评价汇总数据的审核中,以全国历史山洪灾害点和防治区乡镇面积审核为例,分别快速提取了县乡两级区划中的山洪灾害点密度异常单元和面积值异常的乡镇单元,通过对检测结果进行分析,发现是填报口径不一致、单位错误、记录重复等原因造成的。最后分析了该方法在大范围自然灾害调查中的适用条件和方法。 |
其他语种文摘
|
"Natural disaster" is the phenomenon of the losses of life and property, which is caused by the interaction of human society and natural environment. It's also the product of the disaster environment, disaster-causing factors and disaster-bearing body. In order to study the processes, mechanisms and impacts of natural disasters as well as the reduction of the losses caused by natural disasters, it is necessary to conduct surveys of basic data and natural disaster events on a large scale of which the authenticity and consistency are much significant for ensuring the reliability and validity of the research results. However, the large number of organizations and investigators participating in the survey and evaluation process, large regional differences and large spatial scale create challenges in data quality control and validating the consistency of data from various survey units. To ensure the correctness and consistency of the data, it is necessary to carry out manual inspection. However, for the massive survey data, it is unrealistic to totally rely on manual work to effectively identify the abnormities. As a result, we design a multi-scale anomaly detection method for natural disaster survey data by using the single-element detection method of outliers based on normal distribution and spatial clustering method of Anselin Local Moran's I to detect the abnormal values and abnormal spatial distribution patterns of the massive survey data. It can effectively extracts the abnormalities and abnormal investigation units at all levels of scale and gains the reasons for abnormal data. It provides the support for the manual checking of survey data. In this paper, taking the project of flash flood disaster investigation and evaluation in mainland of China as an example, this method is used to audit the events of historical flash flood disaster and the areas of the towns which are in the prevention zones. Also, it quickly extract the anomaly units of flash flood disaster point density and township units with exceptional area values. Further analysis found that the reasons for these abnormalities were due to the inconsistency of filling methods, unit errors, and repetition of records and so on. The method resolved the inconsistency in massive amounts of flash flood survey data. This method is an effective approach of checking the quality of various other large-scale disaster datasets. Although the data validation approach used in this study is very effective, there are still some problems, i.e. the outlier checking only considers the outliers between survey units based on the administrative divisions. Regions are not divided according to their economic development and natural conditions. Finally, we analyze the applicable conditions of this method in the large-scale natural disaster investigations. |
来源
|
地球信息科学学报
,2017,19(12):1653-1660 【核心库】
|
DOI
|
10.3724/SP.J.1047.2017.01653
|
关键词
|
灾害调查
;
山洪灾害
;
数据质量
;
异常检测
;
空间聚类
|
地址
|
1.
天津大学, 水利工程仿真与安全国家重点实验室, 天津, 300072
2.
中国水利水电科学研究院, 水利部防洪抗旱减灾工程技术研究中心, 北京, 100038
|
语种
|
中文 |
文献类型
|
研究性论文 |
ISSN
|
1560-8999 |
学科
|
灾害及其防治 |
基金
|
中国水利水电科学研究院专项
;
国家自然科学基金
|
文献收藏号
|
CSCD:6140923
|
参考文献 共
30
共2页
|
1.
史培军. 灾害系统:灾害群、灾害链、灾害遭遇.
自然灾害学报,2014,23(6):1-12
|
CSCD被引
68
次
|
|
|
|
2.
黄崇福. 自然灾害基本定义的探讨.
自然灾害学报,2009,18(5):41-50
|
CSCD被引
26
次
|
|
|
|
3.
赵思健. 自然灾害风险分析的时空尺度初探.
灾害学,2012,27(2):1-6
|
CSCD被引
13
次
|
|
|
|
4.
刘毅. 历史时期中国重大自然灾害时空分异特征.
地理学报,2012,67(3):291-300
|
CSCD被引
25
次
|
|
|
|
5.
王宏志. 大数据质量管理:问题与研究进展.
科技导报,2014,32(34):78-84
|
CSCD被引
5
次
|
|
|
|
6.
李永红. 对如何做好地质灾害详细调查工作的探讨.
灾害学,2016,31(1):102-112
|
CSCD被引
7
次
|
|
|
|
7.
Morton M. Challenges in disaster data collection during recent disasters.
Prehospital & Disaster Medicine,2011,26(3):196-201
|
CSCD被引
1
次
|
|
|
|
8.
曾五一. 国家统计数据质量研究的基本问题.
商业经济与管理,2010,1(12):72-76
|
CSCD被引
1
次
|
|
|
|
9.
韩京宇. 数据质量研究综述.
计算机科学,2008,35(2):1-5
|
CSCD被引
24
次
|
|
|
|
10.
程益联. 水利普查数据质量控制的研究.
水利信息化,2012(3):1-4
|
CSCD被引
1
次
|
|
|
|
11.
Berrahou L. A quality-aware spatial data warehouse for querying hydroecological data.
Computers & Geosciences,2015,85(PA):126-135
|
CSCD被引
2
次
|
|
|
|
12.
段华明. 大数据对于灾害评估的建构性提升.
灾害学,2016,31(1):188-192
|
CSCD被引
4
次
|
|
|
|
13.
Tin P. An integrated framework for disaster event analysis in big data environments.
Ninth International Conference on Intelligent Information Hiding and Multimedia Signal Processing,2013:255-258
|
CSCD被引
1
次
|
|
|
|
14.
李学龙. 大数据系统综述.
中国科学:信息科学,2015,45(1):1-44
|
CSCD被引
47
次
|
|
|
|
15.
程艳云. 基于大数据的时间序列异常点检测研究.
计算机技术与发展,2016,26(5):139-144
|
CSCD被引
6
次
|
|
|
|
16.
凌骏. 基于监控数据的MySQL异常检测算法.
计算机工程,2015,41(11):41-46
|
CSCD被引
3
次
|
|
|
|
17.
邓敏. 采用聚类技术探测空间异常.
遥感学报,2010,14(5):944-958
|
CSCD被引
5
次
|
|
|
|
18.
邓敏. 时空异常探测方法研究综述.
地理与地理信息科学,2016,32(6):43-50
|
CSCD被引
10
次
|
|
|
|
19.
葛艳琴. 第二次土地调查建库过程中数据质量的控制方法.
测绘科学,2008(S1):62-63
|
CSCD被引
1
次
|
|
|
|
20.
茅晶晶. 全国环境统计数据审核软件设计与实现.
环境科技,2011,24(4):65-68
|
CSCD被引
1
次
|
|
|
|
|