帮助 关于我们

返回检索结果

基于MST聚类的空间数据离群挖掘算法
Algorithm of spatial outlier mining based on MST clustering

查看参考文献13篇

文摘 空间离群是指空间邻域中属性特征值明显不同于其他对象的空间对象,空间数据离群挖掘能为人们提供很多有趣的信息,但空间数据具有复杂的拓扑关系、方位关系和度量关系等空间特征,传统的面向事务型数据库的离群挖掘算法并不适用于空间数据库。本文提出了基于MST(Minimum Spanning Tree,最小生成树)聚类的空间数据离群挖掘算法(SOM);有机结合了最小生成树理论与密度的方法,既体现了空间离群的局部特性,又体现了空间离群的孤立程度。该算法通过MST维护空间数据的基本空间结构特征,通过打断MST中最不一致的边形成MST聚类,不仅具有密度的聚类方法能够聚集非球状簇和分布不均的数据集的特点,而且聚类结果不依赖于用户参数的选择,因此,离群挖掘结果更合理。最后,通过实例数据,验证了该算法的有效性,它适用于大规模空间数据集的离群挖掘。
其他语种文摘 A spatial outlier is a spatial object whose non-spatial attribute values are significantly deviated from the other data's in the dataset. How to detect spatial outliers from spatial dataset and to explain the reason causes the anomaly in practical application have become more and more interesting to many researchers. Spatial outliers mining can bring us a lot of interesting information, but for the complicated characteristic of spatial data, such as topological relation, orientation relation, measurement relation, and so on, traditional algorithms for outlier mining in business database seem to deficient in spatial dataset, the main problem lies in the difficulty to maintain spatial structure characteristics for most existing algorithms during the process of outlier mining. Thanks to the similarities between clustering and outlier mining, clustering based outlier mining is an important way to detect anomalies from dataset. However, due to the diversity of clustering algorithms, it is difficult to choose a proper one for outlier mining, and the main purpose of clustering is to find out the principal features of the dataset, outliers are the by-products of clustering. Based on minimum spanning tree clustering, a new algorithm for spatial outlier mining called SOM is proposed. The algorithm keeps basic spatial structure characteristics of spatial objects through the use of geometric structure : Delaunay triangulated irregular network and minimum spanning tree ( MST), and it gains MST clustering by cutting off several most inconsistent edges of MST, so that it not only owns the function that it can acquire clusters from non-spherical and unbalanced datasets as the density-based cluster algorithms does, but also has the advantage that it doesn't depend on user's pre-set parameters, so the clustering result is usually more reasonable. Finally, the validity of SOM algorithm is validated by real application of geochemical soil elements dataset inspected to coastal areas of Fujian province, through analysis it is found that the algorithm is also applicable for spatial outlier mining in massive spatial dataset.
来源 地球信息科学 ,2008,10(5):586-592 【扩展库】
关键词 SOM算法 ; 聚类的离群 ; 空间离群 ; MST聚类
地址

福州大学福建省空间信息工程研究中心, 空间数据挖掘与信息共享教育部重点实验室, 福建, 福州, 350002

语种 中文
文献类型 研究性论文
ISSN 1560-8999
学科 测绘学;自动化技术、计算机技术
基金 国家自然科学基金 ;  福建省自然科学基金 ;  福建省科技计划重点项目(2005H086) ;  福建省高等学校新世纪优秀人才支持计划
文献收藏号 CSCD:3402656

参考文献 共 13 共1页

1.  邸凯昌. 空间数据发掘和知识发现的框架. 武汉测绘科技大学学报,1997,22(4):328-332 被引 33    
2.  S Shekhar. Detecting Graph-Based Spatial Outlier:Algorithms and Applications(A Summary of Results). In Proc of the Seventh ACM-SIGKDD Int'l Conference on Knowledge Discovery and Data Mining,2001 被引 1    
3.  范明(译). 数据挖掘概念与技术,2001:223-259 被引 17    
4.  黄洪宇. 离群数据挖掘综述. 计算机应用研究,2006(8):8-13 被引 15    
5.  Hodge V. A Survey of Outlier Detection Methodologies. Artificial Intelligence Review,2004,22(2):85-126 被引 65    
6.  S Shekhar. A unified approach to detecting spatial outliers. Geo-Informatica,2003,7(2):139-166 被引 34    
7.  S Shekhar. Detecting graph-based spatial Outlier. Intelligent Data Analysis:An International Journal,2002,6(5):451-468 被引 3    
8.  He Z. Discovering cluster-based local outliers. Pattern Recognition Letters,2003,24:1642-1650 被引 3    
9.  Ester M. A Density based algorithm for discovering clusters in large spatial databases. Proc of KDD'96,1996:226-231 被引 7    
10.  Ng R T. Efficient clustering methods for spatial data mining. In Proc The 20th International Conference on Very Large Data Bases,1994:144-155 被引 1    
11.  M M Breunig. LOF:Identifying density-based local outliers. In Proceedings of the ACM SIGMOD International Conference on Management of Data,2000:93-104 被引 1    
12.  崔光照. 基于密度的最小生成树聚类算法研究. 计算机工程与应用,2006(5):156-159 被引 3    
13.  Yu He. MinClue:a MST-based clustering method with auto-threshold-detection. Proceedings of the 2004 IEEE Conference on Cybernetics and Intelligent Systems,2004:229-233 被引 1    
引证文献 1

1 廖伟华 变精度粗糙集下的GIS面目标拓扑关系扩展研究 地球信息科学学报,2010,12(6):806-810
被引 2

显示所有1篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号