帮助 关于我们

返回检索结果

Continuous Outlier Monitoring on Uncertain Data Streams

查看参考文献26篇

文摘 Outlier detection on data streams is an important task in data mining. The challenges become even larger when considering uncertain data. This paper studies the problem of outlier detection on uncertain data streams. We propose Continuous Uncertain Outlier Detection (CUOD), which can quickly determine the nature of the uncertain elements by pruning to improve the efficiency. Furthermore, we propose a pruning approach - Probability Pruning for Continuous Uncertain Outlier Detection (PCUOD) to reduce the detection cost. It is an estimated outlier probability method which can effectively reduce the amount of calculations. The cost of PCUOD incremental algorithm can satisfy the demand of uncertain data streams. Finally, a new method for parameter variable queries to CUOD is proposed, enabling the concurrent execution of different queries. To the best of our knowledge, this paper is the first work to perform outlier detection on uncertain data streams which can handle parameter variable queries simultaneously. Our methods are verified using both real data and synthetic data. The results show that they are able to reduce the required storage and running time.
来源 Journal of Computer Science and Technology ,2014,29(3):436-448 【核心库】
DOI 10.1007/s11390-014-1441-x
关键词 outlier detection ; uncertain data stream ; data mining ; parameter variable query
地址

1. College of Information Science and Engineering, Northeastern University, Shenyang, 110819  

2. College of Information Science and Engineering, Northeastern University, Key Laboratory of Medical Image Computing, Shenyang, 110819  

3. College of Computer, Shenyang Aerospace University, Shenyang, 110819  

4. Department of Command Information System Engineering, Logistic Engineering University of People's Liberation Army, Chongqing, 400311

语种 英文
文献类型 研究性论文
ISSN 1000-9000
学科 自动化技术、计算机技术
基金 国家自然科学基金 ;  国家863计划 ;  国家973计划
文献收藏号 CSCD:5130627

参考文献 共 26 共2页

1.  Niennattrakul V. Data editing techniques to allow the application of distance-based outlier detection to streams. Proc. the 10th International Conference on Data Mining,2010:947-952 被引 1    
2.  Jin C Q. Continuous ranking on uncertain streams. Frontiers of Computer Science,2012,6(6):686-699 被引 1    
3.  Zhang C. Tracking high quality clusters over uncertain data streams. Proc. the 25th Int. Conf. Data Engineering,2009:1641-1648 被引 1    
4.  Aggarwal C C. On density based transforms for uncertain data mining. Proc. the 23rd International Conference on Data Engineering,2007:866-875 被引 1    
5.  Barbar D. The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering,1992,4(5):487-502 被引 13    
6.  Burdick D. OLAP over uncertain and imprecise data. Proc. the 31st Int. Conf. Very Large Data Bases,2005:970-981 被引 1    
7.  Cheng R. Evaluating probabilistic queries over imprecise data. Proc. International Conference on Management of Data,2003:551-562 被引 1    
8.  Sarma A D. Working models for uncertain data. Proc. the 22nd International Conference on Data Engineering,2006:7 被引 1    
9.  Singh S. Indexing uncertain categorical data. Proc. the 23rd Int. Conf. Data Engineering,2007:616-625 被引 1    
10.  Tao Y. Indexing multi-dimensional uncertain data with arbitrary probability density functions. Proc. the 31st Int. Conf. Very Large Data Bases,2005:922-933 被引 1    
11.  Chen M. An Efficient Method for Cleaning Dirty-Events over Uncertain Data in WSNs. J. Computer Science and Technology,2011,26(6):942-953 被引 1    
12.  Yang D. Neighbor-based pattern detection for windows over streaming data. Proc. the 12th International Conference on Extending Database Technology,2009:529-540 被引 1    
13.  Aggarwal C C. A framework for clustering evolving data streams. Proc. the 29th Int. Conf. Very Large Data Bases,2003:81-92 被引 1    
14.  Babcock B. Models and issues in data stream systems. Proc. the 21st ACM SIGMOD-SIGART-SIGACT Symposium on Principles of Database Systems,2002:1-16 被引 1    
15.  Knorr E M. Algorithms for mining distance-based outliers in large datasets. Proc. the 24th International Conference on Very Large Data Bases,1998:392-403 被引 3    
16.  Angiulli F. Detecting distance-based outliers in streams of data. Proc. the 16th International Conference on Information and Knowledge Management,2007:811-820 被引 1    
17.  Kontaki M. Continuous monitoring of distance-based outliers over data streams. Proc. the 27th International Conference on Data Engineering,2011:135-146 被引 1    
18.  Assent I. AnyOut: Anytime outlier detection on streaming data. Proc. the 17th International Conference on Databases Systems for Advanced Applications, Vol.1,2012:228-242 被引 1    
19.  Aggarwal C C. Outlier detection with uncertain data. Proc. SIAM Int. Conf. Data Mining,2008:483-493 被引 1    
20.  Wang B. Distance-based outlier detection on uncertain data. Proc. the 9th Int. Conf. Comp. and Information Technology,2009:293-298 被引 1    
引证文献 5

1 Han Donghong Classifying Uncertain and Evolving Data Streams with Distributed Extreme Learning Machine Journal of Computer Science and Technology,2015,30(4):874-887
被引 1

2 Wang Xite An Efficient Algorithm for Distributed Outlier Detection in Large Multi-Dimensional Datasets Journal of Computer Science and Technology,2015,30(6):1233-1248
被引 0 次

显示所有5篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号