Continuous Outlier Monitoring on Uncertain Data Streams
查看参考文献26篇
文摘
|
Outlier detection on data streams is an important task in data mining. The challenges become even larger when considering uncertain data. This paper studies the problem of outlier detection on uncertain data streams. We propose Continuous Uncertain Outlier Detection (CUOD), which can quickly determine the nature of the uncertain elements by pruning to improve the efficiency. Furthermore, we propose a pruning approach - Probability Pruning for Continuous Uncertain Outlier Detection (PCUOD) to reduce the detection cost. It is an estimated outlier probability method which can effectively reduce the amount of calculations. The cost of PCUOD incremental algorithm can satisfy the demand of uncertain data streams. Finally, a new method for parameter variable queries to CUOD is proposed, enabling the concurrent execution of different queries. To the best of our knowledge, this paper is the first work to perform outlier detection on uncertain data streams which can handle parameter variable queries simultaneously. Our methods are verified using both real data and synthetic data. The results show that they are able to reduce the required storage and running time. |
来源
|
Journal of Computer Science and Technology
,2014,29(3):436-448 【核心库】
|
DOI
|
10.1007/s11390-014-1441-x
|
关键词
|
outlier detection
;
uncertain data stream
;
data mining
;
parameter variable query
|
地址
|
1.
College of Information Science and Engineering, Northeastern University, Shenyang, 110819
2.
College of Information Science and Engineering, Northeastern University, Key Laboratory of Medical Image Computing, Shenyang, 110819
3.
College of Computer, Shenyang Aerospace University, Shenyang, 110819
4.
Department of Command Information System Engineering, Logistic Engineering University of People's Liberation Army, Chongqing, 400311
|
语种
|
英文 |
文献类型
|
研究性论文 |
ISSN
|
1000-9000 |
学科
|
自动化技术、计算机技术 |
基金
|
国家自然科学基金
;
国家863计划
;
国家973计划
|
文献收藏号
|
CSCD:5130627
|
参考文献 共
26
共2页
|
1.
Niennattrakul V. Data editing techniques to allow the application of distance-based outlier detection to streams.
Proc. the 10th International Conference on Data Mining,2010:947-952
|
CSCD被引
1
次
|
|
|
|
2.
Jin C Q. Continuous ranking on uncertain streams.
Frontiers of Computer Science,2012,6(6):686-699
|
CSCD被引
1
次
|
|
|
|
3.
Zhang C. Tracking high quality clusters over uncertain data streams.
Proc. the 25th Int. Conf. Data Engineering,2009:1641-1648
|
CSCD被引
1
次
|
|
|
|
4.
Aggarwal C C. On density based transforms for uncertain data mining.
Proc. the 23rd International Conference on Data Engineering,2007:866-875
|
CSCD被引
1
次
|
|
|
|
5.
Barbar D. The management of probabilistic data.
IEEE Transactions on Knowledge and Data Engineering,1992,4(5):487-502
|
CSCD被引
13
次
|
|
|
|
6.
Burdick D. OLAP over uncertain and imprecise data.
Proc. the 31st Int. Conf. Very Large Data Bases,2005:970-981
|
CSCD被引
1
次
|
|
|
|
7.
Cheng R. Evaluating probabilistic queries over imprecise data.
Proc. International Conference on Management of Data,2003:551-562
|
CSCD被引
1
次
|
|
|
|
8.
Sarma A D. Working models for uncertain data.
Proc. the 22nd International Conference on Data Engineering,2006:7
|
CSCD被引
1
次
|
|
|
|
9.
Singh S. Indexing uncertain categorical data.
Proc. the 23rd Int. Conf. Data Engineering,2007:616-625
|
CSCD被引
1
次
|
|
|
|
10.
Tao Y. Indexing multi-dimensional uncertain data with arbitrary probability density functions.
Proc. the 31st Int. Conf. Very Large Data Bases,2005:922-933
|
CSCD被引
1
次
|
|
|
|
11.
Chen M. An Efficient Method for Cleaning Dirty-Events over Uncertain Data in WSNs.
J. Computer Science and Technology,2011,26(6):942-953
|
CSCD被引
1
次
|
|
|
|
12.
Yang D. Neighbor-based pattern detection for windows over streaming data.
Proc. the 12th International Conference on Extending Database Technology,2009:529-540
|
CSCD被引
1
次
|
|
|
|
13.
Aggarwal C C. A framework for clustering evolving data streams.
Proc. the 29th Int. Conf. Very Large Data Bases,2003:81-92
|
CSCD被引
1
次
|
|
|
|
14.
Babcock B. Models and issues in data stream systems.
Proc. the 21st ACM SIGMOD-SIGART-SIGACT Symposium on Principles of Database Systems,2002:1-16
|
CSCD被引
1
次
|
|
|
|
15.
Knorr E M. Algorithms for mining distance-based outliers in large datasets.
Proc. the 24th International Conference on Very Large Data Bases,1998:392-403
|
CSCD被引
3
次
|
|
|
|
16.
Angiulli F. Detecting distance-based outliers in streams of data.
Proc. the 16th International Conference on Information and Knowledge Management,2007:811-820
|
CSCD被引
1
次
|
|
|
|
17.
Kontaki M. Continuous monitoring of distance-based outliers over data streams.
Proc. the 27th International Conference on Data Engineering,2011:135-146
|
CSCD被引
1
次
|
|
|
|
18.
Assent I. AnyOut: Anytime outlier detection on streaming data.
Proc. the 17th International Conference on Databases Systems for Advanced Applications, Vol.1,2012:228-242
|
CSCD被引
1
次
|
|
|
|
19.
Aggarwal C C. Outlier detection with uncertain data.
Proc. SIAM Int. Conf. Data Mining,2008:483-493
|
CSCD被引
1
次
|
|
|
|
20.
Wang B. Distance-based outlier detection on uncertain data.
Proc. the 9th Int. Conf. Comp. and Information Technology,2009:293-298
|
CSCD被引
1
次
|
|
|
|
|