UM E-Theses Collection (澳門大學電子學位論文庫)
- Title
-
Improving the outlier detection algorithm from multivariate data stream
- English Abstract
-
Show / Hidden
IMPROVING THE OUTLIER DETECTION ALGORITHM FROM MULTIVARIATE DATA STREAM by HAN DONG Thesis Supervisor: Department of Computer and Information Science Dr. Simon Fong Master of Science in E-Commerce Technology Outlier detection is a preprocessing technology that is effective in reducing irrelevant instances in machine learning. Since now, there are plenty of outlier detection algorithms invented by predecessor, for the purpose of forming the datasets with fewer outliers. In this study, we propose an outlier detection method named lightweight analysis. Whereas we use the full dataset to get this value most of the time. This atmosphere encourage us to think using a fixed number of instances as a reference to calculate the outlier indicator, like Cumulative analysis or lightweight analysis with sliding window, other than global analysis only. Then we combine this three mechanisms with the existed outlier detection algorithms, which is Mahalanobis distance, local outlier factor and interquartile range. The experiments yield encouraging results supporting the fact that classification accuracy using the reduced dataset. Results are equaled or better accuracy when using the proposed classifier based outlier detection (COD) method. Key words: Outlier detection, COD, data mining
- Issue date
-
2015.
- Author
-
Han, Dong
- Faculty
- Faculty of Science and Technology
- Department
- Department of Computer and Information Science
- Degree
-
M.Sc.
- Subject
-
Outliers (Statistics)
Data mining
- Supervisor
-
Fong, Chi Chiu
- Files In This Item
- Location
- 1/F Zone C
- Library URL
- 991000732809706306