UM E-Theses Collection (澳門大學電子學位論文庫)

check Full Text

Time correlated opimization and validation method in data mining

English Abstract

TIME CORRELATED OPIMIZATION AND VALIDATION METHOD IN DATA MINING by FANG QIAN Thesis Supervisor: Dr. Simon Fong Big Data is being touted as the next big thing arousing technical challenges that confront both academic research communities and commercial IT deployment. The root sources of Big Data are founded on infinite data streams and the curse of dimensionality. It is generally known that data which are sourced from data streams accumulate continuously making traditional batch-based model induction algorithms infeasible for real-time data mining. In the past many methods have been proposed for incrementally data mining by modifying classical machine learning algorithms, such as artificial neural network. In this thesis we propose an incremental learning process for supervised learning with parameters optimization by neural network over data stream. The process is coupled with a parameters optimization module which searches for the best combination of input parameters values based on a given segment of data stream. The drawback of the optimization is the heavy consumption of time. To relieve this limitation, a loss function is proposed to look ahead for the occurrence of concept-drift which is one of the main causes of performance deterioration in data mining model. Optimization is skipped intermittently along the way so to save computation costs. Computer simulation is conducted to confirm the merits by this incremental optimization process for neural network. Inspired by implemented this incremental learning process with parameters optimization by neural network over data stream, the author also propose a time-correlated k-folds cross validation method which will keep the order of instance. This method is been compared with traditional k-folds cross validation method using three classification algorithm and five time correlated dataset respectively.

Issue date



Fang, Qian


Faculty of Science and Technology


Department of Computer and Information Science




Data mining


Fong, Chi Chiu

Files In This Item

Full-text (Internet)

1/F Zone C
Library URL