A deep fast learning framework towards exploring Imbalanced data and Multi-class Drift in Evolving Data Streams
Main Article Content
Abstract
Data stream classification poses great challenges in the text based data mining community towards handling evolving data stream. Identification of feature evolution and imbalanced data on the class generated is an important research area for data stream classification on employing of traditional machine learning classifiers. Class evolution and drift is the phenomenon of class emergence and disappearance. Due to class evolutions, performance of the learning model degrades drastically over time. Class evolution problem has been handled on analysis of feature drift and multi class drift. Multi-class drift occur according to probability and time, is categorized as sudden, gradual and recurring drift. Multi class drift has been captured by proposing a new framework in this paper which is named as “Deep Fast Learning Framework”. Initially feature has been extracted using ensemble of technique such as Incremental Kernel Principle Component Analysis, Incremental linear Discriminant analysis and Incremental Linear Principle Component Analysis. These techniques for feature extraction have treated as online feature extraction process. Extracted feature has been processed in the deep fast learning classifier framework which is composed of hybrid ensemble classifiers which follows chuck based ensemble and online ensemble classifiers in parallel on basis of gradual class evolution on block of data on the data streams in form of features. Base learner or classifier has been established using deep neural network to generate the fast learning model on deep analysis of the features obtained and its relationship with existing classes on continuously updating the learner by replacing the older model with newly trained model. Further Base learner will remove the emerging classes which is least utilized and detect the recurring classes on basis of the feature obtained easily. This model is effective in determining the novel classes and recurring class to features which has the possibility of multi class drift. Finally class imbalance problem has been handled on employing under sampling method for base learning model. Experimental results has proved the superiority of the proposed framework on benchmark dataset against state of art approaches on the performance measures such as precision , recall and f measure.
Downloads
Metrics
Article Details
You are free to:
- Share — copy and redistribute the material in any medium or format for any purpose, even commercially.
- Adapt — remix, transform, and build upon the material for any purpose, even commercially.
- The licensor cannot revoke these freedoms as long as you follow the license terms.
Under the following terms:
- Attribution — You must give appropriate credit , provide a link to the license, and indicate if changes were made . You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
- No additional restrictions — You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.
Notices:
You do not have to comply with the license for elements of the material in the public domain or where your use is permitted by an applicable exception or limitation .
No warranties are given. The license may not give you all of the permissions necessary for your intended use. For example, other rights such as publicity, privacy, or moral rights may limit how you use the material.