1. Incremental Learning for Classification of Unstructured Data Using Extreme Learning Machine
- Author
-
Sathya Madhusudhanan, Jayashree L S, and Suresh Jaganathan
- Subjects
0209 industrial biotechnology ,lcsh:T55.4-60.8 ,Computer science ,02 engineering and technology ,Machine learning ,computer.software_genre ,lcsh:QA75.5-76.95 ,Theoretical Computer Science ,extreme learning machine ,020901 industrial engineering & automation ,0202 electrical engineering, electronic engineering, information engineering ,lcsh:Industrial engineering. Management engineering ,Extreme learning machine ,incremental learning ,Numerical Analysis ,Class (computer programming) ,Artificial neural network ,business.industry ,Unstructured data ,streaming data ,Metadata ,Computational Mathematics ,Task (computing) ,classification ,Computational Theory and Mathematics ,Data model ,020201 artificial intelligence & image processing ,lcsh:Electronic computers. Computer science ,Artificial intelligence ,business ,computer ,unstructured data ,MNIST database - Abstract
Unstructured data are irregular information with no predefined data model. Streaming data which constantly arrives over time is unstructured, and classifying these data is a tedious task as they lack class labels and get accumulated over time. As the data keeps growing, it becomes difficult to train and create a model from scratch each time. Incremental learning, a self-adaptive algorithm uses the previously learned model information, then learns and accommodates new information from the newly arrived data providing a new model, which avoids the retraining. The incrementally learned knowledge helps to classify the unstructured data. In this paper, we propose a framework CUIL (Classification of Unstructured data using Incremental Learning) which clusters the metadata, assigns a label for each cluster and then creates a model using Extreme Learning Machine (ELM), a feed-forward neural network, incrementally for each batch of data arrived. The proposed framework trains the batches separately, reducing the memory resources, training time significantly and is tested with metadata created for the standard image datasets like MNIST, STL-10, CIFAR-10, Caltech101, and Caltech256. Based on the tabulated results, our proposed work proves to show greater accuracy and efficiency.
- Published
- 2018