Journal: information sciences / Publisher: elsevier b.v. / Search Limiters: Academic (Peer-Reviewed) Journals and Full Text / Topic: data mining and prediction models - Searchworks@Jio Institute Digital Library Search Results

Showing total 6 results

Start Over Search Limiters Academic (Peer-Reviewed) Journals Search Limiters Full Text Topic data mining Topic prediction models Journal information sciences Publisher elsevier b.v.

6 results

1. Subclass-based semi-random data partitioning for improving sample representativeness.

Author: Liu, Han, Chen, Shyi-Ming, and Cocea, Mihaela
Subjects: *RANDOM data (Statistics), *PARALLEL algorithms, *MACHINE learning, *STATISTICAL sampling, *PREDICTION models
Abstract: Abstract In machine learning tasks, it is essential for a data set to be partitioned into a training set and a test set in a specific ratio. In this context, the training set is used for learning a model for making predictions on new instances, whereas the test set is used for evaluating the prediction accuracy of a model on new instances. In the context of human learning, a training set can be viewed as learning material that covers knowledge, whereas a test set can be viewed as an exam paper that provides questions for students to answer. In practice, data partitioning has typically been done by randomly selecting 70% instances for training and the rest for testing. In this paper, we argue that random data partitioning is likely to result in the sample representativeness issue, i.e., training and test instances show very dissimilar characteristics leading to the case similar to testing students on material that was not taught. To address the above issue, we propose a subclass-based semi-random data partitioning approach. The experimental results show that the proposed data partitioning approach leads to significant advances in learning performance due to the improvement of sample representativeness. [ABSTRACT FROM AUTHOR]
Published: 2019
Full Text: View/download PDF

2. Small perturbations are enough: Adversarial attacks on time series prediction.

Author: Wu, Tao, Wang, Xuechun, Qiao, Shaojie, Xian, Xingping, Liu, Yanbing, and Zhang, Liang
Subjects: *TIME series analysis, *PREDICTION models, *FORECASTING, *DEEP learning, *IMAGE processing, *DATA mining
Abstract: Time-series data are widespread in real-world industrial scenarios. To recover and infer missing information in real-world applications, the problem of time-series prediction has been widely studied as a classical research topic in data mining. Deep learning architectures have been viewed as next-generation time-series prediction models. However, recent studies have shown that deep learning models are vulnerable to adversarial attacks. In this study, we prospectively examine the problem of time-series prediction adversarial attacks and propose an attack strategy for generating an adversarial time series by adding malicious perturbations to the original time series to deteriorate the performance of time-series prediction models. Specifically, a perturbation-based adversarial example generation algorithm is proposed using the gradient information of the prediction model. In practice, unlike the imperceptibility to humans in the field of image processing, time-series data are more sensitive to abnormal perturbations and there are more stringent requirements regarding the amount of perturbations. To address this challenge, we craft an adversarial time series based on the importance measurement to slightly perturb the original data. Based on comprehensive experiments conducted on real-world time-series datasets, we verify that the proposed adversarial attack methods not only effectively fool the target time-series prediction model LSTNet, they also attack state-of-the-art CNN-, RNN-, and MHANET-based models. Meanwhile, the results show that the proposed methods achieve a good transferability. That is, the adversarial examples generated for a specific prediction model can significantly affect the performance of the other methods. Moreover, through a comparison with existing adversarial attack approaches, we can see that much smaller perturbations are sufficient for the proposed importance-measurement based adversarial attack method. The methods described in this paper are significant in understanding the impact of adversarial attacks on a time-series prediction and promoting the robustness of such prediction technologies. [ABSTRACT FROM AUTHOR]
Published: 2022
Full Text: View/download PDF

3. Novel centroid selection approaches for KMeans-clustering based recommender systems.

Author: Zahra, Sobia, Ghazanfar, Mustansar Ali, Khalid, Asra, Azam, Muhammad Awais, Naeem, Usman, and Prugel-Bennett, Adam
Subjects: *CENTROID, *K-means clustering, *MACHINE learning, *RECOMMENDER systems, *PREDICTION models, *INFORMATION theory, *DATA mining
Abstract: Recommender systems have the ability to filter unseen information for predicting whether a particular user would prefer a given item when making a choice. Over the years, this process has been dependent on robust applications of data mining and machine learning techniques, which are known to have scalability issues when being applied for recommender systems. In this paper, we propose a k-means clustering-based recommendation algorithm, which addresses the scalability issues associated with traditional recommender systems. An issue with traditional k-means clustering algorithms is that they choose the initial k centroid randomly, which leads to inaccurate recommendations and increased cost for offline training of clusters. The work in this paper highlights how centroid selection in k-means based recommender systems can improve performance as well as being cost saving. The proposed centroid selection method has the ability to exploit underlying data correlation structures, which has been proven to exhibit superior accuracy and performance in comparison to the traditional centroid selection strategies, which choose centroids randomly. The proposed approach has been validated with an extensive set of experiments based on five different datasets (from movies, books, and music domain). These experiments prove that the proposed approach provides a better quality cluster and converges quicker than existing approaches, which in turn improves accuracy of the recommendation provided. [ABSTRACT FROM AUTHOR]
Published: 2015
Full Text: View/download PDF

4. Data mining agent conversations: A qualitative approach to multiagent systems analysis

Author: Serrano, Emilio, Rovatsos, Michael, and Botía, Juan A.
Subjects: *MULTIAGENT systems, *DATA mining, *QUALITATIVE research, *COMPUTER network protocols, *DISTRIBUTION (Probability theory), *INFORMATION & communication technologies, *PREDICTION models, *DECISION making
Abstract: Abstract: This paper presents a novel method for analysing the behaviour of multiagent systems on the basis of the semantically rich information provided by agent communication languages and interaction protocols specified at the knowledge level. More low-level communication mechanisms only allow for a quantitative analysis of the occurrence of message types, the frequency of message sequences, and the empirical distributions of parameter values. Quite differently, the semantics of languages and protocols in multiagent systems can help to extract qualitative properties of observed conversations among agents. This can be achieved by interpreting the logical constraints associated with protocol execution paths or individual messages as the context of an observed interaction, and using them as features of learning samples. The contexts “mined” from such analyses, or context models, can then be used for various tasks, e.g. for predicting others’ future responses (useful when trying to make strategic communication decisions to achieve a particular outcome), to support ontological alignment (by comparing the properties of logical constraints attached to messages across participating agents), or to assess the trustworthiness of agents (by verifying the logical coherence of their behaviour). This paper details a formal approach that describes our notion of context models in multiagent conversations, an implementation of this approach in a practical tool for mining qualitative context models, and experimental results to illustrate its use and utility. [Copyright &y& Elsevier]
Published: 2013
Full Text: View/download PDF

5. Semi-supervised trees for multi-target regression.

Author: Levatić, Jurica, Kocev, Dragi, Ceci, Michelangelo, and Džeroski, Sašo
Subjects: *REGRESSION analysis, *MACHINE learning, *DATA mining, *CLUSTER analysis (Statistics), *PREDICTION models
Abstract: The predictive performance of traditional supervised methods heavily depends on the amount of labeled data. However, obtaining labels is a difficult process in many real-life tasks, and only a small amount of labeled data is typically available for model learning. As an answer to this problem, the concept of semi-supervised learning has emerged. Semi-supervised methods use unlabeled data in addition to labeled data to improve the performance of supervised methods. It is even more difficult to get labeled data for data mining problems with structured outputs since several labels need to be determined for each example. Multi-target regression (MTR) is one type of a structured output prediction problem, where we need to simultaneously predict multiple continuous variables. Despite the apparent need for semi-supervised methods able to deal with MTR, only a few such methods are available and even those are difficult to use in practice and/or their advantages over supervised methods for MTR are not clear. This paper presents an extension of predictive clustering trees for MTR and ensembles thereof towards semi-supervised learning. The proposed method preserves the appealing characteristic of decision trees while enabling the use of unlabeled examples. In particular, the proposed semi-supervised trees for MTR are interpretable, easy to understand, fast to learn, and can handle both numeric and nominal descriptive features. We perform an extensive empirical evaluation in both an inductive and a transductive semi-supervised setting. The results show that the proposed method improves the performance of supervised predictive clustering trees and enhances their interpretability (due to reduced tree size), whereas, in the ensemble learning scenario, it outperforms its supervised counterpart in the transductive setting. The proposed methods have a mechanism for controlling the influence of unlabeled examples, which makes them highly useful in practice: This mechanism can protect them against a degradation of performance of their supervised counterparts – an inherent risk of semi-supervised learning. The proposed methods also outperform two existing semi-supervised methods for MTR. [ABSTRACT FROM AUTHOR]
Published: 2018
Full Text: View/download PDF

6. Software defect prediction using relational association rule mining.

Author: Czibula, Gabriela, Marian, Zsuzsanna, and Czibula, Istvan Gergely
Subjects: *SOFTWARE engineering, *PREDICTION models, *SOFTWARE architecture, *DATA mining, *MACHINE learning
Abstract: Abstract: This paper focuses on the problem of defect prediction, a problem of major importance during software maintenance and evolution. It is essential for software developers to identify defective software modules in order to continuously improve the quality of a software system. As the conditions for a software module to have defects are hard to identify, machine learning based classification models are still developed to approach the problem of defect prediction. We propose a novel classification model based on relational association rules mining. Relational association rules are an extension of ordinal association rules, which are a particular type of association rules that describe numerical orderings between attributes that commonly occur over a dataset. Our classifier is based on the discovery of relational association rules for predicting whether a software module is or it is not defective. An experimental evaluation of the proposed model on the open source NASA datasets, as well as a comparison to similar existing approaches is provided. The obtained results show that our classifier overperforms, for most of the considered evaluation measures, the existing machine learning based techniques for defect prediction. This confirms the potential of our proposal. [Copyright &y& Elsevier]
Published: 2014
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

6 results

1. Subclass-based semi-random data partitioning for improving sample representativeness.

2. Small perturbations are enough: Adversarial attacks on time series prediction.

3. Novel centroid selection approaches for KMeans-clustering based recommender systems.

4. Data mining agent conversations: A qualitative approach to multiagent systems analysis

5. Semi-supervised trees for multi-target regression.

6. Software defect prediction using relational association rule mining.

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Database

6 results

Search Results

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources