3,772 results on '"PEDRYCZ, WITOLD"'
Search Results
2. Machine learning in human creativity: status and perspectives
- Author
-
Farina, Mirko, Lavazza, Andrea, Sartori, Giuseppe, and Pedrycz, Witold
- Published
- 2024
- Full Text
- View/download PDF
3. Knowledge-Driven Possibilistic Clustering with Automatic Cluster Elimination.
- Author
-
Hu, Xianghui, Tang, Yiming, Pedrycz, Witold, Jiang, Jiuchuan, and Jiang, Yichuan
- Subjects
DATA structures ,FUZZY logic ,MACHINE learning ,PROBLEM solving ,ALGORITHMS - Abstract
Traditional Fuzzy C-Means (FCM) and Possibilistic C-Means (PCM) clustering algorithms are data-driven, and their objective function minimization process is based on the available numeric data. Recently, knowledge hints have been introduced to form knowledge-driven clustering algorithms, which reveal a data structure that considers not only the relationships between data but also the compatibility with knowledge hints. However, these algorithms cannot produce the optimal number of clusters by the clustering algorithm itself; they require the assistance of evaluation indices. Moreover, knowledge hints are usually used as part of the data structure (directly replacing some clustering centers), which severely limits the flexibility of the algorithm and can lead to knowledge misguidance. To solve this problem, this study designs a new knowledge-driven clustering algorithm called the PCM clustering with High-density Points (HP-PCM), in which domain knowledge is represented in the form of so-called high-density points. First, a new data density calculation function is proposed. The Density Knowledge Points Extraction (DKPE) method is established to filter out high-density points from the dataset to form knowledge hints. Then, these hints are incorporated into the PCM objective function so that the clustering algorithm is guided by high-density points to discover the natural data structure. Finally, the initial number of clusters is set to be greater than the true one based on the number of knowledge hints. Then, the HP-PCM algorithm automatically determines the final number of clusters during the clustering process by considering the cluster elimination mechanism. Through experimental studies, including some comparative analyses, the results highlight the effectiveness of the proposed algorithm, such as the increased success rate in clustering, the ability to determine the optimal cluster number, and the faster convergence speed. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Cascade AdaBoost Neural Network Classifier: Analysis and Design.
- Author
-
Gao, Mingjie, Huang, Wei, Wan, Shaohua, Oh, Sung-Kwun, and Pedrycz, Witold
- Subjects
PARTICLE swarm optimization ,ATRIAL fibrillation ,MACHINE learning - Abstract
In this paper, we propose a cascade AdaBoost neural network (CANN) based on concepts and construct of AdaBoost neurons and cascade structure. Compared with AdaBoost, CANN can represent complex relationships between features. In CANN, representation learning is performed through AdaBoost, and the method of random selection features is utilized to encourage the diversity of AdaBoost neurons. Through the cascade structure, CANN has the context structure for complex feature representation. At the same time, in order to avoid the problem of feature disappearance, shortcut connection is used to add the previous information to the later nodes. Furthermore, particle swarm optimization (PSO) algorithm is utilized to optimize the structure of CANN, it can obtain the number of iterations to achieve better performance. Two types of CANN are proposed based — binary-classification CANN (BCANN) or multi-classification CANN (MCANN). The performance of CANN is evaluated with two kinds of data sets: machine learning data sets and atrial fibrillation data set. A comparative analysis illustrates that the proposed CANN leads to better performance than the models reported in the literature. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. MBSSA-Bi-AESN: Classification prediction of bi-directional adaptive echo state network based on modified binary salp swarm algorithm and feature selection.
- Author
-
Wu, Xunjin, Zhan, Jianming, Li, Tianrui, Ding, Weiping, and Pedrycz, Witold
- Subjects
FEATURE selection ,SUBSET selection ,DEMAND forecasting ,ALGORITHMS ,MACHINE learning ,TIME series analysis ,CLASSIFICATION - Abstract
In the era of big data, the demand for multivariate time series prediction has surged, drawing increased attention to feature selection and neural networks in machine learning. However, certain feature selection methods neglect the alignment between actual data sample differences and clustering results, while neural networks lack automatic parameter adjustment in response to changing target features. This paper presents the MBSSA-Bi-AESN model, a Bi-directional Adaptive Echo State Network that utilizes the modified salp swarm algorithm (MBSSA) and feature selection to address the limitations of manually set parameters. Initial feature subset selection involves assigning weights based on the consistency of clustering results with differences. Subsequently, the four critical parameters in the Bi-AESN model are optimized using MBSSA. The optimized Bi-AESN model and selected feature subset are then integrated for simultaneous model learning and optimal feature subset selection. Experimental analysis on eight datasets demonstrates the superior prediction accuracy of the MBSSA-Bi-AESN model compared to benchmark models, underscoring its feasibility, validity, and universality. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Oversampling method based on GAN for tabular binary classification problems.
- Author
-
Yang, Jie, Jiang, Zhenhao, Pan, Tingting, Chen, Yueqi, and Pedrycz, Witold
- Subjects
MACHINE learning ,GENERATIVE adversarial networks ,DATA distribution - Abstract
Data-imbalanced problems are present in many applications. A big gap in the number of samples in different classes induces classifiers to skew to the majority class and thus diminish the performance of learning and quality of obtained results. Most data level imbalanced learning approaches generate new samples only using the information associated with the minority samples through linearly generating or data distribution fitting. Different from these algorithms, we propose a novel oversampling method based on generative adversarial networks (GANs), named OS-GAN. In this method, GAN is assigned to learn the distribution characteristics of the minority class from some selected majority samples but not random noise. As a result, samples released by the trained generator carry information of both majority and minority classes. Furthermore, the central regularization makes the distribution of all synthetic samples not restricted to the domain of the minority class, which can improve the generalization of learning models or algorithms. Experimental results reported on 14 datasets and one high-dimensional dataset show that OS-GAN outperforms 14 commonly used resampling techniques in terms of G-mean, accuracy and F1-score. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
7. Network traffic classification for data fusion: A survey
- Author
-
Zhao, Jingjing, Jing, Xuyang, Yan, Zheng, Pedrycz, Witold, Xidian University, Network Security and Trust, Department of Communications and Networking, Aalto-yliopisto, and Aalto University
- Subjects
Traffic classification ,Machine learning ,Data fusion ,Security management - Abstract
Traffic classification groups similar or related traffic data, which is one main stream technique of data fusion in the field of network management and security. With the rapid growth of network users and the emergence of new networking services, network traffic classification has attracted increasing attention. Many new traffic classification techniques have been developed and widely applied. However, the existing literature lacks a thorough survey to summarize, compare and analyze the recent advances of network traffic classification in order to deliver a holistic perspective. This paper carefully reviews existing network traffic classification methods from a new and comprehensive perspective by classifying them into five categories based on representative classification features, i.e., statistics-based classification, correlation-based classification, behavior-based classification, payload-based classification, and port-based classification. A series of criteria are proposed for the purpose of evaluating the performance of existing traffic classification methods. For each specified category, we analyze and discuss the details, advantages and disadvantages of its existing methods, and also present the traffic features commonly used. Summaries of investigation are offered for providing a holistic and specialized view on the state-of-art. For convenience, we also cover a discussion on the mostly used datasets and the traffic features adopted for traffic classification in the review. At the end, we identify a list of open issues and future directions in this research field.
- Published
- 2021
8. Granular transfer learning.
- Author
-
Al-Hmouz, Rami, Pedrycz, Witold, Awadallah, Medhat, and Ammari, Ahmed
- Subjects
- *
MACHINE learning , *GRANULAR computing , *INTERVAL analysis , *FUZZY sets , *KNOWLEDGE transfer - Abstract
Transfer learning is aimed at supporting the design of machine learning models in the target domain D t , given that the knowledge (model) has already been constructed in the source domain D s. The domains D t and D s (as well as the corresponding tasks T s and T t) are similar, yet not identical. As a result, the model transferred from D s to D t in this new environment exhibits its relevance (credibility) only to some limited extent. In this study, we develop an original approach, where we advocate that the knowledge transfer (model transfer) gives rise to a granular model where the level of information granularity associated with the produced results quantifies the relevance (quality or credibility) of the transferred model. In other words, we stress that the quality of knowledge transferred to D t becomes captured through a granular generalization of the original numeric model. The overall systematic design process is elaborated on by focusing on the development process of granular neural networks carried out on a basis of the numeric neural networks coming from D s. The key aspect of the design is to elevate the existing numeric neural network to its granular counterpart by admitting that the connections of the developed model come in the form of information granules, in particular intervals and fuzzy sets. The optimization process is guided by adjusting (optimizing) the level of information granularity being regarded as an essential design asset. The optimized performance index builds upon the descriptors of information granules commonly encountered in Granular Computing. In particular, coverage and specificity measures are treated as sound performance indicators of the quality of knowledge transfer (viz. the performance of the granular neural network expressed in the target domain). Several illustrative examples are provided to visualize the performance of the established design environment. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
9. Semisupervised Learning via Axiomatic Fuzzy Set Theory and SVM.
- Author
-
Jia, Wenjuan, Liu, Xiaodong, Wang, Yuangang, Pedrycz, Witold, and Zhou, Juxiang
- Abstract
In this article, we present a semantic semisupervised learning (Semantic SSL) approach targeted at unifying two machine-learning paradigms in a mutually beneficial way, where the classical support vector machine (SVM) learns to reveal primitive logic facts from data, while axiomatic fuzzy set (AFS) theory is utilized to exploit semantic knowledge and correct the wrongly perceived facts for improving the machine-learning model. This novel semisupervised method can easily produce interpretable semantic descriptions to outline different categories by forming a fuzzy set with semantic explanations realized on the basis of the AFS theory. Besides, it is known that disagreement-based semisupervised learning (SSL) can be viewed as an excellent schema so that a co-training approach with SVM and the AFS theory can be utilized to improve the resulting learning performance. Furthermore, an evaluation index is used to prune descriptions to deliver promising performance. Compared with other semisupervised approaches, the proposed approach can build a structure to reflect data-distributed information with unlabeled data and labeled data, so that the hidden information embedded in both labeled and unlabeled data can be sufficiently utilized and can potentially be applied to achieve good descriptions of each category. Experimental results demonstrate that this approach can offer a concise, comprehensible, and precise SSL frame, which strikes a balance between the interpretability and the accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. A Survey on Trust Evaluation Based on Machine Learning
- Author
-
Wang, Jingwen, Jing, Xuyang, Yan, Zheng, Fu, Yulong, Pedrycz, Witold, Yang, Laurence T., Xidian University, Network Security and Trust, University of Alberta, Saint Francis Xavier University, Department of Communications and Networking, Aalto-yliopisto, and Aalto University
- Subjects
evaluation requirements ,machine learning ,performance metrics ,Trust evaluation - Abstract
Trust evaluation is the process of quantifying trust with attributes that influence trust. It faces a number of severe issues such as lack of essential evaluation data, demand of big data process, request of simple trust relationship expression, and expectation of automation. In order to overcome these problems and intelligently and automatically evaluate trust, machine learning has been applied into trust evaluation. Researchers have proposed many methods to use machine learning for trust evaluation. However, the literature still lacks a comprehensive literature review on this topic. In this article, we perform a thorough survey on trust evaluation based on machine learning. First, we cover essential prerequisites of trust evaluation and machine learning. Then, we justify a number of requirements that a sound trust evaluation method should satisfy, and propose them as evaluation criteria to assess the performance of trust evaluation methods. Furthermore, we systematically organize existing methods according to application scenarios and provide a comprehensive literature review on trust evaluation from the perspective of machine learning's function in trust evaluation and evaluation granularity. Finally, according to the completed review and evaluation, we explore some open research problems and suggest the directions that are worth our research effort in the future.
- Published
- 2020
11. Development of two-phase logic-oriented fuzzy AND/OR network.
- Author
-
Alateeq, Majed and Pedrycz, Witold
- Subjects
- *
FUZZY neural networks , *FUZZY sets , *MACHINE learning , *LEARNING ability - Abstract
The architecture of AND/OR fuzzy neural networks exhibits outstanding learning abilities and significant interpretation capabilities. However, AND/OR networks suffer from structure-related problems namely low efficiency and slow convergence of learning due to several reasons such as high dimensionality and gradient-based learning algorithms which lead to a visible computing overhead. In this paper, we present a two-phase fuzzy logic-oriented network design that is composed of AND/OR neurons. This design takes advantages of Randomized Neural Network (RNN) to achieve higher convergence while exhibiting good nonlinear approximation capabilities. A gradient–based learning algorithm is implemented in the second phase of the design to further reduce values of performance index. The quality of the proposed design and resulting architecture is quantified through the use of numeric data along with fuzzy sets (information granules). Experimental results meet the research's objectives and the proposed design methodology opens up new future directions for proceeding with more improvements. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. Disjunctive Fuzzy Neural Networks: A New Splitting-Based Approach to Designing a T–S Fuzzy Model.
- Author
-
Wang, Ning, Pedrycz, Witold, Yao, Wen, Chen, Xiaoqian, and Zhao, Yong
- Subjects
FUZZY neural networks ,GREEDY algorithms ,MACHINE learning ,FUZZY numbers ,DECISION trees - Abstract
This article proposes a new network approach toward the implementation of Takagi–Sugeno (T–S) fuzzy models referred to as disjunctive fuzzy neural networks (DJFNNs). The proposed DJFNN involves a novel network architecture and a greedy learning algorithm. Being different from the existing grid-based and clustering-based network architectures, the proposed architecture adds an OR neural layer positioned between the fuzzification layer and the rule layer. In this way, the implied constraint between the number of rules and the number of fuzzy labels is excluded so that a curse of dimensionality can be overcome and more interpretable models are formed. Furthermore, inspired by the core algorithm for building a decision tree, a top–down, nonbacktracking, and greedy algorithm is proposed to learn the unknown parameters of the networks. The input space splits into smaller and smaller subspace along the predefined fuzzy grids in a supervised manner meanwhile the associated conditions of the T–S fuzzy model are identified. The greedy algorithm is applicable to high-dimensional problems since there is no exponential growth in time or space as the dimensionality increases. The new network architecture and greedy learning algorithm make the proposed DJFNN a regression model of high interpretability and good prediction capability, particularly suitable for solving the high-dimensional problems. The DJFNN was experimented with using a synthetic dataset and 28 real-world datasets and compared with classical and state-of-the-art methods through nonparametric statistical tests. The results confirmed the effectiveness of the DJFNN in terms of accuracy, interpretability, and computational cost. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. Residual-Sparse Fuzzy C -Means Clustering Incorporating Morphological Reconstruction and Wavelet Frame.
- Author
-
Wang, Cong, Pedrycz, Witold, Li, ZhiWu, Zhou, MengChu, and Zhao, Jun
- Subjects
IMAGE segmentation ,ALGORITHMS ,WAVELET transforms ,IMAGE reconstruction ,GAUSSIAN mixture models - Abstract
In this article, we develop a residual-sparse Fuzzy C-Means (FCM) algorithm for image segmentation, which furthers FCM's robustness by realizing the favorable estimation of the residual (e.g., unknown noise) between an observed image and its ideal version (noise-free image). To achieve a sound tradeoff between detail preservation and noise suppression, morphological reconstruction is used to filter the observed image. By combining the observed and filtered images, a weighted sum image is generated. Tight wavelet frame decomposition is used to transform the weighted sum image into its corresponding feature set. Taking such feature set as data for clustering, we impose an $\ell _0$ regularization term on residual to FCM's objective function, thus resulting in residual-sparse FCM, where spatial information is introduced for improving its robustness and making residual estimation more reliable. To further enhance segmentation accuracy of the proposed FCM, we employ morphological reconstruction to smoothen the labels generated by clustering. Finally, based on the prototypes and smoothed labels, a segmented image is reconstructed by using tight wavelet frame reconstruction. Experimental results regarding synthetic, medical, and real-world images show that the proposed algorithm is effective and efficient, and outperforms its peers. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
14. Design of Reinforced Hybrid Fuzzy Rule-Based Neural Networks Driven to Inhomogeneous Neurons and Tournament Selection.
- Author
-
Zhang, Congcong, Oh, Sung-Kwun, Fu, Zunwei, and Pedrycz, Witold
- Subjects
FUZZY neural networks ,NEURONS ,RETINAL blood vessels ,EVOLUTIONARY computation ,ALGORITHMS ,MACHINE learning - Abstract
In this article, we introduce novel reinforced hybrid fuzzy rule-based neural networks (RHFNNs). This article is concerned with the development of the design methodologies of hybrid fuzzy rule-based model for constructing the network structure and enhancing its predictive abilities through the combination of inhomogeneous neurons [i.e., clustering-based polynomial neurons (CPNs) and polynomial neurons (PNs)] and tournament selection. The key points of the proposed RHFNN are enumerated as follows: The first layer of the proposed network consists of CPNs. CPN can effectively reflect the complex nonlinear structure encountered in the data space, and refine (granulate) it with the help of the clustering algorithm. Two types of CPNs including hard C-means (HCM) clustering-based polynomial neuron (HCPN) and fuzzy C-means (FCM) clustering-based polynomial neuron (FCPN) are designed. According to the type of CPN used in the first layer, RHFNN can be categorized into two types, namely, RHFNN based on HCPN (HRHFNN) and RHFNN based on FCPN (FRHFNN). We use PNs to construct the second and consecutive layers. PN can identify and approximate the nonlinear relationship among system's inputs and outputs. A tournament-based performance selection (TPS) algorithm stemming from evolutionary computation is used for selection of neuron. TPS not only ensures that the candidate nodes have sufficient fitting ability but also enhances the individual diversity in the node set and provides the abilities to generate better prediction nodes. In addition, L2-norm regularization is considered to reduce the deviation between coefficients and ameliorate overfitting as well as boost generalization ability. The performance of RHFNN is discussed through a variety of publicly available machine learning datasets. From the experimental results, we conclude that RHFNN achieves the best prediction accuracy on 13 of 15 datasets; the statistical analysis also confirms the superiority of RHFNN. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
15. Guest Editorial Evolutionary Computation Meets Deep Learning.
- Author
-
Ding, Weiping, Pedrycz, Witold, Yen, Gary G., and Xue, Bing
- Subjects
DEEP learning ,EVOLUTIONARY computation ,COMPUTER vision ,MACHINE learning ,SPEECH perception ,CONSTRAINED optimization - Abstract
Deep learning is a timely research direction in machine learning, where breakthrough progress has been made in both academe and industries, bringing promising results in speech recognition, computer vision, industrial control and automation, etc. The motivation of deep learning is primarily to establish a model to simulate the neural connection structure of the human brain. While dealing with complex tasks, deep learning adopts a number of transformation stages to deliver the in-depth description and interpretation of the data. Deep learning achieves exceptional power and flexibility by learning to represent the task through a nested hierarchy of layers, with more abstract representations formed successively in terms of less abstract ones. One of the key issues of existing deep learning approaches is that the meaningful representations can be learned only when their hyperparameter settings are properly specified beforehand, and general parameters are learned during the training process. Until now, not much research has been dedicated to automatically set the hyperparameters, and accurately find the globally optimal general parameters. However, this problem can be formulated as optimization problems, including discrete optimization, constrained optimization, large-scale global optimization, and multiobjective optimization, by engaging mechanisms of evolutionary computation. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. A Survey on Trust Evaluation Based on Machine Learning.
- Author
-
JINGWEN WANG, XUYANG JING, ZHENG YAN, YULONG FU, PEDRYCZ, WITOLD, and YANG, LAURENCE T.
- Subjects
MACHINE learning ,TRUST ,ELECTRONIC data processing ,BIG data - Abstract
Trust evaluation is the process of quantifying trust with attributes that influence trust. It faces a number of severe issues such as lack of essential evaluation data, demand of big data process, request of simple trust relationship expression, and expectation of automation. In order to overcome these problems and intelligently and automatically evaluate trust, machine learning has been applied into trust evaluation. Researchers have proposed many methods to use machine learning for trust evaluation. However, the literature still lacks a comprehensive literature review on this topic. In this article, we perform a thorough survey on trust evaluation based on machine learning. First, we cover essential prerequisites of trust evaluation and machine learning. Then, we justify a number of requirements that a sound trust evaluation method should satisfy, and propose them as evaluation criteria to assess the performance of trust evaluation methods. Furthermore, we systematically organize existing methods according to application scenarios and provide a comprehensive literature review on trust evaluation from the perspective of machine learning's function in trust evaluation and evaluation granularity. Finally, according to the completed review and evaluation, we explore some open research problems and suggest the directions that are worth our research effort in the future. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
17. Reinforcement learning-assisted evolutionary algorithm: A survey and research opportunities.
- Author
-
Song, Yanjie, Wu, Yutong, Guo, Yangyang, Yan, Ran, Suganthan, Ponnuthurai Nagaratnam, Zhang, Yue, Pedrycz, Witold, Das, Swagatam, Mallipeddi, Rammohan, Ajani, Oladayo Solomon, and Feng, Qiang
- Subjects
MACHINE learning ,REINFORCEMENT learning ,EVOLUTIONARY algorithms ,RESEARCH personnel - Abstract
Evolutionary algorithms (EA), a class of stochastic search methods based on the principles of natural evolution, have received widespread acclaim for their exceptional performance in various real-world optimization problems. While researchers worldwide have proposed a wide variety of EAs, certain limitations remain, such as slow convergence speed and poor generalization capabilities. Consequently, numerous scholars actively explore improvements to algorithmic structures, operators, search patterns, etc., to enhance their optimization performance. Reinforcement learning (RL) integrated as a component in the EA framework has demonstrated superior performance in recent years. This paper presents a comprehensive survey on integrating reinforcement learning into the evolutionary algorithm, referred to as reinforcement learning-assisted evolutionary algorithm (RL-EA). We begin with the conceptual outlines of reinforcement learning and the evolutionary algorithm. We then provide a taxonomy of RL-EA. Subsequently, we discuss the RL-EA integration method, the RL-assisted strategy adopted by RL-EA, and its applications according to the existing literature. The RL-assisted procedure is divided according to the implemented functions including solution generation, learnable objective function, algorithm/operator/sub-population selection, parameter adaptation, and other strategies. Additionally, different attribute settings of RL in RL-EA are discussed. In the applications of RL-EA section, we also demonstrate the excellent performance of RL-EA on several benchmarks and a range of public datasets to facilitate a quick comparative study. Finally, we analyze potential directions for future research. This survey serves as a rich resource for researchers interested in RL-EA as it overviews the current state-of-the-art and highlights the associated challenges. By leveraging this survey, readers can swiftly gain insights into RL-EA to develop efficient algorithms, thereby fostering further advancements in this emerging field. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
18. Prediction Intervals for Granular Data Streams Based on Evolving Type-2 Fuzzy Granular Neural Network Dynamic Ensemble.
- Author
-
Liu, Yang, Zhao, Jun, Wang, Wei, and Pedrycz, Witold
- Subjects
FUZZY neural networks ,DATABASES ,SMART structures ,MACHINE learning ,ARTIFICIAL neural networks - Abstract
Granular data streams (GDSs) are a class of high-level abstract multitime scale description of data streams. Prediction intervals (PIs) for GDSs that provide estimated values as well as their corresponding reliability play an important role for assisting on-site workers to perceive the nonstationary environment in real time. However, constructing reliable PIs for GDSs constitutes a significant challenge. To provide a solution to the problem, an interval type-2 (IT2) fuzzy granular neural network (FGNN) dynamic ensemble approach (IT2FGNNDEnsemble) is proposed in this article. To fully reflect the uncertainty of GDSs, an interval value learning algorithm based IT2FGNN is developed, which can automatically generate, prune, merge, and realize recall in a single-pass learning mode. In addition, an evolving dynamic ensemble method is presented by providing an adaptive structure that considers a tradeoff between coverage and width of PIs, which can dynamically generate and prune the element of an ensemble according to current data tendency. A number of synthetic and industrial data streams experimentally validate the performance of the proposed IT2FGNNDEnsemble by using the state-of-the-art comparative methods. It is demonstrated that the proposed approach exhibits a good performance on PIs for practical applications. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. Design of Fuzzy Ensemble Architecture Realized With the Aid of FCM-Based Fuzzy Partition and NN With Weighted LSE Estimation.
- Author
-
Roh, Seok-Beom, Oh, Sung-Kwun, Pedrycz, Witold, and Fu, Zunwei
- Subjects
LEAST squares ,MACHINE learning ,ARTIFICIAL neural networks - Abstract
Neural networks (NNs) with least square error (LSE) estimation form a certain type of single hidden layer feed-forward NNs. In this class of networks, the input connections (weights) and the biases of hidden neurons are generated randomly and fixed after being generated. The output connections are estimated by the LSE method rather than the back-propagation method. The random generation of the input connection weights and the hidden biases results in the larger number of hidden neurons to assure the quality of classification performance. To reduce the number of neurons in the hidden layer while maintaining the classification performance, we apply a “divide and conquer” strategy in this article. In other words, we divide an overall input space into several subspaces by using information granulation technique (Fuzzy C-Means clustering algorithm) and determine the local decision boundaries among related subspaces. A decision boundary defined in the input space can be considered as being composed of several decision boundaries defined in subspaces that form the entire input space. For the decision boundaries defined in the subspaces, their nonlinearity becomes lower in comparison with the one being encountered when considering the entire input space. Through the weighted LSE estimation instead of using the LSE estimation method, the connections of several NNs can be estimated without interfering with each other. After estimating the weights, the decision boundaries defined in the related subspaces are merged to a single decision boundary by using fuzzy ensemble technique. Several machine learning datasets and one real world application dataset are used to evaluate and validate the proposed fuzzy ensemble classifier. Based on the experimental results, the proposed classifier shows better classification performance when compared with the performance of some selected classifiers. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
20. A design of information granule-based under-sampling method in imbalanced data classification.
- Author
-
Liu, Tianyu, Zhu, Xiubin, Pedrycz, Witold, and Li, Zhiwu
- Subjects
GRANULATION ,SUPPORT vector machines ,INFORMATION design ,GRANULAR computing ,DATA distribution ,MACHINE learning - Abstract
In numerous real-world problems, we are faced with difficulties in learning from imbalanced data. The classification performance of a "standard" classifier (learning algorithm) is evidently hindered by the imbalanced distribution of data. The over-sampling and under-sampling methods have been researched extensively with the aim to increase the predication accuracy over the minority class. However, traditional under-sampling methods tend to ignore important characteristics pertinent to the majority class. In this paper, a novel under-sampling method based on information granules is proposed. The method exploits the concepts and algorithms of granular computing. First, information granules are built around the selected patterns coming from the majority class to capture the essence of the data belonging to this class. In the sequel, the resultant information granules are evaluated in terms of their quality and those with the highest specificity values are selected. Next, the selected numeric data are augmented by some weights implied by the size of information granules. Finally, a support vector machine and a K-nearest-neighbor classifier, both being regarded here as representative classifiers, are built based on the weighted data. Experimental studies are carried out using synthetic data as well as a suite of imbalanced data sets coming from the public machine learning repositories. The experimental results quantify the performance of support vector machine and K-nearest-neighbor with under-sampling method based on information granules. The results demonstrate the superiority of the performance obtained for these classifiers endowed with conventional under-sampling method. In general, the improvement of performance expressed in terms of G-means is over 10% when applying information granule under-sampling compared with random under-sampling. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
21. Deep learning : algorithms and applications.
- Author
-
Chen, Shyi-Ming and Pedrycz, Witold
- Subjects
Computer algorithms ,Machine learning - Abstract
Summary: This book presents a wealth of deep-learning algorithms and demonstrates their design process. It also highlights the need for a prudent alignment with the essential characteristics of the nature of learning encountered in the practical problems being tackled. Intended for readers interested in acquiring practical knowledge of analysis, design, and deployment of deep learning solutions to real-world problems, it covers a wide range of the paradigms algorithms and their applications in diverse areas including imaging, seismic tomography, smart grids, surveillance and security, and health care, among others. Featuring systematic and comprehensive discussions on the development processes, their evaluation, and relevance, the book offers insights into fundamental design strategies for algorithms of deep learning.
- Published
- 2020
22. Development and Analysis of Deep Learning Architectures.
- Author
-
Pedrycz, Witold and Chen, Shyi-Ming
- Subjects
Big data ,Deep learning ,Machine learning - Abstract
Summary: This book offers a timely reflection on the remarkable range of algorithms and applications that have made the area of deep learning so attractive and heavily researched today. Introducing the diversity of learning mechanisms in the environment of big data, and presenting authoritative studies in fields such as sensor design, health care, autonomous driving, industrial control and wireless communication, it enables readers to gain a practical understanding of design. The book also discusses systematic design procedures, optimization techniques, and validation processes.
- Published
- 2019
23. A disease diagnosis system for smart healthcare based on fuzzy clustering and battle royale optimization.
- Author
-
Yan, Fei, Huang, Hesheng, Pedrycz, Witold, and Hirota, Kaoru
- Subjects
MACHINE learning ,OPTIMIZATION algorithms ,DIAGNOSIS ,FUZZY neural networks ,ELECTRONIC health records ,FEATURE selection - Abstract
The ongoing growth of the Internet of Things and machine learning technology have provided increased motivation for the development of smart healthcare. In this study, a disease diagnosis system is proposed for remote identification and early prediction in smart healthcare environments. The originality of this study resides in the innovative implementation of ensuing modules to improve diagnostic accuracy of the system. First, fuzzy clustering based on the forest optimization algorithm is employed to detect outliers and a self-organizing fuzzy logic classifier is applied to supplement missing data in electronic medical records (EMRs). A feature selection technique using the battle royale optimization algorithm is then developed to remove redundant information and identify optimal EMR features. The refined and fused data are further classified using an eigenvalue-based machine learning algorithm to determine whether a patient exhibits a certain disease. Simulation experiments are conducted with widely used heart disease and diabetes datasets to evaluate the performance of the proposed system, using accuracy, precision, recall, and F-measure as evaluation metrics. • A diagnostic system is proposed for early disease prediction in smart healthcare. • Fuzzy clustering is applied to remove outliers from electronic medical records. • A self-organizing fuzzy logic classifier is developed to supplement missing data. • A feature selection scheme is included to identify optimal features from the data. • Eigenvalue classification is used to ascertain whether a patient exhibits a disease. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. Granular neural networks: A study of optimizing allocation of information granularity in input space.
- Author
-
Song, Mingli, Jing, Yukai, and Pedrycz, Witold
- Subjects
ARTIFICIAL neural networks ,MULTILAYER perceptrons ,GENETIC algorithms ,MACHINE learning - Abstract
Abstract In this paper, we develop a granular input space for neural networks, especially for multilayer perceptrons (MLPs). Unlike conventional neural networks, a neural network with granular input is an augmented study on a basis of a well learned numeric neural network. We explore an efficient way of forming granular input variables so that the corresponding granular outputs of the neural network achieve the highest values of the criteria of specificity (and support). When we augment neural networks through distributing information granularities across input variables, the output of a network has different levels of sensitivity on different input variables. Capturing the relationship between input variables and output result becomes of a great help for mining knowledge from the data. And in this way, important features of the data can be easily found. As an essential design asset, information granules are considered in this construct. The quantification of information granules is viewed as levels of granularity which is given by the expert. The detailed optimization procedure of allocation of information granularity is realized by an improved partheno genetic algorithm (IPGA). The proposed algorithm is testified effective by some numeric studies completed for synthetic data and data coming from the machine learning and StatLib repositories. Moreover, the experimental studies offer a deep insight into the specificity of input features. Highlights • An algorithm of developing a granular neural network with granular input on basis of a designed network is proposed. • The influence of different levels of information granularity on the performance of the granular network is studied. • An improved partheno genetic algorithm is used to optimize the allocation of information granularity. • Synthetic data and real world data sets are used to testify the effectiveness of the algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
25. Fuzzy Rule-Based Domain Adaptation in Homogeneous and Heterogeneous Spaces.
- Author
-
Zuo, Hua, Lu, Jie, Zhang, Guangquan, and Pedrycz, Witold
- Subjects
HOMOGENEOUS spaces ,DATA modeling ,MACHINE learning - Abstract
Domain adaptation aims to leverage knowledge acquired from a related domain (called a source domain) to improve the efficiency of completing a prediction task (classification or regression) in the current domain (called the target domain), which has a different probability distribution from the source domain. Although domain adaptation has been widely studied, most existing research has focused on homogeneous domain adaptation, where both domains have identical feature spaces. Recently, a new challenge proposed in this area is heterogeneous domain adaptation where both the probability distributions and the feature spaces are different. Moreover, in both homogeneous and heterogeneous domain adaptation, the greatest efforts and major achievements have been made with classification tasks, while successful solutions for tackling regression problems are limited. This paper proposes two innovative fuzzy rule-based methods to deal with regression problems. The first method, called fuzzy homogeneous domain adaptation, handles homogeneous spaces while the second method, called fuzzy heterogeneous domain adaptation, handles heterogeneous spaces. Fuzzy rules are first generated from the source domain through a learning process; these rules, also known as knowledge, are then transferred to the target domain by establishing a latent feature space to minimize the gap between the feature spaces of the two domains. Through experiments on synthetic datasets, we demonstrate the effectiveness of both methods and discuss the impact of some of the significant parameters that affect performance. Experiments on real-world datasets also show that the proposed methods improve the performance of the target model over an existing source model or a model built using a small amount of target data. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
26. Combining heterogeneous classifiers via granular prototypes.
- Author
-
Nguyen, Tien Thanh, Nguyen, Mai Phuong, Pham, Xuan Cuong, Liew, Alan Wee-Chung, and Pedrycz, Witold
- Subjects
GRANULAR materials ,SUPPORT vector machines ,CLASSIFICATION algorithms ,MACHINE learning ,ARTIFICIAL intelligence - Abstract
Abstract In this study, a novel framework to combine multiple classifiers in an ensemble system is introduced. Here we exploit the concept of information granule to construct granular prototypes for each class on the outputs of an ensemble of base classifiers. In the proposed method, uncertainty in the outputs of the base classifiers on training observations is captured by an interval-based representation. To predict the class label for a new observation, we first determine the distances between the output of the base classifiers for this observation and the class prototypes, then the predicted class label is obtained by choosing the label associated with the shortest distance. In the experimental study, we combine several learning algorithms to build the ensemble system and conduct experiments on the UCI, colon cancer, and selected CLEF2009 datasets. The experimental results demonstrate that the proposed framework outperforms several benchmarked algorithms including two trainable combining methods, i.e., Decision Template and Two Stages Ensemble System, AdaBoost, Random Forest, L2-loss Linear Support Vector Machine, and Decision Tree. Highlights • We modelled the base classifiers' output by using a granular prototype. • We quantified the distance between base classifiers' output and a granular prototype. • We proposed a novel framework to combine multiple classifiers in an ensemble system. • The proposed method is highly competitive to some state-of-the-art ensemble methods. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
27. Hidden Markov Models Based Approaches to Long-Term Prediction for Granular Time Series.
- Author
-
Guo, Hongyue, Pedrycz, Witold, and Liu, Xiaodong
- Subjects
HIDDEN Markov models ,PREDICTION models ,TIME series analysis ,GRANULAR computing ,MACHINE learning ,SUPPORT vector machines - Abstract
In time series forecasting, a challenging and important task is to realize long-term forecasting that is both accurate and transparent. In this study, we propose a long-term prediction approach by transforming the original numerical data into some meaningful and interpretable entities following the principle of justifiable granularity. The obtained sequences exhibiting sound semantics may have different lengths, which bring some difficulties when carrying out predictions. To equalize these temporal sequences, we propose to adjust their lengths by involving the dynamic time warping (DTW) distance. Two theorems are included to ensure the correctness of the proposed equalization approach. Finally, we exploit hidden Markov models (HMM) to derive the relations existing in the granular time series. A series of experiments using publicly available data are conducted to assess the performance of the proposed prediction method. The comparative analysis demonstrates the performance of the prediction delivered by the proposed model. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
28. Evolving Ensemble Fuzzy Classifier.
- Author
-
Pratama, Mahardhika, Pedrycz, Witold, and Lughofer, Edwin
- Subjects
FUZZY systems ,STATISTICAL ensembles ,MACHINE learning ,DATA mining ,PARSIMONIOUS models ,GAUSSIAN function - Abstract
The concept of ensemble learning offers a promising avenue in learning from data streams under complex environments because it better addresses the bias and variance dilemma than its single-model counterpart and features a reconfigurable structure, which is well suited to the given context. While various extensions of ensemble learning for mining nonstationary data streams can be found in the literature, most of them are crafted under static base-classifier and revisit preceding samples in the sliding window for a retraining step. This feature causes computationally prohibitive complexity and is not flexible enough to cope with rapidly changing environments. Their complexities are often demanding because they involve a large collection of offline classifiers due to the absence of structural complexities reduction mechanisms and lack of an online feature selection mechanism. A novel evolving ensemble classifier, namely Parsimonious Ensemble (pENsemble), is proposed in this paper. pENsemble differs from existing architectures in the fact that it is built upon an evolving classifier from data streams, termed Parsimonious Classifier. pENsemble is equipped by an ensemble pruning mechanism, which estimates a localized generalization error of a base classifier. A dynamic online feature selection scenario is integrated into the pENsemble. This method allows for dynamic selection and deselection of input features on the fly. pENsemble adopts a dynamic ensemble structure to output a final classification decision where it features a novel drift detection scenario to grow the ensemble's structure. The efficacy of the pENsemble has been numerically demonstrated through rigorous numerical studies with dynamic and evolving data streams, where it delivers the most encouraging performance in attaining a tradeoff between accuracy and complexity. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
29. Automatic Selection of Process Corner Simulations for Faster Design Verification.
- Author
-
Shoniker, Michael, Oleynikov, Oleg, Cockburn, Bruce F., Han, Jie, Rana, Manish, and Pedrycz, Witold
- Subjects
INTEGRATED circuit design ,SEMICONDUCTORS ,BENCHMARK testing (Engineering) ,MACHINE learning ,GAUSSIAN processes - Abstract
Integrated circuit designs are verified in simulation over a set of process corners, which are combinations of expected transistor properties, power supply voltages, and die temperatures. The simulation time per corner can be long and semiconductor processes can have more than 1000 corners. Simulation is thus a serious bottleneck in design verification. We propose an algorithm that selects the smallest number of process corner simulations that are required to estimate minimum and/or maximum values of the output functions that model circuit behavior. Using our best corner selection algorithm, the required number of process corner simulations is reduced by an average of 79% (a speed-up of 4.71) with respect to a set of 46 output functions from nine industrial benchmark circuits. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
30. Mining constrained inter-sequence patterns: a novel approach to cope with item constraints.
- Author
-
Le, Tuong, Nguyen, Anh, Huynh, Bao, Vo, Bay, and Pedrycz, Witold
- Subjects
DATA mining ,ARTIFICIAL intelligence ,ASSOCIATION rule mining ,BIG data ,MACHINE learning - Abstract
Data mining has become increasingly important in the Internet era. The problem of mining inter-sequence pattern is a sub-task in data mining with several algorithms in the recent years. However, these algorithms only focus on the transitional problem of mining frequent inter-sequence patterns and most frequent inter-sequence patterns are either redundant or insignificant. As such, it can confuse end users during decision-making and can require too much system resources. This led to the problem of mining inter-sequence patterns with item constraints, which addressed the problem when end-users only concerned the patterns contained a number of specific items. In this paper, we propose two novel algorithms for it. First is the ISP-IC (Inter-Sequence Pattern with Item Constraint mining) algorithm based on a theorem that quickly determines whether an inter-sequence pattern satisfies the constraints. Then, we propose a way to improve the strategy of ISP-IC, which is then applied to the i
ISP-IC algorithm to enhance the performance of the process. Finally, pi ISP-IC, a parallel version of iISP-IC, will be presented. Experimental results show that pi ISP-IC algorithm outperforms the post-processing of the-state-of-the-art method for mining inter-sequence patterns (EISP-Miner), ISP-IC, and iISP-IC algorithms in most of the cases. [ABSTRACT FROM AUTHOR] - Published
- 2018
- Full Text
- View/download PDF
31. Granular Fuzzy Regression Domain Adaptation in Takagi–Sugeno Fuzzy Models.
- Author
-
Zuo, Hua, Zhang, Guangquan, Pedrycz, Witold, Behbood, Vahid, and Lu, Jie
- Subjects
FUZZY sets ,MATHEMATICAL models ,GRANULAR computing - Abstract
In classical data-driven machine learning methods, massive amounts of labeled data are required to build a high-performance prediction model. However, the amount of labeled data in many real-world applications is insufficient, so establishing a prediction model is impossible. Transfer learning has recently emerged as a solution to this problem. It exploits the knowledge accumulated in auxiliary domains to help construct prediction models in a target domain with inadequate training data. Most existing transfer learning methods solve classification tasks; only a few are devoted to regression problems. In addition, the current methods ignore the inherent phenomenon of information granularity in transfer learning. In this study, granular computing techniques are applied to transfer learning. Three granular fuzzy regression domain adaptation methods to determine the estimated values for a regression target are proposed to address three challenging cases in domain adaptation. The proposed granular fuzzy regression domain adaptation methods change the input and/or output space of the source domain's model using space transformation, so that the fuzzy rules are more compatible with the target data. Experiments on synthetic and real-world datasets validate the effectiveness of the proposed methods. [ABSTRACT FROM PUBLISHER]
- Published
- 2018
- Full Text
- View/download PDF
32. Fuzzy clustering with nonlinearly transformed data.
- Author
-
Zhu, Xiubin, Pedrycz, Witold, and Li, Zhiwu
- Subjects
FUZZY clustering technique ,NONLINEAR theories ,MATHEMATICAL functions ,MATHEMATICAL optimization ,MACHINE learning - Abstract
The Fuzzy C-Means (FCM) algorithm is a widely used objective function-based clustering method exploited in numerous applications. In order to improve the quality of clustering algorithms, this study develops a novel approach, in which a transformed data-based FCM is developed. Two data transformation methods are proposed, using which the original data are projected in a nonlinear fashion onto a new space of the same dimensionality as the original one. Next, clustering is carried out on the transformed data. Two optimization criteria, namely a classification error and a reconstruction error, are introduced and utilized to guide the optimization of the performance of the new clustering algorithm and a transformation of the original data space. Unlike other data transformation methods that require some prior knowledge, in this study, Particle Swarm Optimization (PSO) is used to determine the optimal transformation realized on a basis of a certain performance index. Experimental studies completed for a synthetic data set and a number of data sets coming from the Machine Learning Repository demonstrate the performance of the FCM with transformed data. The experiments show that the proposed fuzzy clustering method achieves better performance (in terms of the clustering accuracy and the reconstruction error) in comparison with the outcomes produced by the generic version of the FCM algorithm. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
33. Fuzzy Regression Transfer Learning in Takagi–Sugeno Fuzzy Models.
- Author
-
Zuo, Hua, Zhang, Guangquan, Pedrycz, Witold, Behbood, Vahid, and Lu, Jie
- Subjects
FUZZY systems ,DATA science ,KNOWLEDGE transfer - Abstract
Data science is a research field concerned with processes and systems that extract knowledge from massive amounts of data. In some situations, however, data shortage renders existing data-driven methods difficult or even impossible to apply. Transfer learning has recently emerged as a way of exploiting previously acquired knowledge to solve new yet similar problems much more quickly and effectively. In contrast to classical data-driven machine learning methods, transfer learning methods exploit the knowledge accumulated from data in auxiliary domains to facilitate predictive modeling in the current domain. A significant number of transfer learning methods that address classification tasks have been proposed, but studies on transfer learning in the case of regression problems are still scarce. This study focuses on using transfer learning techniques to handle regression problems in a domain that has insufficient training data. We propose an original fuzzy regression transfer learning method, based on fuzzy rules, to address the problem of estimating the value of the target for regression. A Takagi–Sugeno fuzzy regression model is developed to transfer knowledge from a source domain to a target domain. Experimental results using synthetic data and real-world datasets demonstrate that the proposed fuzzy regression transfer learning method significantly improves the performance of existing models when tackling regression problems in the target domain. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
34. A supervised gradient-based learning algorithm for optimized entity resolution.
- Author
-
Reyes-Galaviz, Orion F., Pedrycz, Witold, He, Ziyue, and Pizzi, Nick J.
- Subjects
- *
MACHINE learning , *PROBABILITY theory , *TASK performance , *FALSE positive error , *MATHEMATICAL functions , *MATHEMATICAL optimization - Abstract
The task of probabilistic record linkage is to find and link records that refer to the same entity across several disparate data sources. The accurate linking of records (entity resolution) is an important task for the healthcare industry, government, law enforcement, and the private sector, for obvious reasons. However, finding exact matches of an entity can be challenging due to records with typographical, phonetical or other types of errors (noise) found across real-world data sources. Over the years, many comparison functions have been developed to relate pairs of records and produce a similarity score. With a pair of predefined thresholds, one may decide if records pairs match, do not match, or if they require further clerical review. Nevertheless, finding appropriate comparison functions, identity descriptors (fields), threshold values, and efficient classifiers remains a challenging task. In this study, we propose a supervised gradient-based learning model that can adjust its structure and parameters based on matching scores coming from many comparison functions (and applied to many fields), to efficiently classify the records. The design of this structure is transparent, and can potentially allow us to locate which comparison functions and fields are more significant to correctly link the records. To train this structure, we propose a novel performance index that can help learn how to separate matched from non-matched records. Results completed with the use of synthetic datasets affected by different levels of noise and real-world datasets show the effectiveness of the algorithm, which can significantly reduce the number of false positives, false negatives, and the number of records selected for review. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
35. FUZZY TRANSFER LEARNING IN DATA-SHORTAGE AND RAPIDLY CHANGING ENVIRONMENTS.
- Author
-
HUA ZUO, GUANGQUAN ZHANG, BEHBOOD, VAHID, JIE LU, PEDRYCZ, WITOLD, and TONG ZHANG
- Subjects
FUZZY logic ,BIG data ,MACHINE learning ,KNOWLEDGE transfer ,ARTIFICIAL neural networks ,INFORMATION retrieval - Published
- 2016
36. Some new qualitative insights into quality of fuzzy rule-based models.
- Author
-
Kerr-Wilson, Jeremy and Pedrycz, Witold
- Subjects
- *
FUZZY systems , *MATHEMATICAL models , *KNOWLEDGE management , *GENERALIZABILITY theory , *MACHINE learning - Abstract
Rules in fuzzy rule-based models convey essential knowledge about the system under discussion. As such, they capture the essence of relationships occurring among input and output variables. While the quality of such fuzzy models is predominantly related with the accuracy and eventual interpretability of rules (although to a limited extent), the quality of rules being regarded as generic pieces of knowledge has not been studied so far. In this study, we formulate and investigate this problem by looking at the quality of rules, including aspects of (a) stability, (b) generalizability, and (c) conflict. We identify a concept of rule multiplicity, conflict, and study an emergence of rule generalization. A number of pertinent performance indices are developed, and their usage is presented through a series of experimental studies. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
37. Discriminative sparse subspace learning and its application to unsupervised feature selection.
- Author
-
Zhou, Nan, Cheng, Hong, Pedrycz, Witold, Zhang, Yong, and Liu, Huaping
- Subjects
LEARNING ,FEATURE selection ,PROBLEM solving ,MACHINE learning ,INFORMATION theory ,COMPUTER algorithms - Abstract
In order to efficiently use the intrinsic data information, in this study a Discriminative Sparse Subspace Learning (DSSL) model has been investigated for unsupervised feature selection. First, the feature selection problem is formulated as a subspace learning problem. In order to efficiently learn the discriminative subspace, we investigate the discriminative information in the subspace learning process. Second, a two-step TDSSL algorithm and a joint modeling JDSSL algorithm are developed to incorporate the clusters׳ assignment as the discriminative information. Then, a convergence analysis of these two algorithms is provided. A kernelized discriminative sparse subspace learning (KDSSL) method is proposed to handle the nonlinear subspace learning problem. Finally, extensive experiments are conducted on real-world datasets to show the superiority of the proposed approaches over several state-of-the-art approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
38. Multistep Fuzzy Bridged Refinement Domain Adaptation Algorithm and Its Application to Bank Failure Prediction.
- Author
-
Behbood, Vahid, Lu, Jie, Zhang, Guangquan, and Pedrycz, Witold
- Subjects
MACHINE learning ,CLASSIFICATION algorithms ,PREDICTION models ,FUZZY algorithms ,DATA modeling - Abstract
Machine learning plays an important role in data classification and data-based prediction. In some real-world applications, however, the training data (coming from the source domain) and test data (from the target domain) come from different domains or time periods, and this may result in the different distributions of some features. Moreover, the values of the features and/or labels of the datasets might be nonnumeric and involve vague values. Traditional learning-based prediction and classification methods cannot handle these two issues. In this study, we propose a multistep fuzzy bridged refinement domain adaptation algorithm, which offers an effective way to deal with both issues. It utilizes a concept of similarity to modify the labels of the target instances that were initially predicted by a shift-unaware model. It then refines the labels using instances that are most similar to a given target instance. These instances are extracted from mixture domains composed of source and target domains. The proposed algorithm is built on a basis of some data and refines the labels, thus performing completely independently of the shift-unaware prediction model. The algorithm uses a fuzzy set-based approach to deal with the vague values of the features and labels. Four different datasets are used in the experiments to validate the proposed algorithm. The results, which are compared with those generated by the existing domain adaptation methods, demonstrate a significant improvement in prediction accuracy in both the above-mentioned datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
39. Mining system logs to learn error predictors: a case study of a telemetry system.
- Author
-
Russo, Barbara, Succi, Giancarlo, and Pedrycz, Witold
- Subjects
COMPUTER systems ,RELIABILITY in engineering ,MACHINE learning ,SYSTEM failures ,MATHEMATICAL sequences - Abstract
Predicting system failures can be of great benefit to managers that get a better command over system performance. Data that systems generate in the form of logs is a valuable source of information to predict system reliability. As such, there is an increasing demand of tools to mine logs and provide accurate predictions. However, interpreting information in logs poses some challenges. This study discusses how to effectively mining sequences of logs and provide correct predictions. The approach integrates different machine learning techniques to control for data brittleness, provide accuracy of model selection and validation, and increase robustness of classification results. We apply the proposed approach to log sequences of 25 different applications of a software system for telemetry and performance of cars. On this system, we discuss the ability of three well-known support vector machines - multilayer perceptron, radial basis function and linear kernels - to fit and predict defective log sequences. Our results show that a good analysis strategy provides stable, accurate predictions. Such strategy must at least require high fitting ability of models used for prediction. We demonstrate that such models give excellent predictions both on individual applications - e.g., 1 % false positive rate, 94 % true positive rate, and 95 % precision - and across system applications - on average, 9 % false positive rate, 78 % true positive rate, and 95 % precision. We also show that these results are similarly achieved for different degree of sequence defectiveness. To show how good are our results, we compare them with recent studies in system log analysis. We finally provide some recommendations that we draw reflecting on our study. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
40. Unsupervised feature selection via maximum projection and minimum redundancy.
- Author
-
Wang, Shiping, Pedrycz, Witold, Zhu, Qingxin, and Zhu, William
- Subjects
- *
FEATURE selection , *REDUNDANCY in engineering , *INFORMATION retrieval , *FACTORIZATION , *GREEDY algorithms , *KERNEL (Mathematics) - Abstract
Dimensionality reduction is an important and challenging task in machine learning and data mining. It can facilitate data clustering, classification and information retrieval. As an efficient technique for dimensionality reduction, feature selection is about finding a small feature subset preserving the most relevant information. In this paper, we propose a new criterion, called maximum projection and minimum redundancy feature selection, to address unsupervised learning scenarios. First, the feature selection is formalized with the use of the projection matrices and then characterized equivalently as a matrix factorization problem. Second, an iterative update algorithm and a greedy algorithm are proposed to tackle this problem. Third, kernel techniques are considered and the corresponding algorithm is also put forward. Finally, the proposed algorithms are compared with four state-of-the-art feature selection methods. Experimental results reported for six publicly datasets demonstrate the superiority of the proposed algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
41. Subspace learning for unsupervised feature selection via matrix factorization.
- Author
-
Wang, Shiping, Pedrycz, Witold, Zhu, Qingxin, and Zhu, William
- Subjects
- *
FACTORIZATION , *SUBSPACES (Mathematics) , *SUPERVISED learning , *MACHINE learning , *DATA mining , *MATHEMATICAL programming , *KERNEL operating systems - Abstract
Dimensionality reduction is an important and challenging task in machine learning and data mining. Feature selection and feature extraction are two commonly used techniques for decreasing dimensionality of the data and increasing efficiency of learning algorithms. Specifically, feature selection realized in the absence of class labels, namely unsupervised feature selection, is challenging and interesting. In this paper, we propose a new unsupervised feature selection criterion developed from the viewpoint of subspace learning, which is treated as a matrix factorization problem. The advantages of this work are four-fold. First, dwelling on the technique of matrix factorization, a unified framework is established for feature selection, feature extraction and clustering. Second, an iterative update algorithm is provided via matrix factorization, which is an efficient technique to deal with high-dimensional data. Third, an effective method for feature selection with numeric data is put forward, instead of drawing support from the discretization process. Fourth, this new criterion provides a sound foundation for embedding kernel tricks into feature selection. With this regard, an algorithm based on kernel methods is also proposed. The algorithms are compared with four state-of-the-art feature selection methods using six publicly available datasets. Experimental results demonstrate that in terms of clustering results, the proposed two algorithms come with better performance than the others for almost all datasets we experimented with here. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
42. Designing of higher order information granules through clustering heterogeneous granular data.
- Author
-
Wang, Dan, Nie, Peng, Zhu, Xiubin, Pedrycz, Witold, and Li, Zhiwu
- Subjects
GRANULAR computing ,ALGORITHMS ,KNOWLEDGE representation (Information theory) ,MACHINE learning - Abstract
This study is devoted to the generalization of information granules by forming higher order, namely, order-2 information granules. Information granules are semantically meaningful entities, which play a central role in knowledge representation and system modeling in the framework of Granular Computing. The encountered information granules could exhibit significant heterogeneity because of the diversified formal formalisms. To facilitate an effective generalization of heterogeneous granular data when using clustering algorithms, an efficient scheme has been proposed to form a unified representation of various types of granular data by using Possibility–necessity measures. Once the clustering process has been completed in the possibility–necessity feature space, the higher order information granules come as results of decoding by involving the possibility–necessity metrics and fuzzy relational calculus. The extent to which the higher order information granules are supported by the granular data present at a lower level of hierarchy is quantified in terms of the membership degrees obtained in the clustering process. Experimental studies concerning a series of publicly available datasets coming from UCI and KEEL machine learning repositories are carried out in this study. • A novel encoding/decoding method for representing heterogeneous granular data. • Clustering algorithm can be applied to cluster encoded heterogeneous granular data. • Supported by fuzzy relational calculus theory and Possibility-necessity measures. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
43. Near real-time spatial prediction of earthquake-triggered landslides based on global inventories from 2008 to 2022.
- Author
-
Zhang, Aomei, Wang, Xianmin, Pedrycz, Witold, Yang, Qiyuan, Wang, Xuewen, and Guo, Haixiang
- Subjects
- *
LANDSLIDE prediction , *DEEP learning , *EARTHQUAKES , *MACHINE learning , *TOPOGRAPHY , *EARTHQUAKE intensity , *LANDSLIDES - Abstract
Near real-time prediction of earthquake-triggered landslides can rapidly forecast the spatial distribution of coseismic landslides just after a great earthquake, and provide effective support for emergency response. However, the prediction of earthquake-triggered landslides has always been a great challenge because of low accuracy and high false alarms. This work proposes a novel fuzzy deep learning (FuDL) model for near real-time earthquake-triggered landslide spatial prediction. Fuzzy learning theory is for the first time employed in earthquake-triggered landslide prediction. The FuDL has high generalization and robustness, effectively improving the accuracy of earthquake-triggered landslide prediction. Eighteen earthquake-triggered landslide inventories worldwide from 2008 to 2022 are employed to conduct ETL prediction. According to the chronological order, 15 earthquake-triggered landslides from 2008 to 2018 are adopted to train the FuDL model, and 3 earthquake-triggered landslides from 2019 to 2022 are utilized for near real-time earthquake-triggered landslide prediction. Furthermore, this work reveals that ground movement, relatively steep and high topography, and strong seismic intensity are critical factors affecting the spatial distribution of earthquake-triggered landslides. In addition, this work conducted a detailed analysis of the distribution patterns of earthquake-triggered landslides on a global scale. • A novel FuDL model is suggested and achieves high accuracy and good generalization. • FuDL model outperforms the state-of-the-art machine learning or deep learning models. • Topography, ground shake, and seismic intensity dominate the distribution of ETLs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
44. Maximizing minority accuracy for imbalanced pattern classification problems using cost-sensitive Localized Generalization Error Model.
- Author
-
Ng, Wing W.Y., Liu, Zhengxi, Zhang, Jianjun, and Pedrycz, Witold
- Subjects
GENERALIZATION ,NETWORK performance ,MACHINE learning ,MINORITIES - Abstract
Traditional machine learning methods may not yield satisfactory generalization capability when samples in different classes are imbalanced. These methods tend to sacrifice the accuracy of the minority class to improve the overall accuracy without regarding the fact that misclassifications of minority samples usually costs more in many real world applications. Therefore, we propose a neural network training method via a minimization of the cost-sensitive localized generalization error-based objective function (c-LGEM) to achieve a better balance of error yielded by the minority and the majority classes. The c-LGEM emphasizes the minimization of the generalization error of the minority class in a cost-sensitive manner. Experimental results obtained on 16 UCI datasets show that neural networks trained by the c-LGEM yield better performance in comparison to the performance yielded by some existing methods. • We propose a new model to enhance network performance in local regions of samples. • A new neural network training method is proposed for imbalance data. • The proposed model yields better decision boundary for imbalance data. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
45. Fuzzy associative memories with autoencoding mechanisms.
- Author
-
Li, Lina, Pedrycz, Witold, Qu, Ting, and Li, Zhiwu
- Subjects
- *
HIGH-dimensional model representation , *DIMENSION reduction (Statistics) , *PARTICLE swarm optimization , *ASSOCIATIVE storage , *DIFFERENTIAL evolution , *MACHINE learning , *MEMORY - Abstract
Associative memories constructed and operating in the presence of big data offer an effective way to realize association mechanisms aimed at storing and recalling items. In this study, we develop a logic-driven model of two-level fuzzy associative memories augmented by autoencoding processing. It is composed of two functional modules. The first module of this architecture implements an efficient dimensionality reduction of the original high dimensional data with the use of an autoencoder. This helps achieve storing and completing the recall realized by a logic-oriented associative memory which constitutes the second module of the architecture. The optimization of the association matrices studied in the paper involves both gradient-based learning mechanisms and the algorithms of population-based optimization, i.e., particle swarm optimization (PSO) and differential evolution (DE). A suite of experimental studies is presented to quantify the performance of the proposed approach. Comparative studies are also conducted to show and quantify the advantages of the mechanisms of associative recall and storage augmented by the autoencoding process. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
46. Mild Cognitive Impairment Conversion Prediction
- Author
-
Kumar, Nishant, Thakur, Aman, Jha, Nikita, Ankit, Sujata, Bhasin, Harsh, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Swaroop, Abhishek, editor, Kansal, Vineet, editor, Fortino, Giancarlo, editor, and Hassanien, Aboul Ella, editor
- Published
- 2024
- Full Text
- View/download PDF
47. NeuroCogNet: Advanced Computational Intelligence for Neurological Diagnosis
- Author
-
Asutosh, B., Mohanty, Himanshu, Biswal, Nihar Ranjan, Mishra, Sushruta, Jabbar, Kadhim Abbas, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Swaroop, Abhishek, editor, Kansal, Vineet, editor, Fortino, Giancarlo, editor, and Hassanien, Aboul Ella, editor
- Published
- 2024
- Full Text
- View/download PDF
48. Misleading and Ambiguous Factual Information Detection Using an Ensemble Classifier with Voting Average Approach
- Author
-
Panda, Sheetal, Banerjee, Shrimoyee, Mishra, Sushruta, Anand, Kunal, Nsrulaah Faris, Najlaa, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Swaroop, Abhishek, editor, Kansal, Vineet, editor, Fortino, Giancarlo, editor, and Hassanien, Aboul Ella, editor
- Published
- 2024
- Full Text
- View/download PDF
49. Harnessing the Power of Big Data Applications Across Diverse Fields like Science, Internet, and Finance
- Author
-
Choubey, Siddhartha, Jaiswal, Dipti, Jaiswal, Manuraj, Choubey, Abha, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Swaroop, Abhishek, editor, Kansal, Vineet, editor, Fortino, Giancarlo, editor, and Hassanien, Aboul Ella, editor
- Published
- 2024
- Full Text
- View/download PDF
50. Robust and Accurate Weather Forecasting Using an Integrated Complex Cognitive Gradient Boosting Model
- Author
-
Raj, Shreya, Agarwal, Chirag, Tripathy, Hrudaya Kumar, Shnawa, Ammar H., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Swaroop, Abhishek, editor, Kansal, Vineet, editor, Fortino, Giancarlo, editor, and Hassanien, Aboul Ella, editor
- Published
- 2024
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.