91 results on '"Rudy Setiono"'
Search Results
2. Selected Peer-Reviewed Articles from The Internet Data Telecommunication and Satellite 2016, Bali, Indonesia, 17–18 December, 2016
- Author
-
Ford Lumban Gaol, Fithra Faisal Hastiadi, Rudy Setiono, Jan Walters Kruger, and Frank Guan Yunqing
- Subjects
Health (social science) ,General Computer Science ,biology ,business.industry ,General Mathematics ,General Engineering ,biology.organism_classification ,Education ,General Energy ,The Internet ,Satellite (biology) ,Business ,Telecommunications ,General Environmental Science - Published
- 2017
3. Neural network training and rule extraction with augmented discretized input
- Author
-
Arnulfo P. Azcarraga, Yoichi Hayashi, and Rudy Setiono
- Subjects
021103 operations research ,Discretization ,Artificial neural network ,Time delay neural network ,Computer science ,business.industry ,Cognitive Neuroscience ,0211 other engineering and technologies ,Training (meteorology) ,Pattern recognition ,02 engineering and technology ,Interval (mathematics) ,Space (commercial competition) ,computer.software_genre ,Computer Science Applications ,Data set ,Artificial Intelligence ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,Artificial intelligence ,business ,computer - Abstract
The classification and prediction accuracy of neural networks can be improved when they are trained with discretized continuous attributes as additional inputs. Such input augmentation makes it easier for the network weights to form more accurate decision boundaries when the data samples of different classes in the data set are contained in distinct hyper-rectangular subregions in the original input space. In this paper, we present first how a neural network can be trained with augmented discretized inputs. The additional inputs are obtained by dividing the original interval of each continuous attribute into subintervals of equal length. The network is then pruned to remove most of the discretized inputs as well as the original continuous attributes as long as the network still achieves a minimum preset accuracy requirement. We then discuss how comprehensible classification rules can be extracted from the pruned network by analyzing the activations of the network hidden units and the weights of the network connections that remain in the pruned network. Our experiments on artificial data sets show that the rules extracted from the neural networks can perfectly replicate the class membership rules used to create the data perfectly. On real-life benchmark data sets, neural networks trained with augmented discretized inputs are shown to achieve better accuracy than neural networks trained with the original data.
- Published
- 2016
4. Validating the stable clustering of songs in a structured 3D SOM
- Author
-
Arnulfo P. Azcarraga, Arturo Caronongan, Sean Manalili, and Rudy Setiono
- Subjects
business.industry ,Computer science ,Feature extraction ,Stability (learning theory) ,Pattern recognition ,02 engineering and technology ,020204 information systems ,Distortion ,Core (graph theory) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Computer vision ,Artificial intelligence ,Cube ,Cluster analysis ,business - Abstract
A structured 3D SOM is an extension of a Self-Organizing Map from 2D to 3D in such a way that a pre-defined structure is built into the design of the 3D map. The structured 3D SOM is a 3×3×3 structure that has a distinct core cube in the center and exterior cubes around the core. The current application of the structured SOM, as a digital music archive, only uses the 8 corner cubes among the 26 exterior cubes. Given that the SOM has a built-in structure, the SOM learning algorithm is modified to include a four-phase learning and labeling phase. The first phase is meant to position the music files in their general locations within the core cube. The second phase positions the music files in their respective corner cubes according to their music genre. The second phase is therefore a semi-supervised version of the SOM algorithm which leads to the stability of the trained SOM in terms of the general distribution of the music files in the core cube. The third phase does a fine adjustment of the weight vectors in the core cube and finalizes the training of the 3D SOM. The final fourth phase is the labeling of the core cube and the association (uploading) of music files to specific nodes in the core cube. Based on the pre-defined structure of the 3D SOM, a precise measure is developed to measure the quality of the resulting trained SOM (in this case, the music archive), as well as the quality of the different categories/genres of music albums based on a novel measure of the distortion values of music files with respect to their respective music genres.
- Published
- 2016
5. RULE EXTRACTION FROM MINIMAL NEURAL NETWORKS FOR CREDIT CARD SCREENING
- Author
-
Rudy Setiono, Bart Baesens, and Christophe Mues
- Subjects
Network architecture ,Models, Statistical ,Artificial neural network ,Computer Networks and Communications ,Computer science ,Rule sets ,business.industry ,Time delay neural network ,Decision Making ,Context (language use) ,General Medicine ,computer.software_genre ,Machine learning ,Credit card ,Artificial Intelligence ,Feedforward neural network ,Neural Networks, Computer ,Artificial intelligence ,Data mining ,Pruning (decision trees) ,business ,computer ,Algorithms - Abstract
While feedforward neural networks have been widely accepted as effective tools for solving classification problems, the issue of finding the best network architecture remains unresolved, particularly so in real-world problem settings. We address this issue in the context of credit card screening, where it is important to not only find a neural network with good predictive performance but also one that facilitates a clear explanation of how it produces its predictions. We show that minimal neural networks with as few as one hidden unit provide good predictive accuracy, while having the added advantage of making it easier to generate concise and comprehensible classification rules for the user. To further reduce model size, a novel approach is suggested in which network connections from the input units to this hidden unit are removed by a very straightaway pruning procedure. In terms of predictive accuracy, both the minimized neural networks and the rule sets generated from them are shown to compare favorably with other neural network based classifiers. The rules generated from the minimized neural networks are concise and thus easier to validate in a real-life setting.
- Published
- 2011
6. Understanding consumer heterogeneity: A business intelligence application of neural networks
- Author
-
Rudy Setiono, Yoichi Hayashi, and Ming-Huei Hsieh
- Subjects
Information Systems and Management ,Artificial neural network ,Computer science ,business.industry ,Decision tree ,Context (language use) ,Machine learning ,computer.software_genre ,Management Information Systems ,Data set ,Artificial Intelligence ,Business intelligence ,Artificial intelligence ,business ,computer ,Software - Abstract
This paper describes a business intelligence application of neural networks in analyzing consumer heterogeneity in the context of eating-out behavior in Taiwan. We apply a neural network rule extraction algorithm which automatically groups the consumers into identifiable segments according to their socio-demographic information. Within each of these segments, the consumers are distinguished between those who eat-out frequently from those who do not based on their psychological traits and eat-out considerations. The data set for this study has been collected through a survey of 800 Taiwanese consumers. Demographic information such as gender, age and income were recorded. In addition, information about their psychological traits and eating-out considerations that might influence the frequency of eating-out were obtained. The results of our data analysis show that the neural network rule extraction algorithm is able to find distinct consumer segments and predict the consumers within each segment with good accuracy.
- Published
- 2010
7. Predicting consumer preference for fast-food franchises: a data mining approach
- Author
-
Ming-Huei Hsieh, Yoichi Hayashi, and Rudy Setiono
- Subjects
Marketing ,021103 operations research ,Training set ,Computer science ,Strategy and Management ,Decision tree learning ,0211 other engineering and technologies ,Decision tree ,02 engineering and technology ,Decision rule ,Management Science and Operations Research ,computer.software_genre ,Preference ,Management Information Systems ,Data modeling ,Tree structure ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Data mining ,computer ,Decision tree model - Abstract
The objectives of the study reported in this paper are: (1) to evaluate the adequacy of two data mining techniques, decision tree and neural network in analysing consumer preference for a fast-food franchise and (2) to examine the sufficiency of the criteria selected in understanding this preference. We build decision tree and neural network models to fit data samples collected from 800 respondents in Taiwan to understand the factors that determine their brand preference. Classification rules are generated from these models to differentiate between consumers who prefer the brand and those who do not. The generated rules show that while both decision tree and neural network models can achieve predictive accuracy of more than 80% on the training data samples and more that 70% on the cross-validation data samples, the neural network models compare very favourably to a decision tree model in rule complexity and the numbers of relevant input attributes.
- Published
- 2009
8. A note on knowledge discovery using neural networks and its application to credit card screening
- Author
-
Rudy Setiono, Bart Baesens, and Christophe Mues
- Subjects
Information Systems and Management ,General Computer Science ,Artificial neural network ,business.industry ,Computer science ,Management Science and Operations Research ,computer.software_genre ,Industrial and Manufacturing Engineering ,Credit card ,Information extraction ,Knowledge extraction ,Modeling and Simulation ,Artificial intelligence ,business ,computer - Abstract
We address an important issue in knowledge discovery using neural networks that has been left out in a recent article “Knowledge discovery using a neural network simultaneous optimization algorithm on a real world classification problem” by Sexton et al. [R.S. Sexton, S. McMurtrey, D.J. Cleavenger, Knowledge discovery using a neural network simultaneous optimization algorithm on a real world classification problem, European Journal of Operational Research 168 (2006) 1009–1018]. This important issue is the generation of comprehensible rule sets from trained neural networks. In this note, we present our neural network rule extraction algorithm that is very effective in discovering knowledge embedded in a neural network. This algorithm is particularly appropriate in applications where comprehensibility as well as accuracy are required. For the same data sets used by Sexton et al. our algorithm produces accurate rule sets that are concise and comprehensible, and hence helps validate the claim that neural networks could be viable alternatives to other data mining tools for knowledge discovery.
- Published
- 2009
9. Greedy rule generation from discrete data and its use in neural network rule extraction
- Author
-
Rudy Setiono, Koichi Odajima, Yoichi Hayashi, and Gong Tianxia
- Subjects
Artificial neural network ,Discretization ,Iterative method ,business.industry ,Cognitive Neuroscience ,Reproducibility of Results ,Flowers ,Data set ,Knowledge extraction ,Artificial Intelligence ,Delta rule ,Data Interpretation, Statistical ,Neoplasms ,Humans ,Neural Networks, Computer ,Artificial intelligence ,Cluster analysis ,business ,Greedy algorithm ,Algorithms ,Software ,Mathematics - Abstract
This paper proposes a GRG (Greedy Rule Generation) algorithm, a new method for generating classification rules from a data set with discrete attributes. The algorithm is ''greedy'' in the sense that at every iteration, it searches for the best rule to generate. The criteria for the best rule include the number of samples and the size of subspaces that it covers, as well as the number of attributes in the rule. This method is employed for extracting rules from neural networks that have been trained and pruned for solving classification problems. The classification rules are extracted from the neural networks using the standard decompositional approach. Neural networks with one hidden layer are trained and the proposed GRG algorithm is applied to their discretized hidden unit activation values. Our experimental results show that neural network rule extraction with the GRG method produces rule sets that are accurate and concise. Application of GRG directly on three medical data sets with discrete attributes also demonstrates its effectiveness for rule generation.
- Published
- 2008
10. Knowledge acquisition and revision using neural networks: an application to a cross-national study of brand image perception
- Author
-
Arnulfo P. Azcarraga, Shan L. Pan, Rudy Setiono, and Ming-Huei Hsieh
- Subjects
Marketing ,021103 operations research ,Knowledge management ,Operations research ,Artificial neural network ,business.industry ,Computer science ,Strategy and Management ,Knowledge engineering ,0211 other engineering and technologies ,Information technology ,Context (language use) ,02 engineering and technology ,Management Science and Operations Research ,Knowledge acquisition ,Management Information Systems ,Knowledge-based systems ,0202 electrical engineering, electronic engineering, information engineering ,Information system ,020201 artificial intelligence & image processing ,business ,Knowledge transfer ,Core Knowledge - Abstract
A three-tier knowledge management approach is proposed in the context of a cross-national study of car brand and corporate image perceptions. The approach consists of knowledge acquisition, transfer and revision using neural networks. We investigate how knowledge acquired by a neural network from one car market can be exploited and applied in another market. This transferred knowledge is subsequently revised for application in the new market. Knowledge revision is achieved by re-training the neural network. Core knowledge common to both markets is retained while some localized knowledge components are introduced during network re-training. Since the knowledge acquired by a neural network can be expressed as an accurate set of simple rules, we are able to compare the knowledge extracted from one network with the knowledge extracted from another. Comparison of the originally acquired knowledge with the revised knowledge provides us with insights into the commonalities and differences in car brand and corporate perceptions across national markets.
- Published
- 2006
11. Extracting Salient Dimensions for Automatic SOM Labeling
- Author
-
Shan L. Pan, Arnulfo P. Azcarraga, Ming-Huei Hsieh, and Rudy Setiono
- Subjects
Computer science ,business.industry ,Construct (python library) ,Machine learning ,computer.software_genre ,Computer Science Applications ,Human-Computer Interaction ,Set (abstract data type) ,Range (mathematics) ,ComputingMethodologies_PATTERNRECOGNITION ,Control and Systems Engineering ,Salient ,Node (computer science) ,Unsupervised learning ,Segmentation ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Software ,Information Systems - Abstract
Learning in self-organizing maps (SOM) is considered unsupervised because training patterns do not need accompanying desired output information. Prior to its use in some real-world applications, however, a trained SOM often has to be labeled. This labeling phase is usually supervised in that labeled patterns need accompanying output information. Because such labeled patterns are not always available or may not even be possible to construct, the supervised nature of the labeling phase restricts the deployment of SOM from a wide range of potential domains of application. This work proposes a methodical and automatic SOM labeling procedure that does not require a set of prelabeled patterns. Instead, nodes in the trained map are clustered and subsets of training patterns associated to each of the clustered nodes are identified. Salient dimensions per node cluster that constitute the bases for labeling each node in the map are then identified. The effectiveness of the method is demonstrated on a SOM-based international market segmentation study.
- Published
- 2005
12. Separating Core and Noncore Knowledge: An Application of Neural Network Rule Extraction to a Cross-National Study of Brand Image Perception
- Author
-
Arnulfo P. Azcarraga, Rudy Setiono, Ming-Huei Hsieh, and Shan L. Pan
- Subjects
Artificial neural network ,business.industry ,Social perception ,Computer science ,media_common.quotation_subject ,Decision tree ,Machine learning ,computer.software_genre ,Knowledge acquisition ,Computer Science Applications ,Human-Computer Interaction ,Corporate branding ,Knowledge extraction ,Control and Systems Engineering ,Perception ,Similarity (psychology) ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,ComputingMilieux_MISCELLANEOUS ,Software ,Information Systems ,media_common ,Core Knowledge - Abstract
Recent advances in algorithms that extract rules from artificial neural networks make it feasible to use neural networks as a tool for acquiring knowledge hidden in the data. Findings are reported from the use of such algorithms to separate core and noncore knowledge in a cross-national study of automobile brand image perception. Respondents from five Western European countries have been asked to associate individual and corporate brand associations for a number of well-known automobile brands. Knowledge, expressed as concise and accurate rules that distinguish between the respondents' perceptions of German and Japanese brands, is extracted from trained neural networks. This paper explains how both core knowledge, which captures the perceptions shared by the respondents in all countries, and country-specific noncore knowledge can be acquired and differentiated by a proposed two-step approach to train and extract rules from a multi-neural network system. The experimental results show that, in addition to providing a better understanding of the differences and similarities in the brand image perceptions of consumers in various countries, the proposed approach also yields better predictive accuracy than a decision tree method.
- Published
- 2005
13. A Hybrid SOM-SVM Approach for the Zebrafish Gene Expression Analysis
- Author
-
Min Xu, Xin Liu, Wei Wu, Jinrong Peng, and Rudy Setiono
- Subjects
Self-organizing map ,Computer science ,Computational biology ,computer.software_genre ,Biochemistry ,Article ,self-organizing map ,Genetics ,Animals ,support vector machine ,Cluster analysis ,Zebrafish ,Gene ,Molecular Biology ,biology ,Gene Expression Profiling ,Computational Biology ,biology.organism_classification ,Expression (mathematics) ,Data set ,Support vector machine ,Computational Mathematics ,ComputingMethodologies_PATTERNRECOGNITION ,Gene Expression Regulation ,classification ,Multigene Family ,Gene chip analysis ,Data mining ,computer ,clustering - Abstract
Microarray technology can be employed to quantitatively measure the expression of thousands of genes in a single experiment. It has become one of the main tools for global gene expression analysis in molecular biology research in recent years. The large amount of expression data generated by this technology makes the study of certain complex biological problems possible, and machine learning methods are expected to play a crucial role in the analysis process. In this paper, we present our results from integrating the self-organizing map (SOM) and the support vector machine (SVM) for the analysis of the various functions of zebrafish genes based on their expression. The most distinctive characteristic of our zebrafish gene expression is that the number of samples of different classes is imbalanced. We discuss how SOM can be used as a data-filtering tool to improve the classification performance of the SVM on this data set.
- Published
- 2005
- Full Text
- View/download PDF
14. Automatic knowledge extraction from survey data: learningM-of-Nconstructs using a hybrid approach
- Author
-
Ming-Huei Hsieh, Shan L. Pan, Arnulfo P. Azcarraga, and Rudy Setiono
- Subjects
Marketing ,Data collection ,Artificial neural network ,Computer science ,Group method of data handling ,Strategy and Management ,05 social sciences ,Decision tree ,02 engineering and technology ,Management Science and Operations Research ,computer.software_genre ,Management Information Systems ,Knowledge extraction ,Market segmentation ,020204 information systems ,0502 economics and business ,0202 electrical engineering, electronic engineering, information engineering ,Survey data collection ,050211 marketing ,Data mining ,computer - Abstract
Data collected from a survey typically consist of attributes that are mostly if not completely binary-valued or binary-encoded. We present a method for handling such data where the underlying data analysis can be cast as a classification problem. We propose a hybrid method that combines neural network and decision tree methods. The network is trained to remove irrelevant data attributes and the decision tree is applied to extract comprehensible classification rules from the trained network. The conditions of the rules are in the form of a conjunction of M-of-N constructs. An M-of-N construct is a rule condition that is satisfied if (at least, exactly, at most) M of the N binary attributes in the construct are present. The effectiveness of the method is illustrated on data collected for a study of global car market segmentation. The results show that besides achieving high predictive accuracy, the method also allows meaningful interpretation of the relationships among the data variables.
- Published
- 2005
15. Product-, Corporate-, and Country-Image Dimensions and Purchase Behavior: A Multicountry Analysis
- Author
-
Shan-Ling Pan, Ming-Huei Hsieh, and Rudy Setiono
- Subjects
Marketing ,Economics and Econometrics ,business.industry ,media_common.quotation_subject ,Multilevel model ,Automotive industry ,Advertising ,Frequent use ,Brand image ,Work (electrical) ,Perception ,Macro level ,Product (category theory) ,Business and International Management ,business ,media_common - Abstract
This research focuses on consumer perceptions that are developed on the basis of a firm’s advertising appeals as well as other factors. In conceptualizing brand-image perceptions, the authors extend the frequent use of productrelated images to include corporate and country images attached to brands. The authors report findings based on secondary economic and cultural data at the macro level and the results of a global brand-image survey conducted in the top 20 international automobile markets at the individual level. The findings suggest that while consumers’ attitudes toward corporate image and country image exert main effects on their brand purchase behavior, the effects of certain product-image appeals are moderated by sociodemographics and national cultural characteristics. The empirical results are broadly supportive of the proposed hypotheses and provide a consumer-based extension of Roth’s work on global brand image.
- Published
- 2004
16. Computational intelligence methods for rule-based data understanding
- Author
-
Rudy Setiono, Włodzisław Duch, and Jacek M. Zurada
- Subjects
Artificial neural network ,Computer science ,business.industry ,Fuzzy set ,Decision tree ,Stability (learning theory) ,Computational intelligence ,Rule-based system ,Fuzzy control system ,computer.software_genre ,Machine learning ,Expert system ,Artificial intelligence ,Data mining ,Electrical and Electronic Engineering ,business ,computer - Abstract
In many applications, black-box prediction is not satisfactory, and understanding the data is of critical importance. Typically, approaches useful for understanding of data involve logical rules, evaluate similarity to prototypes, or are based on visualization or graphical methods. This paper is focused on the extraction and use of logical rules for data understanding. All aspects of rule generation, optimization, and application are described, including the problem of finding good symbolic descriptors for continuous data, tradeoffs between accuracy and simplicity at the rule-extraction stage, and tradeoffs between rejection and error level at the rule optimization stage. Stability of rule-based description, calculation of probabilities from rules, and other related issues are also discussed. Major approaches to extraction of logical rules based on neural networks, decision trees, machine learning, and statistical methods are introduced. Optimization and application issues for sets of logical rules are described. Applications of such methods to benchmark and real-life problems are reported and illustrated with simple logical rules for many datasets. Challenges and new directions for research are outlined.
- Published
- 2004
17. An approach to generate rules from neural networks for regression problems
- Author
-
Rudy Setiono and James Y.L. Thong
- Subjects
Information Systems and Management ,General Computer Science ,Artificial neural network ,Computer science ,business.industry ,Time delay neural network ,Decision tree ,Management Science and Operations Research ,Machine learning ,computer.software_genre ,Industrial and Manufacturing Engineering ,Regression ,Knowledge-based systems ,Modeling and Simulation ,Linear regression ,Artificial intelligence ,Data mining ,Types of artificial neural networks ,business ,computer ,Nonlinear regression - Abstract
Artificial neural networks have been successfully applied to a variety of business application problems involving classification and regression. They are especially useful for regression problems as they do not require prior knowledge about the data distribution. In many applications, it is desirable to extract knowledge from trained neural networks so that the users can gain a better understanding of the solution. Existing research works have focused primarily on extracting symbolic rules for classification problems with few methods devised for regression problems. In order to fill this gap, we propose an approach to extract rules from neural networks that have been trained to solve regression problems. The extracted rules divide the data samples into groups. For all samples within a group, a linear function of the relevant input attributes of the data approximates the network output. The approach is illustrated with two examples on various application problems. Experimental results show that the proposed approach generates rules that are more accurate than the existing methods based on decision trees and linear regression.
- Published
- 2004
18. Using Neural Network Rule Extraction and Decision Tables for Credit-Risk Evaluation
- Author
-
Jan Vanthienen, Rudy Setiono, Christophe Mues, and Bart Baesens
- Subjects
Artificial neural network ,Computer science ,business.industry ,Strategy and Management ,Feature selection ,Management Science and Operations Research ,Machine learning ,computer.software_genre ,Domain (software engineering) ,Credit-risk Evaluation, Neural Networks, Decision Tables, Classification ,Artificial intelligence ,Data mining ,Decision table ,business ,computer ,Credit risk - Abstract
Credit-risk evaluation is a very challenging and important management science problem in the domain of financial analysis. Many classification methods have been suggested in the literature to tackle this problem. Neural networks, especially, have received a lot of attention because of their universal approximation property. However, a major drawback associated with the use of neural networks for decision making is their lack of explanation capability. While they can achieve a high predictive accuracy rate, the reasoning behind how they reach their decisions is not readily available. In this paper, we present the results from analysing three real-life credit-risk data sets using neural network rule extraction techniques. Clarifying the neural network decisions by explanatory rules that capture the learned knowledge embedded in the networks can help the credit-risk manager in explaining why a particular applicant is classified as either bad or good. Furthermore, we also discuss how these rules can be visualized as a decision table in a compact and intuitive graphical format that facilitates easy consultation. It is concluded that neural network rule extraction and decision tables are powerful management tools that allow us to build advanced and userfriendly decision-support systems for credit-risk evaluation.
- Published
- 2003
19. Combining neural network predictions for medical diagnosis
- Author
-
Rudy Setiono and Yoichi Hayashi
- Subjects
Liver Cirrhosis ,Carcinoma, Hepatocellular ,Artificial neural network ,business.industry ,Liver Neoplasms ,Hepatobiliary disease ,Health Informatics ,Machine learning ,computer.software_genre ,Computer Science Applications ,Diagnosis, Differential ,Bias ,Liver Function Tests ,Cholelithiasis ,Humans ,Medicine ,Feedforward neural network ,Diagnosis, Computer-Assisted ,Neural Networks, Computer ,Artificial intelligence ,Medical diagnosis ,business ,Liver Diseases, Alcoholic ,computer - Abstract
We present our results from combining the predictions of an ensemble of neural networks for the diagnosis of hepatobiliary disorders. To improve the accuracy of the diagnosis, we train the second level networks using the outputs of the first level networks as input data. The second level networks achieve an accuracy that is higher than that of the individual networks in the first level. Compared to the simple method which averages the outputs of the first level networks, the second level networks are also more accurate. We discuss how the overall predictive accuracy can be improved by introducing bias during the training of the level one networks.
- Published
- 2002
20. GENERATING CONCISE SETS OF LINEAR REGRESSION RULES FROM ARTIFICIAL NEURAL NETWORKS
- Author
-
Arnulfo P. Azcarraga and Rudy Setiono
- Subjects
Polynomial regression ,Multivariate adaptive regression splines ,Proper linear model ,Artificial neural network ,Computer science ,business.industry ,Pattern recognition ,Regression analysis ,Artificial Intelligence ,Linear predictor function ,Feedforward neural network ,Artificial intelligence ,Types of artificial neural networks ,business - Abstract
Neural networks with a single hidden layer are known to be universal function approximators. However, due to the complexity of the network topology and the nonlinear transfer function used in computing the hidden unit activations, the predictions of a trained network are difficult to comprehend. On the other hand, predictions from a multiple linear regression equation are easy to understand but are not accurate when the underlying relationship between the input variables and the output variable is nonlinear. We have thus developed a method for multivariate function approximation which combines neural network learning, clustering and multiple regression. This method generates a set of multiple linear regression equations using neural networks, where the number of regression equations is determined by clustering the weighted input variables. The predictions for samples of the same cluster are computed by the same regression equation. Experimental results on a number of real-world data demonstrate that this new method generates relatively few regression equations from the training data samples. Yet, drawing from the universal function approximation capacity of neural networks, the predictive accuracy is high. The prediction errors are comparable to or lower than those achieved by existing function approximation methods.
- Published
- 2002
21. Extraction of rules from artificial neural networks for nonlinear regression
- Author
-
Wee Kheng Leow, Jacek M. Zurada, and Rudy Setiono
- Subjects
Artificial neural network ,Computer Networks and Communications ,business.industry ,Computer science ,General Medicine ,Function (mathematics) ,Machine learning ,computer.software_genre ,Linear function ,Computer Science Applications ,Set (abstract data type) ,Nonlinear system ,Function approximation ,Artificial Intelligence ,Artificial intelligence ,Data mining ,business ,Nonlinear regression ,computer ,Software - Abstract
Neural networks (NNs) have been successfully applied to solve a variety of application problems including classification and function approximation. They are especially useful as function approximators because they do not require prior knowledge of the input data distribution and they have been shown to be universal approximators. In many applications, it is desirable to extract knowledge that can explain how Me problems are solved by the networks. Most existing approaches have focused on extracting symbolic rules for classification. Few methods have been devised to extract rules from trained NNs for regression. This article presents an approach for extracting rules from trained NNs for regression. Each rule in the extracted rule set corresponds to a subregion of the input space and a linear function involving the relevant input attributes of the data approximates the network output for all data samples in this subregion. Extensive experimental results on 32 benchmark data sets demonstrate the effectiveness of the proposed approach in generating accurate regression rules.
- Published
- 2002
22. [Untitled]
- Author
-
Hongjun Lu and Rudy Setiono
- Subjects
Artificial neural network ,business.industry ,Computer science ,Cumulative distribution function ,Online aggregation ,Query optimization ,Machine learning ,computer.software_genre ,Query expansion ,Artificial Intelligence ,Sargable ,Data mining ,Artificial intelligence ,Layer (object-oriented design) ,business ,computer ,Boolean conjunctive query - Abstract
This paper describes a novel approach to estimate the size of database query results using neural networks. Using the proposed approach, three layer neural networks are constructed and trained to learn the cumulative distribution functions of attribute values in relations. With a trained network, the estimation of the query result size could be obtained instantly by simply computing the network output from the given query predicates. The basic computational model using a cumulative distribution function to compute the query result size is described. The network construction and training is discussed. Comprehensive experiments were conducted to study the effectiveness of the proposed approach. The results indicate that the approach produces estimates with accuracies that are comparable with or higher than those reported in the literature.
- Published
- 2002
23. MofN rule extraction from neural networks trained with augmented discretized input
- Author
-
Arnulfo P. Azcarraga, Yoichi Hayashi, and Rudy Setiono
- Subjects
Discretization ,Artificial neural network ,business.industry ,Time delay neural network ,Computer science ,Pattern recognition ,Sample (statistics) ,Interval (mathematics) ,computer.software_genre ,Data set ,Simple (abstract algebra) ,Encoding (memory) ,Artificial intelligence ,Data mining ,business ,computer - Abstract
The accuracy of neural networks can be improved when they are trained with discretized continuous attributes as additional inputs. Such input augmentation makes it easier for the network weights to form more accurate decision boundaries when the data samples of different classes in the data set are contained in distinct hyper-rectangular subregions in the original input space. In this paper we present first how a neural network can be trained with augmented discretized inputs. The additional inputs are obtained by simply dividing the original interval of each continuous attribute into subintervals of equal length. Thermometer encoding scheme is used to represent these discretized inputs. The network is then pruned to remove most of the discretized inputs as well as the original continuous attributes as long as the network still achieves a minimum preset accuracy requirement. We then discuss how MofN rules can be extracted from the pruned network by analyzing the activations of the network's hidden units and the weights of the network connections that remain in the pruned network. For data sets that have sample classes defined by relatively complex boundaries, surprisingly simple MofN rules with very good accuracy rates are obtained.
- Published
- 2014
24. Tagging documents using neural networks based on local word features
- Author
-
Arnulfo P. Azcarraga, Rudy Setiono, and Paolo Tensuan
- Subjects
Information retrieval ,Artificial neural network ,Computer science ,business.industry ,Semantic analysis (machine learning) ,Decision tree ,computer.software_genre ,Digital library ,Word lists by frequency ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Artificial intelligence ,business ,tf–idf ,computer ,Word (computer architecture) ,Natural language processing - Abstract
Keywords and key-phrases that concisely represent text documents are integral to many knowledge management and text information retrieval systems, as well as digital libraries in general. Not all text documents, however, are annotated with good keywords; and the quality of these keywords is often dependent on a tedious, sometimes manual, extraction and tagging process. To automatically extract high quality keywords without the need for a semantic analysis of the document, it is shown that artificial neural networks (ANN) can be trained to only consider in-document word features such as word frequency, word distribution in document, use of word in special parts of the document, and use of word formatting features (i.e. bold-faced, italicized, large-font size). Results show that purely local features are adequate in determining whether a word in a document is a keyword or not. Classification performance yields a G mean of a least 0.83, and weighted f-measure of 0.96 for both keywords and non-keywords. Precision for keywords alone, however, is not as high. To understand the basis for classifying keywords, C4.5 is used to extract rules from the ANN. The extracted rules from C4.5, in the form of a decision tree, show the relative importance of the different document features that were extracted.
- Published
- 2014
25. A comparison between two neural network rule extraction techniques for the diagnosis of hepatobiliary disorders
- Author
-
Yoichi Hayashi, Rudy Setiono, and Katsumi Yoshida
- Subjects
Artificial neural network ,Discretization ,Computer science ,Biliary Tract Diseases ,Liver Diseases ,Process (computing) ,Medicine (miscellaneous) ,Linear discriminant analysis ,computer.software_genre ,Domain (software engineering) ,Nonlinear system ,ComputingMethodologies_PATTERNRECOGNITION ,Fuzzy Logic ,Artificial Intelligence ,Humans ,Neural Networks, Computer ,Data mining ,Medical diagnosis ,Decision process ,computer ,Algorithms - Abstract
Neural networks have been widely used as tools for prediction in medicine. We expect to see even more applications of neural networks for medical diagnosis as recently developed neural network rule extraction algorithms make it possible for the decision process of a trained network to be expressed as classification rules. These rules are more comprehensible to a human user than the classification process of the networks which involves complex nonlinear mapping of the input data. This paper reports the results from two neural network rule extraction techniques, NeuroLinear and NeuroRule applied to the diagnosis of hepatobiliary disorders. The dataset consists of nine measurements collected from patients in a Japanese hospital and these measurements have continuous values. NeuroLinear generates piece-wise linear discriminant functions for this dataset. The continuous measurements have previously been discretized by domain experts. NeuroRule is applied to the discretized dataset to generate symbolic classification rules. We compare the rules generated by the two techniques and find that the rules generated by NeuroLinear from the original continuously valued dataset to be slightly more accurate and more concise than the rules generated by NeuroRule from the discretized dataset.
- Published
- 2000
26. Learning M-of-N Concepts for Medical Diagnosis Using Neural Networks
- Author
-
Katsumi Yoshida, Yoichi Hayashi, and Rudy Setiono
- Subjects
Human-Computer Interaction ,Artificial neural network ,Artificial Intelligence ,business.industry ,Computer science ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Medical diagnosis ,business ,Machine learning ,computer.software_genre ,computer - Abstract
Records in a medical dataset may be best characterized by M-of-N concepts. For example, a patient showing at least 2 of the 4 symptoms is likely to be diagnosed as having a certain illness. In this paper, we describe how feedforward neural networks can be used to learn such concepts. We train a network where each input in the data can only have one of two possible values, -1 or 1 and apply the hyperbolic tangent function to each connection from the input layer to the hidden layer of the network before the hidden unit activations are computed. By applying this squashing function, the activation values at the hidden units are effectively computed as the hyperbolic tangent (or the sigmoid) of the weighted inputs, where the weights have magnitudes that are near one. By restricting the inputs and the weights to binary ’ values either -1 or 1, the extraction of the M-of-N concepts from networks becomes trivial. We show how this approach can be used to learn concise and accurate the M-of-N concepts for the diagnosis of hepatobiliary disorders.
- Published
- 2000
27. [Untitled]
- Author
-
Wee Kheng Leow and Rudy Setiono
- Subjects
Artificial neural network ,Computer science ,business.industry ,Decision tree ,Feed forward ,Pattern recognition ,computer.software_genre ,Set (abstract data type) ,Tree (data structure) ,Artificial Intelligence ,Penalty method ,Pruning (decision trees) ,Data mining ,Artificial intelligence ,business ,computer ,Algorithm - Abstract
Before symbolic rules are extracted from a trained neural network, the network is usually pruned so as to obtain more concise rules. Typical pruning algorithms require retraining the network which incurs additional cost. This paper presents FERNN, a fast method for extracting rules from trained neural networks without network retraining. Given a fully connected trained feedforward network with a single hidden layer, FERNN first identifies the relevant hidden units by computing their information gains. For each relevant hidden unit, its activation values is divided into two subintervals such that the information gain is maximized. FERNN finds the set of relevant network connections from the input units to this hidden unit by checking the magnitudes of their weights. The connections with large weights are identified as relevant. Finally, FERNN generates rules that distinguish the two subintervals of the hidden activation values in terms of the network inputs. Experimental results show that the size and the predictive accuracy of the tree generated are comparable to those extracted by another method which prunes and retrains the network.
- Published
- 2000
28. A connectionist approach to generating oblique decision trees
- Author
-
Rudy Setiono and Huan Liu
- Subjects
Incremental decision tree ,Computer science ,business.industry ,Decision tree learning ,ID3 algorithm ,Decision tree ,Weight-balanced tree ,Pattern recognition ,General Medicine ,Machine learning ,computer.software_genre ,Computer Science Applications ,Human-Computer Interaction ,Control and Systems Engineering ,Alternating decision tree ,Influence diagram ,Decision stump ,Artificial intelligence ,Electrical and Electronic Engineering ,business ,computer ,Software ,Information Systems - Abstract
Neural networks and decision tree methods are two common approaches to pattern classification. While neural networks can achieve high predictive accuracy rates, the decision boundaries they form are highly nonlinear and generally difficult to comprehend. Decision trees, on the other hand, can be readily translated into a set of rules. In this paper, we present a novel algorithm for generating oblique decision trees that capitalizes on the strength of both approaches. Oblique decision trees classify the patterns by testing on linear combinations of the input attributes. As a result, an oblique decision tree is usually much smaller than the univariate tree generated for the same domain. Our algorithm consists of two components: connectionist and symbolic. A three-layer feedforward neural network is constructed and pruned, a decision tree is then built from the hidden unit activation values of the pruned network. An oblique decision tree is obtained by expressing the activation values using the original input attributes. We test our algorithm on a wide range of problems. The oblique decision trees generated by the algorithm preserve the high accuracy of the neural networks, while keeping the explicitness of decision trees. Moreover, they outperform univariate decision trees generated by the symbolic approach and oblique decision trees built by other approaches in accuracy and tree size.
- Published
- 1999
29. On mapping decision trees and neural networks
- Author
-
Wee Kheng Leow and Rudy Setiono
- Subjects
Information Systems and Management ,Artificial neural network ,business.industry ,Time delay neural network ,Computer science ,Decision tree learning ,Deep learning ,Decision tree ,Machine learning ,computer.software_genre ,Management Information Systems ,Artificial Intelligence ,Alternating decision tree ,Pruning (decision trees) ,Artificial intelligence ,business ,computer ,Software - Abstract
There exist several methods for transforming decision trees to neural networks. These methods typically construct the networks by directly mapping decision nodes or rules to the neural units. As a result, the networks constructed are often larger than necessary. This article describes a pruning-based method for mapping decision trees to neural networks, which can compress the network by removing unimportant and redundant units and connections. In addition, equivalent decision trees extracted from the pruned networks are simpler than those induced by well-known algorithms such as ID3 and C4.5.
- Published
- 1999
30. Some issues on scalable feature selection1This is an extended version of the paper presented at the Fourth World Congress of Expert Systems: Application of Advanced Information Technologies held in Mexico City in March 1998.1
- Author
-
Rudy Setiono and Huan Liu
- Subjects
business.industry ,Computer science ,Feature extraction ,General Engineering ,Feature selection ,Machine learning ,computer.software_genre ,Computer Science Applications ,Randomized algorithm ,k-nearest neighbors algorithm ,Artificial Intelligence ,Feature (computer vision) ,Scalability ,Feature (machine learning) ,Artificial intelligence ,Data mining ,business ,computer ,Feature learning - Abstract
Feature selection determines relevant features in the data. It is often applied in pattern classification, data mining, as well as machine learning. A special concern for feature selection nowadays is that the size of a database is normally very large, both vertically and horizontally. In addition, feature sets may grow as the data collection process continues. Effective solutions are needed to accommodate the practical demands. This paper concentrates on three issues: large number of features, large data size, and expanding feature set. For the first issue, we suggest a probabilistic algorithm to select features. For the second issue, we present a scalable probabilistic algorithm that expedites feature selection further and can scale up without sacrificing the quality of selected features. For the third issue, we propose an incremental algorithm that adapts to the newly extended feature set and captures `concept drifts' by removing features from previously selected and newly added ones. We expect that research on scalable feature selection will be extended to distributed and parallel computing and have impact on applications of data mining and machine learning.
- Published
- 1998
31. Symbolic rule extraction from neural networks
- Author
-
Chee-Sing Yap, Rudy Setiono, and James Y.L. Thong
- Subjects
Engineering ,Decision support system ,Service (systems architecture) ,Information Systems and Management ,Artificial neural network ,business.industry ,Information technology ,Decision rule ,computer.software_genre ,Linear discriminant analysis ,Backpropagation ,Management Information Systems ,Data mining ,business ,computer ,Information Systems ,Drawback - Abstract
Interest in the application of neural networks as tools for decision support has been growing in recent years. A major drawback often associated with neural networks is the difficulty in understanding the knowledge represented by a trained network. This paper describes an approach that can extract symbolic rules from neural networks. We illustrate how the approach successfully extracted rules from a data set collected from a survey of the service sectors in the United Kingdom. The extracted rules were then used to distinguish between organizations using computers from those that do not. The classification scheme based on these rules was used to identify specific segments of a market for promoting adoption of information technology. The extracted rules are not only concise but also outperform discriminant analysis in terms of predictive accuracy.
- Published
- 1998
32. Analysis of Hidden Representations by Greedy Clustering
- Author
-
Huan Liu and Rudy Setiono
- Subjects
Artificial neural network ,business.industry ,Computer science ,Contiguity ,Decision tree ,Machine learning ,computer.software_genre ,Backpropagation ,Human-Computer Interaction ,Set (abstract data type) ,Artificial Intelligence ,Pruning (decision trees) ,Artificial intelligence ,Greedy algorithm ,business ,Cluster analysis ,computer ,Software - Abstract
The hidden layer of backpropagation neural networks (NNs) holds the key to the networks' success in solving pattern classification problems. The units in the hidden layer encapsulate the network's internal representations of the outside world described by the input data. this paper, the hidden representations of trained networks are investigated by means simple greedy clustering algorithm. This clustering algorithm is applied to networks have been trained to solve well-known problems: the monks problems, the 5-bit problem and the contiguity problem. The results from applying the algorithm to problems with known concepts provide us with a better understanding of NN learning. These also explain why NNs achieve higher predictive accuracy than that of decision-tree methods. The results of this study can be readily applied to rule extraction from Production rules are extracted for the parity and the monks problems, as well as benchmark data set: Pima Indian diabetes diagnosis. The extracted rules from the Indi...
- Published
- 1998
33. [Untitled]
- Author
-
Rudy Setiono and Huan Liu
- Subjects
business.industry ,Computer science ,Heuristic ,Dimensionality reduction ,Feature extraction ,Feature selection ,computer.software_genre ,Machine learning ,Randomized algorithm ,Artificial Intelligence ,Feature (computer vision) ,Pattern recognition (psychology) ,Minimum redundancy feature selection ,Artificial intelligence ,Data mining ,business ,computer - Abstract
Feature selection is a problem of finding relevant features. When the number of features of a dataset is large and its number of patterns is huge, an effective method of feature selection can help in dimensionality reduction. An incremental probabilistic algorithm is designed and implemented as an alternative to the exhaustive and heuristic approaches. Theoretical analysis is given to support the idea of the probabilistic algorithm in finding an optimal or near-optimal subset of features. Experimental results suggest that (1) the probabilistic algorithm is effective in obtaining optimal/suboptimal feature subsets; (2) its incremental version expedites feature selection further when the number of patterns is large and can scale up without sacrificing the quality of selected features.
- Published
- 1998
34. NeuroLinear: From neural networks to oblique decision rules
- Author
-
Rudy Setiono and Huan Liu
- Subjects
Artificial neural network ,business.industry ,Cognitive Neuroscience ,Oblique case ,Pattern recognition ,Decision rule ,computer.software_genre ,Partition (database) ,Computer Science Applications ,Set (abstract data type) ,Reduction (complexity) ,Hyperplane ,Artificial Intelligence ,Data mining ,Artificial intelligence ,Pruning (decision trees) ,business ,computer ,Mathematics - Abstract
We present NeuroLinear, a system for extracting oblique decision rules from neural networks that have been trained for classification of patterns. Each condition of an oblique decision rule corresponds to a partition of the attribute space by a hyperplane that is not necessarily axisparallel. Allowing a set of such hyperplanes to form the boundaries of the decision regions leads to a significant reduction in the number of rules generated while maintaining the accuracy rates of the networks. We describe the components of NeuroLinear in detail by way of two examples using artificial datasets. Our experimental results on real-world datasets show that the system is effective in extracting compact and comprehensible rules with high predictive accuracy from neural networks.
- Published
- 1997
35. On the solution of the parity problem by a single hidden layer feedforward neural network
- Author
-
Rudy Setiono
- Subjects
Cognitive Neuroscience ,Connection (vector bundle) ,Sigmoid function ,System of linear equations ,Topology ,Transfer function ,Computer Science Applications ,Artificial Intelligence ,Control theory ,Parity problem ,Feedforward neural network ,Layer (object-oriented design) ,Unit (ring theory) ,Mathematics - Abstract
It is known that the N-bit parity problem is solvable by a standard feedforward neural network having a single hidden layer consisting of ( N 2 ) + 1 hidden units if N is even and ( N + 1) 2 hidden units if N is odd. The network does not allow a direct connection between the input layer and the output layer and the transfer function used in all hidden units and the output unit is the usual sigmoidal function σ(x) = 1 (1 + exp (−x)) . We show that such a solution can be easily obtained by solving a system of linear equations.
- Published
- 1997
36. Neural-network feature selector
- Author
-
Huan Liu and Rudy Setiono
- Subjects
Artificial neural network ,Computer Networks and Communications ,Computer science ,Entropy (statistical thermodynamics) ,business.industry ,Feature extraction ,Feature selection ,Pattern recognition ,General Medicine ,Backpropagation ,Computer Science Applications ,Entropy (classical thermodynamics) ,Error function ,Cross entropy ,Artificial Intelligence ,Entropy (information theory) ,Feedforward neural network ,Artificial intelligence ,Entropy (energy dispersal) ,business ,Software - Abstract
Feature selection is an integral part of most learning algorithms. Due to the existence of irrelevant and redundant attributes, by selecting only the relevant attributes of the data, higher predictive accuracy can be expected from a machine learning method. In this paper, we propose the use of a three-layer feedforward neural network to select those input attributes that are most useful for discriminating classes in a given set of input patterns. A network pruning algorithm is the foundation of the proposed algorithm. By adding a penalty term to the error function of the network, redundant network connections can be distinguished from those relevant ones by their small weights when the network training process has been completed. A simple criterion to remove an attribute based on the accuracy rate of the network is developed. The network is retrained after removal of an attribute, and the selection process is repeated until no attribute meets the criterion for removal. Our experimental results suggest that the proposed method works very well on a wide variety of classification problems.
- Published
- 1997
37. A Penalty-Function Approach for Pruning Feedforward Neural Networks
- Author
-
Rudy Setiono
- Subjects
Neurons ,Models, Statistical ,Probability learning ,Artificial neural network ,Computer science ,Weight elimination ,Cognitive Neuroscience ,Feed forward ,Reproducibility of Results ,Robotics ,Backpropagation ,Arts and Humanities (miscellaneous) ,Cluster Analysis ,Feedforward neural network ,Penalty method ,Neural Networks, Computer ,Probability Learning ,Algorithm ,Algorithms - Abstract
This article proposes the use of a penalty function for pruning feedforward neural network by weight elimination. The penalty function proposed consists of two terms. The first term is to discourage the use of unnecessary connections, and the second term is to prevent the weights of the connections from taking excessively large values. Simple criteria for eliminating weights from the network are also given. The effectiveness of this penalty function is tested on three well-known problems: the contiguity problem, the parity problems, and the monks problems. The resulting pruned networks obtained for many of these problems have fewer connections than previously reported in the literature.
- Published
- 1997
38. Extracting Rules from Neural Networks by Pruning and Hidden-Unit Splitting
- Author
-
Rudy Setiono
- Subjects
Neurons ,Signal processing ,Base Sequence ,Artificial neural network ,Computer science ,Cognitive Neuroscience ,Process (computing) ,Feed forward ,Reproducibility of Results ,Exons ,Introns ,Sound ,Arts and Humanities (miscellaneous) ,Pattern recognition (psychology) ,Learning ,Feedforward neural network ,Neural Networks, Computer ,Subnetwork ,Algorithm ,Algorithms ,Pruning (morphology) - Abstract
An algorithm for extracting rules from a standard three-layer feedforward neural network is proposed. The trained network is first pruned not only to remove redundant connections in the network but, more important, to detect the relevant inputs. The algorithm generates rules from the pruned network by considering only a small number of activation values at the hidden units. If the number of inputs connected to a hidden unit is sufficiently small, then rules that describe how each of its activation values is obtained can be readily generated. Otherwise the hidden unit will be split and treated as output units, with each output unit corresponding to an activation value. A hidden layer is inserted and a new subnetwork is formed, trained, and pruned. This process is repeated until every hidden unit in the network has a relatively small number of input units connected to it. Examples on how the proposed algorithm works are shown using real-world data arising from molecular biology and signal processing. Our results show that for these complex problems, the algorithm can extract reasonably compact rule sets that have high predictive accuracy rates.
- Published
- 1997
39. Feature selection via discretization
- Author
-
Huan Liu and Rudy Setiono
- Subjects
Discretization ,Group method of data handling ,business.industry ,Computer science ,Feature extraction ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Pattern recognition ,Feature selection ,Computer Science Applications ,Computational Theory and Mathematics ,Simple (abstract algebra) ,Artificial intelligence ,business ,Statistic ,Information Systems ,Discretization of continuous features - Abstract
Discretization can turn numeric attributes into discrete ones. Feature selection can eliminate some irrelevant and/or redundant attributes. Chi2 is a simple and general algorithm that uses the /spl chi//sup 2/ statistic to discretize numeric attributes repeatedly until some inconsistencies are found in the data. It achieves feature selection via discretization. It can handle mixed attributes, work with multiclass data, and remove irrelevant and redundant attributes.
- Published
- 1997
40. Improving backpropagation learning with feature selection
- Author
-
Huan Liu and Rudy Setiono
- Subjects
Structure (mathematical logic) ,Training set ,Speedup ,Computer science ,business.industry ,Feature selection ,Information theory ,Machine learning ,computer.software_genre ,Base (topology) ,Backpropagation ,Artificial Intelligence ,Feedforward neural network ,Artificial intelligence ,Data mining ,business ,computer - Abstract
There exist redundant, irrelevant and noisy data. Using proper data to train a network can speed up training, simplify the learned structure, and improve its performance. A two-phase training algorithm is proposed. In the first phase, the number of input units of the network is determined by using an information base method. Only those attributes that meet certain criteria for inclusion will be considered as the input to the network. In the second phase, the number of hidden units of the network is selected automatically based on the performance of the network on the training data. One hidden unit is added at a time only if it is necessary. The experimental results show that this new algorithm can achieve a faster learning time, a simpler network and an improved performance.
- Published
- 1996
41. Symbolic representation of neural networks
- Author
-
Huan Liu and Rudy Setiono
- Subjects
General Computer Science ,Discretization ,Artificial neural network ,Computer science ,business.industry ,Time delay neural network ,Deep learning ,Backpropagation ,Probabilistic neural network ,Recurrent neural network ,Feedforward neural network ,Artificial intelligence ,Cluster analysis ,Stochastic neural network ,business ,Nervous system network models - Abstract
Neural networks often surpass decision trees in predicting pattern classifications, but their predictions cannot be explained. This algorithm's symbolic representations make each prediction explicit and understandable. Our approach to understanding a neural network uses symbolic rules to represent the network decision process. The algorithm, NeuroRule, extracts these rules from a neural network. The network can be interpreted by the rules which, in general, preserve network accuracy and explain the prediction process. We based NeuroRule on a standard three layer feed forward network. NeuroRule consists of four phases. First, it builds a weight decay backpropagation network so that weights reflect the importance of the network's connections. Second, it prunes the network to remove irrelevant connections and units while maintaining the network's predictive accuracy. Third, it discretizes the hidden unit activation values by clustering. Finally, it extracts rules from the network with discretized hidden unit activation values.
- Published
- 1996
42. Dimensionality reduction via discretization
- Author
-
Huan Liu and Rudy Setiono
- Subjects
Imagination ,Information Systems and Management ,Discretization ,business.industry ,Computer science ,media_common.quotation_subject ,Dimensionality reduction ,Pattern recognition ,Feature selection ,Management Information Systems ,Search engine ,Knowledge extraction ,Artificial Intelligence ,Artificial intelligence ,business ,Raw data ,Classifier (UML) ,Software ,media_common - Abstract
The existence of numeric data and large numbers of records in a database present a challenging task in terms of explicit concepts extraction from the raw data. The paper introduces a method that reduces data vertically and horizontally, keeps the discriminating power of the original data, and paves the way for extracting concepts. The method is based on discretization (vertical reduction) and feature selection (horizontal reduction). The experimental results show that (a) the data can be effectively reduced by the proposed method; (b) the predictive accuracy of a classifier (C4.5) can be improved after data and dimensionality reduction; and (c) the classification rules learned are simpler.
- Published
- 1996
43. Effective data mining using neural networks
- Author
-
Huan Liu, Rudy Setiono, and Hongjun Lu
- Subjects
Interpretation (logic) ,Artificial neural network ,Time delay neural network ,Computer science ,business.industry ,Decision tree ,Machine learning ,computer.software_genre ,Knowledge acquisition ,Computer Science Applications ,Set (abstract data type) ,Intelligent Network ,Computational Theory and Mathematics ,Data mining ,Artificial intelligence ,business ,computer ,Information Systems - Abstract
Classification is one of the data mining problems receiving great attention recently in the database community. The paper presents an approach to discover symbolic classification rules using neural networks. Neural networks have not been thought suited for data mining because how the classifications were made is not explicitly stated as symbolic rules that are suitable for verification or interpretation by humans. With the proposed approach, concise symbolic rules with high accuracy can be extracted from a neural network. The network is first trained to achieve the required accuracy rate. Redundant connections of the network are then removed by a network pruning algorithm. The activation values of the hidden units in the network are analyzed, and classification rules are generated using the result of this analysis. The effectiveness of the proposed approach is clearly demonstrated by the experimental results on a set of standard data mining test problems.
- Published
- 1996
44. Discrete Variable Generation for Improved Neural Network Classification
- Author
-
Rudy Setiono and Alex Seret
- Subjects
Artificial neural network ,Discretization ,business.industry ,Time delay neural network ,Computer science ,Decision tree ,Univariate ,Pattern recognition ,Function (mathematics) ,computer.software_genre ,Feedforward neural network ,Artificial intelligence ,Pruning (decision trees) ,Data mining ,business ,computer ,Variable (mathematics) - Abstract
Neural networks are widely used for classification as they achieve good predictive accuracy. When the class labels are determined by complex interactions of the input variables, neural networks can be expected to provide better predictions than methods that test on the values of one variable at a time such as univariate decision tree classifiers. On the other hand, when no or relatively simple interaction between variables determines the class membership, the neural network may over fit the data and the input-to-output relationship in the data is represented by a function that is more complex than it should be. In this paper, we propose adding discretized values of the continuous variables in the data as input when training the neural networks. Finding out whether the discretized values or the original continuous values of the variables are useful is achieved by pruning. By having only the relevant inputs left in the pruned networks, we are able to extract classification rules from these networks that are accurate, concise and interpretable.
- Published
- 2012
45. A Neural Network Construction Algorithm which Maximizes the Likelihood Function
- Author
-
Rudy Setiono
- Subjects
Quasi-maximum likelihood ,Mathematical optimization ,Quantitative Biology::Neurons and Cognition ,Artificial neural network ,Mean squared error ,Computer science ,Computer Science::Neural and Evolutionary Computation ,Feed forward ,Human-Computer Interaction ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,Feedforward neural network ,Quasi-Newton method ,Hidden layer ,Likelihood function ,Algorithm ,Software - Abstract
A new method for constructing a feedforward neural network is proposed. The method starts with a single hidden unit and more units are added to the hidden layer one at a time until a network that c...
- Published
- 1995
46. EFFICIENT NEURAL NETWORK TRAINING ON A CRAY Y-MP
- Author
-
Rudy Setiono and Siu-Leung Chung
- Subjects
Artificial neural network ,Computer science ,Computation ,Process (computing) ,Training (meteorology) ,ComputerApplications_COMPUTERSINOTHERSYSTEMS ,Parallel computing ,Supercomputer ,Theoretical Computer Science ,Computational science ,Error function ,Computational Theory and Mathematics ,Vectorization (mathematics) ,Overall performance - Abstract
An efficient implementation of a quasi-Newton algorithm for training feed-forward neural network on a Cray Y-MP is presented. The most time-consuming step of a neural network training using the quasi-Newton algorithm is the computation of the error function and its gradient. Parallelization embedded in these computations can be exploited through vectorization in a Cray Y-MP supercomputer. We show how they can be carried out such that the overall performance of the neural network training process can be enhanced substantially.
- Published
- 1995
47. Use of a quasi-Newton method in a feedforward neural network construction algorithm
- Author
-
L.C.K. Hui and Rudy Setiono
- Subjects
Brooks–Iyengar algorithm ,Training set ,Artificial neural network ,Computer Networks and Communications ,Computer science ,Time delay neural network ,business.industry ,Feed forward ,General Medicine ,Backpropagation ,Computer Science Applications ,Probabilistic neural network ,Artificial Intelligence ,Robustness (computer science) ,Feedforward neural network ,Artificial intelligence ,Forward algorithm ,business ,Algorithm ,Software ,FSA-Red Algorithm - Abstract
This paper describes an algorithm for constructing a single hidden layer feedforward neural network. A distinguishing feature of this algorithm is that it uses the quasi-Newton method to minimize the sequence of error functions associated with the growing network. Experimental results indicate that the algorithm is very efficient and robust. The algorithm was tested on two test problems. The first was the n-bit parity problem and the second was the breast cancer diagnosis problem from the University of Wisconsin Hospitals. For the n-bit parity problem, the algorithm was able to construct neural network having less than n hidden units that solved the problem for n=4,/spl middotspl middotspl middot/,7. For the cancer diagnosis problem, the neural networks constructed by the algorithm had small number of hidden units and high accuracy rates on both the training data and the testing data. >
- Published
- 1995
48. Keyword extraction using backpropagation neural networks and rule extraction
- Author
-
Arnulfo P. Azcarraga, Michael David Liu, and Rudy Setiono
- Subjects
Artificial neural network ,Computer science ,Feature extraction ,Keyword extraction ,Decision tree ,Data mining ,Cluster analysis ,Document processing ,computer.software_genre ,computer ,Word (computer architecture) ,Backpropagation ,Sentence - Abstract
Keyword extraction is vital for Knowledge Management System, Information Retrieval System, and Digital Libraries as well as for general browsing of the web. Keywords are often the basis of document processing methods such as clustering and retrieval since processing all the words in the document can be slow. Common models for automating the process of keyword extraction are usually done by using several statistics-based methods such as Bayesian, K-Nearest Neighbor, and Expectation-Maximization. These models are limited by word-related features that can be used since adding more features will make the models more complex and difficult to comprehend. In this research, a Neural Network, specifically a backpropagation network, will be used in generalizing the relationship of the title and the content of articles in the archive by following word features other than TF-IDF, such as position of word in the sentence, paragraph, or in the entire document, and formats such as heading, and other attributes defined beforehand. In order to explain how the backpropagation network works, a rule extraction method will be used to extract symbolic data from the resulting backpropagation network. The rules extracted can then be transformed into decision trees performing almost as accurate as the network plus the benefit of being in an easily comprehensible format.
- Published
- 2012
49. Rule Extraction from Neural Networks and Support Vector Machines for Credit Scoring
- Author
-
Bart Baesens, Rudy Setiono, and David Martens
- Subjects
Artificial neural network ,business.industry ,Computer science ,Decision tree ,computer.software_genre ,Domain (software engineering) ,Support vector machine ,ComputingMethodologies_PATTERNRECOGNITION ,Credit history ,Application domain ,Business intelligence ,Extraction methods ,Data mining ,business ,computer - Abstract
In this chapter we describe how comprehensible rules can be extracted from artificial neural networks (ANN) and support vector machines (SVM). ANN and SVM are two very popular techniques for pattern classification. In the business intelligence application domain of credit scoring, they have been shown to be effective tools for distinguishing between good credit risks and bad credit risks. The accuracy obtained by these two techniques is often higher than that from decision tree methods. Unlike decision tree methods, however, the classifications made by ANN and SVM are difficult to understand by the end-users as outputs from ANN and SVM are computed as nonlinear mapping of the input data attributes. We describe two rule extraction methods that we have developed to overcome this difficulty. These rule extraction methods enable the users to obtain comprehensible propositional rules from ANN and SVM. Such rules can be easily verified by the domain experts and would lead to a better understanding about the data in hand.
- Published
- 2012
50. Using Sample Selection to Improve Accuracy and Simplicity of Rules Extracted from Neural Networks for Credit Scoring Applications
- Author
-
Arnulfo P. Azcarraga, Rudy Setiono, and Yoichi Hayashi
- Subjects
Structure (mathematical logic) ,Sample selection ,Training set ,Artificial neural network ,business.industry ,Computer science ,media_common.quotation_subject ,Extraction algorithm ,Pattern recognition ,computer.software_genre ,Computer Science Applications ,Theoretical Computer Science ,ComputingMethodologies_PATTERNRECOGNITION ,Outlier ,Benchmark (computing) ,Simplicity ,Artificial intelligence ,Data mining ,business ,computer ,Software ,media_common - Abstract
In this paper, we present an approach for sample selection using an ensemble of neural networks for credit scoring. The ensemble determines samples that can be considered outliers by checking the classification accuracy of the neural networks on the original training data samples. Those samples that are consistently misclassified by the neural networks in the ensemble are removed from the training dataset. The remaining data samples are then used to train and prune another neural network for rule extraction. Our experimental results on publicly available benchmark credit scoring datasets show that by eliminating the outliers, we obtain neural networks with higher predictive accuracy and simpler in structure compared to the networks that are trained with the original dataset. A rule extraction algorithm is applied to generate comprehensible rules from the neural networks. The extracted rules are more concise than the rules generated from networks that have been trained using the original datasets.
- Published
- 2015
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.