30 results on '"César A. Astudillo"'
Search Results
2. ML models for severity classification and length-of-stay forecasting in emergency units
- Author
-
Jonathan Moya-Carvajal, Francisco Pérez-Galarce, Carla Taramasco, César A. Astudillo, and Alfredo Candia-Véjar
- Subjects
Artificial Intelligence ,General Engineering ,Computer Science Applications - Published
- 2023
3. A Novel Strategy to Classify Chronic Patients at Risk: A Hybrid Machine Learning Approach
- Author
-
Hugo Nuñez Delafuente, César A. Astudillo, and Fabián Silva-Aravena
- Subjects
General Mathematics ,chronic patient ,classification rules ,machine learning ,decision support systems ,Computer Science (miscellaneous) ,Engineering (miscellaneous) - Abstract
Various care processes have been affected by COVID-19. One of the most dramatic has been the care of chronic patients under medical supervision. According to the World Health Organization (WHO), a chronic patient has one or more long-term illnesses, and must be permanently monitored by the health team.. In fact, and according to the Chilean Ministry of Health (MINSAL), 7 out of 10 chronic patients have suspended their medical check-ups, generating critical situations, such as a more significant number of visits to emergency units, expired prescriptions, and a higher incidence in hospitalization rates. For this problem, health services in Chile have had to reschedule their scarce medical resources to provide care in all health processes. One element that has been considered is caring through telemedicine and patient prioritization. In the latter case, the aim was to provide timely care to those critical patients with high severity and who require immediate clinical attention. For this reason, in this work, we present the following methodological contributions: first, an unsupervised algorithm that analyzes information from anonymous patients to classify them according to priority levels; and second, rules that allow health teams to understand which variable(s) determine the classification of patients. The results of the proposed methodology allow classifying new patients with 99.96% certainty using a three-level decision tree and five classification rules.
- Published
- 2022
- Full Text
- View/download PDF
4. Optimizing the search directions of a mixed DDM applied on cracks
- Author
-
Ignacio Fuenzalida-Henriquez, César A. Astudillo, Jorge Hinojosa, and Larry Peña
- Subjects
Work (thermodynamics) ,Control and Optimization ,Optimization algorithm ,Computer science ,Mechanical Engineering ,Linear elasticity ,Aerospace Engineering ,Domain decomposition methods ,Fracture mechanics ,Set (abstract data type) ,Optimization methods ,Electrical and Electronic Engineering ,Algorithm ,Software ,Civil and Structural Engineering - Abstract
Domain Decomposition Methods (DDM) are a set of numerical techniques that efficiently implement parallel computing for the structural analysis of large domains. This work presents the implementation of mixed DDM for linear elasticity problems along with non-linear problems such as crack propagation. In addition, optimization algorithms have been used to find the optimal parameters of the mixed domain decomposition method. Finally, a strategy is proposed in order to implement the optimization methods to obtain a good approximation of the Search Direction parameter at a low computational cost.
- Published
- 2021
5. Patients’ Prioritization on Surgical Waiting Lists: A Decision Support System
- Author
-
César A. Astudillo, Fabián Silva-Aravena, Luis González-Martínez, Eduardo Álvarez-Miranda, and José G. Ledezma
- Subjects
Biopsychosocial model ,Decision support system ,medicine.medical_specialty ,decision support system ,Computer science ,General Mathematics ,prioritization and vulnerability ,02 engineering and technology ,biopsychosocial criteria ,03 medical and health sciences ,0302 clinical medicine ,Intervention (counseling) ,Health care ,0202 electrical engineering, electronic engineering, information engineering ,Computer Science (miscellaneous) ,medicine ,QA1-939 ,Operations management ,030212 general & internal medicine ,Elective surgery ,Engineering (miscellaneous) ,business.industry ,Medical record ,Social environment ,waiting list ,elective surgery ,psychosocial support systems ,Otorhinolaryngology ,020201 artificial intelligence & image processing ,business ,Mathematics - Abstract
Currently, in Chile, more than a quarter-million of patients are waiting for an elective surgical intervention. This is a worldwide reality, and it occurs as the demand for healthcare is vastly superior to the clinical resources in public systems. Moreover, this phenomenon has worsened due to the COVID-19 sanitary crisis. In order to reduce the impact of this situation, patients in the waiting lists are ranked according to a priority. However, the existing prioritization strategies are not necessarily systematized, and they usually respond only to clinical criteria, excluding other dimensions such as the personal and social context of patients. In this paper, we present a decision-support system designed for the prioritization of surgical waiting lists based on biopsychosocial criteria. The proposed system features three methodological contributions, first, an ad-hoc medical record form that captures the biopsychosocial condition of the patients, second, a dynamic scoring scheme that recognizes that patients’ conditions evolve differently while waiting for the required elective surgery, and third, a methodology for prioritizing and selecting patients based on the corresponding dynamic scores and additional clinical criteria. The designed decision-support system was implemented in the otorhinolaryngology unit in the Hospital of Talca, Chile, in 2018. When compared to the previous prioritization methodology, the results obtained from the use of the system during 2018 and 2019 show that this new methodology outperforms the previous prioritization method quantitatively and qualitatively. As a matter of fact, the designed system allowed a decrease, from 2017 to 2019, in the average number of days in the waiting list from 462 to 282 days.
- Published
- 2021
- Full Text
- View/download PDF
6. Towards an Integrated Maturity Model of System and E-Business Applications in an Emerging Economy
- Author
-
Jimmy H. Gutiérrez-Bahamondes, Luis González-Martínez, Alejandro Cataldo, Robert J. McQueen, and César A. Astudillo
- Subjects
Electronic business ,Computer science ,Business system planning ,02 engineering and technology ,General Business, Management and Accounting ,Maturity (finance) ,Computer Science Applications ,Capability Maturity Model ,Enterprise system ,Maturity models ,020204 information systems ,Clustering analysis ,0202 electrical engineering, electronic engineering, information engineering ,Information system ,Business systems ,E-Business applications ,Information systems ,Emerging markets ,Cluster analysis ,Data mining ,Industrial organization - Abstract
Although there is a great number of maturity models proposed for Information Systems, most of them have three limitations: (1) they are focused on a single or small subset of companies, (2) they do not address the evolution of enterprise systems and e-business applications, simultaneously, (3) they are only focused on developed countries and do not consider emerging economies. We developed a maturity model of Information Systems that addresses these limitations through a data mining approach. The results showed that clustering analysis was an effective method for discovering similar groups of companies according to the set of enterprise systems and e-business applications that they adopted. Two major conclusions can be outlined: Unlike previous models, it has been shown that companies can be grouped only in three stages of maturity. Furthermore, the evolutionary pattern of systems adopted by companies follows a path oriented to obtain greater efficiencies at the expense of those that strengthen the relationship with customers. The results are relevant to practitioners, researchers and policy makers in emerging economies.
- Published
- 2020
- Full Text
- View/download PDF
7. Algorithms for the Minmax Regret Path problem with interval data
- Author
-
Alfredo Candia-Véjar, Francisco Pérez-Galarce, César A. Astudillo, and Matthew Bardeen
- Subjects
021103 operations research ,Information Systems and Management ,Computer science ,Node (networking) ,Interval estimation ,0211 other engineering and technologies ,Regret ,02 engineering and technology ,Minimax ,Computer Science Applications ,Theoretical Computer Science ,Artificial Intelligence ,Control and Systems Engineering ,Path (graph theory) ,Simulated annealing ,Shortest path problem ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Heuristics ,Branch and cut ,Algorithm ,Software - Abstract
The Shortest Path in networks is an important problem incombinatorial optimization and has many applications in areas like telecommunications and transportation. It is known that this problem is easy to solve in its classic deterministic version, but it is also known that it is an NP-Hard problem for several generalizations. The Shortest Path Problem consists in finding a simple path connecting a source node and a terminal node in an arc-weighted directed network. In some real-world situations the weights are not completely known and then this problem is transformed into an optimization one under uncertainty. It is assumed that an interval estimate is given for each arc length and no further information about the statistical distribution of the weights is known. Uncertainty has been modeled in different ways in optimization. Our aim in this paper is to study the Minmax Regret path with interval data problem by presenting a new exact branch and cut algorithm and, additionally, new heuristics. A set of difficult and large size instances are defined and computational experiments are conducted for the analysis of the different approaches designed to solve the problem. The main contribution of our paper is to provide an assessment of the performance of the proposed algorithms and an empirical evidence of the superiority of a simulated annealing approach based on a new neighborhood over the other heuristics proposed.
- Published
- 2018
8. On the data to know the prioritization and vulnerability of patients on surgical waiting lists
- Author
-
César A. Astudillo, José G. Ledezma, Eduardo Álvarez-Miranda, Luis González-Martínez, and Fabián Silva-Aravena
- Subjects
Prioritization ,Biopsychosocial model ,Decision support system ,medicine.medical_specialty ,Waiting list ,Vulnerability ,Medical decision making ,lcsh:Computer applications to medicine. Medical informatics ,03 medical and health sciences ,0302 clinical medicine ,Biopsichosocial criteria ,Medicine ,lcsh:Science (General) ,Decision Science ,030304 developmental biology ,0303 health sciences ,Multidisciplinary ,business.industry ,Priorization ,medicine.disease ,Otorhinolaryngology ,lcsh:R858-859.7 ,Medical emergency ,business ,030217 neurology & neurosurgery ,lcsh:Q1-390 - Abstract
The data presented in this article are complementary material to our work entitled “A decision support system for prioritization of patients on surgical waiting lists: A biopsychosocial approach”. We prepared, together with physicians, a survey was used in the otorhinolaryngology unit of the Hospital of Talca for a period of five months, between February 05, 2018 and June 29, 2018. Two hundred and five surveys were collected through 20 biopsychosocial criteria, which allowed measuring the priority and vulnerability of patients on the surgical waiting list. The data allow choosing and preparing patients for surgery according to both a dynamic score and a vulnerability level. Keywords: Waiting list, Biopsichosocial criteria, Priorization, Vulnerability, Medical decision making
- Published
- 2019
9. Data extracted from olive oil mill waste exposed to ambient conditions
- Author
-
Luis González-Martínez, David Gabriel, Diógenes Hernández, Fabián Silva A, and César A. Astudillo
- Subjects
0303 health sciences ,Multidisciplinary ,Extraction (chemistry) ,Forestal ,Oil mill ,Pulp and paper industry ,lcsh:Computer applications to medicine. Medical informatics ,Decomposition ,03 medical and health sciences ,chemistry.chemical_compound ,0302 clinical medicine ,chemistry ,Olive oil mill waste ,Environmental Science ,Odorants ,Ilumina sequencing ,Mill ,Environmental science ,lcsh:R858-859.7 ,Open-air reservoirs ,lcsh:Science (General) ,030217 neurology & neurosurgery ,030304 developmental biology ,Olive oil ,lcsh:Q1-390 - Abstract
Recent studies show that the process of extraction of olive oil results in a large amount of waste. Around 20% the oil is obtained in the process and the remaining 80% corresponds to mainly two types of waste, known as orujo and alperujo. These residues were stored in pools for 6 months in an uncontrolled environment. The reservoirs are open and generate Odorous Volatile Organic Compounds (VOCs) as products of waste decomposition. The data in this article corresponds of physical-chemical compounds of olive oil mill waste exposed to ambient conditions. The data was obtained from two different oil mills, namely, Almazara del Pacífico located in the Alto Pangue area, Talca, Chile; and Agricola y Forestal Don Rafael oil mill, Molina, Chile. Samples were extracted directly from the oil mills to fill 200 L plastic containers that simulated the waste storage in oil mill reservoirs. Each sample was identified and standardized to a mass of 150 kg and moved and stored under uncontrolled ambient conditions at the Universidad de Talca, Curicó, Chile. Keywords: Olive oil mill waste, Open-air reservoirs, Odorants, Ilumina sequencing
- Published
- 2019
10. Predicting the stability of human lysozyme mutants using the tree-based classifier TTOSOM
- Author
-
Gonzalo Maldonado, Julio Caballero, César A. Astudillo, and Gonzalo Riadi
- Subjects
0301 basic medicine ,Self-organizing map ,Artificial neural network ,business.industry ,Process Chemistry and Technology ,Machine learning ,computer.software_genre ,01 natural sciences ,0104 chemical sciences ,Computer Science Applications ,Analytical Chemistry ,010404 medicinal & biomolecular chemistry ,03 medical and health sciences ,030104 developmental biology ,Protein stability ,Tree based ,Artificial intelligence ,Class membership ,business ,Classifier (UML) ,computer ,Spectroscopy ,Software ,Mathematics - Abstract
One of the primary goals of applied proteomics is the development of new computational methods for modeling the properties of the proteins from the primary structure. In this work, we used the concept of semi-supervised learning, which is relatively new machine learning philosophy that combines labeled and unlabeled instances simultaneously, to perform classification of protein mutants according to their physical properties. Unlike more traditional methods, it does not demand the specification of the class labels of every sample. This is particularly useful when many exemplars are available but the actual class membership is only available for only a marginal subset. In spite of its desirable properties, semi-supervised learning has been seldom applied in molecular biology. In the recent years, a novel algorithm capable of performing semi-supervised learning has been proposed. This algorithm, namely the TTOSOM, is a tree-based neural network inspired in the well-known Self-Organizing Maps. In this paper, we use the TTOSOM to predict the stability of human lysozyme mutants. Since it plays a central role in the immunologic system, prediction of its structural stability is of primary importance for molecular biology. Our experimental results show that it is possible to predict the stability with accuracy above 64%, outperforming two well-known classifiers. This prediction is only based on historical data, i.e., without the necessity of expensive chemical substances and human resources.
- Published
- 2017
11. The multiple team formation problem using sociometry
- Author
-
César A. Astudillo, Daniel Mora-Melia, Jimmy H. Gutiérrez, Alfredo Candia-Véjar, and Pablo Ballesteros-Pérez
- Subjects
Mathematical optimization ,021103 operations research ,General Computer Science ,Heuristic (computer science) ,Multiple team formation problem ,0211 other engineering and technologies ,ComputingMilieux_LEGALASPECTSOFCOMPUTING ,02 engineering and technology ,Management Science and Operations Research ,Solver ,Sociometry ,ComputingMilieux_GENERAL ,Variable (computer science) ,Modelling and Simulation ,Modeling and Simulation ,0202 electrical engineering, electronic engineering, information engineering ,Constraint programming ,Heuristics ,020201 artificial intelligence & image processing ,Metaheuristic ,Local search (constraint satisfaction) ,Computer Science(all) ,Mathematics ,Integer (computer science) - Abstract
The Team Formation problem (TFP) has become a well-known problem in the OR literature over the last few years. In this problem, the allocation of multiple individuals that match a required set of skills as a group must be chosen to maximise one or several social positive attributes.Specifically, the aim of the current research is two-fold. First, two new dimensions of the TFP are added by considering multiple projects and fractions of people's dedication. This new problem is named the Multiple Team Formation Problem (MTFP).Second, an optimisation model consisting in a quadratic objective function, linear constraints and integer variables is proposed for the problem. The optimisation model is solved by three algorithms: a Constraint Programming approach provided by a commercial solver, a Local Search heuristic and a Variable Neighbourhood Search metaheuristic. These three algorithms constitute the first attempt to solve the MTFP, being a variable neighbourhood local search metaheuristic the most efficient in almost all cases.Applications of this problem commonly appear in real-life situations, particularly with the current and ongoing development of social network analysis. Therefore, this work opens multiple paths for future research. HighlightsOptimisation of human resource allocation in multiple simultaneous projects.Time-fraction allocations are now allowed.Comparison of CP, LS and VNS algorithm performance.Proposal of multiple options for future research.
- Published
- 2016
12. Performance Assessment of Classification Methods for the Inductance within a VSI
- Author
-
Colin Bellinger, Yamisleydi Salgueiro, Diego Aldana, César A. Astudillo, and Marco Rivera
- Subjects
business.industry ,Computer science ,Dimensionality reduction ,Linear discriminant analysis ,Machine learning ,computer.software_genre ,Random forest ,Reduction (complexity) ,Support vector machine ,Model predictive control ,Naive Bayes classifier ,Artificial intelligence ,business ,computer ,Curse of dimensionality - Abstract
The non-intrusive monitoring of electrical systems has gained relevance, during the last years, due to its lower costs and space requirements. Machine learning techniques have proved their ability to predict the parameters under monitoring and consequently improve the performance of power electronics systems. The present work seeks to determine the combination of machine learning techniques and dimensionality reduction that efficiently predicts the inductance value for a Voltage Source Inverter’s Modulate Model Predictive Control (VSI_M2PC). The problem, modeled as a classification one with three classes, has a high dimensionality 5000 attributes. Consequently, its reduction is needed to make it tractable at the cost of slightly sacrificing the accuracy of the model. Seven machine learning methods Support Vector Machine, K-Nearest Neighbours, Naive Bayes, Linear Discriminant Analysis, Classification and Regression Trees, C4.5 and Random Fores. Additionally, the strategies for dimensions reduction Correlation Elimination, Principal Component Analysis, and Boruta were experimentally studied on VSI_M2PC Matlab simulations. It was found that Random Forest combined with the Boruta provided the best results regarding classification efficiency.
- Published
- 2018
13. Evaluating different families of prediction methods for estimating software project outcomes
- Author
-
Narciso Cerpa, Matthew Bardeen, June M. Verner, and César A. Astudillo
- Subjects
Estimation ,Computer science ,business.industry ,Process (engineering) ,Dimensionality reduction ,020207 software engineering ,02 engineering and technology ,Machine learning ,computer.software_genre ,Outcome (game theory) ,Random forest ,Software ,Hardware and Architecture ,020204 information systems ,Benchmark (surveying) ,0202 electrical engineering, electronic engineering, information engineering ,Survey data collection ,Artificial intelligence ,Data mining ,business ,computer ,Information Systems - Abstract
We compare classifiers using AUC when predicting software project outcome.Attribute selection using Information Gain improves our classifiers performance.Statistical and ensemble classifiers are robust for predicting project outcome.Random Forest is the most appropriate technique for determining project outcome.Best prediction is achieved with team dynamics, process, and estimation attributes. Software has been developed since the 1960s but the success rate of development projects is still low. Classification models have been used to predict defects and effort estimation, but little work has been done to predict the outcome of these projects. Previous research shows that it is possible to predict outcome using classifiers based on key variables during development, but it is not clear which techniques provide more accurate predictions. We benchmark classifiers from different families to determine the outcome of a software project and identify variables that influence it. A survey-based empirical investigation was used to examine variables contributing to project outcome. Classification models were built and tested to identify the best classifiers for this data by comparing their AUC values. We reduce the dimensionality of the data with Information Gain and build models with the same techniques. We use Information Gain and classification techniques to identify key attributes and their relative importance. We find that four classification techniques provide good results for survey data, regardless of dimensionality reduction. We conclude that Random Forest is the most appropriate technique for predicting project outcome. We identified key attributes which are related to communication, estimation, and process review.
- Published
- 2016
14. Evolution of physical-chemical parameters, microbial diversity and VOC emissions of olive oil mill waste exposed to ambient conditions in open reservoirs
- Author
-
Claudio Tenreiro, César A. Astudillo, E. Fernández-Palacios, D. Hernández, Fernando Cataldo, and David Gabriel
- Subjects
0301 basic medicine ,chemistry.chemical_classification ,Volatile Organic Compounds ,030109 nutrition & dietetics ,Moisture ,Chemistry ,Fatty acid ,010501 environmental sciences ,01 natural sciences ,Decomposition ,Gas Chromatography-Mass Spectrometry ,03 medical and health sciences ,chemistry.chemical_compound ,Phenols ,Environmental chemistry ,Physical chemical ,Odorants ,Olive oil extraction ,Relative humidity ,Waste Management and Disposal ,Olive Oil ,0105 earth and related environmental sciences ,Olive oil - Abstract
In the olive oil extraction process, 20% olive oil is obtained. About 80% corresponds to waste, mainly alperujo and orujo. When these residues are stored in open reservoirs for later stabilization or potential reuse, odorous Volatile Organic Compounds (VOCs) are generated as products of waste decomposition. In this work, these emissions were studied by means of TD-GC/MS in relation to the changes in the physical-chemical (ashes, moisture, total phenols, pH, proteins, fibers, oils, fats) and biological parameters (bacterial and fungal diversity in Illumina platform) of waste for 6 months. The dynamics of these parameters were statistically related to the evolution of environmental variables (temperature, relative humidity, precipitation) and their effects on the most relevant physical-chemical parameters in order to evaluate their incidence in odorant VOCs emissions over time. The results showed a progressive increase in the diversity of both fungi and bacteria that were related, mainly, to a progressive decrease in the concentration of fatty acid methyl esters and the concentration of alkenes in the emissions; and to an increase of odorous compounds, mainly aldehydes, ketones and carboxylic acids, which were responsible for the unpleasant odors of waste. No significant differences were observed between the evolution of orujo characteristics compared to those of alperujo.
- Published
- 2018
15. A Novel Storage Space Allocation Policy for Import Containers
- Author
-
Myriam Gaete, César A. Astudillo, Rosa G. González-Ramírez, and Marcela C. González-Araya
- Subjects
Decision support system ,Operations research ,business.industry ,Computer science ,02 engineering and technology ,Port (computer networking) ,Automation ,Reduction (complexity) ,Yard ,Dwell time ,Statistical classification ,020204 information systems ,Container (abstract data type) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,business - Abstract
In developing countries, such as those in Latin America, inland flows of container terminals present high levels of uncertainty and variability. This situation occurs mainly due to lack of automation procedures, affecting coordination with the hinterlands. In this article, a methodology based on a dwell time segregated container storage policy is proposed. This methodology considers only import containers, due to the difficulty to determine a segregation criterion, which motivated us to use container dwell time information. Dwell times are discretized to determine classes, so that containers of the same class are assigned to close locations at the yard. The architecture of a decision support system to aid the stacking decisions based on this storage policy is proposed. The port of Arica in Chile is considered as a case study, and a discrete-event simulation model is also proposed to estimate potential benefits of this approach. Numerical results for the case study show a good performance, with potential reduction of the rehandles incurred when containers are retrieved from the yard.
- Published
- 2018
16. Assessing Wheat Traits by Spectral Reflectance: Do We Really Need to Focus on Predicted Trait-Values or Directly Identify the Elite Genotypes Group?
- Author
-
César A. Astudillo, Alejandro del Pozo, Sebastián Romero-Bravo, Gustavo A. Lobos, Iván Matus, Alejandro Escobar, Miguel Garriga, and Félix Estrada
- Subjects
0106 biological sciences ,phenotyping ,reflectance ,Plant Science ,Biology ,lcsh:Plant culture ,01 natural sciences ,Genotype ,Statistics ,lcsh:SB1-1110 ,Leaf area index ,Categorical variable ,Original Research ,business.industry ,Water stress ,food and beverages ,Regression analysis ,04 agricultural and veterinary sciences ,Reflectivity ,Regression ,carbon isotope discrimination ,Biotechnology ,040103 agronomy & agriculture ,Trait ,0401 agriculture, forestry, and fisheries ,high-throughput phenotyping ,phenomic ,business ,010606 plant biology & botany - Abstract
Phenotyping, via remote and proximal sensing techniques, of the agronomic and physiological traits associated with yield potential and drought adaptation could contribute to improvements in breeding programs. In the present study, 384 genotypes of wheat (Triticum aestivum L.) were tested under fully irrigated (FI) and water stress (WS) conditions. The following traits were evaluated and assessed via spectral reflectance: Grain yield (GY), spikes per square meter (SM2), kernels per spike (KPS), thousand-kernel weight (TKW), chlorophyll content (SPAD), stem water soluble carbohydrate concentration and content (WSC and WSCC, respectively), carbon isotope discrimination (Δ13C), and leaf area index (LAI). The performances of spectral reflectance indices (SRIs), four regression algorithms (PCR, PLSR, ridge regression RR, and SVR), and three classification methods (PCA-LDA, PLS-DA, and kNN) were evaluated for the prediction of each trait. For the classification approaches, two classes were established for each trait: The lower 80% of the trait variability range (Class 1) and the remaining 20% (Class 2 or elite genotypes). Both the SRIs and regression methods performed better when data from FI and WS were combined. The traits that were best estimated by SRIs and regression methods were GY and Δ13C. For most traits and conditions, the estimations provided by RR and SVR were the same, or better than, those provided by the SRIs. PLS-DA showed the best performance among the categorical methods and, unlike the SRI and regression models, most traits were relatively well-classified within a specific hydric condition (FI or WS), proving that classification approach is an effective tool to be explored in future studies related to genotype selection.
- Published
- 2017
17. Data for resistance and inductance estimation within a voltage source inverter
- Author
-
Colin Bellinger, Diego Aldana, César A. Astudillo, Yamisleydi Salgueiro, and Marco Rivera
- Subjects
Voltage source inverter data ,0303 health sciences ,Multidisciplinary ,Nonintrusive monitoring ,Computer science ,Photovoltaic system ,Converters ,lcsh:Computer applications to medicine. Medical informatics ,Power (physics) ,Inductance ,03 medical and health sciences ,Motor drive ,Model predictive control ,Engineering ,0302 clinical medicine ,Control theory ,lcsh:R858-859.7 ,lcsh:Science (General) ,High dimensional machine learning ,Row ,030217 neurology & neurosurgery ,Energy (signal processing) ,lcsh:Q1-390 ,030304 developmental biology - Abstract
Power converters are essential for the use of renewable energy resources. For example, a photovoltaic system produces DC energy that is transformed into AC by the voltage source inverter (VSI). This power is used by a motor drive that operates at different speeds, generating variable loads. Two parameters, namely, resistance and inductance are essential to correctly adjust the model predictive control (MPC) in a VSI. In this paper, we describe the data from a VSI that incorporates an MPC. We generate four datasets consisting of 399 cases or instances (rows) each one. Two data set comprises the simulations varying the inductance (continuous and discrete versions) and the other two varying the resistance (continuous and discrete versions). The motivation behind this data is to support the design and development of nonintrusive models to predict the resistance and inductance of a VSI under different conditions. Keywords: Voltage source inverter data, Nonintrusive monitoring, High dimensional machine learning
- Published
- 2019
18. On achieving semi-supervised pattern recognition by utilizing tree-based SOMs
- Author
-
César A. Astudillo and B. John Oommen
- Subjects
Artificial neural network ,business.industry ,Small number ,Supervised learning ,Pattern recognition ,Semi-supervised learning ,Machine learning ,computer.software_genre ,ComputingMethodologies_PATTERNRECOGNITION ,Artificial Intelligence ,Signal Processing ,One-class classification ,Computer Vision and Pattern Recognition ,Artificial intelligence ,Voronoi diagram ,business ,computer ,Classifier (UML) ,Software ,Mathematics ,Test data - Abstract
Years of research in the field of Pattern Recognition (PR) has led to scores of algorithms which can achieve supervised pattern classification. Such algorithms assume the knowledge of well-defined training sets with a clear specification of the identity of all the training samples. However, more recently, a new stream has emerged, namely, the so-called ''semi-supervised'' paradigm, i.e., one that uses a combination of labeled and unlabeled samples to perform classification [41]. Classifiers based on the latter, do not demand the specification of the class labels of every sample. Rather, a clustering-like mechanism processes the manifold, and attempts to distinguish the training samples into the separate classes, subsequent to which a supervised classifier is derived using a small subset of the training samples whose class identities are known. In this paper we will venture to utilize the Tree-based Topology Oriented SOM (TTOSOM) [3] for semi-supervised pattern classification. We first train a TTOSOM in which the neurons collectively obey the stochastic, topological and structural distribution of all the classes. Subsequently, we make use of the information provided in the labeled dataset. By using this information, we assign a class label to every single node in the Neural Network (NN), which, in turn, partitions the space into its Voronoi regions. On receiving the testing data, the task at hand is rather straightforward. One nearly determines the closest neuron to the testing sample and assigns the sample to the corresponding class. The complexity of the testing is linear, not in cardinality of the training set, but rather in the size of the TTOSOM tree! Our experimental results show that on average, the classification capabilities of our proposed strategy, even with a small number of neurons, are reasonably comparable to those obtained by some of the state-of-the-art classification schemes that only use labeled instances during the training phase. The experiments also show that improved levels of accuracy can be obtained by imposing trees with a larger number of nodes.
- Published
- 2013
19. Concept Drift Detection Using Online Histogram-Based Bayesian Classifiers
- Author
-
B. John Oommen, César A. Astudillo, Javier I. González, and Anis Yazidi
- Subjects
Concept drift ,Computer science ,business.industry ,Bayesian probability ,Pattern recognition ,02 engineering and technology ,computer.software_genre ,Information theory ,Naive Bayes classifier ,ComputingMethodologies_PATTERNRECOGNITION ,020204 information systems ,Histogram ,0202 electrical engineering, electronic engineering, information engineering ,sort ,020201 artificial intelligence & image processing ,Data mining ,Artificial intelligence ,business ,computer ,Classifier (UML) ,Statistical classifier - Abstract
In this paper, we present a novel algorithm that performs online histogram-based classification, i.e., specifically designed for the case when the data is dynamic and its distribution is non-stationary. Our method, called the Online Histogram-based Naïve Bayes Classifier (OHNBC) involves a statistical classifier based on the well-established Bayesian theory, but which makes some assumptions with respect to the independence of the attributes. Moreover, this classifier generates a prediction model using uni-dimensional histograms, whose segments or buckets are fixed in terms of their cardinalities but dynamic in terms of their widths. Additionally, our algorithm invokes the principles of information theory to automatically identify changes in the performance of the classifier, and consequently, forces the reconstruction of the classification model in run-time as and when it is needed. These properties have been confirmed experimentally over numerous data sets (In the interest of space and brevity, we present here only a subset of the available results. More detailed results are found in [2].) from different domains. As far as we know, our histogram-based Naïve Bayes classification paradigm for time-varying datasets is both novel and of a pioneering sort.
- Published
- 2016
20. A Cluster Analysis of Stock Market Data Using Hierarchical SOMs
- Author
-
César A. Astudillo, Jorge Poblete, Marina Resta, and B. John Oommen
- Subjects
Artificial neural network ,Computer science ,Mathematical properties ,020206 networking & telecommunications ,02 engineering and technology ,computer.software_genre ,Original data ,0202 electrical engineering, electronic engineering, information engineering ,Cluster (physics) ,020201 artificial intelligence & image processing ,Stock market ,Data mining ,Cluster analysis ,computer ,Stock (geology) - Abstract
The analysis of stock markets has become relevant mainly because of its financial implications. In this paper, we propose a novel methodology for performing a structured cluster analysis of stock market data. Our proposed method uses a tree-based neural network called the TTOSOM. The TTOSOM performs self-organization to construct tree-based clusters of vector data in the multi-dimensional space. The resultant tree possesses interesting mathematical properties such as a succinct representation of the original data distribution, and a preservation of the underlying topology. In order to demonstrate the capabilities of our method, we analyze 206 assets of the Italian stock market. We were able to establish topological relationships between various companies traded on the Italian stock market and visually inspect the resultant taxonomy. The results that we obtained, briefly reported here (but more elaborately in [10]), were amazingly accurate and reflected the real-life relationships between the stocks.
- Published
- 2016
21. Imposing tree-based topologies onto self organizing maps
- Author
-
B. John Oommen and César A. Astudillo
- Subjects
Self-organizing map ,Information Systems and Management ,business.industry ,Computer science ,05 social sciences ,050301 education ,02 engineering and technology ,Network topology ,Information science ,Computer Science Applications ,Theoretical Computer Science ,Artificial Intelligence ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Tree based ,Artificial intelligence ,business ,0503 education ,Software - Abstract
Accepted version of an article from the journal Information Sciences. Definitive published version available on Elsevier Science Direct: http://dx.doi.org/10.1016/j.ins.2011.04.038
- Published
- 2011
22. Pattern Recognition using the TTOCONROT
- Author
-
César A. Astudillo and B. John Oommen
- Subjects
business.industry ,Computer science ,Training phase ,Pattern recognition ,Classification scheme ,Neural network nn ,Artificial intelligence ,business ,Data structure ,Classifier (UML) - Abstract
We present a method that employs a tree-based Neural Network NN for performing classification. The novel mechanism, apart from incorporating the information provided by unlabeled and labeled instances, re-arranges the nodes of the tree as per the laws of Adaptive Data Structures ADSs. Particularly, we investigate the Pattern Recognition PR capabilities of the Tree-Based Topology-Oriented SOM TTOSOM when Conditional Rotations CONROT [8] are incorporated into the learning scheme. The learning methodology inherits all the properties of the TTOSOM-based classifier designed in [4]. However, we now augment it with the property that frequently accessed nodes are moved closer to the root of the tree. Our experimental results show that on average, the classification capabilities of our proposed strategy are reasonably comparable to those obtained by some of the state-of-the-art classification schemes that only use labeled instances during the training phase. The experiments also show that improved levels of accuracy can be obtained by imposing trees with a larger number of nodes.
- Published
- 2015
23. Fast BMU Search in SOMs Using Random Hyperplane Trees
- Author
-
César A. Astudillo and B. John Oommen
- Subjects
Self-organizing map ,Tree (data structure) ,Data point ,Hyperplane ,Logarithm ,Artificial neural network ,Cluster analysis ,Focus (optics) ,Algorithm ,Mathematics - Abstract
One of the most prominent Neural Networks (NNs) reported in the literature is the Kohonen’s Self-Organizing Map (SOM). In spite of all its desirable capabilities and the scores of reported applications, it, unfortunately, possesses some fundamental drawbacks. Two of these handicaps are the quality of the map learned and the time required to train it. The most demanding phase of the algorithm involves determining the so-called Best Matching Unit (BMU), which requires time that is proportional to the number of neurons in the NN. The focus of this paper is to reduce the time needed for this tedious task, and to attempt to obtain an approximation of the BMU is as little as logarithmic time. To achieve this, we depend heavily on the work of [3,6], where the authors focused on how to accurately learn the data distribution connecting the neurons on a self-organizing tree, and how the learning algorithm, called the Tree-based Topology-Oriented SOM (TTOSOM), can be useful for data clustering [3,6] and classification [5]. We briefly state how we intend to reduce the training time for identifying the BMU efficiently. First, we show how a novel hyperplane-based partitioning scheme can be used to accelerate the task. Unlike the existing hyperplane-based partitioning methods reported in the literature, our algorithm can avoid ill-conditioned scenarios. It is also capable of considering data points that are dynamic. We demonstrate how these hyperplanes can be recursively defined, represented and computed, so as to recursively divide the hyper-space into two halves. As far as we know, the use of random hyperplanes to identify the BMU is both pioneering and novel.
- Published
- 2014
24. Metric Space Searching Based on Random Bisectors and Binary Fingerprints
- Author
-
José María Andrade, César A. Astudillo, and Rodrigo Paredes
- Subjects
Metric space ,Index (economics) ,Computer science ,Scheme (mathematics) ,Nearest neighbor search ,Binary number ,Algorithm - Abstract
We present a novel index for approximate searching in metric spaces based on random bisectors and binary fingerprints. The aim is to deal with scenarios where the main memory available is small. The method was tested on synthetic and real-world metric spaces. Our results show that our scheme outperforms the standard permutant-based index in scenarios where memory is scarce.
- Published
- 2014
25. PRICAI 2014: Trends in Artificial Intelligence
- Author
-
Swakkhar Shatabda, Deborah Richards, Ilya Sinayskiy, W. Kleijn, Ira Puspitasari, Zhendong Niu, Anthony Truskinger, Naoki Fukuta, Abdul Sattar, Salma Jamoussi, Sanparith Marukatat, Ann Nicholson, César A. Astudillo, Mehul Bhatt, M A Hakim Newton, Hamed Hassanzadeh, Joao Leite, Takayuki Ito, Alan Wee-Chung Liew, Donghui Lin, Abhaya Nayak, Doan Nguyen, Jiamou Liu, Paul Compton, Matthias Knorr, Endong Tong, Michael Towsey, Pavel Surynek, Lei Pan, Fenghui Ren, Jakob Suchan, Sanjiang Li, Minjie Zhang, Toru Ishida, Gang Li, Muhammad Tahajjudi Ghifary, Ziheng Wei, Yuki Yamagishi, Mahmood Rashid, Ingrid Zukerman, Jane Hunter, Quan Bai, Zahid Islam, Federico Cerutti, Richi Nayak, Ricardo Gonçalves, Erwin Oh, B. John Oommen, Francesco Petruccione, and Alexander Ferrein
- Subjects
Transitive relation ,Computer science ,business.industry ,Bilingual dictionary ,Constraint satisfaction ,computer.software_genre ,Pivot language ,Semantic similarity ,Complete information ,Artificial intelligence ,Polysemy ,business ,computer ,Word (computer architecture) ,Natural language processing - Abstract
High quality bilingual dictionaries are rarely available for lower-density language pairs, especially for those that are closely related. Using a third language as a pivot to link two other languages is a well-known solution, and usually requires only two input bilingual dictionaries to automatically induce the new one. This approach, however, produces many incorrect translation pairs because the dictionary entries are normally are not transitive due to polysemy and the ambiguous words in the pivot language. Utilizing the complete structures of the input bilingual dictionaries positively influences the result since dropped meanings can be countered. Moreover, an additional input dictionary may provide more complete information for calculating the semantic distance between word senses which is key to suppressing wrong sense matches. This paper proposes an extended constraint optimization model to inducing new dictionaries of closely related languages from multiple input dictionaries, and its formalization based on Integer Linear Programming. Evaluations indicated that the proposal not only outperforms the baseline method, but also shows improvements in performance and scalability as more dictionaries are utilized.
- Published
- 2014
26. Topology-oriented self-organizing maps: a survey
- Author
-
B. John Oommen and César A. Astudillo
- Subjects
Self-organizing map ,Artificial neural network ,Computer science ,business.industry ,Asset (computer security) ,Machine learning ,computer.software_genre ,Data science ,Resource (project management) ,Artificial Intelligence ,Pattern recognition (psychology) ,Computer Vision and Pattern Recognition ,Tree based ,Artificial intelligence ,Architecture ,business ,computer ,Topology (chemistry) - Abstract
The self-organizing map (SOM) is a prominent neural network model that has found wide application in a spectrum of domains. Accordingly, it has received widespread attention both from the communities of researchers and practitioners. As a result, several variations of the basic architecture have been devised, specifically in the early years of the SOM's evolution, which were introduced so as to address various architectural shortcomings or to explore other structures of the basic model. The overall goal of this survey is to present a comprehensive comparison of these networks, in terms of their primitive components and properties. We dichotomize these schemes as being either tree based or non-tree based. We have embarked on this venture with the hope that since the survey is comprehensive and the bibliography extensive, it will be an asset and resource for future researchers.
- Published
- 2014
27. Using semantic similarity to predict angle and distance of objects in images
- Author
-
Jim Davies, Sterling Somers, Jonathan Gagné, and César A. Astudillo
- Subjects
Computer science ,business.industry ,media_common.quotation_subject ,Analogy ,Creativity ,computer.software_genre ,Visualization ,Presentation ,Semantic similarity ,Artificial intelligence ,business ,computer ,Natural language processing ,media_common - Abstract
A presentation of an Artificial Intelligence (AI) called Visuo that stores and guesses quantitative visual-spatial magnitudes (e.g., sizes of objects). In this analysis, Visuo is used to store polar (angle and distance) relationships between objects in images. It uses a database of tagged images as its memory and approximates unexperienced magnitudes by analogy with semantically related concepts. This shows the transferring of information from high semantically related concepts yielding significantly higher accuracy in angle and distance estimations over using medium or low semantically similar items.
- Published
- 2011
28. Editorial: Data Mining in Electronic Commerce - Support vs. Confidence
- Author
-
César A. Astudillo, Matthew Bardeen, and Narciso Cerpa
- Subjects
Business information ,Association rule learning ,Data stream mining ,Computer science ,business.industry ,Customer relationship management ,computer.software_genre ,General Business, Management and Accounting ,Computer Science Applications ,World Wide Web ,Web mining ,Business intelligence ,Data mining ,Web service ,business ,Ubiquitous commerce ,computer - Abstract
in the data mining technique, association rules. This article was presented at a conference and never formally published [22]. In the last four years this article has been downloaded nearly twenty-thousand times from an open access repository. This interest by researchers and practitioners has motivated us to write this technical editorial. The structure of this editorial will be as follows. In this section we briefly introduce data mining and electronic commerce. In the following section we describe different data mining techniques. In the final section we discuss the effect of support versus confidence in association rules technique applied to electronic commerce. The data mining process involves searching, selecting, exploring, and modeling large amounts of data to uncover previously unknown patterns that are potentially useful, and ultimately comprehensible information, from large databases. Its goal is to manipulate data into knowledge ([15], [18]-[19], [31], [33]). Pattern extraction is an important process of any data mining technique and it refers to the relationships between subsets of data. Data mining use different families of computational, statistical and machine learning methods that include statistical analysis, decision trees, neural networks, rule induction and refinement, and graphic visualization among others, to exhaustively explore data to reveal complex relationships that may exist. Although machine learning techniques have been available for a long time, the development of advanced and user friendly tools for business intelligence [25] has made data mining more attractive and practical for organizations. When these pattern extraction techniques are used correctly, they can be effective tools for extracting useful information from data [35]. The recent wide use of data mining has been due to several factors. The most obvious of these is the large amounts of data that organizations collect during operational transactions. In the early 90s, credit and insurance companies began using data mining as a means of detecting fraud [28]. Most organizations, irrespective of the industry type, have some form of operational process in which they collect large amounts of data. For example, the retail industry has been using data mining techniques for years to predict what their customers are likely to purchase. The electronic commerce industry was one of the latest to use data mining technology [18]. Electronic commerce is the use of information and communication technologies through the Internet platform to share business information, keep business relationships, and conduct business transactions. In electronic commerce, different data mining techniques can be used for many purposes. For example, in sales promotion the marketing staff may want to find out which products their customers are more likely to buy together. This information will allow them to place these items in a sales bundle in order to increase revenue ([2], [31]). The use of Web log data permits to understand users' behavior. This data contains information about users' access and may show potential patterns in their behavior, and identify potential customers of electronic commerce. This knowledge is useful to: change marketing strategies; identify segmentation of customers; improve customers' retention; predict customer's expenditure and market trends; provide personalized services to customers; analyze shopping cart; forecast sales; redesign the website to provide a better service; and/or make better business decisions. This area of data mining has given rise to Web mining, a technique that can be subdivided into Web content mining; Web structure mining; and Web usage mining ([7], [24]). These techniques are also used to extract useful information from Web documents or Web services [5] and are widely used in a variety of applications. As we describe above, data mining and specifically web data mining technology plays an important role in electronic commerce. In recent years with the rapid growth of electronic commerce and the large amounts of data collected through operational transactions, data mining techniques are becoming more useful to discover and understand unknown customer patterns. In the following paragraphs we briefly describe some examples of the application of data mining in electronic commerce. Clustering or grouping electronic commerce customers with similar browsing behaviors permit the identification of their common characteristics, providing a better understanding of customers with the aim of giving them a more appropriate, and personalized service. When a vendor knows the customer's needs and interests, they can work on providing a better service and keeping the customer relationship with the vendor.
- Published
- 2014
29. On Using Adaptive Binary Search Trees to Enhance Self Organizing Maps
- Author
-
B. John Oommen and César A. Astudillo
- Subjects
Red–black tree ,Fractal tree index ,Tree rotation ,K-ary tree ,Binary tree ,Theoretical computer science ,Computer science ,Optimal binary search tree ,Interval tree ,Cartesian tree ,Search tree ,Random binary tree ,Treap ,k-d tree ,Tree traversal ,Binary search tree ,Ternary search tree ,Binary expression tree ,Algorithm ,Self-balancing binary search tree ,Order statistic tree - Abstract
We present a strategy by which a Self-Organizing Map (SOM) with an underlying Binary Search Tree (BST) structure can be adaptively re-structured using conditional rotations. These rotations on the nodes of the tree are local and are performed in constant time , guaranteeing a decrease in the Weighted Path Length (WPL) of the entire tree. As a result, the algorithm, referred to as the Tree-based Topology-Oriented SOM with Conditional Rotations (TTO-CONROT), converges in such a manner that the neurons are ultimately placed in the input space so as to represent its stochastic distribution, and additionally, the neighborhood properties of the neurons suit the best BST that represents the data.
- Published
- 2009
30. A Novel Self Organizing Map Which Utilizes Imposed Tree-Based Topologies
- Author
-
César A. Astudillo and John B. Oommen
- Subjects
Set (abstract data type) ,Self-organizing map ,Tree (data structure) ,Theoretical computer science ,Computer science ,Perspective (graphical) ,ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION ,Topology (electrical circuits) ,Representation (mathematics) ,Network topology ,Image (mathematics) - Abstract
In this paper we propose a strategy, the Tree-based Topology-Oriented SOM (TTO-SOM) by which we can impose an arbitrary, user-defined, tree-like topology onto the codebooks. Such an imposition enforces a neighborhood phenomenon which is based on the user-defined tree, and consequently renders the so-called bubble of activity to be drastically different from the ones defined in the prior literature. The map learnt as a consequence of training with the TTO-SOM is able to infer both the distribution of the data and its structured topology interpreted via the perspective of the user-defined tree. The TTO-SOM also reveals multi-resolution capabilities, which are helpful for representing the original data set with different numbers of points, whithout the necessity of recomputing the whole tree. The ability to extract an skeleton, which is a “stick-like” representation of the image in a lower dimensional space, is discussed as well. These properties have been confirmed by our experimental results on a variety of data sets.
- Published
- 2009
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.