Author: "Asdrúbal López Chau" / Topic: computer science - Searchworks@Jio Institute Digital Library Search Results

1. Efficient nucleus segmentation of white blood cells mimicking the human perception of color

Author: Jair Cervantes, Matías Alvarado, Farid García-Lamont, and Asdrúbal López-Chau
Subjects: White (horse), Computer science, business.industry, General Chemical Engineering, Color recognition, media_common.quotation_subject, Human Factors and Ergonomics, General Chemistry, Image segmentation, Color space, medicine.anatomical_structure, Perception, medicine, Segmentation, Computer vision, Artificial intelligence, Chromaticity, business, Nucleus, media_common
Published: 2021
Full Text: View/download PDF

2. FRAGMENT: A Web Application for Database Fragmentation, Allocation and Replication over a Cloud Environment

Author: Asdrúbal López-Chau, María Antonieta Abud-Figueroa, Lisbeth Rodríguez-Mazahua, Felipe Castro-Medina, and Giner Alor-Hernández
Subjects: Distributed Computing Environment, General Computer Science, Database, Distributed database, Computer science, business.industry, Relational database, Fragmentation (computing), 020206 networking & telecommunications, Cloud computing, 02 engineering and technology, computer.software_genre, Metadata, Centralized database, 0202 electrical engineering, electronic engineering, information engineering, Web application, 020201 artificial intelligence & image processing, Electrical and Electronic Engineering, business, computer
Abstract: Fragmentation, allocation and replication are techniques widely used in relational databases to improve the performance of operations and reduce their cost in distributed environments. This article shows an analysis of different methods for database fragmentation, allocation and replication and a Web application called FRAGMENT that adopts the work technique that was selected in the analysis stage, because it presents a fragmentation and replication method, it is applied to a cloud environment, it is easy to implement, it focuses on improving the performance of the operations executed on the database, it shows everything necessary for its implementation and is based on a cost model. FRAGMENT analyzes the operations performed in any table of a database, proposes fragmentation schemes based on the most expensive attributes and allocates and replicates a scheme chosen by the user in a distributed environment in the cloud. This work shows a common problem in fragmentation methods, overlapping fragments, and provides an algorithm with an approach to address it. This algorithm results in the predicates that will define each fragment in a distributed environment. To validate the implemented technique, a second web application is presented, dedicated to simulate operations on sites and focused on producing a log file for the main application. Experiments with the TPC-E benchmark demonstrated lower response time of the queries executed against the distributed database generated by FRAGMENT compared with a centralized database.
Published: 2020
Full Text: View/download PDF

3. A Brief Review of Vertical Fragmentation Methods Considering Multimedia Databases and Content-Based Queries

Author: Felipe Castro-Medina, Lisbeth Rodríguez-Mazahua, Asdrúbal López-Chau, Celia Romero-Torres, Aldo Osmar Ortiz-Ballona, and María Antonieta Abud-Figueroa
Subjects: Database, Multimedia, Distributed database, Relational database, Computer science, business.industry, Multimedia database, computer.software_genre, Query optimization, Market fragmentation, Content (measure theory), Web application, business, computer, Image retrieval
Abstract: Vertical fragmentation is a distributed database design technique widely used in relational databases to reduce query execution costs. This technique has been also applied in multimedia databases to take advantage of its benefits in query optimization. Content-based retrieval is essential in multimedia databases to provide relevant data to the user. This paper presents a review of the literature on vertical fragmentation methods to determine if they consider multimedia data and content-based queries, are easy to implement, are complete (i.e., the paper includes all the information necessary to implement the method), and if they provide a cost model for the evaluation of their results. To meet this objective, we analyzed and classified 37 papers referring to vertical fragmentation and/or content-based image retrieval. Finally, we selected the best method from our comparative analysis and presented a web application architecture for applying vertical fragmentation in multimedia databases to optimize content-based queries.
Published: 2021
Full Text: View/download PDF

4. An Improvement to FRAGMENT: A Web Application for Database Fragmentation, Allocation, and Replication Over a Cloud Environment

Author: Cuauhtémoc Sánchez-Ramírez, Felipe Castro-Medina, Ulises Juárez-Martínez, Lisbeth Rodríguez-Mazahua, Giner Alor-Hernández, and Asdrúbal López-Chau
Subjects: Distributed Computing Environment, Database, business.industry, Computer science, Cloud computing, computer.software_genre, Replication (computing), Market fragmentation, Reduction (complexity), Fragment (logic), Benchmark (computing), Web application, business, computer
Abstract: Fragmentation plays a very important role in databases since this achieves an adequate design in distributed environments such as the cloud, which in terms of execution cost of the read and write operations, benefits are obtained. This work presents an improvement to the design of the overlapping fragmentation algorithm shown in previous works, which is part of a method to also carry out allocation and replication. The improvement consists of ordering the predicates by cost and in this way obtaining each predicate contemplating that the fragment overlap will occur, however, the fragments with a higher cost will be kept more intact than those with a lower cost. Experiments with the TPC-E benchmark adapted to a distributed environment demonstrate that our enhanced approach achieves a significant reduction of query response time.
Published: 2021
Full Text: View/download PDF

5. Análisis de la utilidad del Bastón Blanco Inteligente UAEM para personas con discapacidad visual

Author: Valentín Trujillo Mora, Elvira Ivone González Jaimes, Asdrúbal López Chau, and Jorge Bautista López
Subjects: Software portability, ALARM, White cane, Severe visual impairment, Human–computer interaction, Visually impaired, business.industry, Computer science, Global Positioning System, Exploratory research, Ocean Engineering, Statistical analysis, business
Abstract: El objetivo de la presente investigación es probar la utilidad del prototipo de Bastón Blanco Inteligente UAEM con sensores ultrasónicos (disparadores de alarmas y vibraciones) y sistema GPS (Sistema de Posicionamiento de Posición Global) para usuarios con discapacidad visual. La tecnología incluida en el Bastón Blanco Inteligente UAEM le proporciona al usuario con discapacidad visual diversas ventajas para ampliar su movilidad de forma segura, lo que en definitiva sirve para mejorar su calidad de vida. Sin embargo, su uso adecuado requiere de un entrenamiento especializado que ayude al usuario a obtener la utilidad que el prototipo promete. Para probar esos beneficios se efectuó un estudio exploratorio donde se analizaron las experiencias de entrenamiento y de consumo de 20 participantes adultos con discapacidad visual severa y ceguera. El análisis estadístico fue descriptivo, y permitió registrar la satisfacción de los usuarios ante las características físicas y los beneficios ofrecidos por el prototipo. Como resultado se observó la asociación entre las vibraciones, los sonidos y los diferentes mensajes emitidos (obstáculos diversos a pequeñas o grandes distancias). El Bastón Blanco Inteligente UAEM con sensores ultrasónicos y sistema GPS es un prototipo que ayuda a la movilidad segura y autónoma, lo que eleva la calidad de vida del usuario porque el dispositivo es ligero, plegable y su material es resistente; además, contiene aditamentos sonoros y vibratorios que proporcionan la simulación de un mapa físico a bajo precio.
Published: 2021
Full Text: View/download PDF

6. Comparative Analysis of Decision Tree Algorithms for Data Warehouse Fragmentation

Author: Nidia Rodríguez-Mazahua, Lisbeth Rodríguez-Mazahua, Asdrúbal López-Chau, Giner Alor-Hernández, and S. Gustavo Peláez-Camarena
Subjects: Data set, C4.5 algorithm, Computer science, Star schema, Information gain ratio, Decision tree, Feature selection, Tuple, Algorithm, Data warehouse
Abstract: One of the main problems faced by Data Warehouse (DW) designers is fragmentation. Several studies have proposed data mining-based horizontal fragmentation methods, which focus on optimizing query response time and execution cost to make the DW more efficient. However, to the best of our knowledge, it does not exist a horizontal fragmentation technique that uses a decision tree to carry out fragmentation. Given the importance of decision trees in classification, since they allow obtaining pure partitions (subsets of tuples) in a data set using measures such as Information Gain, Gain Ratio and the Gini Index, the aim of this work is to use decision trees in the DW fragmentation. This chapter presents the analysis of different decision tree algorithms to select the best one to implement the fragmentation method. Such analysis was performed under version 3.9.4 of Weka considering four evaluation metrics (Precision, ROC Area, Recall, and F-measure) for different selected data sets using the SSB (Star Schema Benchmark). Several experiments were carried out using two attribute selection methods: Best First and Greedy Stepwise, the data sets were pre-processed using the Class Conditional Probabilities filter and it was included the analysis of two data sets (24 and 50 queries) with this filter, to know the behavior of the decision tree algorithms for each data set. Once the analysis was concluded, we can determine that for 24 queries data set the best algorithm was RandomTree since it won in two methods. On the other hand, in the data set of 50 queries, the best decision tree algorithms were LMT and RandomForest because they obtained the best performance for all methods tested. Finally, J48 was the selected algorithm when neither an attribute selection method nor the Class Probabilities filter are used. But, if only the latter is applied to the data set, the best performance is given by the LMT algorithm.
Published: 2021
Full Text: View/download PDF

7. Application of Dynamic Fragmentation Methods in Multimedia Databases: A Review

Author: Jair Cervantes, Felipe Castro-Medina, Isaac Machorro-Cano, Asdrúbal López-Chau, Giner Alor-Hernández, and Lisbeth Rodríguez-Mazahua
Subjects: Computer science, literature review, hybrid fragmentation, General Physics and Astronomy, cost model, lcsh:Astrophysics, Database administrator, 02 engineering and technology, Review, computer.software_genre, 020204 information systems, lcsh:QB460-466, 0202 electrical engineering, electronic engineering, information engineering, lcsh:Science, dynamic fragmentation, multimedia fragmentation, vertical fragmentation, Multimedia, Database, horizontal fragmentation, Fragmentation (computing), lcsh:QC1-999, Workflow, lcsh:Q, 020201 artificial intelligence & image processing, computer, lcsh:Physics, Large size
Abstract: Fragmentation is a design technique widely used in multimedia databases, because it produces substantial benefits in reducing response times, causing lower execution costs in each operation performed. Multimedia databases include data whose main characteristic is their large size, therefore, database administrators face a challenge of great importance, since they must contemplate the different qualities of non-trivial data. These databases over time undergo changes in their access patterns. Different fragmentation techniques presented in related studies show adequate workflows, however, some do not contemplate changes in access patterns. This paper aims to provide an in-depth review of the literature related to dynamic fragmentation of multimedia databases, to identify the main challenges, technologies employed, types of fragmentation used, and characteristics of the cost model. This review provides valuable information for database administrators by showing essential characteristics to perform proper fragmentation and to improve the performance of fragmentation schemes. The reduction of costs in fragmentation methods is one of the most desired main properties. To fulfill this objective, the works include cost models, covering different qualities. In this analysis, a set of characteristics used in the cost models of each work is presented to facilitate the creation of a new cost model including the most used qualities. In addition, different data sets or reference points used in the testing stage of each work analyzed are presented.
Published: 2020

8. Impression analysis of trending topics in Twitter with classification algorithms

Author: David Valle-Cruz, Asdrúbal López-Chau, and Rodrigo Sandoval-Almazan
Subjects: Information retrieval, Computer science, business.industry, Event (computing), Sentiment analysis, Context (language use), 02 engineering and technology, Impression, Statistical classification, Identification (information), Categorization, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, business, Mass media
Abstract: The use of a simple categorization of emotions or even the use of universal expressions of emotions is unsuitable to properly identify sentiments in posts in some situations. The main goal of this paper is to analyze impressions of Twitter messages in the 19S Mexican earthquake of 2017 through machine learning techniques, specifically with classification algorithms. To identify impressions, we applied sentiment analysis based on supervised methods, and we identified a customized list of terms that we called impressions, which reflects the nature of tweets related to the event of study. Our proposed impressions analysis is useful to understand Twitter messages during different events since impressions adapt to each situation and context, based on emotional frameworks. We found that Twitter is useful to prove or disprove the information disseminated by the mass media and mainly for asking for help. Analyzing this kind of data in real-time will be useful for decision-making. The contribution of this paper is to fill the gap in the sentiment analysis area and the automatic identification of eleven impressions for disaster events in Twitter using machine learning techniques. This method has been called impression analysis.
Published: 2020
Full Text: View/download PDF

9. State Vector Identification of Hybrid Model of a Gas Turbine by Real-Time Kalman Filter

Author: Gustavo Delgado-Reyes, Asdrúbal López-Chau, Jazmin Ramirez-Hernandez, Pedro Guevara-Lopez, Igor Loboda, Leobardo Hernandez-Gonzalez, and Jorge Salvador Valdez-Martinez
Subjects: 050101 languages & linguistics, Mean squared error, Computer science, 020209 energy, General Mathematics, real-time, ComputerApplications_COMPUTERSINOTHERSYSTEMS, 02 engineering and technology, Matrix (mathematics), time constraints, Convergence (routing), 0202 electrical engineering, electronic engineering, information engineering, Computer Science (miscellaneous), 0501 psychology and cognitive sciences, Engineering (miscellaneous), computer.programming_language, ANSI C, Multivariable calculus, lcsh:Mathematics, 05 social sciences, Finite difference, State vector, Kalman filter, single board computer, lcsh:QA1-939, Computer Science::Computers and Society, identification, gas turbine model, computer, Algorithm
Abstract: A model and real-time simulation of a gas turbine engine (GTE) by real-time tasks (RTT) is presented. A Kalman filter is applied to perform the state vector identification of the GTE model. The obtained algorithms are recursive and multivariable, for this reason, ANSI C libraries have been developed for (a) use of matrices and vectors, (b) dynamic memory management, (c) simulation of state-space systems, (d) approximation of systems using equations in matrix finite difference, (e) computing the mean square errors vector, and (f) state vector identification of dynamic systems through digital Kalman filter. Simulations were performed in a Single Board Computer (SBC) Raspberry Pi 2&reg, with a real-time operating system. Execution times have been measured to justify the real-time simulation. To validate the results, multiple time plots are analyzed to verify the quality and convergence time of the mean square error obtained.
Published: 2020
Full Text: View/download PDF

10. A CBIR System for the Recognition of Agricultural Machinery

Author: Silvestre Gustavo Peláez-Camarena, Isaac Machorro-Cano, Lisbeth Rodríguez-Mazahua, Asdrúbal López-Chau, María Antonieta Abud-Figueroa, and Rodolfo Rojas Ruiz
Subjects: Agricultural machinery, Computer science, business.industry, General Medicine, Agricultural engineering, business
Published: 2018
Full Text: View/download PDF

11. Automatic computing of number of clusters for color image segmentation employing fuzzy c-means by extracting chromaticity features of colors

Author: Asdrúbal López-Chau, Arturo Yee-Rendon, Farid García-Lamont, and Jair Cervantes
Subjects: Pixel, Artificial neural network, Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020207 software engineering, Pattern recognition, 02 engineering and technology, Image segmentation, Color space, Artificial Intelligence, Pattern recognition (psychology), 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Segmentation, Computer Vision and Pattern Recognition, Artificial intelligence, Variation of information, Chromaticity, business
Abstract: In this paper we introduce a method for color image segmentation by computing automatically the number of clusters the data, pixels, are divided into using fuzzy c-means. In several works the number of clusters is defined by the user. In other ones the number of clusters is computed by obtaining the number of dominant colors, which is determined with unsupervised neural networks (NN) trained with the image’s colors; the number of dominant colors is defined by the number of the most activated neurons. The drawbacks with this approach are as follows: (1) The NN must be trained every time a new image is given and (2) despite employing different color spaces, the intensity data of colors are used, so the undesired effects of non-uniform illumination may affect computing the number of dominant colors. Our proposal consists in processing the images with an unsupervised NN trained previously with chromaticity samples of different colors; the number of the neurons with the highest activation occurrences defines the number of clusters the image is segmented. By training the NN with chromatic data of colors it can be employed to process any image without training it again, and our approach is, to some extent, robust to non-uniform illumination. We perform experiments with the images of the Berkeley segmentation database, using competitive NN and self-organizing maps; we compute and compare the quantitative evaluation of the segmented images obtained with related works using the probabilistic random index and variation of information metrics.
Published: 2018
Full Text: View/download PDF

12. Preliminary Results of an Analysis Using Association Rules to Find Relations between Medical Opinions About the non-Realization of Autopsies in a Mexican Hospital

Author: Giner Alor-Hernández, Elayne Rubio Delgado, Silvestre Gustavo Peláez-Camarena, María Antonieta Abud-Figueroa, Lisbeth Rodríguez-Mazahua, Asdrúbal López-Chau, and José Antonio Palet Guzmán
Subjects: 010504 meteorology & atmospheric sciences, Association rule learning, Computer science, 020209 energy, 0202 electrical engineering, electronic engineering, information engineering, Operations management, 02 engineering and technology, General Medicine, Set (psychology), 01 natural sciences, Realization (probability), 0105 earth and related environmental sciences
Abstract: In the last years, a significant reduction in the number of autopsies realized in the hospitals of the world has been observed. Since medics are the closest people to this problematic, they can offer information that helps clarify why the decreasing of this practice has occurred. In this paper, data mining techniques are applied to perform an analysis of medical opinions regarding the realization of autopsies in a hospital of Veracruz, in Mexico. The opinions were collected through surveys applied to 85 medics of the hospital. The result is a model represented by a set of rules that suggests some of the factors that are related to the decrease in the number of autopsies in the hospital, according to the survey respondents.
Published: 2017
Full Text: View/download PDF

13. Sentiment Analysis of Twitter Data Through Machine Learning Techniques

Author: David Valle-Cruz, Asdrúbal López-Chau, and Rodrigo Sandoval-Almazan
Subjects: business.industry, Computer science, media_common.quotation_subject, Sentiment analysis, Supervised learning, Cloud computing, Machine learning, computer.software_genre, Sadness, Support vector machine, Naive Bayes classifier, Happiness, Social media, Artificial intelligence, business, computer, media_common
Abstract: Cloud computing is a revolutionary technology for businesses, governments, and citizens. Some examples of Software-as-a-Services (SaaS) of cloud computing are banking apps, e-mail, blog, online news, and social networks. In this chapter, we analyze data sets generated by trending topics on Twitter that emerged from Mexican citizens that interacted during the earthquake of September 19, 2017, using sentiment analysis and supervised learning, based on the Ekman’s six emotional model. We built three classifiers to determine the emotions of tweets that belong to the same topic. The classifiers with the best accuracy for predicting emotions were Naive Bayes and support vector machine. We found that the most frequent predicted emotions were happiness, anger, and sadness; also, that 6.5% of predicted tweets were irrelevant. We provide some recommendations about the use of machine learning techniques in sentiment analysis. Our contribution is the expansion of the emotions range, from three (negative, neutral, positive) to six in order to provide more elements to understand how users interact with social media platforms. Future research will include validation of the method with different data sets and emotions, and the addition of new artificial intelligence techniques to improve accuracy.
Published: 2020
Full Text: View/download PDF

14. Design of a Horizontal Data Fragmentation, Allocation and Replication Method in the Cloud

Author: María Antonieta Abud-Figueroa, Asdrúbal López-Chau, Lisbeth Rodríguez-Mazahua, Isaac Machorro-Cano, and Felipe Castro-Medina
Subjects: Replication method, Distributed database, business.industry, Computer science, Distributed computing, Fragmentation (computing), Web application, Cloud computing, business
Abstract: At present, the demand for information in distributed database systems is large and growing day by day. While this is happening, new challenges arise to improve the performance of the databases. Data fragmentation and replication methods have a leading role in distributed systems over the cloud, which is why this paper presents the design of a horizontal fragmentation, allocation and replication method in the cloud. This research proposes an algorithm for solving the problem of overlapping horizontal fragments in a data fragmentation and replication method in the cloud. The design of a Web application using the aforementioned method is also presented, this allows the fragmentation, assignment and replication scheme proposed by the method to be directly applied on a distributed database.
Published: 2019
Full Text: View/download PDF

15. Human mimic color perception for segmentation of color images using a three-layered self-organizing map previously trained to classify color chromaticity

Author: Farid García-Lamont, Jair Cervantes, and Asdrúbal López-Chau
Subjects: 0209 industrial biotechnology, Color histogram, Computer science, Color vision, Map coloring, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Color balance, 02 engineering and technology, False color, HSL and HSV, Color space, 020901 industrial engineering & automation, Artificial Intelligence, 0202 electrical engineering, electronic engineering, information engineering, Segmentation, Computer vision, Chromaticity, Hue, Color image, business.industry, Binary image, Image segmentation, Color quantization, RGB color model, 020201 artificial intelligence & image processing, Artificial intelligence, business, Software
Abstract: Most of the works addressing segmentation of color images use clustering-based methods; the drawback with such methods is that they require a priori knowledge of the amount of clusters, so the number of clusters is set depending on the nature of the scene so as not to lose color features of the scene. Other works that employ different unsupervised learning-based methods use the colors of the given image, but the classifying method employed is retrained again when a new image is given. Humans have the nature capability to: (1) recognize colors by using their previous knowledge, that is, they do not need to learn to identify colors every time they observe a new image and, (2) within a scene, humans can recognize regions or objects by their chromaticity features. Hence, in this paper we propose to emulate the human color perception for color image segmentation. We train a three-layered self-organizing map with chromaticity samples so that the neural network is able to segment color images by their chromaticity features. When training is finished, we use the same neural network to process several images, without training it again and without specifying, to some extent, the number of colors the image have. The hue component of colors is extracted by mapping the input image from the RGB space to the HSV space. We test our proposal using the Berkeley segmentation database and compare quantitatively our results with related works; according to the results comparison, we claim that our approach is competitive.
Published: 2016
Full Text: View/download PDF

16. Active rule base development for dynamic vertical partitioning of multimedia databases

Author: Giner Alor-Hernández, Jair Cervantes, Xiaoou Li, Asdrúbal López-Chau, and Lisbeth Rodríguez-Mazahua
Subjects: Scheme (programming language), Database, Multimedia, Computer Networks and Communications, Computer science, Response time, Workload, 02 engineering and technology, Query optimization, computer.software_genre, Database design, Set (abstract data type), Workflow, Artificial Intelligence, Hardware and Architecture, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Data mining, computer, Reactive system, Software, Information Systems, computer.programming_language
Abstract: Currently, vertical partitioning has been used in multimedia databases in order to take advantage of its potential benefits in query optimization. Nevertheless, most vertical partitioning algorithms are static; this means that they optimize a vertical partitioning scheme (VPS) according to a workload, but if this workload suffers changes, the VPS may be degraded, which would result in long query response time. This paper presents a set of active rules to perform dynamic vertical partitioning in multimedia databases. First of all, these rules collect all the information that a vertical partitioning algorithm needs as input. Then, they evaluate this information in order to know if the database has experienced enough changes to trigger a performance evaluator. In this case, if the performance of the database falls below a threshold previously calculated by the rules, the vertical partitioning algorithm is triggered, which gets a new VPS. Finally, the rules materialize the new VPS. Our active rule base is implemented in the system DYMOND, which is an active rule-based system for dynamic vertical partitioning of multimedia databases. DYMOND's architecture and workflow are presented in this paper. Moreover, a case study is used to clarify and evaluate the functionality of our active rule base. Additionally, authors of this paper performed a qualitative evaluation with the aim of comparing and evaluating DYMOND's functionality. The results showed that DYMOND improved query performance in multimedia databases.
Published: 2016
Full Text: View/download PDF

17. Data selection based on decision tree for SVM classification on large data sets

Author: Lisbeth Rodríguez Mazahua, Jair Cervantes, Farid García Lamont, Asdrúbal López-Chau, and J. Sergio Ruíz
Subjects: Training set, Small data, Computer science, business.industry, Decision tree, Pattern recognition, Machine learning, computer.software_genre, Set (abstract data type), Support vector machine, Data set, Tree (data structure), Data point, Ranking SVM, Artificial intelligence, business, computer, Software
Abstract: Graphical abstractDisplay Omitted HighlightsThis paper describes the development of an algorithm for training large data sets.The algorithm uses a first stage of SVM with a small data set.The algorithm uses decision trees to find best data points in the entire data set.DT is trained using SV and non-SV found in the first SVM stage.In the second SVM stage the training data represent all data points found by the DT. Support Vector Machine (SVM) has important properties such as a strong mathematical background and a better generalization capability with respect to other classification methods. On the other hand, the major drawback of SVM occurs in its training phase, which is computationally expensive and highly dependent on the size of input data set. In this study, a new algorithm to speed up the training time of SVM is presented; this method selects a small and representative amount of data from data sets to improve training time of SVM. The novel method uses an induction tree to reduce the training data set for SVM, producing a very fast and high-accuracy algorithm. According to the results, the proposed algorithm produces results with similar accuracy and in a faster way than the current SVM implementations.
Published: 2015
Full Text: View/download PDF

18. Analysis of Medical Opinions about the Nonrealization of Autopsies in a Mexican Hospital Using Association Rules and Bayesian Networks

Author: Jair Cervantes, José Antonio Palet Guzmán, José Luis Sánchez Cervantes, Silvestre Gustavo Peláez-Camarena, Asdrúbal López-Chau, Lisbeth Rodríguez-Mazahua, and Elayne Rubio Delgado
Subjects: Information retrieval, Association rule learning, Article Subject, business.industry, Computer science, Bayesian probability, Bayesian network, 02 engineering and technology, Field (computer science), Computer Science Applications, 03 medical and health sciences, QA76.75-76.765, 0302 clinical medicine, C4.5 algorithm, 0202 electrical engineering, electronic engineering, information engineering, Web application, 020201 artificial intelligence & image processing, 030212 general & internal medicine, Computer software, business, Software, Natural language
Abstract: This research identifies the factors influencing the reduction of autopsies in a hospital of Veracruz. The study is based on the application of data mining techniques such as association rules and Bayesian networks in data sets obtained from opinions of physicians. We analyzed, for the exploration and extraction of the knowledge, algorithms like Apriori, FPGrowth, PredictiveApriori, Tertius, J48, NaiveBayes, MultilayerPerceptron, and BayesNet, all of them provided by the API of WEKA. To generate mining models and present the new knowledge in natural language, we also developed a web application. The results presented in this study are those obtained from the best-evaluated algorithms, which have been validated by specialists in the field of pathology.
Published: 2018
Full Text: View/download PDF

19. Contrast Enhancement of RGB Color Images by Histogram Equalization of Color Vectors’ Intensities

Author: Jair Cervantes, Farid García-Lamont, Sergio Ruiz, and Asdrúbal López-Chau
Subjects: Channel (digital image), Computer science, business.industry, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, 020206 networking & telecommunications, 02 engineering and technology, HSL and HSV, Color space, Grayscale, Computer Science::Computer Vision and Pattern Recognition, 0202 electrical engineering, electronic engineering, information engineering, RGB color model, 020201 artificial intelligence & image processing, Computer vision, Artificial intelligence, Chromaticity, business, Histogram equalization, ComputingMethodologies_COMPUTERGRAPHICS, Hue
Abstract: The histogram equalization (HE) is a technique developed for image contrast enhancement of grayscale images. For RGB (Red, Green, Blue) color images, the HE is usually applied in the color channels separately; due to correlation between the color channels, the chromaticity of colors is modified. In order to overcome this problem, the colors of the image are mapped to different color spaces where the chromaticity and the intensity of colors are decoupled; then, the HE is applied in the intensity channel. Mapping colors between different color spaces may involve a huge computational load, because the mathematical operations are not linear. In this paper we present a proposal for contrast enhancement of RGB color images, without mapping the colors to different color spaces, where the HE is applied to the intensities of the color vectors. We show that the images obtained with our proposal are very similar to the images processed in the HSV (Hue, Saturation, Value) and L*a*b* color spaces.
Published: 2018
Full Text: View/download PDF

20. Support vector machine classification for large datasets using decision tree and Fisher linear discriminant

Author: Xiaoou Li, Asdrúbal López Chau, and Wen Yu
Subjects: Structured support vector machine, Computer Networks and Communications, business.industry, Computer science, Entropy (statistical thermodynamics), Decision tree, Pattern recognition, Machine learning, computer.software_genre, Linear discriminant analysis, Standard deviation, Support vector machine, Entropy (classical thermodynamics), ComputingMethodologies_PATTERNRECOGNITION, Hardware and Architecture, Optimal discriminant analysis, Entropy (information theory), Artificial intelligence, Entropy (energy dispersal), business, computer, Time complexity, Entropy (arrow of time), Software, Entropy (order and disorder)
Abstract: Training a support vector machine (SVM) with data number n has time complexity between O ( n 2 ) and O ( n 3 ) . Most training algorithms for SVM are not suitable for large datasets. Decision trees can simplify SVM training, however classification accuracy becomes lower when there are inseparable points. This paper introduces a novel method for SVM classification. A decision tree is used to detect the low entropy regions in input space, and Fisher’s linear discriminant is applied to detect the data near to support vectors. The experimental results demonstrate that our approach has good classification accuracy and low standard deviation, the training is significantly faster than the other existing methods.
Published: 2014
Full Text: View/download PDF

21. Computing the Number of Groups for Color Image Segmentation Using Competitive Neural Networks and Fuzzy C-Means

Author: Sergio Ruiz, Farid García-Lamont, Jair Cervantes, and Asdrúbal López-Chau
Subjects: 0209 industrial biotechnology, Pixel, business.industry, Segmentation-based object categorization, Computer science, Computer Science::Neural and Evolutionary Computation, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, Scale-space segmentation, 02 engineering and technology, Image segmentation, Color space, Similitude, ComputingMethodologies_PATTERNRECOGNITION, 020901 industrial engineering & automation, Computer Science::Computer Vision and Pattern Recognition, Histogram, 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Segmentation, Computer vision, Artificial intelligence, business
Abstract: Fuzzy C-means (FCM) is one of the most often techniques employed for color image segmentation; the drawback with this technique is the number of clusters the data, pixels’ colors, is grouped must be defined a priori. In this paper we present an approach to compute the number of clusters automatically. A competitive neural network (CNN) and a self-organizing map (SOM) are trained with chromaticity samples of different colors; the neural networks process each pixel of the image to segment, where the activation occurrences of each neuron are collected in a histogram. The number of clusters is set by computing the number of the most activated neurons. The number of clusters is adjusted by comparing the similitude of colors. We show successful segmentation results obtained using images of the Berkeley segmentation database by training only one time the CNN and SOM, using only chromaticity data.
Published: 2016
Full Text: View/download PDF

22. Recognition of Mexican Sign Language from Frames in Video Sequences

Author: Jair Cervantes, Farid García-Lamont, Asdrúbal López Chau, Lisbeth Rodríguez-Mazahua, and Arturo Yee Rendón
Subjects: Computer science, Speech recognition, 05 social sciences, Feature extraction, Video sequence, 02 engineering and technology, Support vector machine, Discriminative model, 0202 electrical engineering, electronic engineering, information engineering, Mexican sign language, 020201 artificial intelligence & image processing, 0501 psychology and cognitive sciences, Classifier (UML), 050107 human factors
Abstract: The development of vision systems capable to extracting discriminative features that enhance the generalization power of a classifier is still a very challenging problem. In this paper, is presented a methodology to improve the classification performance of Mexican Sign Language (MSL). The proposed method explores some frames in video sequences for each sign. 743 features were extracted from these frames, and a genetic algorithm is employed to select a subset of sensitive features by removing the irrelevant features. The genetic algorithm permits to obtain the most discriminative features. Support Vector Machines (SVM) are used to classify signs based on these features. The experiments show that the proposed method can be successfully used to recognize the MSL with accuracy results individually above 97 % on average. The proposed feature extraction methodology and the GA used to extract the most discriminative features is a promising method to facilitate the communication of deaf people.
Published: 2016
Full Text: View/download PDF

23. Is There a Relationship Between Neighborhoods of Minority Class Instances and the Performance of Classification Methods?

Author: Asdrúbal López-Chau, Jair Cervantes, and Farid García-Lamont
Subjects: business.industry, Computer science, 0206 medical engineering, 02 engineering and technology, Minority class, Machine learning, computer.software_genre, Measure (mathematics), 0202 electrical engineering, electronic engineering, information engineering, Classification methods, 020201 artificial intelligence & image processing, Artificial intelligence, business, computer, 020602 bioinformatics
Abstract: The performance of classification methods is notably damaged with imbalanced data sets. Although some studies to analyze this behavior have realized before, most of the conclusions obtained from experiments correspond to synthetic data sets. In this paper, we study the relationship between the performance of five classification methods and neighbors of minority class instances. According to the results of experiments, we found strong empiric evidence that the type of neighborhoods of minority class instances affect classification accuracy. Indeed, we observe that the type of neighborhood is more important than the imbalance rate. In order to validate the results, we use ten real-world imbalanced data sets, and measure AUC ROC and True Positive Rates.
Published: 2016
Full Text: View/download PDF

24. Color Characterization Comparison for Machine Vision-Based Fruit Recognition

Author: Jair Cervantes, Farid García-Lamont, Asdrúbal López-Chau, and Sergio Ruiz
Subjects: Machine vision, business.industry, Computer science, Feature vector, ComputingMethodologies_IMAGEPROCESSINGANDCOMPUTERVISION, symbols.namesake, Fourier transform, symbols, RGB color model, Computer vision, Artificial intelligence, Chromaticity, business, Classifier (UML), ComputingMethodologies_COMPUTERGRAPHICS
Abstract: In this paper we present a comparison between three color characterizations methods applied for fruit recognition, two of them are selected from two related works and the third is the authors’ proposal; in the three works, color is represented in the RGB space. The related works characterize the colors considering their intensity data; but employing the intensity data of colors in the RGB space may lead to obtain imprecise models of colors, because, in this space, despite two colors with the same chromaticity if they have different intensities then they represent different colors. Hence, we introduce a method to characterize the color of objects by extracting the chromaticity of colors; so, the intensity of colors does not influence significantly the color extraction. The color characterizations of these two methods and our proposal are implemented and tested to extract the color features of different fruit classes. The color features are concatenated with the shape characteristics, obtained using Fourier descriptors, Hu moments and four basic geometric features, to form a feature vector. A feed-forward neural network is employed as classifier; the performance of each method is evaluated using an image database with 12 fruit classes.
Published: 2015
Full Text: View/download PDF

25. Classification on Imbalanced Data Sets, Taking Advantage of Errors to Improve Performance

Author: Farid García-Lamont, Asdrúbal López-Chau, and Jair Cervantes
Subjects: Data set, ComputingMethodologies_PATTERNRECOGNITION, Computer science, business.industry, Artificial intelligence, Data mining, Machine learning, computer.software_genre, business, computer, Imbalanced data
Abstract: Classification methods usually exhibit a poor performance when they are applied on imbalanced data sets. In order to overcome this problem, some algorithms have been proposed in the last decade. Most of them generate synthetic instances in order to balance data sets, regardless the classification algorithm. These methods work reasonably well in most cases; however, they tend to cause over-fitting.
Published: 2015
Full Text: View/download PDF

26. A Hybrid Algorithm to Improve the Accuracy of Support Vector Machines on Skewed Data-Sets

Author: De-Shuang Huang, Farid García-Lamont, Asdrúbal López Chau, and Jair Cervantes
Subjects: education.field_of_study, business.industry, Computer science, Generalization, Population, Machine learning, computer.software_genre, Hybrid algorithm, Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, Hyperplane, Genetic algorithm, Artificial intelligence, Noise (video), business, education, computer
Abstract: Over the past few years, has been shown that generalization power of Support Vector Machines (SVM) falls dramatically on imbalanced data-sets. In this paper, we propose a new method to improve accuracy of SVM on imbalanced data-sets. To get this outcome, firstly, we used undersampling and SVM to obtain the initial SVs and a sketch of the hyperplane. These support vectors help to generate new artificial instances, which will take part as the initial population of a genetic algorithm. The genetic algorithm improves the population in artificial instances from one generation to another and eliminates instances that produce noise in the hyperplane. Finally, the generated and evolved data were included in the original data-set for minimizing the imbalance and improving the generalization ability of the SVM on skewed data-sets.
Published: 2014
Full Text: View/download PDF

27. Data Selection Using Decision Tree for SVM Classification

Author: Asdrúbal López-Chau, Xiaoou Li, Jair Cervantes, Wen Yu, and Lourdes Lopez Garcia
Subjects: Computational complexity theory, Computer science, Entropy (statistical thermodynamics), business.industry, Decision tree, Pattern recognition, Machine learning, computer.software_genre, Data set, Support vector machine, Entropy (classical thermodynamics), Entropy (information theory), Mutual fund separation theorem, Artificial intelligence, Entropy (energy dispersal), Cluster analysis, business, Entropy (arrow of time), computer, Linear separability, Entropy (order and disorder)
Abstract: Support Vector Machine (SVM) is an important classification method used in a many areas. The training of SVM is almost O(n^{2}) in time and space. Some methods to reduce the training complexity have been proposed in last years. Data selection methods for SVM select most important examples from training data sets to improve its training time. This paper introduces a novel data reduction method that works detecting clusters and then selects some examples from them. Different from other state of the art algorithms, the novel method uses a decision tree to form partitions that are treated as clusters, and then executes a guided random selection of examples. The clusters discovered by a decision tree can be linearly separable, taking advantage of the Eidelheit separation theorem, it is possible to reduce the size of training sets by carefully selecting examples from training sets. The novel method was compared with LibSVM using public available data sets, experiments demonstrate an important reduction of the size of training sets whereas showing only a slight decreasing in the accuracy of classifier.
Published: 2012
Full Text: View/download PDF

28. Enhancing the Performance of SVM on Skewed Data Sets by Exciting Support Vectors

Author: Jair Cervantes, Farid García Lamont, Asdrúbal López-Chau, and José Hernández Santiago
Subjects: Computer science, business.industry, Generalization, Small number, Pattern recognition, computer.software_genre, Data set, Support vector machine, ComputingMethodologies_PATTERNRECOGNITION, Hyperplane, Face (geometry), Pattern recognition (psychology), Artificial intelligence, Sensitivity (control systems), Data mining, business, computer
Abstract: In pattern recognition and data mining a data set is named skewed or imbalanced if it contains a large number of objects of certain type and a very small number of objects of the opposite type. The imbalance in data sets represents a challenging problem for most classification methods, this is because the generalization power achieved for classic classifiers is not good for skewed data sets. Many real data sets are imbalanced, so the development of new methods to face this problem is necessary. The SVM classifier has an exceptional performance for data sets that are not skewed, however for imbalanced sets the optimal separating hyper plane is not enough to achieve acceptable results. In this paper a novel method that improves the performance of SVM for skewed data sets is presented. The proposed method works by exciting the support vectors and displacing the separating hyper plane towards majority class. According to the results obtained in experiments with different skewed data sets, the method enhances not only the accuracy but also the sensitivity of SVM classifier on this kind of data sets.
Published: 2012
Full Text: View/download PDF

29. Border Samples Detection for Data Mining Applications Using Non Convex Hulls

Author: Asdrúbal López Chau, Wen Yu, Xiaoou Li, Pedro Mejia-Alvarez, and Jair Cervantes
Subjects: Convex hull, Discretization, Computer science, business.industry, Regular polygon, Image processing, Pattern recognition, computer.software_genre, Support vector machine, Set (abstract data type), Margin (machine learning), Pattern recognition (psychology), Data mining, Artificial intelligence, business, computer
Abstract: Border points are those instances located at the outer margin of dense clusters of samples. The detection is important in many areas such as data mining, image processing, robotics, geographic information systems and pattern recognition. In this paper we propose a novel method to detect border samples. The proposed method makes use of a discretization and works on partitions of the set of points. Then the border samples are detected by applying an algorithm similar to the presented in reference [8] on the sides of convex hulls. We apply the novel algorithm on classification task of data mining; experimental results show the effectiveness of our method.
Published: 2011
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

29 results on '"Asdrúbal López Chau"'

1. Efficient nucleus segmentation of white blood cells mimicking the human perception of color

2. FRAGMENT: A Web Application for Database Fragmentation, Allocation and Replication over a Cloud Environment

3. A Brief Review of Vertical Fragmentation Methods Considering Multimedia Databases and Content-Based Queries

4. An Improvement to FRAGMENT: A Web Application for Database Fragmentation, Allocation, and Replication Over a Cloud Environment

5. Análisis de la utilidad del Bastón Blanco Inteligente UAEM para personas con discapacidad visual

6. Comparative Analysis of Decision Tree Algorithms for Data Warehouse Fragmentation

7. Application of Dynamic Fragmentation Methods in Multimedia Databases: A Review

8. Impression analysis of trending topics in Twitter with classification algorithms

9. State Vector Identification of Hybrid Model of a Gas Turbine by Real-Time Kalman Filter

10. A CBIR System for the Recognition of Agricultural Machinery

11. Automatic computing of number of clusters for color image segmentation employing fuzzy c-means by extracting chromaticity features of colors

12. Preliminary Results of an Analysis Using Association Rules to Find Relations between Medical Opinions About the non-Realization of Autopsies in a Mexican Hospital

13. Sentiment Analysis of Twitter Data Through Machine Learning Techniques

14. Design of a Horizontal Data Fragmentation, Allocation and Replication Method in the Cloud

15. Human mimic color perception for segmentation of color images using a three-layered self-organizing map previously trained to classify color chromaticity

16. Active rule base development for dynamic vertical partitioning of multimedia databases

17. Data selection based on decision tree for SVM classification on large data sets

18. Analysis of Medical Opinions about the Nonrealization of Autopsies in a Mexican Hospital Using Association Rules and Bayesian Networks

19. Contrast Enhancement of RGB Color Images by Histogram Equalization of Color Vectors’ Intensities

20. Support vector machine classification for large datasets using decision tree and Fisher linear discriminant

21. Computing the Number of Groups for Color Image Segmentation Using Competitive Neural Networks and Fuzzy C-Means

22. Recognition of Mexican Sign Language from Frames in Video Sequences

23. Is There a Relationship Between Neighborhoods of Minority Class Instances and the Performance of Classification Methods?

24. Color Characterization Comparison for Machine Vision-Based Fruit Recognition

25. Classification on Imbalanced Data Sets, Taking Advantage of Errors to Improve Performance

26. A Hybrid Algorithm to Improve the Accuracy of Support Vector Machines on Skewed Data-Sets

27. Data Selection Using Decision Tree for SVM Classification

28. Enhancing the Performance of SVM on Skewed Data Sets by Exciting Support Vectors

29. Border Samples Detection for Data Mining Applications Using Non Convex Hulls

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

29 results on '"Asdrúbal López Chau"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources