11 results on '"Sebastián Ventura"'
Search Results
2. Mining local periodic patterns in a discrete sequence
- Author
-
Peng Yang, José María Luna, Sebastián Ventura, Rage Uday Kiran, and Philippe Fournier-Viger
- Subjects
Sequence ,Information Systems and Management ,Period (periodic table) ,Computer science ,05 social sciences ,050301 education ,02 engineering and technology ,Type (model theory) ,Measure (mathematics) ,Computer Science Applications ,Theoretical Computer Science ,Variable (computer science) ,Artificial Intelligence ,Control and Systems Engineering ,Duration (music) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,0503 education ,Algorithm ,Software - Abstract
Periodic frequent patterns are sets of events or items that periodically appear in a sequence of events or transactions. Many algorithms have been designed to identify periodic frequent patterns in data. However, most assume that the periodic behavior of a pattern does not change much over time. To address this limitation, this paper proposes to discover a novel type of periodic patterns in a sequence of events or transactions, called Local Periodic Patterns (LPPs) which are patterns (sets of events) that have a periodic behavior in some non predefined time-intervals. A pattern is said to be a local periodic pattern if it appears regularly and continuously in some time-interval(s). Two novel measures are proposed to assess the periodicity and frequency of patterns in time-intervals. The maxSoPer (maximal period of spillovers) measure allows detecting time-intervals of variable lengths where a pattern is continuously periodic, while the minDur (minimal duration) measure ensures that those time-intervals have a minimum duration. To discover all LPPs, the paper presents three efficient algorithms. An experimental evaluation on real datasets shows that the proposed algorithms are efficient and can provide useful patterns that cannot be found using traditional periodic pattern mining algorithms.
- Published
- 2021
3. Interactive multi-objective evolutionary optimization of software architectures
- Author
-
Aurora Ramírez, Sebastián Ventura, and José Raúl Romero
- Subjects
Information Systems and Management ,Fitness function ,business.industry ,Computer science ,Process (engineering) ,Evolutionary algorithm ,Software requirements specification ,020207 software engineering ,Interactive evolutionary computation ,02 engineering and technology ,Software metric ,Computer Science Applications ,Theoretical Computer Science ,Software ,Artificial Intelligence ,Control and Systems Engineering ,0202 electrical engineering, electronic engineering, information engineering ,Human-in-the-loop ,020201 artificial intelligence & image processing ,Software engineering ,business - Abstract
While working on a software specification, designers usually need to evaluate different architectural alternatives to be sure that quality criteria are met. Even when these quality aspects could be expressed in terms of multiple software metrics, other qualitative factors cannot be numerically measured, but they are extracted from the engineers know-how and prior experiences. In fact, detecting not only strong but also weak points in the different solutions seems to fit better with the way humans make their decisions. Putting the human in the loop brings new challenges to the search-based software engineering field, especially for those human-centered activities within the early analysis phase. This paper explores how the interactive evolutionary computation can serve as a basis for integrating the humans judgment into the search process. An interactive approach is proposed to discover software architectures, in which both quantitative and qualitative criteria are applied to guide a multi-objective evolutionary algorithm. The obtained feedback is incorporated into the fitness function using architectural preferences allowing the algorithm to discern between promising and poor solutions. Experimentation with real users has revealed that the proposed interaction mechanism can effectively guide the search towards those regions of the search space that are of real interest to the expert.
- Published
- 2018
4. Extremely high-dimensional optimization with MapReduce: Scaling functions and algorithm
- Author
-
Sebastián Ventura, Alberto Cano, and Carlos García-Martínez
- Subjects
Continuous optimization ,Information Systems and Management ,Optimization problem ,Computer science ,05 social sciences ,050301 education ,02 engineering and technology ,Computer Science Applications ,Theoretical Computer Science ,Artificial Intelligence ,Control and Systems Engineering ,Computer cluster ,Parallel programming model ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,Memetic algorithm ,020201 artificial intelligence & image processing ,Heuristics ,0503 education ,Algorithm ,Software ,Curse of dimensionality - Abstract
Large scale optimization is an active research area in which many algorithms, benchmark functions, and competitions have been proposed to date. However, extremely high-dimensional optimization problems comprising millions of variables demand new approaches to perform effectively in results quality and efficiently in time. Memetic algorithms are popular in continuous optimization but they are hampered on such extremely large dimensionality due to the limitations of computational and memory resources, and heuristics must tackle the immensity of the search space. This work advances on how the MapReduce parallel programming model allows scaling to problems with millions of variables, and presents an adaptation of the MA-SW-Chains algorithm to the MapReduce framework. Benchmark functions from the IEEE CEC 2010 and 2013 competitions are considered and results with 1, 3 and 10 million variables are presented. MapReduce demonstrates to be an effective approach to scale optimization algorithms on extremely high-dimensional problems, taking advantage of the combined computational and memory resources distributed in a computer cluster.
- Published
- 2017
5. Multi-target support vector regression via correlation regressor chains
- Author
-
Alberto Cano, Vojislav Kecman, Gabriella Melki, and Sebastián Ventura
- Subjects
Information Systems and Management ,Computational complexity theory ,02 engineering and technology ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,Artificial Intelligence ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Segmented regression ,Statistical hypothesis testing ,Mathematics ,business.industry ,Regression analysis ,Regression ,Computer Science Applications ,Support vector machine ,Variable (computer science) ,Control and Systems Engineering ,Errors-in-variables models ,020201 artificial intelligence & image processing ,Data mining ,Artificial intelligence ,business ,computer ,Software - Abstract
Multi-target regression is a challenging task that consists of creating predictive models for problems with multiple continuous target outputs. Despite the increasing attention on multi-label classification, there are fewer studies concerning multi-target (MT) regression. The current leading MT models are based on ensembles of regressor chains, where random, differently ordered chains of the target variables are created and used to build separate regression models, using the previous target predictions in the chain. The challenges of building MT models stem from trying to capture and exploit possible correlations among the target variables during training. This paper presents three multi-target support vector regression models. The first involves building independent, single-target Support Vector Regression (SVR) models for each output variable. The second builds an ensemble of random chains using the first method as a base model. The third calculates the targets’ correlations and forms a maximum correlation chain, which is used to build a single chained support vector regression model, improving the models’ prediction performance while reducing the computational complexity. The experimental study evaluates and compares the performance of the three approaches with seven other state-of-the-art multi-target regressors on 24 multi-target datasets. The experimental results are then analyzed using non-parametric statistical tests. The results show that the maximum correlation SVR approach improves the performance of using ensembles of random chains.
- Published
- 2017
6. Discovering useful patterns from multiple instance data
- Author
-
Alberto Cano, Virgilijus Sakalauskas, Sebastián Ventura, and José María Luna
- Subjects
Information Systems and Management ,Association rule learning ,Data stream mining ,Computer science ,media_common.quotation_subject ,Association (object-oriented programming) ,02 engineering and technology ,computer.software_genre ,Field (computer science) ,Computer Science Applications ,Theoretical Computer Science ,Artificial Intelligence ,Control and Systems Engineering ,020204 information systems ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Quality (business) ,Data mining ,computer ,Database transaction ,Software ,Curse of dimensionality ,media_common - Abstract
Association rule mining is one of the most common data mining techniques used to identify and describe interesting relationships between patterns from large datasets, the frequency of an association being defined as the number of transactions that it satisfies. In situations where each transaction includes an undetermined number of instances (customers shopping habits where each transaction represents a different customer having a varied number of instances), the problem cannot be described as a traditional association rule mining problem. The aim of this work is to discover robust and useful patterns from multiple instance datasets, that is, datasets where each transaction may include an undetermined number of instances. We propose a new problem formulation in the data mining framework: multiple-instance association rule mining. The problem definition, an algorithm to tackle the problem, the application fields, and the relations' quality measures are formally described. Experimental results reveal the scalability of the problem on different data dimensionality. Finally, we apply it to two real-world applications field: (1) analysis of financial data gathered from one of the most important banks in Lithuania; (2) study of existing relations between records of unemployed gathered from the Spanish public employment service.
- Published
- 2016
7. Effective lazy learning algorithm based on a data gravitation model for multi-label learning
- Author
-
Oscar Reyes, Carlos Morell, and Sebastián Ventura
- Subjects
Information Systems and Management ,Multi label learning ,02 engineering and technology ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,Gravitation ,Newton's law of universal gravitation ,Artificial Intelligence ,Simple (abstract algebra) ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Instance-based learning ,Mathematics ,Statistical hypothesis testing ,Multi-label classification ,business.industry ,Computer Science Applications ,Lazy learning ,Control and Systems Engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Algorithm ,Software - Abstract
In the last decade, an increasing number of real-world problems surrounding multi-label data have appeared, and multi-label learning has become an important area of research. The data gravitation model is an approach that applies the principles of the universal law of gravitation to resolve machine learning problems. One advantage of the data gravitation model, compared with other techniques, is that it is based on simple principles with high performance levels. This paper presents a multi-label lazy algorithm based on a data gravitation model, named MLDGC. MLDGC directly handles multi-label data, and considers each instance as an atomic data particle. The proposed multi-label lazy algorithm was evaluated and compared to several state-of-the-art multi-label lazy methods on 34 datasets. The results showed that our proposal outperformed state-of-the-art lazy methods. The experimental results were validated using non-parametric statistical tests, confirming the effectiveness of this data gravitation model for multi-label lazy learning.
- Published
- 2016
8. An approach for the evolutionary discovery of software architectures
- Author
-
Sebastián Ventura, Aurora Ramírez, and José Raúl Romero
- Subjects
Software Engineering Process Group ,Information Systems and Management ,Resource-oriented architecture ,Computer science ,computer.software_genre ,Machine learning ,Theoretical Computer Science ,Software ,Artificial Intelligence ,Software sizing ,Software system ,Software design description ,business.industry ,Search-based software engineering ,Software development ,Software metric ,Computer Science Applications ,Software framework ,Control and Systems Engineering ,Component-based software engineering ,Goal-Driven Software Development Process ,Software construction ,Software design ,Artificial intelligence ,business ,computer - Abstract
A ranking based EA for the discovery of software architectures is proposed.An expert-oriented model based on a comprehensible encoding and genetic operator.A complete experimental analysis of the algorithm setup is carried out. Software architectures constitute important analysis artefacts in software projects, as they reflect the main functional blocks of the software. They provide high-level analysis artefacts that are useful when architects need to analyse the structure of working systems. Normally, they do this process manually, supported by their prior experiences. Even so, the task can be very tedious when the actual design is unclear due to continuous uncontrolled modifications. Since the recent appearance of search based software engineering, multiple tasks in the area of software engineering have been formulated as complex search and optimisation problems, where evolutionary computation has found a new area of application. This paper explores the design of an evolutionary algorithm (EA) for the discovery of the underlying architecture of software systems. Important efforts have been directed towards the creation of a generic and human-oriented process. Hence, the selection of a comprehensible encoding, a fitness function inspired by accurate software design metrics, and a genetic operator simulating architectural transformations all represent important characteristics of the proposed approach. Finally, a complete parameter study and experimentation have been performed using real software systems, looking for a generic evolutionary approach to help software engineers towards their decision making process.
- Published
- 2015
9. An interpretable classification rule mining algorithm
- Author
-
Alberto Cano, Sebastián Ventura, and Amelia Zafra
- Subjects
Decision support system ,Information Systems and Management ,Classification rule mining ,business.industry ,Machine learning ,computer.software_genre ,Computer Science Applications ,Theoretical Computer Science ,Knowledge extraction ,Artificial Intelligence ,Control and Systems Engineering ,Artificial intelligence ,Data mining ,Multiple classification ,business ,computer ,Classifier (UML) ,Algorithm ,Software ,Evolutionary programming ,Statistical hypothesis testing ,Interpretability ,Mathematics - Abstract
Obtaining comprehensible classifiers may be as important as achieving high accuracy in many real-life applications such as knowledge discovery tools and decision support systems. This paper introduces an efficient Evolutionary Programming algorithm for solving classification problems by means of very interpretable and comprehensible IF-THEN classification rules. This algorithm, called the Interpretable Classification Rule Mining (ICRM) algorithm, is designed to maximize the comprehensibility of the classifier by minimizing the number of rules and the number of conditions. The evolutionary process is conducted to construct classification rules using only relevant attributes, avoiding noisy and redundant data information. The algorithm is evaluated and compared to nine other well-known classification techniques in 35 varied application domains. Experimental results are validated using several non-parametric statistical tests applied on multiple classification and interpretability metrics. The experiments show that the proposal obtains good results, improving significantly the interpretability measures over the rest of the algorithms, while achieving competitive accuracy. This is a significant advantage over other algorithms as it allows to obtain an accurate and very comprehensible classifier quickly.
- Published
- 2013
10. HyDR-MI : A hybrid algorithm to reduce dimensionality in multiple instance learning
- Author
-
Sebastián Ventura, Amelia Zafra, Mykola Pechenizkiy, and Data Mining
- Subjects
Information Systems and Management ,business.industry ,Dimensionality reduction ,Supervised learning ,Feature selection ,Pattern recognition ,Machine learning ,computer.software_genre ,Hybrid algorithm ,Computer Science Applications ,Theoretical Computer Science ,Statistical classification ,Artificial Intelligence ,Control and Systems Engineering ,Filter (video) ,Feature (machine learning) ,Artificial intelligence ,Instance-based learning ,business ,computer ,Software ,Mathematics - Abstract
Feature selection techniques have been successfully applied in many applications for making supervised learning more effective and efficient. These techniques have been widely used and studied in traditional supervised learning settings, where each instance is expected to have a label. In multiple instance learning (MIL) each example or bag consists of a variable set of instances, and the label is known for the bag as a whole, but not for the individual instances it consists of. Therefore utilizing these labels for feature selection in MIL becomes less straightforward. In this paper we study a new feature subset selection method for MIL called HyDR-MI (hybrid dimensionality reduction method for multiple instance learning). The hybrid consists of the filter component based on an extension of the ReliefF algorithm developed for working with MIL and the wrapper component based on a genetic algorithm that optimizes the search for the best feature subset from a reduced set of features, output by the filter component. We conducted an extensive experimental evaluation of our method on five benchmark datasets and 17 classification algorithms for MIL. The results of our study show the potential of the proposed hybrid with respect to the desirable effect it produces: a significant improvement of the predictive performance of many MIL classification techniques as compared to the effect of filter-based feature selection. This is achieved due to the possibility to decide how many of the top ranked features are useful for each particular algorithm and the possibility to discard redundant attributes.
- Published
- 2013
11. G3P-MI: A genetic programming algorithm for multiple instance learning
- Author
-
Amelia Zafra and Sebastián Ventura
- Subjects
Information Systems and Management ,Wake-sleep algorithm ,Active learning (machine learning) ,Computer science ,Population-based incremental learning ,Stability (learning theory) ,Evolutionary algorithm ,Genetic programming ,Semi-supervised learning ,Machine learning ,computer.software_genre ,Theoretical Computer Science ,Artificial Intelligence ,Instance-based learning ,Learning classifier system ,Weighted Majority Algorithm ,business.industry ,Generalization error ,Computer Science Applications ,Control and Systems Engineering ,Artificial intelligence ,Genetic representation ,business ,computer ,Software ,Evolutionary programming - Abstract
This paper introduces a new Grammar-Guided Genetic Programming algorithm for resolving multi-instance learning problems. This algorithm, called G3P-MI, is evaluated and compared to other multi-instance classification techniques in different application domains. Computational experiments show that the G3P-MI often obtains consistently better results than other algorithms in terms of accuracy, sensitivity and specificity. Moreover, it makes the knowledge discovery process clearer and more comprehensible, by expressing information in the form of IF-THEN rules. Our results confirm that evolutionary algorithms are very appropriate for dealing with multi-instance learning problems.
- Published
- 2010
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.