14 results on '"Melgar-García, Laura"'
Search Results
2. A novel distributed forecasting method based on information fusion and incremental learning for streaming time series
- Author
-
Melgar-García, Laura, Gutiérrez-Avilés, David, Rubio-Escudero, Cristina, and Troncoso, Alicia
- Published
- 2023
- Full Text
- View/download PDF
3. Discovering three-dimensional patterns in real-time from data streams: An online triclustering approach
- Author
-
Melgar-García, Laura, Gutiérrez-Avilés, David, Rubio-Escudero, Cristina, and Troncoso, Alicia
- Published
- 2021
- Full Text
- View/download PDF
4. Comparative Analysis of Deep Learning and Swarm-Optimized Random Forest for Groundwater Spring Potential Identification in Tropical Regions
- Author
-
Nhu, Viet-Ha, primary, Hoa, Pham Viet, additional, Melgar-García, Laura, additional, and Tien Bui, Dieu, additional
- Published
- 2023
- Full Text
- View/download PDF
5. A novel semantic segmentation approach based on U-Net, WU-Net, and U-Net++ deep learning for predicting areas sensitive to pluvial flood at tropical area
- Author
-
Melgar-García, Laura, primary, Martínez-Álvarez, Francisco, additional, Tien Bui, Dieu, additional, and Troncoso, Alicia, additional
- Published
- 2023
- Full Text
- View/download PDF
6. Técnicas big data para el procesamiento de flujos de datos masivos en tiempo real
- Author
-
Melgar García, Laura, Troncoso, Alicia, and Rubio Escudero, Cristina
- Subjects
Big data ,Procesamiento de datos ,Inteligencia artificial - Abstract
Programa de Doctorado en Biotecnología, Ingeniería y Tecnología Química, Línea de Investigación: Ingeniería, Ciencia de Datos y Bioinformática, Clave Programa: DBI, Código Línea: 111, Machine learning techniques have become one of the most demanded resources by companies due to the large volume of data that surrounds us in these days. The main objective of these technologies is to solve complex problems in an automated way using data. One of the current perspectives of machine learning is the analysis of continuous flows of data or data streaming. This approach is increasingly requested by enterprises as a result of the large number of information sources producing time-indexed data at high frequency, such as sensors, Internet of Things devices, social networks, etc. However, nowadays, research is more focused on the study of historical data than on data received in streaming. One of the main reasons for this is the enormous challenge that this type of data presents for the modeling of machine learning algorithms. This Doctoral Thesis is presented in the form of a compendium of publications with a total of 10 scientific contributions in International Conferences and journals with high impact index in the Journal Citation Reports (JCR). The research developed during the PhD Program focuses on the study and analysis of real-time or streaming data through the development of new machine learning algorithms. Machine learning algorithms for real-time data consist of a different type of modeling than the traditional one, where the model is updated online to provide accurate responses in the shortest possible time. The main objective of this Doctoral Thesis is the contribution of research value to the scientific community through three new machine learning algorithms. These algorithms are big data techniques and two of them work with online or streaming data. In this way, contributions are made to the development of one of the current trends in Artificial Intelligence. With this purpose, algorithms are developed for descriptive and predictive tasks, i.e., unsupervised and supervised learning, respectively. Their common idea is the discovery of patterns in the data. The first technique developed during the dissertation is a triclustering algorithm to produce three-dimensional data clusters in offline or batch mode. This big data algorithm is called bigTriGen. In a general way, an evolutionary metaheuristic is used to search for groups of data with similar patterns. The model uses genetic operators such as selection, crossover, mutation or evaluation operators at each iteration. The goal of the bigTriGen is to optimize the evaluation function to achieve triclusters of the highest possible quality. It is used as the basis for the second technique implemented during the Doctoral Thesis. The second algorithm focuses on the creation of groups over three-dimensional data received in real-time or in streaming. It is called STriGen. Streaming modeling is carried out starting from an offline or batch model using historical data. As soon as this model is created, it starts receiving data in real-time. The model is updated in an online or streaming manner to adapt to new streaming patterns. In this way, the STriGen is able to detect concept drifts and incorporate them into the model as quickly as possible, thus producing triclusters in real-time and of good quality. The last algorithm developed in this dissertation follows a supervised learning approach for time series forecasting in real-time. It is called StreamWNN. A model is created with historical data based on the k-nearest neighbor or KNN algorithm. Once the model is created, data starts to be received in real-time. The algorithm provides real-time predictions of future data, keeping the model always updated in an incremental way and incorporating streaming patterns identified as novelties. The StreamWNN also identifies anomalous data in real-time allowing this feature to be used as a security measure during its application. The developed algorithms have been evaluated with real data from devices and sensors. These new techniques have demonstrated to be very useful, providing meaningful triclusters and accurate predictions in real time., Universidad Pablo de Olavide de Sevilla. Departamento de Deporte e informática
- Published
- 2023
7. A new big data triclustering approach for extracting three-dimensional patterns in precision agriculture
- Author
-
Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Universidad de Sevilla. TIC-254: Data Science and Big Data Lab, Universidad de Sevilla. TIC-134: Sistemas Informáticos, Ministerio de Ciencia e Innovación (MICIN). España, Junta de Andalucía, Fundação para a Ciência e a Tecnologia (FCT), Melgar García, Laura, Gutiérrez Avilés, David, Godinho, María Teresa, Espada, Rita, Brito, Isabel Sofía, Martínez Álvarez, Francisco, Troncoso Lora, Alicia, Rubio Escudero, Cristina, Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Universidad de Sevilla. TIC-254: Data Science and Big Data Lab, Universidad de Sevilla. TIC-134: Sistemas Informáticos, Ministerio de Ciencia e Innovación (MICIN). España, Junta de Andalucía, Fundação para a Ciência e a Tecnologia (FCT), Melgar García, Laura, Gutiérrez Avilés, David, Godinho, María Teresa, Espada, Rita, Brito, Isabel Sofía, Martínez Álvarez, Francisco, Troncoso Lora, Alicia, and Rubio Escudero, Cristina
- Abstract
Precision agriculture focuses on the development of site-specific harvest considering the variability of each crop area. Vegetation indices allow the study and delineation of different characteristics of each field zone, generally invisible to the naked-eye. This paper introduces a new big data triclustering approach based on evolutionary algorithms. The algorithm shows its capability to discover three-dimensional pat-terns on the basis of vegetation indices from vine crops. Different vegetation indices have been tested to find different patterns in the crops. The results reported using a vineyard crop located in Portugal depicts four areas with different moisture stress particularities that can lead to changes in the management of the vineyard. Furthermore, scalability studies have been performed, showing that the proposed algorithm is suitable for dealing with big datasets.
- Published
- 2022
8. Generating a seismogenic source zone model for the Pyrenees: A GIS-assisted triclustering approach
- Author
-
Amaro Mellado, José Lázaro, Melgar García, Laura, Rubio Escudero, Cristina, Gutiérrez Avilés, David, Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, European Commission (EC), Ministerio de Economía y Competitividad (MINECO). España, and Junta de Andalucía
- Subjects
Triclustering ,Data Science ,GIS ,Seismic sources - Abstract
Seismogenic source zone models, including the delineation and the characterization, still have a role to play in seismic hazard calculations, particularly in regions with moderate or low to moderate seismicity. Seismic source zones establish areas with common tectonic and seismic characteristics, described by a unique magnitude–frequency distribution. Their definition can be addressed from different views. Traditionally, the source zones have been geographically outlined from seismotectonic, geological structures, and earthquake catalogs. Geographic information systems (GIS) can be of great help in their definition, as they deal rigorously and less ambiguously with the available geographical data. Moreover, novel computer science approaches are now being employed in their definition. The Pyrenees mountain range – in southwest Europe – is located in a region characterized by low to moderate seismicity. In this study, a method based purely on seismic catalogs, managed with a GIS and a triclustering algorithm, were used to delineate seismogenic zones in the Pyrenees. Based on an updated, reviewed, declustered, extensive, and homogeneous earthquake catalog (including detailed information about each event such as date and time, hypocentral location, and size), a triclustering algorithm has been applied to generate the seismogenic zones. The method seeks seismicity patterns in a quasi-objective manner following an initial assessment as to the best suited seismic parameters. The eight zones identified as part of this study are represented on maps to be analyzed, being the zone covered by the Arudy–Arette region to Bagnères de Bigorre as the one with the highest seismic hazard potential. European Commission (EC) 0313-PERSISTAH Ministerio de Economía y Competitividad TIN2017-88209-C2 Junta de Andalucía US-1263341
- Published
- 2021
9. Nearest Neighbors-Based Forecasting for Electricity Demand Time Series in Streaming
- Author
-
Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Ciencia, Innovación y Universidades (MICINN). España, Melgar García, Laura, Gutiérrez Avilés, David, Rubio Escudero, Cristina, Troncoso Lora, Alicia, Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Ciencia, Innovación y Universidades (MICINN). España, Melgar García, Laura, Gutiérrez Avilés, David, Rubio Escudero, Cristina, and Troncoso Lora, Alicia
- Abstract
This paper presents a new forecasting algorithm for time series in streaming named StreamWNN. The methodology has two well-differentiated stages: the algorithm searches for the nearest neighbors to generate an initial prediction model in the batch phase. Then, an online phase is carried out when the time series arrives in streaming. In par-ticular, the nearest neighbor of the streaming data from the training set is computed and the nearest neighbors, previously computed in the batch phase, of this nearest neighbor are used to obtain the predictions. Results using the electricity consumption time series are reported, show-ing a remarkable performance of the proposed algorithm in terms of fore-casting errors when compared to a nearest neighbors-based benchmark algorithm. The running times for the predictions are also remarkable
- Published
- 2021
10. Generating a seismogenic source zone model for the Pyrenees: A GIS-assisted triclustering approach
- Author
-
Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, European Commission (EC), Ministerio de Economía y Competitividad (MINECO). España, Junta de Andalucía, Amaro Mellado, José Lázaro, Melgar García, Laura, Rubio Escudero, Cristina, Gutiérrez Avilés, David, Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, European Commission (EC), Ministerio de Economía y Competitividad (MINECO). España, Junta de Andalucía, Amaro Mellado, José Lázaro, Melgar García, Laura, Rubio Escudero, Cristina, and Gutiérrez Avilés, David
- Abstract
Seismogenic source zone models, including the delineation and the characterization, still have a role to play in seismic hazard calculations, particularly in regions with moderate or low to moderate seismicity. Seismic source zones establish areas with common tectonic and seismic characteristics, described by a unique magnitude–frequency distribution. Their definition can be addressed from different views. Traditionally, the source zones have been geographically outlined from seismotectonic, geological structures, and earthquake catalogs. Geographic information systems (GIS) can be of great help in their definition, as they deal rigorously and less ambiguously with the available geographical data. Moreover, novel computer science approaches are now being employed in their definition. The Pyrenees mountain range – in southwest Europe – is located in a region characterized by low to moderate seismicity. In this study, a method based purely on seismic catalogs, managed with a GIS and a triclustering algorithm, were used to delineate seismogenic zones in the Pyrenees. Based on an updated, reviewed, declustered, extensive, and homogeneous earthquake catalog (including detailed information about each event such as date and time, hypocentral location, and size), a triclustering algorithm has been applied to generate the seismogenic zones. The method seeks seismicity patterns in a quasi-objective manner following an initial assessment as to the best suited seismic parameters. The eight zones identified as part of this study are represented on maps to be analyzed, being the zone covered by the Arudy–Arette region to Bagnères de Bigorre as the one with the highest seismic hazard potential.
- Published
- 2021
11. Discovering three-dimensional patterns in real-time from data streams: An online triclustering approach
- Author
-
Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Ciencia, Innovación y Universidades (MICINN). España, Melgar García, Laura, Gutiérrez Avilés, David, Rubio Escudero, Cristina, Troncoso Lora, Alicia, Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Ciencia, Innovación y Universidades (MICINN). España, Melgar García, Laura, Gutiérrez Avilés, David, Rubio Escudero, Cristina, and Troncoso Lora, Alicia
- Abstract
Triclustering algorithms group sets of coordinates of 3-dimensional datasets. In this paper, a new triclustering approach for data streams is introduced. It follows a streaming scheme of learning in two steps: offline and online phases. First, the offline phase provides a sum mary model with the components of the triclusters. Then, the second stage is the online phase to deal with data in streaming. This online phase consists in using the summary model obtained in the offline stage to update the triclusters as fast as possible with genetic operators. Results using three types of synthetic datasets and a real-world environmental sensor dataset are reported. The performance of the proposed triclustering streaming algo rithm is compared to a batch triclustering algorithm, showing an accurate performance both in terms of quality and running times
- Published
- 2021
12. High-Content Screening images streaming analysis using the STriGen methodology
- Author
-
Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Economía y Competitividad (MINECO). España, Melgar García, Laura, Gutiérrez Avilés, David, Rubio Escudero, Cristina, Troncoso Lora, Alicia, Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Economía y Competitividad (MINECO). España, Melgar García, Laura, Gutiérrez Avilés, David, Rubio Escudero, Cristina, and Troncoso Lora, Alicia
- Abstract
One of the techniques that provides systematic insights into biolog ical processes is High-Content Screening (HCS). It measures cells phenotypes simultaneously. When analysing these images, features like fluorescent colour, shape, spatial distribution and interaction between components can be found. STriGen, which works in the real-time environment, leads to the possibility of studying time evolution of these features in real-time. In addition, data stream ing algorithms are able to process flows of data in a fast way. In this article, STriGen (Streaming Triclustering Genetic) algorithm is presented and applied to HCS images. Results have proved that STriGen finds quality triclusters in HCS images, adapts correctly throughout time and is faster than re-computing the triclustering algorithm each time a new data stream image arrives.
- Published
- 2020
13. Coronavirus Optimization Algorithm: A Bioinspired Metaheuristic Based on the COVID-19 Propagation Model
- Author
-
Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Economía y Competitividad (MINECO). España, Martínez Álvarez, Francisco, Asencio Cortés, Gualberto, Torres, J. F., Gutiérrez Avilés, David, Melgar García, Laura, Pérez Chacón, R., Rubio Escudero, Cristina, Riquelme Santos, José Cristóbal, Troncoso Lora, Alicia, Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Economía y Competitividad (MINECO). España, Martínez Álvarez, Francisco, Asencio Cortés, Gualberto, Torres, J. F., Gutiérrez Avilés, David, Melgar García, Laura, Pérez Chacón, R., Rubio Escudero, Cristina, Riquelme Santos, José Cristóbal, and Troncoso Lora, Alicia
- Abstract
This study proposes a novel bioinspired metaheuristic simulating how the coronavirus spreads and infects healthy people. From a primary infected individual (patient zero), the coronavirus rapidly infects new victims, creating large populations of infected people who will either die or spread infection. Relevant terms such as reinfection probability, super-spreading rate, social distancing measures, or traveling rate are introduced into the model to simulate the coronavirus activity as accurately as possible. The infected population initially grows exponentially over time, but taking into consideration social isolation measures, the mortality rate, and number of recoveries, the infected population gradually decreases. The coronavirus optimization algorithm has two major advantages when compared with other similar strategies. First, the input parameters are already set according to the disease statistics, preventing researchers from initializing them with arbitrary values. Second, the approach has the ability to end after several iterations, without setting this value either. Furthermore, a parallel multivirus version is proposed, where several coronavirus strains evolve over time and explore wider search space areas in less iterations. Finally, the metaheuristic has been combined with deep learning models, to find optimal hyperparameters during the training phase. As application case, the problem of electricity load time series forecasting has been addressed, showing quite remarkable performance.
- Published
- 2020
14. Discovering Spatio-Temporal Patterns in Precision Agriculture Based on Triclustering
- Author
-
Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Economía y Competitividad (MINECO). España, Fundaçao para a Ciencia e a Tecnología (FCT), Melgar García, Laura, Godinho, María Teresa, Espada, Rita, Gutiérrez Avilés, David, Brito, Isabel Sofía, Martínez Álvarez, Francisco, Troncoso Lora, Alicia, Rubio Escudero, Cristina, Universidad de Sevilla. Departamento de Lenguajes y Sistemas Informáticos, Ministerio de Economía y Competitividad (MINECO). España, Fundaçao para a Ciencia e a Tecnología (FCT), Melgar García, Laura, Godinho, María Teresa, Espada, Rita, Gutiérrez Avilés, David, Brito, Isabel Sofía, Martínez Álvarez, Francisco, Troncoso Lora, Alicia, and Rubio Escudero, Cristina
- Abstract
Agriculture has undergone some very important changes over the last few decades. The emergence and evolution of precision agri culture has allowed to move from the uniform site management to the site-specific management, with both economic and environmental advan tages. However, to be implemented effectively, site-specific management requires within-field spatial variability to be well-known and character ized. In this paper, an algorithm that delineates within-field management zones in a maize plantation is introduced. The algorithm, based on tri clustering, mines clusters from temporal remote sensing data. Data from maize crops in Alentejo, Portugal, have been used to assess the suit ability of applying triclustering to discover patterns over time, that may eventually help farmers to improve their harvests.
- Published
- 2020
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.