26 results on '"recurrent models"'
Search Results
2. Enhancing Food Ingredient Named-Entity Recognition with Recurrent Network-Based Ensemble (RNE) Model.
- Author
-
Komariah, Kokoy Siti and Sin, Bong-Kee
- Subjects
RECOMMENDER systems ,INFORMATION storage & retrieval systems ,MENU planning - Abstract
Food recipe sharing sites are becoming increasingly popular among people who want to learn how to cook or plan their menu. Through online food recipes, individuals can select ingredients that suit their lifestyle and health condition. Information from online food recipes is useful in developing food-related systems such as recommendations and health care systems. However, the information from online recipes is often unstructured. One way of extracting such information into a well-structured format is the technique called named-entity recognition (NER), which is the process of identifying keywords and phrases in the text and classifying them into a set of predetermined categories, such as location, persons, time, and others. We present a food ingredient named-entity recognition model called RNE (recurrent network-based ensemble methods) to extract the entities from the online recipe. RNE is an ensemble-learning framework using recurrent network models such as RNN, GRU, and LSTM. These models are trained independently on the same dataset and combined to produce better predictions in extracting food entities such as ingredient names, products, units, quantities, and states for each ingredient in a recipe. The experimental findings demonstrate that the proposed model achieves predictions with an F1 score of 96.09% and outperforms all individual models by 0.2% to 0.5% in percentage points. This result indicates that RNE can extract information from food recipes better than a single model. In addition, this information extracted by RNE can be used to support various information systems related to food. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. Image Segmentation Using Deep Learning: A Survey.
- Author
-
Minaee, Shervin, Boykov, Yuri, Porikli, Fatih, Plaza, Antonio, Kehtarnavaz, Nasser, and Terzopoulos, Demetri
- Subjects
- *
IMAGE segmentation , *DEEP learning , *COMPUTER vision , *IMAGE analysis , *IMAGE processing , *IMAGE compression - Abstract
Image segmentation is a key task in computer vision and image processing with important applications such as scene understanding, medical image analysis, robotic perception, video surveillance, augmented reality, and image compression, among others, and numerous segmentation algorithms are found in the literature. Against this backdrop, the broad success of deep learning (DL) has prompted the development of new image segmentation approaches leveraging DL models. We provide a comprehensive review of this recent literature, covering the spectrum of pioneering efforts in semantic and instance segmentation, including convolutional pixel-labeling networks, encoder-decoder architectures, multiscale and pyramid-based approaches, recurrent networks, visual attention models, and generative models in adversarial settings. We investigate the relationships, strengths, and challenges of these DL-based segmentation models, examine the widely used datasets, compare performances, and discuss promising research directions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
4. Time-warping invariant quantum recurrent neural networks via quantum-classical adaptive gating
- Author
-
Ivana Nikoloska, Osvaldo Simeone, Leonardo Banchi, and Petar Veličković
- Subjects
quantum machine learning ,recurrent models ,time-warping ,Computer engineering. Computer hardware ,TK7885-7895 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Adaptive gating plays a key role in temporal data processing via classical recurrent neural networks (RNNs), as it facilitates retention of past information necessary to predict the future, providing a mechanism that preserves invariance to time warping transformations. This paper builds on quantum RNNs (QRNNs), a dynamic model with quantum memory, to introduce a novel class of temporal data processing quantum models that preserve invariance to time-warping transformations of the (classical) input-output sequences. The model, referred to as time warping-invariant QRNN (TWI-QRNN) , augments a QRNN with a quantum–classical adaptive gating mechanism that chooses whether to apply a parameterized unitary transformation at each time step as a function of the past samples of the input sequence via a classical recurrent model. The TWI-QRNN model class is derived from first principles, and its capacity to successfully implement time-warping transformations is experimentally demonstrated on examples with classical or quantum dynamics.
- Published
- 2023
- Full Text
- View/download PDF
5. Sequence-to-sequence modeling for graph representation learning
- Author
-
Aynaz Taheri, Kevin Gimpel, and Tanya Berger-Wolf
- Subjects
Graph representation learning ,Deep learning ,Graph classification ,Recurrent models ,Applied mathematics. Quantitative methods ,T57-57.97 - Abstract
Abstract We propose sequence-to-sequence architectures for graph representation learning in both supervised and unsupervised regimes. Our methods use recurrent neural networks to encode and decode information from graph-structured data. Recurrent neural networks require sequences, so we choose several methods of traversing graphs using different types of substructures with various levels of granularity to generate sequences of nodes for encoding. Our unsupervised approaches leverage long short-term memory (LSTM) encoder-decoder models to embed the graph sequences into a continuous vector space. We then represent a graph by aggregating its graph sequence representations. Our supervised architecture uses an attention mechanism to collect information from the neighborhood of a sequence. The attention module enriches our model in order to focus on the subgraphs that are crucial for the purpose of a graph classification task. We demonstrate the effectiveness of our approaches by showing improvements over the existing state-of-the-art approaches on several graph classification tasks.
- Published
- 2019
- Full Text
- View/download PDF
6. Enhancing Food Ingredient Named-Entity Recognition with Recurrent Network-Based Ensemble (RNE) Model
- Author
-
Kokoy Siti Komariah and Bong-Kee Sin
- Subjects
deep learning ,ensemble method ,food information extraction ,named-entity recognition ,recurrent models ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Food recipe sharing sites are becoming increasingly popular among people who want to learn how to cook or plan their menu. Through online food recipes, individuals can select ingredients that suit their lifestyle and health condition. Information from online food recipes is useful in developing food-related systems such as recommendations and health care systems. However, the information from online recipes is often unstructured. One way of extracting such information into a well-structured format is the technique called named-entity recognition (NER), which is the process of identifying keywords and phrases in the text and classifying them into a set of predetermined categories, such as location, persons, time, and others. We present a food ingredient named-entity recognition model called RNE (recurrent network-based ensemble methods) to extract the entities from the online recipe. RNE is an ensemble-learning framework using recurrent network models such as RNN, GRU, and LSTM. These models are trained independently on the same dataset and combined to produce better predictions in extracting food entities such as ingredient names, products, units, quantities, and states for each ingredient in a recipe. The experimental findings demonstrate that the proposed model achieves predictions with an F1 score of 96.09% and outperforms all individual models by 0.2% to 0.5% in percentage points. This result indicates that RNE can extract information from food recipes better than a single model. In addition, this information extracted by RNE can be used to support various information systems related to food.
- Published
- 2022
- Full Text
- View/download PDF
7. BS-LSTM: An Ensemble Recurrent Approach to Forecasting Soil Movements in the Real World
- Author
-
Praveen Kumar, Priyanka Sihag, Pratik Chaturvedi, K.V. Uday, and Varun Dutt
- Subjects
soil movements ,time-series forecasting ,recurrent models ,simple LSTMs ,stacked LSTMs ,bidirectional LSTMs ,Science - Abstract
Machine learning (ML) proposes an extensive range of techniques, which could be applied to forecasting soil movements using historical soil movements and other variables. For example, researchers have proposed recurrent ML techniques like the long short-term memory (LSTM) models for forecasting time series variables. However, the application of novel LSTM models for forecasting time series involving soil movements is yet to be fully explored. The primary objective of this research is to develop and test a new ensemble LSTM technique (called “Bidirectional-Stacked-LSTM” or “BS-LSTM”). In the BS-LSTM model, forecasts of soil movements are derived from a bidirectional LSTM for a period. These forecasts are then fed into a stacked LSTM to derive the next period’s forecast. For developing the BS-LSTM model, datasets from two real-world landslide sites in India were used: Tangni (Chamoli district) and Kumarhatti (Solan district). The initial 80% of soil movements in both datasets were used for model training and the last 20% of soil movements in both datasets were used for model testing. The BS-LSTM model’s performance was compared to other LSTM variants, including a simple LSTM, a bidirectional LSTM, a stacked LSTM, a CNN-LSTM, and a Conv-LSTM, on both datasets. Results showed that the BS-LSTM model outperformed all other LSTM model variants during training and test in both the Tangni and Kumarhatti datasets. This research highlights the utility of developing recurrent ensemble models for forecasting soil movements ahead of time.
- Published
- 2021
- Full Text
- View/download PDF
8. Characterizing parking systems from sensor data through a data-driven approach.
- Author
-
Arjona Martinez, Jamie, Linares, Maria Paz, and Casanovas, Josep
- Subjects
- *
PARKING facilities , *RECURRENT neural networks , *CITY traffic , *QUALITY of life , *DEEP learning , *INTERNET of things - Abstract
Nowadays, urban traffic affects the quality of life in cities as the problem becomes even more exacerbated by parking issues: congestion increases due to drivers searching slots to park. An Internet of Things approach permits drivers to know the parking availability in real time and provides data that can be used to develop predictive models. This can be useful in improving the management of parking areas while having an important effect on traffic. This work begins by describing the state-of-the-art parking predictive models and, then, introduces the recurrent neural network methods that were used Long Short-Term Memory and Gated Recurrent Unit, as well as the models developed according to real scenarios in Wattens and Los Angeles. To improve the quality of the models, exogenous variables related to weather and calendar are considered. Finally, the results are described, followed by suggestions for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
9. Time-warping invariant quantum recurrent neural networks via quantum-classical adaptive gating
- Author
-
Nikoloska, Ivana, Simeone, Osvaldo, Banchi, Leonardo, Velickovic, Petar, Nikoloska, Ivana, Simeone, Osvaldo, Banchi, Leonardo, and Velickovic, Petar
- Abstract
Adaptive gating plays a key role in temporal data processing via classical recurrent neural networks (RNNs), as it facilitates retention of past information necessary to predict the future, providing a mechanism that preserves invariance to time warping transformations. This paper builds on quantum RNNs (QRNNs), a dynamic model with quantum memory, to introduce a novel class of temporal data processing quantum models that preserve invariance to time-warping transformations of the (classical) input-output sequences. The model, referred to as time warping-invariant QRNN (TWI-QRNN), augments a QRNN with a quantum–classical adaptive gating mechanism that chooses whether to apply a parameterized unitary transformation at each time step as a function of the past samples of the input sequence via a classical recurrent model. The TWI-QRNN model class is derived from first principles, and its capacity to successfully implement time-warping transformations is experimentally demonstrated on examples with classical or quantum dynamics.
- Published
- 2023
10. Sequential Deep Learning Models for Neonatal Sepsis Detection : A suitability assessment of deep learning models for event detection in physiological data
- Author
-
Alex Siren, Henrik and Alex Siren, Henrik
- Abstract
Sepsis is a life-threatening condition that neonatal patients are especially susceptible to. Fortunately, improved bedside monitoring has enabled the collection and use of continuous vital signs data for the purpose of detecting conditions such as sepsis. While current research has found some success in reducing mortality in neonatal intensive care units with linear directly interpretable models, such as logistic regression, accurate detection of sepsis from inherently noisy time-series data still remains a challenge. Furthermore, previous research has generally relied on pre-defined features extracted from rawvital signs data, which may not be optimal for the detection task. Therefore, assessing the overall feasibility of sequential deep learning models, such as recurrent and convolutional models, could improve the results of current research. This task was tackled in three phases. Firstly, baseline scores were established with a logistic regression model. Secondly, three common recurrent classifiers were tested on pre-defined window based features and compared with each other. Thirdly, a convolutional architecture with a recurrent and non-recurrent classifier was tested on raw low frequency (1Hz) signals in order to examine their capability to automatically extract features from the data. The final results from all phases were compared with each other. Results show that recurrent classifiers trained on pre-defined features do outperform automatic feature extraction with the convolutional models. The best model was based on a long-short term memory unit that achieved an area under the characteristic receiver operating unit curve of 0.806, and outperformed the established baseline results. In comparison with previous research, said model performed on par with the examined simple interpretable baseline models. The low results can likely be attributed to a insufficient sample size of patients with sepsis for the examined models and sub-optimal hyperparameter optimizat, Sepsis är ett livshotande tillstånd som neonatala patienter är särskilt mottagliga för. Lyckligtvis har förbättrad patientmonitorering möjliggjort kontinuerlig insamling och andvänding av vitalparametrar i syfte att upptäcka tillstånd som sepsis. Medan aktuell forskning har funnit viss framgång i att minska dödligheten på neonatala intensivvårdsavdelningar med hjälp av linjära tolkbara modeller, såsom logistisk regression, är noggrann detektering av sepsis från brusig tidsseriedata fortfarande en utmaning. Dessutom har tidigare forskning i allmänhet förlitat sig på fördefinierade prediktorer extraherade från rå vitalparameterdata, som kanske inte är optimala för detektionsuppgiften. På grund av detta kan en bedömning av den övergripande användbarheten av sekventiella modeller för djupinlärning, såsom RNN- och CNN-modeller, förbättra resultaten av aktuell forskning. Denna uppgift tacklades i tre faser. Först och främst etablerades baslinjeresultat med en logistisk regressionsmodell. För det andra testades tre RNN-baserad klassificerare på data med fördefinierade fönsterbaserade prediktorer och jämfördes med varandra. För det tredje testades en CNN-arkitektur med både en RNN-klassificerare och MLP-klassificerare på råa lågfrekventa (1Hz) signaler för att undersöka deras förmåga att automatiskt extrahera egna prediktorer från datan. Slutresultaten från alla faser jämfördes med varandra. Resultaten visar att RNN-klassificerare som tränats på fördefinierade prediktorer överträffar automatisk extraktion av prediktorer med CNN-modellerna. Den bäst presterande modellen baserades på en långtidsminnesenhet som uppnådde en AUROC på 0.806, och överträffade de etablerade baslinjeresultaten. I jämförelse med tidigare forskning uppnådde ifrågavarande modell lika hög prestation som de väl undersökta enklare tolkbara baslinjemodellerna. De låga resultaten kan sannolikt tillskrivas en otillräcklig provstorlek av patienter med sepsis för de undersökta modellerna och suboptimal hyperpa
- Published
- 2022
11. Sekventiella djupinlärningsmodeller för detektering av neonatal sepsis : En lämplighetsbedömning av djupinlärningsmodeller för händelsedetektering i fysiologisk data
- Author
-
Alex Siren, Henrik
- Subjects
Fysiologisk data ,Computer and Information Sciences ,CNN-modeller ,Convolutional models ,Neonatal sepsis ,Physiological data ,Deep learning ,Data- och informationsvetenskap ,Recurrent models ,RNN-modeller ,Djupinlärning - Abstract
Sepsis is a life-threatening condition that neonatal patients are especially susceptible to. Fortunately, improved bedside monitoring has enabled the collection and use of continuous vital signs data for the purpose of detecting conditions such as sepsis. While current research has found some success in reducing mortality in neonatal intensive care units with linear directly interpretable models, such as logistic regression, accurate detection of sepsis from inherently noisy time-series data still remains a challenge. Furthermore, previous research has generally relied on pre-defined features extracted from rawvital signs data, which may not be optimal for the detection task. Therefore, assessing the overall feasibility of sequential deep learning models, such as recurrent and convolutional models, could improve the results of current research. This task was tackled in three phases. Firstly, baseline scores were established with a logistic regression model. Secondly, three common recurrent classifiers were tested on pre-defined window based features and compared with each other. Thirdly, a convolutional architecture with a recurrent and non-recurrent classifier was tested on raw low frequency (1Hz) signals in order to examine their capability to automatically extract features from the data. The final results from all phases were compared with each other. Results show that recurrent classifiers trained on pre-defined features do outperform automatic feature extraction with the convolutional models. The best model was based on a long-short term memory unit that achieved an area under the characteristic receiver operating unit curve of 0.806, and outperformed the established baseline results. In comparison with previous research, said model performed on par with the examined simple interpretable baseline models. The low results can likely be attributed to a insufficient sample size of patients with sepsis for the examined models and sub-optimal hyperparameter optimization due to the number of possible configurations. Further avenues of research include examination of high frequency data and more complex models for automatic feature extraction. Sepsis är ett livshotande tillstånd som neonatala patienter är särskilt mottagliga för. Lyckligtvis har förbättrad patientmonitorering möjliggjort kontinuerlig insamling och andvänding av vitalparametrar i syfte att upptäcka tillstånd som sepsis. Medan aktuell forskning har funnit viss framgång i att minska dödligheten på neonatala intensivvårdsavdelningar med hjälp av linjära tolkbara modeller, såsom logistisk regression, är noggrann detektering av sepsis från brusig tidsseriedata fortfarande en utmaning. Dessutom har tidigare forskning i allmänhet förlitat sig på fördefinierade prediktorer extraherade från rå vitalparameterdata, som kanske inte är optimala för detektionsuppgiften. På grund av detta kan en bedömning av den övergripande användbarheten av sekventiella modeller för djupinlärning, såsom RNN- och CNN-modeller, förbättra resultaten av aktuell forskning. Denna uppgift tacklades i tre faser. Först och främst etablerades baslinjeresultat med en logistisk regressionsmodell. För det andra testades tre RNN-baserad klassificerare på data med fördefinierade fönsterbaserade prediktorer och jämfördes med varandra. För det tredje testades en CNN-arkitektur med både en RNN-klassificerare och MLP-klassificerare på råa lågfrekventa (1Hz) signaler för att undersöka deras förmåga att automatiskt extrahera egna prediktorer från datan. Slutresultaten från alla faser jämfördes med varandra. Resultaten visar att RNN-klassificerare som tränats på fördefinierade prediktorer överträffar automatisk extraktion av prediktorer med CNN-modellerna. Den bäst presterande modellen baserades på en långtidsminnesenhet som uppnådde en AUROC på 0.806, och överträffade de etablerade baslinjeresultaten. I jämförelse med tidigare forskning uppnådde ifrågavarande modell lika hög prestation som de väl undersökta enklare tolkbara baslinjemodellerna. De låga resultaten kan sannolikt tillskrivas en otillräcklig provstorlek av patienter med sepsis för de undersökta modellerna och suboptimal hyperparameteroptimering på grund av antalet möjliga konfigurationer. Ytterligare forskningsvägar inkluderar undersökning av högfrekventa data och mer komplexa modeller för automatisk extraktion av prediktorer.
- Published
- 2022
12. Improved action proposals using fine-grained proposal features with recurrent attention models
- Author
-
Selen Pehlivan, Jorma Laaksonen, Lecturer Laaksonen Jorma group, Department of Computer Science, Aalto-yliopisto, and Aalto University
- Subjects
Untrimmed video understanding ,Signal Processing ,Media Technology ,Attention ,Temporal action proposal generation ,Computer Vision and Pattern Recognition ,Recurrent models ,Temporal convolution ,Electrical and Electronic Engineering - Abstract
Funding Information: This work has been funded by the Academy of Finland through the projects “Movie Making Finland: Finnish fiction films as audiovisual big data (MoMaF)” (project number 329268 ) and “Understanding speech and scene with ears and eyes (USSEE)” (project number 345791 ). We also acknowledge the computational resources provided by the Aalto University’s Aalto Science IT project and CSC–IT Center for Science. Recent models for the temporal action proposal task show that local properties can be an alternative to the region proposal network (RPN) for generating good proposal candidates on untrimmed videos. In this study, we devise an RPN model with a new two-stage pipeline and a new joint scoring function for temporal proposals. The evaluation of local properties is integrated into our RPN model to search for the best proposal candidates that can be distinguished mainly in fine details of proposal regions. Our network models proposals in multiple scales using two recurrent neural network layers with attention mechanisms. We observe that joint training of the RPN with local clues and multi-scale modeling of proposals with recurrent attention mechanisms improve the performance of the proposal generation task. Our model yields state-of-the-art results on the THUMOS-14 and comparable results on the ActivityNet-1.3 datasets.
- Published
- 2023
- Full Text
- View/download PDF
13. Designing smart ITS services through innovative data analysis modeling
- Author
-
Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa, Casanovas Garcia, Josep, Linares Herreros, María Paz, Arjona Martínez, Jamie, Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa, Casanovas Garcia, Josep, Linares Herreros, María Paz, and Arjona Martínez, Jamie
- Abstract
Nowadays, one of the most important problems in urban areas concerns traffic congestion. This, in turn, has an impact on the economy, nature, human health, city architecture, and many other facets of life. Part of the vehicular traffic in cities is caused by parking space availability. The drivers of private vehicles usually want to leave their vehicles as close as possible to their destination. However, the parking slots are limited and may not be enough to sustain the demand, especially when the destination pertains to an attractive area. Thus, individuals looking for a place to park their vehicles contribute to increasing traffic flow density on roads where the parking demand cannot be satisfied. An Internet of Things (IoT) approach allows us to know the state of the parking system (availability of the parking slots) in real time through wireless networks of sensor devices. An intelligent treatment of this data could generate forecasted information that may be useful in improving management of on-street parking, thus having a notable effect on urban traffic. Smart parking systems first appeared in 2015, with IoT platforms in Santander, San Francisco and Melbourne. That is the year when those cities began to provide on-street real-time parking data in order to offer new services to their citizens. One of the most interesting services that these kinds of platforms can offer is parking availability forecasting, for which the first works in this field studied the temporal and spatial correlations of parking occupancy to support short-term forecasts (no more than 30 minutes). Those short-term forecasts are not useful at all to the end user of this service; thus, the necessary prediction intervals should be at the order of magnitude of hours. In this context, this thesis focuses on using parking and other sources of data to characterize and model different parking systems. The methodology used employs novel techniques for providing real-time forecasts of parking availa, En la actualidad uno de los mayores problemas de las zonas urbanas tiene origen en la congestión del tráfico con un alto impacto en la economía, el medio ambiente, la salud y otras facetas de la vida urbana. En muchas ocasiones parte de la congestión del trafico tiene origen en la disponibilidad de las plazas de aparcamiento debido a que los conductores de vehículos privados suelen querer aparcar sus vehículos lo más cerca posible de su destino pero las plazas de aparcamiento son limitadas y pueden no ser suficientes para mantener la demanda. Un enfoque basado en el Internet of Things (IoT) nos permite en tiempo real conocer la disponibilidad de plazas de estacionamiento a través de redes inalámbricas de sensores. Un tratamiento inteligente sobre estos datos puede generar información que ayude a predecir la futura demanda de estacionamiento en las zonas sensorizadas mejorando así la gestión del estacionamiento y teniendo un efecto en el tráfico urbano. Los primeros trabajos académicos en este área se centraron en estudiar las correlaciones temporales y espaciales de la ocupación del estacionamiento para proveer pronósticos a corto plazo (predicciones a tiempo máximo de 30 minutos) y que en muchas ocasiones no son de utilidad ya que para el usuario final es preferible tener estimaciones de la disponibilidad de estacionamiento en el order de magnitud de horas. En este contexto, esta tesis se centra en el uso de datos de aparcamientos y otras fuentes para caracterizar y modelizar diferentes sistemas de aparcamiento. La metodología utilizada emplea técnicas innovadoras para proporcionar predicciones en tiempo real sobre la disponibilidad de aparcamiento basadas en datos de sensores. Los modelos se desarrollan a partir de cuatro metodología: Autoregressive Integrated Moving Average (ARIMA), Multilayer Perceptron (MLP), Long-Short Term Memory (LSTM) y Gated Recurrent Unit (GRU). La primera ha sido el enfoque estándar de predicción en la literatura sobre Sistemas de Transp, Postprint (published version)
- Published
- 2021
14. Characterizing parking systems from sensor data through a data-driven approach
- Author
-
Universitat Politècnica de Catalunya. Doctorat en Estadística i Investigació Operativa, Facultat d'Informàtica de Barcelona, Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. IMP - Information Modeling and Processing, Arjona Martínez, Jamie, Linares Herreros, María Paz, Casanovas Garcia, Josep, Universitat Politècnica de Catalunya. Doctorat en Estadística i Investigació Operativa, Facultat d'Informàtica de Barcelona, Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa, Barcelona Supercomputing Center, Universitat Politècnica de Catalunya. IMP - Information Modeling and Processing, Arjona Martínez, Jamie, Linares Herreros, María Paz, and Casanovas Garcia, Josep
- Abstract
Nowadays, urban traffic affects the quality of life in cities as the problem becomes even more exacerbated by parking issues: congestion increases due to drivers searching slots to park. An Internet of Things approach permits drivers to know the parking availability in real time and provides data that can be used to develop predictive models. This can be useful in improving the management of parking areas while having an important effect on traffic. This work begins by describing the state-of-the-art parking predictive models and, then, introduces the recurrent neural network methods that were used Long Short-Term Memory and Gated Recurrent Unit, as well as the models developed according to real scenarios in Wattens and Los Angeles. To improve the quality of the models, exogenous variables related to weather and calendar are considered. Finally, the results are described, followed by suggestions for future research., This research was funded by Secretaria d’Universitats i Recerca de la Generalitat de Catalunya [2017-SGR-1749] and under the Industrial Doctorate Program [2016-DI-79]., Peer Reviewed, Postprint (author's final draft)
- Published
- 2021
15. BS-LSTM: An Ensemble Recurrent Approach to Forecasting Soil Movements in the Real World
- Author
-
Pratik Chaturvedi, K. V. Uday, Varun Dutt, Praveen Kumar, and Priyanka Sihag
- Subjects
bidirectional LSTMs ,Ensemble forecasting ,business.industry ,Computer science ,Science ,simple LSTMs ,stacked LSTMs ,Machine learning ,computer.software_genre ,time-series forecasting ,Model testing ,Range (statistics) ,General Earth and Planetary Sciences ,Artificial intelligence ,Time series ,soil movements ,business ,computer ,recurrent models - Abstract
Machine learning (ML) proposes an extensive range of techniques, which could be applied to forecasting soil movements using historical soil movements and other variables. For example, researchers have proposed recurrent ML techniques like the long short-term memory (LSTM) models for forecasting time series variables. However, the application of novel LSTM models for forecasting time series involving soil movements is yet to be fully explored. The primary objective of this research is to develop and test a new ensemble LSTM technique (called “Bidirectional-Stacked-LSTM” or “BS-LSTM”). In the BS-LSTM model, forecasts of soil movements are derived from a bidirectional LSTM for a period. These forecasts are then fed into a stacked LSTM to derive the next period’s forecast. For developing the BS-LSTM model, datasets from two real-world landslide sites in India were used: Tangni (Chamoli district) and Kumarhatti (Solan district). The initial 80% of soil movements in both datasets were used for model training and the last 20% of soil movements in both datasets were used for model testing. The BS-LSTM model’s performance was compared to other LSTM variants, including a simple LSTM, a bidirectional LSTM, a stacked LSTM, a CNN-LSTM, and a Conv-LSTM, on both datasets. Results showed that the BS-LSTM model outperformed all other LSTM model variants during training and test in both the Tangni and Kumarhatti datasets. This research highlights the utility of developing recurrent ensemble models for forecasting soil movements ahead of time.
- Published
- 2021
- Full Text
- View/download PDF
16. Designing smart ITS services through innovative data analysis modeling
- Author
-
Arjona Martínez, Jamie|||0000-0003-2585-2392, Casanovas, Josep (Casanovas Garcia), Linares Herreros, Ma. Paz (María Paz), Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa, Casanovas Garcia, Josep, and Linares Herreros, María Paz
- Subjects
Series temporales ,Time series ,Predicción de disponibilidad de aparcamiento ,Aprendizaje profundo ,Parking availability forecast ,Matemàtiques i estadística [Àrees temàtiques de la UPC] ,Deep learning ,Recurrent models ,Modelos recurrentes ,Smart cities - Abstract
Nowadays, one of the most important problems in urban areas concerns traffic congestion. This, in turn, has an impact on the economy, nature, human health, city architecture, and many other facets of life. Part of the vehicular traffic in cities is caused by parking space availability. The drivers of private vehicles usually want to leave their vehicles as close as possible to their destination. However, the parking slots are limited and may not be enough to sustain the demand, especially when the destination pertains to an attractive area. Thus, individuals looking for a place to park their vehicles contribute to increasing traffic flow density on roads where the parking demand cannot be satisfied. An Internet of Things (IoT) approach allows us to know the state of the parking system (availability of the parking slots) in real time through wireless networks of sensor devices. An intelligent treatment of this data could generate forecasted information that may be useful in improving management of on-street parking, thus having a notable effect on urban traffic. Smart parking systems first appeared in 2015, with IoT platforms in Santander, San Francisco and Melbourne. That is the year when those cities began to provide on-street real-time parking data in order to offer new services to their citizens. One of the most interesting services that these kinds of platforms can offer is parking availability forecasting, for which the first works in this field studied the temporal and spatial correlations of parking occupancy to support short-term forecasts (no more than 30 minutes). Those short-term forecasts are not useful at all to the end user of this service; thus, the necessary prediction intervals should be at the order of magnitude of hours. In this context, this thesis focuses on using parking and other sources of data to characterize and model different parking systems. The methodology used employs novel techniques for providing real-time forecasts of parking availability based on data from sensors with certain inaccuracies due to their mechanical nature. The models are developed from four different methodologies: ARIMA, multilayer perceptron (MLP), long-short term memory (LSTM) and gated recurrent unit (GRU). The first has been the standard approach to forecasting in the ITS literature, while the latter ones have proven to be the best neural network (NN) architectures for solving a wide set of sequential data problems, such as those presented in this work. As far as we know, LSTM and GRU methods (recurrent neural network approaches) have been used recently with good results in traffic forecasting, but not for parking. In addition, we propose using exogenous data such as weather conditions and calendar effects, thereby converting the problem from univariate to multivariate. It is shown here how NN methods naturally handle the increased complexity in the problem. The reason for using exogenous variables is that they can offer relevant information that cannot be inferred from the sensor measurements. The proposed methods have been intensively compared by creating parking models for parking sectors in five cities around the world. The results have been analysed in order to identify and provide exhaustive guidelines and insights into the inner mechanisms of parking systems while also ascertaining how the idiosyncrasies of each method are reflected in the model forecasts. When comparing the results according to their disciplines of origin (ARIMA from statistics and NN methods from machine learning), neither of the proposed methodologies is clearly better than the other, as both can provide forecasts with low error but by different means. ARIMA has shown lower error rates in small-sized sectors where the more recent status of the parking system is more relevant; while the NN methods are more capable of providing forecasts for large-sized sectors where patterns are dependent on long time horizons. En la actualidad uno de los mayores problemas de las zonas urbanas tiene origen en la congestión del tráfico con un alto impacto en la economía, el medio ambiente, la salud y otras facetas de la vida urbana. En muchas ocasiones parte de la congestión del trafico tiene origen en la disponibilidad de las plazas de aparcamiento debido a que los conductores de vehículos privados suelen querer aparcar sus vehículos lo más cerca posible de su destino pero las plazas de aparcamiento son limitadas y pueden no ser suficientes para mantener la demanda. Un enfoque basado en el Internet of Things (IoT) nos permite en tiempo real conocer la disponibilidad de plazas de estacionamiento a través de redes inalámbricas de sensores. Un tratamiento inteligente sobre estos datos puede generar información que ayude a predecir la futura demanda de estacionamiento en las zonas sensorizadas mejorando así la gestión del estacionamiento y teniendo un efecto en el tráfico urbano. Los primeros trabajos académicos en este área se centraron en estudiar las correlaciones temporales y espaciales de la ocupación del estacionamiento para proveer pronósticos a corto plazo (predicciones a tiempo máximo de 30 minutos) y que en muchas ocasiones no son de utilidad ya que para el usuario final es preferible tener estimaciones de la disponibilidad de estacionamiento en el order de magnitud de horas. En este contexto, esta tesis se centra en el uso de datos de aparcamientos y otras fuentes para caracterizar y modelizar diferentes sistemas de aparcamiento. La metodología utilizada emplea técnicas innovadoras para proporcionar predicciones en tiempo real sobre la disponibilidad de aparcamiento basadas en datos de sensores. Los modelos se desarrollan a partir de cuatro metodología: Autoregressive Integrated Moving Average (ARIMA), Multilayer Perceptron (MLP), Long-Short Term Memory (LSTM) y Gated Recurrent Unit (GRU). La primera ha sido el enfoque estándar de predicción en la literatura sobre Sistemas de Transporte Inteligentes, mientras que las otras tres han demostrado ser las mejores arquitecturas de redes neuronales para resolver un amplio conjunto de problemas de datos de naturaleza secuencial, como los que se tratan en este trabajo. Hasta donde sabemos, los métodos LSTM y GRU (enfoques de redes neuronales recurrentes) se han utilizado recientemente para la predicción de tráfico, obteniendo buenos resultados, pero no para aparcamiento. Además, proponemos utilizar datos exógenos como las condiciones meteorológicas y los efectos del calendario, transformando el problema de univariante a multivariante y demostramos como los métodos de redes neuronales gestionan de forma natural esta mayor complejidad del problema. El motivo para incluir variables exógenas es el de reducir la incertidumbre dada por las mediciones de los sensores ya que el uso de los sistemas de aparcamiento está condicionado por procesos no medibles por estos. Los métodos propuestos se han comparado mediante la creación de modelos para sectores de aparcamiento en cinco ciudades. Los resultados se han analizado con el fin de identificar y proporcionar pautas exhaustivas y conocimientos sobre los mecanismos internos de los sistemas de estacionamiento y, al mismo tiempo, determinar cómo se reflejan las idiosincrasias de cada método y de cada sector en los pronósticos del modelo. Al comparar los resultados según sus disciplinas de origen (ARIMA de estadística y redes neuronales de aprendizaje automático), ninguna de las metodologías propuestas es claramente mejor que las otras, ya que ambas pueden proporcionar predicciones con bajo error. ARIMA ha demostrado tener tasas de error más bajas en sectores de aparcamiento de menor dimensión donde el estado más reciente del sistema es más relevante; mientras que los métodos de redes neuronales has demostrado ser capaces de proporcionar mejores predicciones para sectores de gran tamaño donde los patrones tienen mayores dependencias temporales Estadística i investigació operativa
- Published
- 2021
17. Design smart its services through innovative data analsys modeling
- Author
-
Arjona Martínez, Jamie, Universitat Politècnica de Catalunya. Departament d'Estadística i Investigació Operativa, Casanovas Garcia, Josep, and Linares Herreros, María Paz
- Subjects
Series temporales ,Time series ,Predicción de disponibilidad de aparcamiento ,Aprendizaje profundo ,Matemàtiques i estadística [Àrees temàtiques de la UPC] ,Parking availability forecast ,Deep learning ,Recurrent models ,Modelos recurrentes ,Smart cities - Abstract
Nowadays, one of the most important problems in urban areas concerns traffic congestion. This, in turn, has an impact on the economy, nature, human health, city architecture, and many other facets of life. Part of the vehicular traffic in cities is caused by parking space availability. The drivers of private vehicles usually want to leave their vehicles as close as possible to their destination. However, the parking slots are limited and may not be enough to sustain the demand, especially when the destination pertains to an attractive area. Thus, individuals looking for a place to park their vehicles contribute to increasing traffic flow density on roads where the parking demand cannot be satisfied. An Internet of Things (IoT) approach allows us to know the state of the parking system (availability of the parking slots) in real time through wireless networks of sensor devices. An intelligent treatment of this data could generate forecasted information that may be useful in improving management of on-street parking, thus having a notable effect on urban traffic. Smart parking systems first appeared in 2015, with IoT platforms in Santander, San Francisco and Melbourne. That is the year when those cities began to provide on-street real-time parking data in order to offer new services to their citizens. One of the most interesting services that these kinds of platforms can offer is parking availability forecasting, for which the first works in this field studied the temporal and spatial correlations of parking occupancy to support short-term forecasts (no more than 30 minutes). Those short-term forecasts are not useful at all to the end user of this service; thus, the necessary prediction intervals should be at the order of magnitude of hours. In this context, this thesis focuses on using parking and other sources of data to characterize and model different parking systems. The methodology used employs novel techniques for providing real-time forecasts of parking availability based on data from sensors with certain inaccuracies due to their mechanical nature. The models are developed from four different methodologies: ARIMA, multilayer perceptron (MLP), long-short term memory (LSTM) and gated recurrent unit (GRU). The first has been the standard approach to forecasting in the ITS literature, while the latter ones have proven to be the best neural network (NN) architectures for solving a wide set of sequential data problems, such as those presented in this work. As far as we know, LSTM and GRU methods (recurrent neural network approaches) have been used recently with good results in traffic forecasting, but not for parking. In addition, we propose using exogenous data such as weather conditions and calendar effects, thereby converting the problem from univariate to multivariate. It is shown here how NN methods naturally handle the increased complexity in the problem. The reason for using exogenous variables is that they can offer relevant information that cannot be inferred from the sensor measurements. The proposed methods have been intensively compared by creating parking models for parking sectors in five cities around the world. The results have been analysed in order to identify and provide exhaustive guidelines and insights into the inner mechanisms of parking systems while also ascertaining how the idiosyncrasies of each method are reflected in the model forecasts. When comparing the results according to their disciplines of origin (ARIMA from statistics and NN methods from machine learning), neither of the proposed methodologies is clearly better than the other, as both can provide forecasts with low error but by different means. ARIMA has shown lower error rates in small-sized sectors where the more recent status of the parking system is more relevant; while the NN methods are more capable of providing forecasts for large-sized sectors where patterns are dependent on long time horizons. En la actualidad uno de los mayores problemas de las zonas urbanas tiene origen en la congestión del tráfico con un alto impacto en la economía, el medio ambiente, la salud y otras facetas de la vida urbana. En muchas ocasiones parte de la congestión del trafico tiene origen en la disponibilidad de las plazas de aparcamiento debido a que los conductores de vehículos privados suelen querer aparcar sus vehículos lo más cerca posible de su destino pero las plazas de aparcamiento son limitadas y pueden no ser suficientes para mantener la demanda. Un enfoque basado en el Internet of Things (IoT) nos permite en tiempo real conocer la disponibilidad de plazas de estacionamiento a través de redes inalámbricas de sensores. Un tratamiento inteligente sobre estos datos puede generar información que ayude a predecir la futura demanda de estacionamiento en las zonas sensorizadas mejorando así la gestión del estacionamiento y teniendo un efecto en el tráfico urbano. Los primeros trabajos académicos en este área se centraron en estudiar las correlaciones temporales y espaciales de la ocupación del estacionamiento para proveer pronósticos a corto plazo (predicciones a tiempo máximo de 30 minutos) y que en muchas ocasiones no son de utilidad ya que para el usuario final es preferible tener estimaciones de la disponibilidad de estacionamiento en el order de magnitud de horas. En este contexto, esta tesis se centra en el uso de datos de aparcamientos y otras fuentes para caracterizar y modelizar diferentes sistemas de aparcamiento. La metodología utilizada emplea técnicas innovadoras para proporcionar predicciones en tiempo real sobre la disponibilidad de aparcamiento basadas en datos de sensores. Los modelos se desarrollan a partir de cuatro metodología: Autoregressive Integrated Moving Average (ARIMA), Multilayer Perceptron (MLP), Long-Short Term Memory (LSTM) y Gated Recurrent Unit (GRU). La primera ha sido el enfoque estándar de predicción en la literatura sobre Sistemas de Transporte Inteligentes, mientras que las otras tres han demostrado ser las mejores arquitecturas de redes neuronales para resolver un amplio conjunto de problemas de datos de naturaleza secuencial, como los que se tratan en este trabajo. Hasta donde sabemos, los métodos LSTM y GRU (enfoques de redes neuronales recurrentes) se han utilizado recientemente para la predicción de tráfico, obteniendo buenos resultados, pero no para aparcamiento. Además, proponemos utilizar datos exógenos como las condiciones meteorológicas y los efectos del calendario, transformando el problema de univariante a multivariante y demostramos como los métodos de redes neuronales gestionan de forma natural esta mayor complejidad del problema. El motivo para incluir variables exógenas es el de reducir la incertidumbre dada por las mediciones de los sensores ya que el uso de los sistemas de aparcamiento está condicionado por procesos no medibles por estos. Los métodos propuestos se han comparado mediante la creación de modelos para sectores de aparcamiento en cinco ciudades. Los resultados se han analizado con el fin de identificar y proporcionar pautas exhaustivas y conocimientos sobre los mecanismos internos de los sistemas de estacionamiento y, al mismo tiempo, determinar cómo se reflejan las idiosincrasias de cada método y de cada sector en los pronósticos del modelo. Al comparar los resultados según sus disciplinas de origen (ARIMA de estadística y redes neuronales de aprendizaje automático), ninguna de las metodologías propuestas es claramente mejor que las otras, ya que ambas pueden proporcionar predicciones con bajo error. ARIMA ha demostrado tener tasas de error más bajas en sectores de aparcamiento de menor dimensión donde el estado más reciente del sistema es más relevante; mientras que los métodos de redes neuronales has demostrado ser capaces de proporcionar mejores predicciones para sectores de gran tamaño donde los patrones tienen mayores dependencias temporales
- Published
- 2021
18. Spatial and temporal integration of visual features
- Author
-
Choung, Oh-Hyeon and Herzog, Michael
- Subjects
Human vision ,Non-retinotopic processing ,Crowding ,Segmentation ,Grouping ,Convolutional Neural Networks ,Recurrent models ,Model of vision ,Feedforward networks ,Perceptual organization - Abstract
Visual processing can be seen as the integration and segmentation of features. Objects are composed of contours, integrated into shapes and segmented from other contours. Information also needs to be integrated to solve the ill-posed problems of vision. For example, in the "color" perception of an object, illuminance needs to be discounted, requiring large-scale integration of luminance values. Whereas there is little controversy about the crucial role of integration, very little is known about how it really works. In this thesis, I focused on large-scale spatiotemporal information using two paradigms. First, I used the Ternus-Pikler display (TPD) to understand non-retinotopic, temporal integration, and then I used crowding to understand spatial integration across, more or less, the entire visual field. Motions of object parts are perceived relative to the specific object. For example, a reflector on a bicycle wheel seems to rotate even though it is cycloidal in retinotopic coordinates. This is because the reflector's motion is subtracted from the bike's horizontal motion. Instead of bike motion, I used the TPD, which is perfectly suited to understand non-retinotopic processing. There are two possibilities of how information may be integrated non-retinotopically: either based on attentional tracking, e.g., of the reflector's motion, or relying on inbuilt automated mechanisms. I showed that attentional tracking does not play a major role for non-retinotopic processing in the TPD. Second, I showed that invisible retinotopic information can strongly modulate the visible, non-retinotopic percept, further supporting automated integration processes. Crowding occurs when the perception of a target deteriorates because of the surrounding elements. It is the standard situation in everyday vision, since elements are rarely encountered in isolation. The classic model of vision integrates information from low-level to high-level feature detectors. By adding flankers, this model can only predict performance deterioration. However, this prediction was proven wrong because flankers far from the target can even lead to a release of crowding. Integration across the entire visual field is crucial. Here, I systematically investigated the characteristics of this large-scale integration. First, I dissected complex multi-flanker configurations and showed that low-level aspects play only a minor role. Configural aspects and the Gestalt principle of Prägnanz seem to be involved instead. However, as I showed secondly, the basic Gestalt principles fail to explain our results. Lastly, I tested several computational models, including one-stage feedforward models that integrate information within a local area or across the whole visual field, and two-stage recursive models that integrate global information and then explicitly segment elements. I showed that all models fail, unless they take explicit grouping and segmentation processing into accounts, such as capsule networks and the Laminart model. Overall, spatial and temporal integration is rather a complex inbuilt automated mechanism, and integration occurs across the whole visual field, contrary to most classic and recent models in vision. Moreover, global integration can only be reproduced by two-stage models, which process grouping and segmentation. To better understand perception, we need to consider models that group elements by multiple processes and recursively segment other groups explicitly.
- Published
- 2021
- Full Text
- View/download PDF
19. Recurrent sparse support vector regression machines trained by active learning in the time-domain
- Author
-
Ceperic, V., Gielen, G., and Baric, A.
- Subjects
- *
SUPPORT vector machines , *ACTIVE learning , *REGRESSION analysis , *TIME-domain analysis , *COMPUTATIONAL complexity , *ALGORITHMS , *MATHEMATICAL models - Abstract
Abstract: A method for the sparse solution of recurrent support vector regression machines is presented. The proposed method achieves a high accuracy versus complexity and allows the user to adjust the complexity of the resulting model. The sparse representation is guaranteed by limiting the number of training data points for the support vector regression method. Each training data point is selected based on the accuracy of the fully recurrent model using the active learning principle applied to the successive time-domain data. The user can adjust the training time by selecting how often the hyper-parameters of the algorithm should be optimised. The advantages of the proposed method are illustrated on several examples, and the experiments clearly show that it is possible to reduce the number of support vectors and to significantly improve the accuracy versus complexity of recurrent support vector regression machines. [Copyright &y& Elsevier]
- Published
- 2012
- Full Text
- View/download PDF
20. Merge SOM for temporal data
- Author
-
Strickert, Marc and Hammer, Barbara
- Subjects
- *
SELF-organizing maps , *ARTIFICIAL neural networks , *NEURAL computers , *ROBOTICS - Abstract
Abstract: The recent merging self-organizing map (MSOM) for unsupervised sequence processing constitutes a fast, intuitive, and powerful unsupervised learning model. In this paper, we investigate its theoretical and practical properties. Particular focus is put on the context established by the self-organizing MSOM, and theoretic results on the representation capabilities and the MSOM training dynamic are presented. For practical studies, the context model is combined with the neural gas vector quantizer to obtain merging neural gas (MNG) for temporal data. The suitability of MNG is demonstrated by experiments with artificial and real-world sequences with one- and multi-dimensional inputs from discrete and continuous domains. [Copyright &y& Elsevier]
- Published
- 2005
- Full Text
- View/download PDF
21. Unsupervised recursive sequence processing
- Author
-
Strickert, Marc, Hammer, Barbara, and Blohm, Sebastian
- Subjects
- *
DATABASE searching , *VISUAL perception , *NEURAL circuitry , *INFORMATION retrieval - Abstract
Abstract: The self-organizing map (SOM) is a valuable tool for data visualization and data mining for potentially high-dimensional data of an a priori fixed dimensionality. We investigate SOMs for sequences and propose the SOM-S architecture for sequential data. Sequences of potentially infinite length are recursively processed by integrating the currently presented item and the recent map activation, as proposed in the SOMSD presented in (IEEE Trans. Neural Networks 14(3) (2003) 491). We combine that approach with the hyperbolic neighborhood of Ritter (Proceedings of PKDD-01, Springer, Berlin, 2001, pp. 338–349), in order to account for the representation of possibly exponentially increasing sequence diversification over time. Discrete and real-valued sequences can be processed efficiently with this method, as we will show in experiments. Temporal dependencies can be reliably extracted from a trained SOM. U-matrix methods, adapted to sequence processing SOMs, allow the detection of clusters also for real-valued sequence elements. [Copyright &y& Elsevier]
- Published
- 2005
- Full Text
- View/download PDF
22. A study of the connectionist models for software reliability prediction
- Author
-
Ho, S.L., Xie, M., and Goh, T.N.
- Subjects
- *
COMPUTER software , *RELIABILITY in engineering , *COMPUTER architecture , *POISSON processes , *ARTIFICIAL neural networks - Abstract
When analysing software failure data, many software reliability models are available and in particular, nonhomogeneous Poisson process (NHPP) models are commonly used. However, difficulties posed by the assumptions, their validity, and relevance of these assumptions to the real testing environment have limited their usefulness. The connectionist approach using neural network models are more flexible and with less restrictive assumptions. This model-free technique requires only the failure history as inputs and then develops its own internal model of failure process. Their ability to model nonlinear patterns and learn from the data makes it a valuable alternative methodology for characterising the failure process. In this paper, a modified Elman recurrent neural network in modeling and predicting software failures is investigated. The effects of different feedback weights in the proposed model are also studied. A comparative study between the proposed recurrent architecture, with the more popular feedforward neural network, the Jordan recurrent model, and some traditional parametric software reliability growth models are carried out. [Copyright &y& Elsevier]
- Published
- 2003
- Full Text
- View/download PDF
23. Modelos Recorrentes para Geração de Fármacos
- Author
-
Carvalho, Angélica Santos and Arrais, Joel Perdiz
- Subjects
Fragmentation Growing Procedure ,Deep Learning ,Validação ,Drug Discovery ,Recurrent Models ,Validation ,Modelos Recurrentes - Abstract
Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia A descoberta de medicamentos visa identificar potenciais novos medicamentos através de um processo multidisciplinar, incluindo várias áreas científicas, como a biologia, a química e a farmacologia. Atualmente, múltiplas estratégias e metodologias têm sido desenvolvidas para descobrir, testar e otimizar novos medicamentos. No entanto, há um longo processo que vai desde a identificação de alvos até uma molécula comercializável. O objetivo principal desta dissertação é desenvolver um modelo computacional capaz de propor novos compostos. Para atingir este objetivo, foi explorado e treinado um modelo recorrente para gerar um novo Simplified molecular-input line-entry system (SMILES). As Artificial Neural Network (ANN) estudadas nesta dissertação foram Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU) e Bidirectional Long-Short Term Memory (BLSTM). Um conjunto de dados consistente foi escolhido e os SMILES gerados pelo modelo foram sintática e bioquimicamente validados. Para restringir a geração de SMILES, foi utilizada uma técnica denominada Fragmentation Growing Procedure, onde é possível escolher um fragmento e gerar SMILES a partir dele. Para analisar a rede recorrente que melhor se ajusta e os respetivos parâmetros, foram realizados alguns testes e a rede contida no modelo que atingiu o melhor resultado, 98% SMILES válidos e 93% SMILES únicos, foi uma LSTM com 2 camadas. A técnica de restrição de geração foi utilizada no melhor modelo e atingiu 99% dos SMILES válidos e 79% dos SMILES únicos. Drug discovery aims to identify potential new medicines through a multidisciplinary process, including several scientific areas, such as biology, chemistry and pharmacology. Nowadays, multiple strategies and methodologies have been developed to discover, test and optimise new drugs. However, there is a long process from target identification to an optimal marketable molecule. The main purpose of this dissertation is to develop computational models able to propose new drug compounds. In order to achieve this goal, the artificial neural networks explored and trained to generate new drugs in the form of Simplified Molecular-Input Line-Entry System (SMILES). The explored neural networks model were Recurrent Neural Network (RNN), Long-Short Term Memory (LSTM), Gated Recurrent Unit (GRU) and Bidirectional Long-Short Term Memory (BLSTM). A consistent dataset was chosen, and the generated SMILES by the model were syntactically and biochemically validated. In order to restrict the generation of SMILES, a technique denominated Fragmentation Growing Procedure was used, where made it possible to choose a fragment and generate SMILES from that. To analyse the recurrent network that fits the best and the respective parameters, some tests were performed, and the network contained in the model that reached the best result, 98% of valid SMILES and 93% of unique SMILES, was an LSTM with two layers. The technique to restrict the generation was used in the best model and reached 99% of valid SMILES and 79% of unique SMILES.
- Published
- 2019
24. Recursive multikernel filters exploiting nonlinear temporal structure
- Author
-
Ignacio Santamaria, Simone Scardapane, Steven Van Vaerenbergh, and Universidad de Cantabria
- Subjects
FOS: Computer and information sciences ,Multiple kernel learning ,Computer science ,Hilbert space ,Machine Learning (stat.ML) ,020206 networking & telecommunications ,Multikernel ,02 engineering and technology ,Machine Learning (cs.LG) ,Support vector machine ,Computer Science - Learning ,symbols.namesake ,Kernel (linear algebra) ,Kernel method ,Statistics - Machine Learning ,0202 electrical engineering, electronic engineering, information engineering ,symbols ,time-series ,kernel methods ,recurrent models ,020201 artificial intelligence & image processing ,Online algorithm ,Algorithm ,Reproducing kernel Hilbert space - Abstract
In kernel methods, temporal information on the data is commonly included by using time-delayed embeddings as inputs. Recently, an alternative formulation was proposed by defining a gamma-filter explicitly in a reproducing kernel Hilbert space, giving rise to a complex model where multiple kernels operate on different temporal combinations of the input signal. In the original formulation, the kernels are then simply combined to obtain a single kernel matrix (for instance by averaging), which provides computational benefits but discards important information on the temporal structure of the signal. Inspired by works on multiple kernel learning, we overcome this drawback by considering the different kernels separately. We propose an efficient strategy to adaptively combine and select these kernels during the training phase. The resulting batch and online algorithms automatically learn to process highly nonlinear temporal information extracted from the input signal, which is implicitly encoded in the kernel values. We evaluate our proposal on several artificial and real tasks, showing that it can outperform classical approaches both in batch and online settings., Eusipco 2017
- Published
- 2017
- Full Text
- View/download PDF
25. Player Identification in Hockey Broadcast Videos.
- Author
-
Chan, Alvin, Levine, Martin D., and Javan, Mehrsan
- Subjects
- *
CONVOLUTIONAL neural networks , *HOCKEY on television , *HOCKEY players , *COMPUTER vision , *RECURRENT neural networks , *VIDEOS - Abstract
We present a deep recurrent convolutional neural network (CNN) approach to solve the problem of hockey player identification in NHL broadcast videos. Player identification is a difficult computer vision problem mainly because of the players' similar appearance, occlusion, and blurry facial and physical features. However, we can observe players' jersey numbers over time by processing variable length image sequences of players (aka 'tracklets'). We propose an end-to-end trainable ResNet+LSTM network, with a residual network (ResNet) base and a long short-term memory (LSTM) layer, to discover spatio-temporal features of jersey numbers over time and learn long-term dependencies. Additionally, we employ a secondary 1-dimensional convolutional neural network classifier as a late score-level fusion method to classify the output of the ResNet+LSTM network. For this work, we created a new hockey player tracklet dataset that contains sequences of hockey player bounding boxes. This achieves an overall player identification accuracy score over 87% on the test split of our new dataset. • Recurrent convolutional neural network proposed for hockey player identification. • Variable length image sequences of player bounding boxes (tracklets) are classified. • Spatial–temporal information across video frames improves jersey number predictions. • A secondary classifier added as a late score-level fusion method increases accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
26. Implementation of recurrent multi-models for system identification
- Author
-
Thiaw, Lamine, Madani, Kurosh, Malti, Rachid, Sow, Gustave, Laboratoire Images, Signaux et Systèmes Intelligents (LISSI), Université Paris-Est Créteil Val-de-Marne - Paris 12 (UPEC UP12), Laboratoire de l'intégration, du matériau au système (IMS), Université Sciences et Technologies - Bordeaux 1-Institut Polytechnique de Bordeaux-Centre National de la Recherche Scientifique (CNRS), and Malti, Rachid
- Subjects
non-linear systems ,[SPI.AUTO] Engineering Sciences [physics]/Automatic ,multi-model ,System identification ,recurrent models ,[SPI.AUTO]Engineering Sciences [physics]/Automatic - Abstract
Multi-modeling is a recent tool proposed for modeling complex nonlinear systems by the use of a combination of relatively simple set of local models. Due to their simplicity, linear local models are mainly used in such structures. In this work, multi-models having polynomial local models are described and applied in system identification. Estimation of model's parameters is carried out using least squares algorithms which reduce considerably computation time as compared to iterative algorithms. The proposed methodology is applied to recurrent models implementation. NARMAX and NOE multi-models are implemented and compared to their corresponding neural network implementations. Obtained results show that the proposed recurrent multi-model architectures have many advantages over neural network models.
- Published
- 2007
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.