938 results on '"time-series data"'
Search Results
202. Aggregating Time Series and Tabular Data in Deep Learning Model for University Students’ GPA Prediction
- Author
-
Harjanto Prabowo, Alam Ahmad Hidayat, Tjeng Wawan Cenggoro, Reza Rahutomo, Kartika Purwandari, and Bens Pardamean
- Subjects
Educational data mining ,deep learning ,GPA prediction ,time-series data ,tabular data ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Current approaches of university students’ Grade Point Average (GPA) prediction rely on the use of tabular data as input. Intuitively, adding historical GPA data can help to improve the performance of a GPA prediction model. In this study, we present a dual-input deep learning model that is able to simultaneously process time-series and tabular data for predicting student GPA. Our proposed model achieved the best performance among all tested models with 0.4142 MSE (Mean Squared Error) and 0.418 MAE (Mean Absolute Error) for GPA with a 4.0 scale. It also has the best $R^{2}$ -score of 0.4879, which means it explains the true distribution of students’ GPA better than other models.
- Published
- 2021
- Full Text
- View/download PDF
203. Groundwater Level Prediction Model Using Correlation and Difference Mechanisms Based on Boreholes Data for Sustainable Hydraulic Resource Management
- Author
-
Naeem Iqbal, Anam-Nawaz Khan, Atif Rizwan, Rashid Ahmad, Bong Wan Kim, Kwangsoo Kim, and Do-Hyeun Kim
- Subjects
Groundwater level prediction ,machine learning ,bagging and boosting ,correlation analysis ,time-series data ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Drilling data for groundwater extraction incur changes over time due to variations in hydrogeological and weather conditions. At any time, if there is a need to deploy a change in drilling operations, drilling companies keep monitoring the time-series drilling data to make sure it is not introducing any changes or new errors. Therefore, a solution is needed to predict groundwater levels (GWL) and detect a change in boreholes data to improve drilling efficiency. The proposed study presents an ensemble GWL prediction (E-GWLP) model using boosting and bagging models based on stacking techniques to predict GWL for enhancing hydraulic resource management and planning. The proposed research study consists of two modules; descriptive analysis of boreholes data and GWL prediction model using ensemble model based on stacking. First, descriptive analysis techniques, such as correlation analysis and difference mechanisms, are applied to investigate boreholes log data for extracting underlying characteristics, which is critical for enhancing hydraulic resource management. Second, an ensemble prediction model is developed based on multiple hydrological patterns using robust machine learning (ML) techniques to predict GWL for enhancing drilling efficiency and water resource management. The architecture of the proposed ensemble model involves three boosting algorithms as base models (level-0) and a bagging algorithm as a meta-model that combines the base models predictions (level-1). The base models consist of the following boosting algorithms; eXtreme Gradient Boosting (XGBoost), AdaBoost, Gradient Boosting (GB). The meta-model includes Random Forest (RF) as a bagging algorithm referred to as a level-1 model. Furthermore, different evaluation metrics are used, including mean absolute error (MAE), mean square error (MSE), and root mean square error (RMSE), mean absolute percentage error (MAPE), and R2 score. The performance of the proposed E-GWLP model is compared with existing ensemble and baseline models. The experimental results reveal that the proposed model performed accurately in respect of MAE, MSE, and RMSE of 0.340, 0.564, and 0.751, respectively. The MAPE and R2 score of our proposed approach is 12.658 and 0.976, respectively, which signifies the importance of our work. Moreover, experimental results suggest that E-GWLP model is suitable for sustainable water resource management and improves reservoir engineering.
- Published
- 2021
- Full Text
- View/download PDF
204. A Visual Analytics Interface for Formulating Evaluation Metrics of Multi-Dimensional Time-Series Data
- Author
-
Rei Takami, Hiroki Shibata, and Yasufumi Takama
- Subjects
Data visualization ,data analysis ,evaluation metrics ,graphical user interfaces ,human–computer interaction ,time-series data ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
A visual analytics (VA) interface for formulating evaluation metrics of multi-dimensional time-series data is proposed. Evaluation metrics such as key performance indicators (KPI) are expected to play an important role in quantitatively evaluating current situations and the quality of target objects. However, it is difficult for even domain experts to formulate metrics, especially for data with complexity related to dimensionality and temporal characteristics. The proposed interface is designed by extending the concept of semantic interaction to consider the temporal characteristics of target data. It represents metrics as a linear combination of data attributes and provides a means for adjusting it through interactive VA. On an animated scatter plot, an analyst can directly manipulate several visualized objects, i.e., a node, a trajectory, and a convex hull, as the group of nodes and trajectories. The result of manipulating the objects is reflected in the linear combination of attributes, which corresponds to an axis of the scatter plot. Using the axes as the output of the analysis, analysts can formulate a metric. The effectiveness of the proposed interface is demonstrated through an example and evaluated by two user experiments on the basis of hypotheses obtained from the example.
- Published
- 2021
- Full Text
- View/download PDF
205. Rainfall prediction: A comparative analysis of modern machine learning algorithms for time-series forecasting
- Author
-
Ari Yair Barrera-Animas, Lukumon O. Oyedele, Muhammad Bilal, Taofeek Dolapo Akinosho, Juan Manuel Davila Delgado, and Lukman Adewale Akanbi
- Subjects
Rainfall prediction ,LSTM Networks ,Multivariate time-series ,Multi-step forecast ,Time-series data ,Cybernetics ,Q300-390 ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
Rainfall forecasting has gained utmost research relevance in recent times due to its complexities and persistent applications such as flood forecasting and monitoring of pollutant concentration levels, among others. Existing models use complex statistical models that are often too costly, both computationally and budgetary, or are not applied to downstream applications. Therefore, approaches that use Machine Learning algorithms in conjunction with time-series data are being explored as an alternative to overcome these drawbacks. To this end, this study presents a comparative analysis using simplified rainfall estimation models based on conventional Machine Learning algorithms and Deep Learning architectures that are efficient for these downstream applications. Models based on LSTM, Stacked-LSTM, Bidirectional-LSTM Networks, XGBoost, and an ensemble of Gradient Boosting Regressor, Linear Support Vector Regression, and an Extra-trees Regressor were compared in the task of forecasting hourly rainfall volumes using time-series data. Climate data from 2000 to 2020 from five major cities in the United Kingdom were used. The evaluation metrics of Loss, Root Mean Squared Error, Mean Absolute Error, and Root Mean Squared Logarithmic Error were used to evaluate the models’ performance. Results show that a Bidirectional-LSTM Network can be used as a rainfall forecast model with comparable performance to Stacked-LSTM Networks. Among all the models tested, the Stacked-LSTM Network with two hidden layers and the Bidirectional-LSTM Network performed best. This suggests that models based on LSTM-Networks with fewer hidden layers perform better for this approach; denoting its ability to be applied as an approach for budget-wise rainfall forecast applications.
- Published
- 2022
- Full Text
- View/download PDF
206. Temporal aggregation bias and Gerrymandering urban time series.
- Author
-
Stehle, Samuel
- Subjects
- *
TIME series analysis , *GERRYMANDERING , *DATA analysis - Abstract
The Modifiable Aerial Unit Problem (MAUP) influences the interpretation of spatial data in that forms of spatial aggregation creates scale and segmentation ecological fallacies. This paper explores the extent to which similar scalar and segmentation issues affect the analysis of temporal data. The analogy of gerrymandering in spatial data, which is the purposeful segmentation of space such that the underlying aggregations prove a specific point, is used to demonstrate segmentation and aggregation effects on time series data. To do so, the paper evaluates real-time sound monitoring data for Dublin, Ireland at multiple aggregation scales and segmentations to determine their effects with respect to compliance with European Union regulations concerning acceptable decibel levels. Like the MAUP, increasing scales of temporal aggregation remove extremes at more local scales, which has the effect of reducing measurements of non-compliance. Similarly, and unlike the spatial equivalent, because of circadian human social patterns, segmentation of temporal measurements also has a predictable, and gerrymander-able, effect on the measurement of compliance with ambient sound limits. The effect is computed as the Temporal Aggregation Bias and strategies which could justify gerrymandering of sound monitoring data are presented. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
207. Managing streamed sensor data for mobile equipment prognostics.
- Author
-
Griffiths, Toby, Corrêa, Débora, Hodkiewicz, Melinda, and Polpo, Adriano
- Subjects
HEALTH management ,WIRELESS communications ,HEALTH status indicators ,LOGISTIC regression analysis ,DATA analysis - Abstract
The ability to wirelessly stream data from sensors on heavy mobile equipment provides opportunities to proactively assess asset condition. However, data analysis methods are challenging to apply due to the size and structure of the data, which contain inconsistent and asynchronous entries, and large periods of missing data. Current methods usually require expertise from site engineers to inform variable selection. In this work, we develop a data preparation method to clean and arrange this streaming data for analysis, including a data-driven variable selection. Data are drawn from a mining industry case study, with sensor data from a primary production excavator over a period of 9 months. Variables include 58 numerical sensors and 40 binary indicators captured in 45-million rows of data describing the conditions and status of different subsystems of the machine. A total of 57% of time stamps contain missing values for at least one sensor. The response variable is drawn from fault codes selected by the operator and stored in the fleet management system. Application to the hydraulic system, for 21 failure events identified by the operator, shows that the datadriven selection contains variables consistent with subject matter expert expectations, as well as some sensors on other systems on the excavator that are less easy to explain from an engineering perspective. Our contribution is to demonstrate a compressed data representation using open-high-low-close and variable selection to visualize data and support identification of potential indicators of failure events from multivariate streamed data. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
208. A Visualization Approach for Monitoring Order Processing in E-Commerce Warehouse.
- Author
-
Tang, Junxiu, Zhou, Yuhua, Tang, Tan, Weng, Di, Xie, Boyang, Yu, Lingyun, Zhang, Huaqiang, and Wu, Yingcai
- Subjects
WAREHOUSES ,ORDER picking systems ,VISUAL analytics ,VISUALIZATION ,DATA warehousing ,CUSTOMER satisfaction ,ELECTRONIC commerce - Abstract
The efficiency of warehouses is vital to e-commerce. Fast order processing at the warehouses ensures timely deliveries and improves customer satisfaction. However, monitoring, analyzing, and manipulating order processing in the warehouses in real time are challenging for traditional methods due to the sheer volume of incoming orders, the fuzzy definition of delayed order patterns, and the complex decision-making of order handling priorities. In this paper, we adopt a data-driven approach and propose OrderMonitor, a visual analytics system that assists warehouse managers in analyzing and improving order processing efficiency in real time based on streaming warehouse event data. Specifically, the order processing pipeline is visualized with a novel pipeline design based on the sedimentation metaphor to facilitate real-time order monitoring and suggest potentially abnormal orders. We also design a novel visualization that depicts order timelines based on the Gantt charts and Marey's graphs. Such a visualization helps the managers gain insights into the performance of order processing and find major blockers for delayed orders. Furthermore, an evaluating view is provided to assist users in inspecting order details and assigning priorities to improve the processing performance. The effectiveness of OrderMonitor is evaluated with two case studies on a real-world warehouse dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
209. Identification of Key Genes in 'Luang Pratahn', Thai Salt-Tolerant Rice, Based on Time-Course Data and Weighted Co-expression Networks.
- Author
-
Sonsungsan, Pajaree, Chantanakool, Pheerawat, Suratanee, Apichat, Buaboocha, Teerapong, Comai, Luca, Chadchawan, Supachitra, and Plaimas, Kitiporn
- Subjects
GENE regulatory networks ,GENETIC variation ,RICE ,GENES ,DATABASES ,SALINITY - Abstract
Salinity is an important environmental factor causing a negative effect on rice production. To prevent salinity effects on rice yields, genetic diversity concerning salt tolerance must be evaluated. In this study, we investigated the salinity responses of rice (Oryza sativa) to determine the critical genes. The transcriptomes of 'Luang Pratahn' rice, a local Thai rice variety with high salt tolerance, were used as a model for analyzing and identifying the key genes responsible for salt-stress tolerance. Based on 3' Tag-Seq data from the time course of salt-stress treatment, weighted gene co-expression network analysis was used to identify key genes in gene modules. We obtained 1,386 significantly differentially expressed genes in eight modules. Among them, six modules indicated a significant correlation within 6, 12, or 48h after salt stress. Functional and pathway enrichment analysis was performed on the co-expressed genes of interesting modules to reveal which genes were mainly enriched within important functions for salt-stress responses. To identify the key genes in salt-stress responses, we considered the two-state co-expression networks, normal growth conditions, and salt stress to investigate which genes were less important in a normal situation but gained more impact under stress. We identified key genes for the response to biotic and abiotic stimuli and tolerance to salt stress. Thus, these novel genes may play important roles in salinity tolerance and serve as potential biomarkers to improve salt tolerance cultivars. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
210. Eigenstate Transition of Multi-Channel Time Series Data around Earthquakes.
- Author
-
Okada, Akihisa and Kaneda, Yoshiyuki
- Abstract
To decrease human and economic damage owing to earthquakes, it is necessary to discover signals preceding earthquakes. We focus on the concept of "early warning signals" developed in bifurcation analysis, in which an increase in the variances of variables precedes its transition. If we can treat earthquakes as one of the transition phenomena that moves from one state to the other state, this concept is useful for detecting earthquakes before they start. We develop a covariance matrix from multi-channel time series data observed by an observatory on the seafloor and calculate the first eigenvalue and corresponding eigenstate of the matrix. By comparing the time dependence of the eigenstate to some past earthquakes, it is shown that the contribution from specific observational channels to the eigenstate increases before earthquakes, and there is a case in which the eigenvalue increases as predicted in early warning signals. This result suggests the first eigenvalue and eigenstate of multi-channel data are useful to identify signals preceding earthquakes. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
211. Periodicity-Oriented Data Analytics on Time-Series Data for Intelligence System.
- Author
-
Kim, Heonho, Yun, Unil, Vo, Bay, Lin, Jerry Chun-Wei, and Pedrycz, Witold
- Abstract
Periodic pattern mining models analyze patterns which occur periodically in a time-series database, such as sensor readings of smartphones and/or Internet of Things devices. The extracted patterns can be utilized for risk prediction, system management, and decision-making. In this article, we propose an efficient periodicity-oriented data analytics approach. It ignores intermediate events deliberately by adopting the concept of flexible periodic patterns, so it can be applied to more diverse real-life scenarios and systems. Moreover, the proposed approach adopts a novel symbol-centered data structure instead of existing data structures for state-of-the-art approaches of periodic pattern mining. Performance evaluations on real-life datasets, Diabetes, Oil Prices, and Bike Sharing, and requirements show that our approach has better runtime, memory usage, number of visited patterns, and sensitivity than efficient periodic pattern mining (EPPM) and flexible periodic pattern mining (FPPM), which are the state-of-the-art approaches in the same field. The experimental results show that the proposed algorithm will require less runtime and smaller memory than the existing algorithms on most data and requirements in real life. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
212. Electricity demand and price forecasting model for sustainable smart grid using comprehensive long short term memory.
- Author
-
Fatema, Israt, Kong, Xiaoying, and Fang, Gengfa
- Subjects
DEMAND forecasting ,LONG-term memory ,SHORT-term memory ,ELECTRIC power consumption ,ELECTRICITY pricing ,STANDARD deviations - Abstract
This paper proposes an electricity demand and price forecast model of the smart city large datasets using a single comprehensive Long Short-Term Memory (LSTM) based on a sequence-to-sequence network. Real electricity market data from the Australian Energy Market Operator (AEMO) is used to validate the effectiveness of the proposed model. Several simulations with different configurations are executed on actual data to produce reliable results. The validation results indicate that the devised model is a better option to forecast the electricity demand and price with an acceptably smaller error. A comparison of the proposed model is also provided with a few existing models, Support Vector Machine (SVM), Regression Tree (RT), and Neural Nonlinear Autoregressive network with Exogenous variables (NARX). Compared to SVM, RT, and NARX, the performance indices, Root Mean Square Error (RMSE) of the proposed forecasting model has been improved by 11.25%, 20%, and 33.5% respectively considering demand, and by 12.8%, 14.5%, and 47% respectively considering the price; similarly, the Mean Absolute Error (MAE) has been improved by 14%, 22.5%, and 32.5% respectively considering demand, and by 8.4%, 21% and 61% respectively considering price. Additionally, the proposed model can produce reliable forecast results without large historical datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
213. Fast Piecewise Polynomial Fitting of Time-Series Data for Streaming Computing
- Author
-
Jianhua Gao, Weixing Ji, Lulu Zhang, Senhao Shao, Yizhuo Wang, and Feng Shi
- Subjects
Least squares ,piecewise polynomial fitting ,streaming computing ,time-series data ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Streaming computing attracts intense attention because of the demand for massive data analyzing in real-time. Due to unbounded and continuous input, the volume of streaming data is so high that all the data cannot be permanently stored. Piecewise polynomial fitting is a popular data compression method that approximately represents the raw data stream with multiple polynomials. The polynomial coefficients corresponding to the best-fitting curve can be calculated by the method of least squares, which minimizes the sum of the squared residuals between observed and fitted values. However, built on several matrix calculations, the method of least squares always leads to high time complexity and is difficult to be applied to streaming computing. This paper puts forward a fast piecewise polynomial fitting for time-series data in streaming computing. The input data stream is dynamically segmented according to a given residual bound. Meanwhile, the data points in each segment are fitted using an improved polynomial fitting method, which has less time overhead than general polynomial fitting by reusing the intermediate calculation results. Experimental results on four time-series datasets show that our algorithm can achieve the highest speedup to the general piecewise polynomial fitting of 2.82x for periodically sampled time-series data and 1.85x for aperiodically sampled time-series data, without affecting the compression ratio and fitting accuracy. Moreover, the event-time latency comparison in a streaming environment indicates that the improved method can endure higher throughput than general piecewise polynomial fitting with the same latency.
- Published
- 2020
- Full Text
- View/download PDF
214. Enhanced inference of ecological networks by parameterizing ensembles of population dynamics models constrained with prior knowledge
- Author
-
Chen Liao, Joao B. Xavier, and Zhenduo Zhu
- Subjects
Lotka–Volterra model ,Time-series data ,Summary food web ,Ecological network inference ,Ensemble method ,Invasive species ,Ecology ,QH540-549.5 - Abstract
Abstract Background Accurate network models of species interaction could be used to predict population dynamics and be applied to manage real world ecosystems. Most relevant models are nonlinear, however, and data available from real world ecosystems are too noisy and sparsely sampled for common inference approaches. Here we improved the inference of generalized Lotka–Volterra (gLV) ecological networks by using a new optimization algorithm to constrain parameter signs with prior knowledge and a perturbation-based ensemble method. Results We applied the new inference to long-term species abundance data from the freshwater fish community in the Illinois River, United States. We constructed an ensemble of 668 gLV models that explained 79% of the data on average. The models indicated (at a 70% level of confidence) a strong positive interaction from emerald shiner (Notropis atherinoides) to channel catfish (Ictalurus punctatus), which we could validate using data from a nearby observation site, and predicted that the relative abundances of most fish species will continue to fluctuate temporally and concordantly in the near future. The network shows that the invasive silver carp (Hypophthalmichthys molitrix) has much stronger impacts on native predators than on prey, supporting the notion that the invader perturbs the native food chain by replacing the diets of predators. Conclusions Ensemble approaches constrained by prior knowledge can improve inference and produce networks from noisy and sparsely sampled time series data to fill knowledge gaps on real world ecosystems. Such network models could aid efforts to conserve ecosystems such as the Illinois River, which is threatened by the invasion of the silver carp.
- Published
- 2020
- Full Text
- View/download PDF
215. Managing streamed sensor data for mobile equipment prognostics
- Author
-
Toby Griffiths, Débora Corrêa, Melinda Hodkiewicz, and Adriano Polpo
- Subjects
Asset health ,LASSO logistic regression ,mobile fleet prognostics ,sensor data ,Streaming data ,time-series data ,Engineering (General). Civil engineering (General) ,TA1-2040 - Abstract
The ability to wirelessly stream data from sensors on heavy mobile equipment provides opportunities to proactively assess asset condition. However, data analysis methods are challenging to apply due to the size and structure of the data, which contain inconsistent and asynchronous entries, and large periods of missing data. Current methods usually require expertise from site engineers to inform variable selection. In this work, we develop a data preparation method to clean and arrange this streaming data for analysis, including a data-driven variable selection. Data are drawn from a mining industry case study, with sensor data from a primary production excavator over a period of 9 months. Variables include 58 numerical sensors and 40 binary indicators captured in 45-million rows of data describing the conditions and status of different subsystems of the machine. A total of 57% of time stamps contain missing values for at least one sensor. The response variable is drawn from fault codes selected by the operator and stored in the fleet management system. Application to the hydraulic system, for 21 failure events identified by the operator, shows that the data-driven selection contains variables consistent with subject matter expert expectations, as well as some sensors on other systems on the excavator that are less easy to explain from an engineering perspective. Our contribution is to demonstrate a compressed data representation using open-high-low-close and variable selection to visualize data and support identification of potential indicators of failure events from multivariate streamed data.
- Published
- 2022
- Full Text
- View/download PDF
216. A Cryptocurrency Price Prediction Model using Deep Learning
- Author
-
V. Akila, M.V.S. Nitin, I. Prasanth, Reddy M. Sandeep, and G. Akash Kumar
- Subjects
cryptocurrency ,change point detection algorithms ,bitcoin ,long short-term memory ,time-series data ,Environmental sciences ,GE1-350 - Abstract
Cryptocurrencies have gained immense popularity in recent years as an emerging asset class, and their prices are known to be highly volatile. Predicting cryptocurrency prices is a difficult task due to their complex nature and the absence of a central authority. In this paper, our proposal is to employ Long Short-Term Memory (LSTM) networks, a type of deep learning technique to forecast the prices of cryptocurrencies. We use historical price data and technical indicators as inputs to the LSTM model, which learns the underlying patterns and trends in the data. To improve the accuracy of the predictions, we also incorporate a Change Point Detection (CPD) technique using the Pruned Exact Linear Time (PELT) algorithm. This method allows us to detect significant changes in cryptocurrency prices and adjust the LSTM model accordingly, leading to better predictions. We evaluate our approach predominantly on Bitcoin cryptocurrency, but the model can be implemented on other cryptocurrencies provided there are valid historical price data. Our experimental results show that our proposed model outperforms the baseline LSTM algorithm, achieving higher accuracy and better performance in terms of Mean Absolute Error (MAE), Mean Square Error (MSE), and Root Mean Square Error (RMSE). Our research findings suggest that combining deep learning techniques such as LSTM with change point detection techniques such as PELT can improve cryptocurrency price prediction accuracy and have practical implications for investors, traders, and financial analysts.
- Published
- 2023
- Full Text
- View/download PDF
217. Binned Data Provide Better Imputation of Missing Time Series Data from Wearables
- Author
-
Shweta Chakrabarti, Nupur Biswas, Khushi Karnani, Vijay Padul, Lawrence D. Jones, Santosh Kesari, and Shashaanka Ashili
- Subjects
imputation ,missing data ,time-series data ,binning ,wearables ,Chemical technology ,TP1-1185 - Abstract
The presence of missing values in a time-series dataset is a very common and well-known problem. Various statistical and machine learning methods have been developed to overcome this problem, with the aim of filling in the missing values in the data. However, the performances of these methods vary widely, showing a high dependence on the type of data and correlations within the data. In our study, we performed some of the well-known imputation methods, such as expectation maximization, k-nearest neighbor, iterative imputer, random forest, and simple imputer, to impute missing data obtained from smart, wearable health trackers. In this manuscript, we proposed the use of data binning for imputation. We showed that the use of data binned around the missing time interval provides a better imputation than the use of a whole dataset. Imputation was performed for 15 min and 1 h of continuous missing data. We used a dataset with different bin sizes, such as 15 min, 30 min, 45 min, and 1 h, and we carried out evaluations using root mean square error (RMSE) values. We observed that the expectation maximization algorithm worked best for the use of binned data. This was followed by the simple imputer, iterative imputer, and k-nearest neighbor, whereas the random forest method had no effect on data binning during imputation. Moreover, the smallest bin sizes of 15 min and 1 h were observed to provide the lowest RMSE values for the majority of the time frames during the imputation of 15 min and 1 h of missing data, respectively. Although applicable to digital health data, we think that this method will also find applicability in other domains.
- Published
- 2023
- Full Text
- View/download PDF
218. Identification of Key Genes in ‘Luang Pratahn’, Thai Salt-Tolerant Rice, Based on Time-Course Data and Weighted Co-expression Networks
- Author
-
Pajaree Sonsungsan, Pheerawat Chantanakool, Apichat Suratanee, Teerapong Buaboocha, Luca Comai, Supachitra Chadchawan, and Kitiporn Plaimas
- Subjects
salt tolerant rice ,3' Tag Seq ,time-series data ,weighted co-expression network ,two-state co-expression network ,network-based analysis ,Plant culture ,SB1-1110 - Abstract
Salinity is an important environmental factor causing a negative effect on rice production. To prevent salinity effects on rice yields, genetic diversity concerning salt tolerance must be evaluated. In this study, we investigated the salinity responses of rice (Oryza sativa) to determine the critical genes. The transcriptomes of ‘Luang Pratahn’ rice, a local Thai rice variety with high salt tolerance, were used as a model for analyzing and identifying the key genes responsible for salt-stress tolerance. Based on 3' Tag-Seq data from the time course of salt-stress treatment, weighted gene co-expression network analysis was used to identify key genes in gene modules. We obtained 1,386 significantly differentially expressed genes in eight modules. Among them, six modules indicated a significant correlation within 6, 12, or 48h after salt stress. Functional and pathway enrichment analysis was performed on the co-expressed genes of interesting modules to reveal which genes were mainly enriched within important functions for salt-stress responses. To identify the key genes in salt-stress responses, we considered the two-state co-expression networks, normal growth conditions, and salt stress to investigate which genes were less important in a normal situation but gained more impact under stress. We identified key genes for the response to biotic and abiotic stimuli and tolerance to salt stress. Thus, these novel genes may play important roles in salinity tolerance and serve as potential biomarkers to improve salt tolerance cultivars.
- Published
- 2021
- Full Text
- View/download PDF
219. Growth and instability analysis of major crops in Nepal
- Author
-
Priyambada Joshi, Pramod Gautam, and Pramila Wagle
- Subjects
Compound growth rate ,Instability index ,Production performance ,Time-series data ,Major crops ,Agriculture (General) ,S1-972 ,Nutrition. Foods and food supply ,TX341-641 - Abstract
Agriculture production growth is dependent on various factors as policies, inputs, and the environment. It is imperative to compute growth rate for the formulation of plans, policies and strategies to boost production performance. This study examined the compound growth rate and instability index of major crops in Nepal. The time-series data of 29 years from 1990/91 to 2018/19 was analysed by dividing the total time period into three sub-periods of three decades. The instability of most of the crops' area, production, and productivity declined during the second sub-period and increased in the third sub period. The growth rate of the area, production, and productivity of the crops declined during the second sub-period except for maize, potato, and vegetables. During the third sub-period, only sugarcane, oilseeds, and maize area growth rate risen whereas the growth rate of production increased for sugarcane, oilseed, lentil, and rice. For the better production performance of crops, crop-specific strategies regarding area expansion and technological intervention should be promulgated.
- Published
- 2021
- Full Text
- View/download PDF
220. Synthetic data generation methods in healthcare: A review on open-source tools and methods.
- Author
-
Pezoulas VC, Zaridis DI, Mylona E, Androutsos C, Apostolidis K, Tachos NS, and Fotiadis DI
- Abstract
Synthetic data generation has emerged as a promising solution to overcome the challenges which are posed by data scarcity and privacy concerns, as well as, to address the need for training artificial intelligence (AI) algorithms on unbiased data with sufficient sample size and statistical power. Our review explores the application and efficacy of synthetic data methods in healthcare considering the diversity of medical data. To this end, we systematically searched the PubMed and Scopus databases with a great focus on tabular, imaging, radiomics, time-series, and omics data. Studies involving multi-modal synthetic data generation were also explored. The type of method used for the synthetic data generation process was identified in each study and was categorized into statistical, probabilistic, machine learning, and deep learning. Emphasis was given to the programming languages used for the implementation of each method. Our evaluation revealed that the majority of the studies utilize synthetic data generators to: (i) reduce the cost and time required for clinical trials for rare diseases and conditions, (ii) enhance the predictive power of AI models in personalized medicine, (iii) ensure the delivery of fair treatment recommendations across diverse patient populations, and (iv) enable researchers to access high-quality, representative multimodal datasets without exposing sensitive patient information, among others. We underline the wide use of deep learning based synthetic data generators in 72.6 % of the included studies, with 75.3 % of the generators being implemented in Python. A thorough documentation of open-source repositories is finally provided to accelerate research in the field., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (© 2024 The Authors. Published by Elsevier B.V. on behalf of Research Network of Computational and Structural Biotechnology.)
- Published
- 2024
- Full Text
- View/download PDF
221. Benchmarking online sequence-to-sequence and character-based handwriting recognition from IMU-enhanced pens
- Author
-
Ott, Felix, Rügamer, David, Heublein, Lucas, Hamann, Tim, Barth, Jens, Bischl, Bernd, and Mutschler, Christopher
- Published
- 2022
- Full Text
- View/download PDF
222. A deep learning approach using graph convolutional networks for slope deformation prediction based on time-series displacement data.
- Author
-
Ma, Zhengjing, Mei, Gang, Prezioso, Edoardo, Zhang, Zhongjian, and Xu, Nengxiong
- Subjects
- *
DEEP learning , *PREDICTION models , *PROPERTY damage , *FORECASTING , *TIME series analysis - Abstract
Slope deformation prediction is crucial for early warning of slope failure, which can prevent property damage and save human life. Existing predictive models focus on predicting the displacement of a single monitoring point based on time series data, without considering spatial correlations among monitoring points, which makes it difficult to reveal the displacement changes in the entire monitoring system and ignores the potential threats from nonselected points. To address the above problem, this paper presents a novel deep learning method for predicting the slope deformation, by considering the spatial correlations between all points in the entire displacement monitoring system. The essential idea behind the proposed method is to predict the slope deformation based on the global information (i.e., the correlated displacements of all points in the entire monitoring system), rather than based on the local information (i.e., the displacements of a specified single point in the monitoring system). In the proposed method, (1) a weighted adjacency matrix is built to interpret the spatial correlations between all points, (2) a feature matrix is assembled to store the time-series displacements of all points, and (3) one of the state-of-the-art deep learning models, i.e., T-GCN, is developed to process the above graph-structured data consisting of two matrices. The effectiveness of the proposed method is verified by performing predictions based on a real dataset. The proposed method can be applied to predict time-dependency information in other similar geohazard scenarios, based on time-series data collected from multiple monitoring points. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
223. Time-series data dynamic density clustering.
- Author
-
Chen, Hao, Xia, Yu, Pan, Yuekai, and Yang, Qing
- Subjects
- *
DENSITY - Abstract
In many clustering problems, the whole data is not always static. Over time, part of it is likely to be changed, such as updated, erased, etc. Suffer this effect, the timeline can be divided into multiple time segments. And, the data at each time slice is static. Then, the data along the timeline shows a series of dynamic intermediate states. The union set of data from all time slices is called the time-series data. Obviously, the traditional clustering process does not apply directly to the time-series data. Meanwhile, repeating the clustering process at every time slices costs tremendous. In this paper, we analyze the transition rules of the data set and cluster structure when the time slice shifts to the next. We find there is a distinct correlation of data set and succession of cluster structure between two adjacent ones, which means we can use it to reduce the cost of the whole clustering process. Inspired by it, we propose a dynamic density clustering method (DDC) for time-series data. In the simulations, we choose 6 representative problems to construct the time-series data for testing DDC. The results show DDC can get high accuracy results for all 6 problems while reducing the overall cost markedly. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
224. Long-Term Changes in the Wintering Population of the Dalmatian Pelican along the Black Sea-Mediterranean Flyway.
- Author
-
Barboutis, Christos, Kassara, Christina, Alexandrou, Olga, and Catsadorakis, Giorgos
- Abstract
Understanding spatiotemporal dynamics in wildlife populations is of paramount importance for their effective conservation, however longitudinal studies are relatively scarce for most animal groups. Waterbirds are an exception however, since midwinter surveys have been implemented in most areas of the world for over four decades. The Dalmatian Pelican Pelecanus crispus is a globally threatened emblematic wetland species of the Palearctic, with a wide distribution in Europe and Asia. Its global population is divided into three distinct groups that coincide with the Black Sea-Mediterranean Flyway, the Central Asian Flyway and the East Asian Flyway. In this study we used International Waterfowl Census data to assess long-term changes in the wintering population of the Dalmatian Pelican pertaining to the Black Sea-Mediterranean Flyway. We report national and regional population trends in SE Europe and Turkey and explore spatiotemporal patterns in the wintering numbers and distribution of the species in relation to climate variability during the last two decades. Our key findings suggest that during the past 30 years the abundance of wintering Pelicans increased across the entire study area. Within the eastern subpopulation this increase was most accentuated in the northern edge of the species' wintering distribution, which was associated with a local warming trend, and was coupled with a north-eastern shift in the distribution pattern, yet not driven by climate conditions. Other contributing factors, such as winter site fidelity, local food availability, finer scale climatic and habitat conditions, but also carry-over effects should be considered in future studies. Given the advancement of first laying dates in Dalmatian Pelicans in almost all breeding sites and the strict timing of IWC counts, we also propose the implementation of species-specific winter surveys, independently from IWC, to obtain a more thorough understanding of the dynamics of the Dalmatian Pelican's wintering population. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
225. Wetland Monitoring: Reporting
- Author
-
Crossman, Neville D., Stratford, Charlie J., Finlayson, C. Max, editor, Everard, Mark, editor, Irvine, Kenneth, editor, McInnes, Robert J., editor, Middleton, Beth A., editor, van Dam, Anne A., editor, and Davidson, Nick C., editor
- Published
- 2018
- Full Text
- View/download PDF
226. Outlier Detection in Time-Series Data: Specific to Nearly Uniform Signals from the Sensors
- Author
-
Suman, Sourabh, Rajathilagam, B., Tavares, João Manuel R.S., Series Editor, Jorge, Renato Natal, Series Editor, Hemanth, D. Jude, editor, and Smys, S., editor
- Published
- 2018
- Full Text
- View/download PDF
227. A Topic Structuration Method on Time Series for a Meeting from Text Data
- Author
-
Okada, Ryotaro, Nakanishi, Takafumi, Tanaka, Yuichi, Ogasawara, Yutaka, Ohashi, Kazuhiro, Kacprzyk, Janusz, Series editor, and Lee, Roger, editor
- Published
- 2018
- Full Text
- View/download PDF
228. Inferring Transcriptional Dynamics with Time-Dependent Reaction Rates Using Stochastic Simulation
- Author
-
Shetty, Keerthi S., Annappa, B., Kacprzyk, Janusz, Series Editor, Pal, Nikhil R., Advisory Editor, Bello Perez, Rafael, Advisory Editor, Corchado, Emilio S., Advisory Editor, Hagras, Hani, Advisory Editor, Kóczy, László T., Advisory Editor, Kreinovich, Vladik, Advisory Editor, Lin, Chin-Teng, Advisory Editor, Lu, Jie, Advisory Editor, Melin, Patricia, Advisory Editor, Nedjah, Nadia, Advisory Editor, Nguyen, Ngoc Thanh, Advisory Editor, Wang, Jun, Advisory Editor, Sa, Pankaj Kumar, editor, Bakshi, Sambit, editor, Hatzilygeroudis, Ioannis K., editor, and Sahoo, Manmath Narayan, editor
- Published
- 2018
- Full Text
- View/download PDF
229. WAMS/SCADA Data Fusion Method Study Based on Time-Series Data Correlation Mining
- Author
-
Zhao, LiJin, Huang, Liang, Lv, Qiansu, Yang, Tao, Wei, Daqian, Kacprzyk, Janusz, Series editor, Pal, Nikhil R., Advisory editor, Bello Perez, Rafael, Advisory editor, Corchado, Emilio S., Advisory editor, Hagras, Hani, Advisory editor, Kóczy, László T., Advisory editor, Kreinovich, Vladik, Advisory editor, Lin, Chin-Teng, Advisory editor, Lu, Jie, Advisory editor, Melin, Patricia, Advisory editor, Nedjah, Nadia, Advisory editor, Nguyen, Ngoc Thanh, Advisory editor, Wang, Jun, Advisory editor, Hu, Zhengbing, editor, Petoukhov, Sergey, editor, and He, Matthew, editor
- Published
- 2018
- Full Text
- View/download PDF
230. Time series modeling and forecasting of epidemic spreading processes using deep transfer learning.
- Author
-
Xue, Dong, Wang, Ming, Liu, Fangzhou, and Buss, Martin
- Subjects
- *
SARS Epidemic, 2002-2003 , *DEEP learning , *HEBBIAN memory , *TIME series analysis , *EPIDEMICS , *CONVOLUTIONAL neural networks , *KNOWLEDGE transfer , *CHARGE transfer - Abstract
Traditional data-driven methods for modeling and predicting epidemic spreading typically operate in an independent and identically distributed setting. However, epidemic spreading on complex networks exhibits significant heterogeneity across different phases, regions, and viruses, indicating that epidemic time series may not be independent and identically distributed due to temporal and spatial variations. In this article, a novel deep transfer learning method integrating convolutional neural networks (CNNs) and bi-directional long short-term memory (BiLSTM) networks is proposed to model and forecast epidemics with heterogeneous data. The proposed method combines a CNN-based layer for local feature extraction, a BiLSTM-based layer for temporal analysis, and a fully connected layer for prediction, and employs transfer learning to enhance the generalization ability of the CNN-BiLSTM model. To improve prediction performance, hyperparameter tuning is conducted using particle swarm optimization during model training. Finally, we adopt the proposed approach to characterize the spatio-temporal spreading dynamics of COVID-19 and infer the pathological heterogeneity among epidemics of severe acute respiratory syndrome (SARS), influenza A (H1N1), and COVID-19. The comprehensive results demonstrate the effectiveness of the proposed approach in exploring the spatiotemporal variations in the spread of epidemics and characterizing the epidemiological features of different viruses. Moreover, the proposed method can significantly reduce modeling and predicting errors in epidemic spread to some extent. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
231. Small perturbations are enough: Adversarial attacks on time series prediction.
- Author
-
Wu, Tao, Wang, Xuechun, Qiao, Shaojie, Xian, Xingping, Liu, Yanbing, and Zhang, Liang
- Subjects
- *
TIME series analysis , *PREDICTION models , *FORECASTING , *DEEP learning , *IMAGE processing , *DATA mining - Abstract
Time-series data are widespread in real-world industrial scenarios. To recover and infer missing information in real-world applications, the problem of time-series prediction has been widely studied as a classical research topic in data mining. Deep learning architectures have been viewed as next-generation time-series prediction models. However, recent studies have shown that deep learning models are vulnerable to adversarial attacks. In this study, we prospectively examine the problem of time-series prediction adversarial attacks and propose an attack strategy for generating an adversarial time series by adding malicious perturbations to the original time series to deteriorate the performance of time-series prediction models. Specifically, a perturbation-based adversarial example generation algorithm is proposed using the gradient information of the prediction model. In practice, unlike the imperceptibility to humans in the field of image processing, time-series data are more sensitive to abnormal perturbations and there are more stringent requirements regarding the amount of perturbations. To address this challenge, we craft an adversarial time series based on the importance measurement to slightly perturb the original data. Based on comprehensive experiments conducted on real-world time-series datasets, we verify that the proposed adversarial attack methods not only effectively fool the target time-series prediction model LSTNet, they also attack state-of-the-art CNN-, RNN-, and MHANET-based models. Meanwhile, the results show that the proposed methods achieve a good transferability. That is, the adversarial examples generated for a specific prediction model can significantly affect the performance of the other methods. Moreover, through a comparison with existing adversarial attack approaches, we can see that much smaller perturbations are sufficient for the proposed importance-measurement based adversarial attack method. The methods described in this paper are significant in understanding the impact of adversarial attacks on a time-series prediction and promoting the robustness of such prediction technologies. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
232. Time‐course in attractiveness of pheromone lure on the smaller tea tortrix moth: A generalized additive mixed model approach.
- Author
-
Sudo, Masaaki, Sato, Yasushi, and Yorozuya, Hiroshi
- Subjects
- *
MOTHS , *INSECT traps , *PHEROMONE traps , *TEA , *INSECT pests , *POPULATION ecology , *ANIMAL offspring sex ratio - Abstract
Long‐term pest insect monitoring in agriculture and forestry has advanced population ecology. However, the discontinuation of research materials such as pheromone lure products jeopardizes data collection continuity, which constrains the utilization of the industrial datasets in ecology. Three pheromone lures against the smaller tea tortrix moth Adoxophyes honmai Yasuda (Lepidoptera; Tortricidae) were available but one was recently discontinued. Hence, a statistical method is required to convert data among records of moths captured with different lures. We developed several generalized additive mixed models (GAMM) separating temporal fluctuation in the background male density during trapping and attenuation of lure attractiveness due to aging or air exposure after settlement. We collected multisite trap data over four moth generations. The lures in each of these were unsealed at different times before trap settlement. We used cross‐validation to select the model with the best generalization performance. The preferred GAMM had nonlinear density fluctuation terms and lure attractiveness decreased exponentially after unsealing. The attenuation rates varied among lures. A light trap dataset near the pheromone traps was a candidate for a male density predictor. Nevertheless, there was only a weak correlation between trap yields, suggesting the difficulty of data conversion between the traps differing in attraction mechanisms. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
233. Is it possible to forecast KOSPI direction using deep learning methods?
- Author
-
Songa Choi and Jongwoo Song
- Subjects
DEEP learning ,STOCK market index options ,SHORT-term memory - Abstract
Deep learning methods have been developed, used in various fields, and they have shown outstanding performances in many cases. Many studies predicted a daily stock return, a classic example of time-series data, using deep learning methods. We also tried to apply deep learning methods to Korea’s stock market data. We used Korea’s stock market index (KOSPI) and several individual stocks to forecast daily returns and directions. We compared several deep learning models with other machine learning methods, including random forest and XGBoost. In regression, long short term memory (LSTM) and gated recurrent unit (GRU) models are better than other prediction models. For the classification applications, there is no clear winner. However, even the best deep learning models cannot predict significantly better than the simple base model. We believe that it is challenging to predict daily stock return data even if we use the latest deep learning methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
234. Are there long-term temporal trends of size composition and the length– weight relationship? Results for chokka squid Loligo reynaudii during the peak spawning season off the south coast of South Africa.
- Author
-
Lipiński, MR, Mmethi, MA, Yemane, D, Githaiga-Mwicigi, J, and Sauer, WHH
- Subjects
- *
SQUIDS , *COASTS , *CECUM , *STOMACH , *OCEAN temperature , *OTOLITHS - Abstract
Temporal trends in the size composition (length frequency) and length–weight (L–W) relationship of chokka squid Loligo reynaudii on the south coast of South Africa were assessed over periods spanning 22 years: length frequencies from 1996 to 2017 (with 15 years represented); and L–W relationships over 9 years between 1994 and 2016. To allow for comparison, identical data selection and processing was adopted for all years considered (i.e. identical period of 60 days in spring–summer; the same depths and areas; chokka with empty stomachs; and squid of the same maturity stage). Although there were no significant long-term temporal trends in the mean lengths, there was a significant short-term drop in the mean lengths over the years 2014–2017 (especially in females), which could not be attributed with certainty to any cause. A tentative explanation is that this drop might be linked to the introduction of an additional closed season in these years. The estimated parameters of the L–W relationship also revealed no trend over the years considered. Investigation of the caecum colour, which indicates the state of starvation (white: 8 h on average after food ingestion; yellow: 6 to 7 h after food ingestion), showed significantly more starving males than starving females. Starvation of males on the spawning grounds might be associated with the spawning behaviour of chokka. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
235. Regional‐scale forest restoration effects on ecosystem resiliency to drought: a synthesis of vegetation and moisture trends on Google Earth Engine.
- Author
-
Sankey, Temuulen, Belmonte, Adam, Massey, Richard, Leonard, Jackson, Disney, Mat, and Armenteras, Dolors
- Subjects
FOREST restoration ,WILDFIRE prevention ,FUEL reduction (Wildfire prevention) ,NORMALIZED difference vegetation index ,DROUGHTS ,FOREST health ,LANDSAT satellites ,MOISTURE - Abstract
Large‐scale changes in forest structure and ecological function throughout western North America have led to increased frequency, size, and severity of wildfires. The US Forest Service is implementing state‐wide forest restoration initiatives to reduce wildfire hazards and improve forest health. We provide a synthesis of pre‐ and post‐treatment forest vegetation and ecosystem moisture trends between 1990 and 2017 in Arizona, the first US state where the initiative has implemented a variety of thinning and burning methods in over 1,200 polygon areas across 3.5 million ha. Using 4,426 Landsat satellite images on Google Earth Engine, we calculated normalized difference moisture index (NDMI), normalized difference water index (NDWI), and normalized difference vegetation index (NDVI) to create dense time‐series datasets. The indices and 1990‐2017 annual total precipitation dataset were then examined using a Mann–Kendall tau test to identify statistically significant upward and downward trends for each pixel. Our results indicate that much of the study region were experiencing drought conditions prior to restoration treatments and NDVI values were significantly decreasing, especially during the dry spring season. However, both NDMI and NDWI trends indicate that the forest restoration treatments have contributed to increased total ecosystem moisture, while precipitation in the post‐treatment period exhibit stable trends. Forest restoration treatments appear to have improved the overall forest health and resiliency to drought, especially during the dry spring season, when forests are most vulnerable to water stress and wildfire risks. Our results of the spatial patterns and long‐term trends in these variables can inform the currently ongoing and future restoration treatments to better target the treatment strategy across the southwestern USA. Google Earth Engine enabled our synthesis of these long‐term trends over the large region and will enhance our continued monitoring in the coming decade. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
236. The Effects of Key Influencing Factors on Unsafe Events in Airport Flight Areas According to Time Series Data.
- Author
-
Shao, Quan, Liu, Haoran, Zhu, Pei, Wei, Wei, Zhou, Hang, and Yang, Mingming
- Subjects
TIME series analysis ,IMPULSE response ,CONSUMPTION (Economics) ,ECONOMETRIC models ,TRANSPORT planes - Abstract
This study applied modern econometric models to analyze the factors affecting the number of unsafe events (NUE) in the airport flight area using time series data from 1993 to 2017. Influencing factors considered in this article include gross domestic product (GDP) per capita (GDPPC), household consumption level (HCL), the civil aviation passenger turnover (CAPT), the number of civil aviation transport aircraft (NCATA), the total population of the whole country, and the number of civil aviation employment (NCAE). First, the Johansen cointegration test results demonstrate that HCL, NCATA, and NCAE have long‐term effects on unsafe events—namely, a 1% increase in HCL corresponds to an average 0.262% increase in NUE. In addition, a 1% increase in NCATA correlates to an average 2.339% increase, and a 1% increase in NCAE corresponds to an average 2.202% decrease in NUE with fixed variables. The analysis results based on the vector error correction model suggest that CAPT has short‐term effects on unsafe events and positively correlates to unsafe events. The results of the impulse response function also indicate that the impact of NUE in the previous period on the change of NUE gradually weakens and finally tends to be stable. Finally, NCATA is found to promote the change of NUE significantly. Similarly, the results of variance decomposition also indicate that NCATA has the greatest contribution to the change of NUE, followed by NUE in the previous period. The findings reveal the effects of key factors on the change of unsafe events in airport flight areas, thus providing a valuable theoretical basis for preventing the occurrence of unsafe incidents. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
237. Analyzing entropy features in time-series data for pattern recognition in neurological conditions.
- Author
-
Huang, Yushan, Zhao, Yuchen, Capstick, Alexander, Palermo, Francesca, Haddadi, Hamed, and Barnaghi, Payam
- Abstract
In the field of medical diagnosis and patient monitoring, effective pattern recognition in neurological time-series data is essential. Traditional methods predominantly based on statistical or probabilistic learning and inference often struggle with multivariate, multi-source, state-varying, and noisy data while also posing privacy risks due to excessive information collection and modeling. Furthermore, these methods often overlook critical statistical information, such as the distribution of data points and inherent uncertainties. To address these challenges, we introduce an information theory-based pipeline that leverages specialized features to identify patterns in neurological time-series data while minimizing privacy risks. We incorporate various entropy methods based on the characteristics of different scenarios and entropy. For stochastic state transition applications, we incorporate Shannon's entropy, entropy rates, entropy production, and the von Neumann entropy of Markov chains. When state modeling is impractical, we select and employ approximate entropy, increment entropy, dispersion entropy, phase entropy, and slope entropy. The pipeline's effectiveness and scalability are demonstrated through pattern analysis in a dementia care dataset and also an epileptic and a myocardial infarction dataset. The results indicate that our information theory-based pipeline can achieve average performance improvements across various models on the recall rate, F1 score, and accuracy by up to 13.08 percentage points, while enhancing inference efficiency by reducing the number of model parameters by an average of 3.10 times. Thus, our approach opens a promising avenue for improved, efficient, and critical statistical information-considered pattern recognition in medical time-series data. • Time-series analysis method for neurology using information theory. • Incorporated entropy features to extract versatile high-level attributes. • Tested scalability on three datasets, one from our dementia healthcare platform. • Improved recall, F1 score, and accuracy by up to 13.08 percentage points over existing baselines. • Efficiency improved by reducing parameters 3.10x on epilepsy and heart datasets. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
238. Enabling global interpolation, derivative estimation and model identification from sparse multi-experiment time series data via neural ODEs.
- Author
-
Bradley, William, Volkovinsky, Ron, and Boukouvala, Fani
- Subjects
- *
INTERPOLATION , *SYSTEM dynamics , *PARAMETER estimation , *SYSTEM analysis , *SENSITIVITY analysis - Abstract
Estimation of the rate of change of a system's states from state measurements is a key step in several system analysis and model-building workflows. While numerous interpolating models exist for inferring derivatives of time series when data is disturbed by noise or stochasticity, general-purpose methods for estimating derivatives from sparse time series datasets are largely lacking. A notable weakness of current methods, which are largely local, is their inability to globally fit data arising from non-identical initial conditions (i.e., multiple experiments or trajectories). In this contribution, Neural ODEs (NODEs) are demonstrated to close this gap. Through a series of benchmarks, we show that because of the differential formulation of NODEs, these data smoothers can infer system dynamics of sparse data, even when accurate interpolation by algebraic methods is unlikely or fundamentally impossible. Through the presented case studies for derivative estimation and model identification, we discuss the advantages and limitations of our proposed workflow and identify cases where NODEs lead to statistically significant improvements. In summary, the proposed method is shown to be advantageous when inferring derivatives from sparse data stratified across multiple experiments and serves as a foundation for further model development and analysis methods (e.g., parameter estimation, model identification, sensitivity analysis). [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
239. Analysis of Social, Economic and Population in Central Java Using the Dynamic Data Panel Simultaneous Equation Model
- Author
-
Supriyanto Supriyanto
- Subjects
Dynamic data panel ,Simultaneous Equations ,Variable Instrument ,cross-sectional data ,time-series data ,Social Sciences - Abstract
A single equation model that is often used ignores the interdependence between response variables. Frequently encountered variables that have a two-way relationship. These interrelated two-way relationships can be summarized in a simultaneous equation model system. There is a relationship between variables which are in fact dynamic. In the system model of simultaneous equations with dynamic panel data, each structural equation is a dynamic panel data regression equation. The estimation of using Ordinary Least Square (OLS) in the dynamic panel data model results in biased and inconsistent predictors because there is a lag of the dependent variable that correlates with the error. First difference in dynamic panel models is used to eliminate individual effects. Instrument variables are needed, namely variables that do not correlate with errors. Therefore, dynamic panel data models are more suitable to be used in analyzing poverty and social change. From the simultaneous equation model obtained, the dominant factors affecting the level of poverty in Central Java Province are the unemployment rate, Human Development Index, labor force participation rate, population, and Gross Regional Domestic Product.
- Published
- 2019
- Full Text
- View/download PDF
240. A Real-Time Crowdsensing Framework for Potential COVID-19 Carrier Detection Using Wearable Sensors
- Author
-
Harsh Mankodiya, Priyal Palkhiwala, Rajesh Gupta, Nilesh Kumar Jadav, Sudeep Tanwar, Bogdan-Constantin Neagu, Gheorghe Grigoras, Fayez Alqahtani, and Ahmed M. Shehata
- Subjects
machine learning ,crowdsensing ,object detection ,support vector machine ,time-series data ,wearable device ,Mathematics ,QA1-939 - Abstract
Artificial intelligence has been utilized extensively in the healthcare sector for the last few decades to simplify medical procedures, such as diagnosis, prognosis, drug discovery, and many more. With the spread of the COVID-19 pandemic, more methods for detecting and treating COVID-19 infections have been developed. Several projects involving considerable artificial intelligence use have been researched and put into practice. Crowdsensing is an example of an application in which artificial intelligence is employed to detect the presence of a virus in an individual based on their physiological parameters. A solution is proposed to detect the potential COVID-19 carrier in crowded premises of a closed campus area, for example, hospitals, corridors, company premises, and so on. Sensor-based wearable devices are utilized to obtain measurements of various physiological indicators (or parameters) of an individual. A machine-learning-based model is proposed for COVID-19 prediction with these parameters as input. The wearable device dataset was used to train four different machine learning algorithms. The support vector machine, which performed the best, received an F1-score of 96.64% and an accuracy score of 96.57%. Moreover, the wearable device is used to retrieve the coordinates of a potential COVID-19 carrier, and the YOLOv5 object detection method is used to do real-time visual tracking on a closed-circuit television video feed.
- Published
- 2022
- Full Text
- View/download PDF
241. Anomaly Detection from Kepler Satellite Time-Series Data
- Author
-
Grabaskas, Nathaniel, Si, Dong, Hutchison, David, Series editor, Kanade, Takeo, Series editor, Kittler, Josef, Series editor, Kleinberg, Jon M., Series editor, Mattern, Friedemann, Series editor, Mitchell, John C., Series editor, Naor, Moni, Series editor, Pandu Rangan, C., Series editor, Steffen, Bernhard, Series editor, Terzopoulos, Demetri, Series editor, Tygar, Doug, Series editor, Weikum, Gerhard, Series editor, and Perner, Petra, editor
- Published
- 2017
- Full Text
- View/download PDF
242. Stock market analysis using candlestick regression and market trend prediction (CKRM).
- Author
-
Ananthi, M. and Vijayakumar, K.
- Abstract
Stock market data is a time-series data in which stock value varies depends on time. Prediction of the stock market is an endeavor to assess the future value of a company's stock rate which will increase the investor's profit. The accurate prediction of stock market analysis is still a challenging task. The proposed system predicts stock price of any company mentioned by the user for the next few days. Using the predicted stock price and datasets collected from various sources regarding a certain equity, the overall sentiment of the stock is predicted. The prediction of stock price is done by regression and candlestick pattern detection. The proposed system generates signals on the candlestick graph which allows to predict market movement to a sufficient level of accuracy so that the user is able to judge whether a stock is a 'Buy/Sell' and whether to short the stock or go long by delivery. The prediction accuracy of the stock exchange has analyzed and improved to 85% using machine learning algorithms. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
243. Remaining useful life prediction via long‐short time memory neural network with novel partial least squares and genetic algorithm.
- Author
-
Yang, Ke, Wang, Yong‐jian, Yao, Yu‐nan, and Fan, Shi‐dong
- Abstract
Advancements in information technology have made various industrial equipment increasingly sophisticated in recent years. The remaining useful life (RUL) of equipment plays a crucial important role in the industrial process. It is difficult to establish a functional RUL model as it requires the fusion of time‐series data across different scales. This paper proposes a long‐short term memory neural network, which integrates a novel partial least square based on a genetic algorithm (GAPLS‐LSTM). The parameters are first analyzed by PLS to obtain the parameter fusion function of the health index (HI). The GA then searches the optimal coefficients of the function; the expected HI values can be calculated with the fusion function. Finally, the RUL of the equipment is predicted with the LSTM method. The proposed GAPLS‐LSTM was applied to RUL prediction of a marine auxiliary engine to validate it by comparison against GAPLS‐BP and GAPLS‐RNN methods. The results show that the proposed method is capable of effective RUL prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
244. Demand Forecasting Tool For Inventory Control Smart Systems.
- Author
-
Benhamida, Fatima Zohra, Kaddouri, Ouahiba, Ouhrouche, Tahar, Benaichouche, Mohammed, Casado-Mansilla, Diego, and López-de-Ipiña, Diego
- Subjects
DEMAND forecasting ,INVENTORY control ,INVENTORY management systems ,PRODUCTION planning ,SUPPLY chains - Abstract
With the availability of data and the increasing capabilities of data processing tools, many businesses are leveraging historical sales and demand data to implement smart inventory management systems. Demand forecasting is the process of estimating the consumption of products or services for future time periods. It plays an important role in the field of inventory control and Supply Chain, since it enables production and supply planning and therefore can reduce delivery times and optimize Supply Chain decisions. This paper presents an extensive literature review about demand forecasting methods for time-series data. Based on analysis results and findings, a new demand forecasting tool for inventory control is proposed. First, a forecasting pipeline is designed to allow selecting the most accurate demand forecasting method. The validation of the proposed solution is executed on Stock&Buy case study, a growing online retail platform. For this reason, two new methods are proposed: (1) a hybrid method, Comb-TSB, is proposed for intermittent and lumpy demand patterns. Comb- TSB automatically selects the most accurate model among a set of methods. (2) a clustering-based approach (ClustAvg) is proposed to forecast demand for new products which have very few or no sales history data. The evaluation process showed that the proposed tool achieves good forecasting accuracy by making the most appropriate choice while defining the forecasting method to apply for each product selection. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
245. Predicting Box-Office Markets with Machine Learning Methods
- Author
-
Dawei Li and Zhi-Ping Liu
- Subjects
box-office prediction ,economic systems ,time-series data ,support vector machine ,machine learning ,Science ,Astrophysics ,QB460-466 ,Physics ,QC1-999 - Abstract
The accurate prediction of gross box-office markets is of great benefit for investment and management in the movie industry. In this work, we propose a machine learning-based method for predicting the movie box-office revenue of a country based on the empirical comparisons of eight methods with diverse combinations of economic factors. Specifically, we achieved a prediction performance of the relative root mean squared error of 0.056 in the US and of 0.183 in China for the two case studies of movie markets in time-series forecasting experiments from 2013 to 2016. We concluded that the support-vector-machine-based method using gross domestic product reached the best prediction performance and satisfies the easily available information of economic factors. The computational experiments and comparison studies provided evidence for the effectiveness and advantages of our proposed prediction strategy. In the validation process of the predicted total box-office markets in 2017, the error rates were 0.044 in the US and 0.066 in China. In the consecutive predictions of nationwide box-office markets in 2018 and 2019, the mean relative absolute percentage errors achieved were 0.041 and 0.035 in the US and China, respectively. The precise predictions, both in the training and validation data, demonstrate the efficiency and versatility of our proposed method.
- Published
- 2022
- Full Text
- View/download PDF
246. Parcel-Level Mapping of Horticultural Crop Orchards in Complex Mountain Areas Using VHR and Time-Series Images
- Author
-
Shuhui Jiao, Dingxiang Hu, Zhanfeng Shen, Haoyu Wang, Wen Dong, Yifei Guo, Shuo Li, Yating Lei, Wenqi Kou, Jian Wang, Huimei He, and Yanming Fang
- Subjects
precise parcel extraction ,map-level mapping ,horticultural crop orchard ,deep leaning ,time-series data ,Science - Abstract
Accurate and reliable farmland crop mapping is an important foundation for relevant departments to carry out agricultural management, crop planting structure adjustment and ecological assessment. The current crop identification work mainly focuses on conventional crops, and there are few studies on parcel-level mapping of horticultural crops in complex mountainous areas. Using Miaohou Town, China, as the research area, we developed a parcel-level method for the precise mapping of horticultural crops in complex mountainous areas using very-high-resolution (VHR) optical images and Sentinel-2 optical time-series images. First, based on the VHR images with a spatial resolution of 0.55 m, the complex mountainous areas were divided into subregions with their own independent characteristics according to a zoning and hierarchical strategy. The parcels in the different study areas were then divided into plain, greenhouse, slope and terrace parcels according to their corresponding parcel characteristics. The edge-based model RCF and texture-based model DABNet were subsequently used to extract the parcels according to the characteristics of different regions. Then, Sentinel-2 images were used to construct the time-series characteristics of different crops, and an LSTM algorithm was used to classify crop types. We then designed a parcel filling strategy to determine the categories of parcels based on the classification results of the time-series data, and accurate parcel-level mapping of a horticultural crop orchard in a complex mountainous area was finally achieved. Based on visual inspection, this method appears to effectively extract farmland parcels from VHR images of complex mountainous areas. The classification accuracy reached 93.01%, and the Kappa coefficient was 0.9015. This method thus serves as a methodological reference for parcel-level horticultural crop mapping and can be applied to the development of local precision agriculture.
- Published
- 2022
- Full Text
- View/download PDF
247. Errors-in-Variables Modeling of Personalized Treatment-Response Trajectories.
- Author
-
Zhang, Guangyi, Ashrafi, Reza A., Juuti, Anne, Pietilainen, Kirsi, and Marttinen, Pekka
- Subjects
MEASUREMENT errors ,ERRORS-in-variables models ,BLOOD sugar measurement ,PARAMETRIC equations ,SIMULATED patients - Abstract
Estimating the impact of a treatment on a given response is needed in many biomedical applications. However, methodology is lacking for the case when the response is a continuous temporal curve, treatment covariates suffer extensively from measurement error, and even the exact timing of the treatments is unknown. We introduce a novel method for this challenging scenario. We model personalized treatment-response curves as a combination of parametric response functions, hierarchically sharing information across individuals, and a sparse Gaussian process for the baseline trend. Importantly, our model accounts for errors not only in treatment covariates, but also in treatment timings, a problem arising in practice for example when data on treatments are based on user self-reporting. We validate our model with simulated and real patient data, and show that in a challenging application of estimating the impact of diet on continuous blood glucose measurements, accounting for measurement error significantly improves estimation and prediction accuracy. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
248. Identifying temporal pathways using biomarkers in the presence of latent non-Gaussian components.
- Author
-
Xie S, Zeng D, and Wang Y
- Subjects
- Humans, Normal Distribution, Attention Deficit Disorder with Hyperactivity, Time Factors, Biometry methods, Biomarkers analysis, Algorithms, Computer Simulation, Models, Statistical
- Abstract
Time-series data collected from a network of random variables are useful for identifying temporal pathways among the network nodes. Observed measurements may contain multiple sources of signals and noises, including Gaussian signals of interest and non-Gaussian noises, including artifacts, structured noise, and other unobserved factors (eg, genetic risk factors, disease susceptibility). Existing methods, including vector autoregression (VAR) and dynamic causal modeling do not account for unobserved non-Gaussian components. Furthermore, existing methods cannot effectively distinguish contemporaneous relationships from temporal relations. In this work, we propose a novel method to identify latent temporal pathways using time-series biomarker data collected from multiple subjects. The model adjusts for the non-Gaussian components and separates the temporal network from the contemporaneous network. Specifically, an independent component analysis (ICA) is used to extract the unobserved non-Gaussian components, and residuals are used to estimate the contemporaneous and temporal networks among the node variables based on method of moments. The algorithm is fast and can easily scale up. We derive the identifiability and the asymptotic properties of the temporal and contemporaneous networks. We demonstrate superior performance of our method by extensive simulations and an application to a study of attention-deficit/hyperactivity disorder (ADHD), where we analyze the temporal relationships between brain regional biomarkers. We find that temporal network edges were across different brain regions, while most contemporaneous network edges were bilateral between the same regions and belong to a subset of the functional connectivity network., (© The Author(s) 2024. Published by Oxford University Press on behalf of The International Biometric Society.)
- Published
- 2024
- Full Text
- View/download PDF
249. Identification and validation of sepsis subphenotypes using time-series data.
- Author
-
Hao C, Hao R, Zhao H, Zhang Y, Sheng M, and An Y
- Abstract
Purpose: The recognition of sepsis as a heterogeneous syndrome necessitates identifying distinct subphenotypes to select targeted treatment., Methods: Patients with sepsis from the MIMIC-IV database (2008-2019) were randomly divided into a development cohort (80%) and an internal validation cohort (20%). Patients with sepsis from the ICU database of Peking University People's Hospital (2008-2022) were included in the external validation cohort. Time-series k-means clustering analysis and dynamic time warping was performed to develop and validate sepsis subphenotypes by analyzing the trends of 21 vital signs and laboratory indicators within 24 h after sepsis onset. Inflammatory biomarkers were compared in the ICU database of Peking University People's Hospital, whereas treatment heterogeneity was compared in the MIMIC-IV database., Findings: Three sub-phenotypes were identified in the development cohort. Type A patients (N = 2525, 47%) exhibited stable vital signs and fair organ function, type B (N = 1552, 29%) was exhibited an obvious inflammatory response and stable organ function, and type C (N = 1251, 24%) exhibited severely impaired organ function with a deteriorating tendency. Type C demonstrated the highest mortality rate (33%) and levels of inflammatory biomarkers, followed by type B (24%), whereas type A exhibited the lowest mortality rate (11%) and levels of inflammatory biomarkers. These subphenotypes were confirmed in both the internal and external cohorts, demonstrating similar features and comparable mortality rates. In type C patients, survivors had significantly lower fluid intake within 24 h after sepsis onset (median 2891 mL, interquartile range (IQR) 1530-5470 mL) than that in non-survivors (median 4342 mL, IQR 2189-7305 mL). For types B and C, survivors showed a higher proportion of indwelling central venous catheters (p < 0.05)., Conclusion: Three novel phenotypes of patients with sepsis were identified and validated using time-series data, revealing significant heterogeneity in inflammatory biomarkers, treatments, and consistency across cohorts., Competing Interests: The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper., (© 2024 Published by Elsevier Ltd.)
- Published
- 2024
- Full Text
- View/download PDF
250. Identifying Qualitative Between-Subject and Within-Subject Variability: A Method for Clustering Regime-Switching Dynamics
- Author
-
Lu Ou, Alejandro Andrade, Rosa A. Alberto, Arthur Bakker, and Timo Bechger
- Subjects
clustering ,regime-switching model ,functional data analysis ,time-series data ,dynamic model ,Psychology ,BF1-990 - Abstract
Technological advancement provides an unprecedented amount of high-frequency data of human dynamic processes. In this paper, we introduce an approach for characterizing qualitative between and within-subject variability from quantitative changes in the multi-subject time-series data. We present the statistical model and examine the strengths and limitations of the approach in potential applications using Monte Carlo simulations. We illustrate its usage in characterizing clusters of dynamics with phase transitions with real-time hand movement data collected on an embodied learning platform designed to foster mathematical learning.
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.