19,890 results on '"univariate"'
Search Results
52. Similarity-Based Data-Fusion Schemes for Missing Data Imputation in Univariate Time Series Data
- Author
-
Nickolas, S., Shobha, K., Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Dave, Mayank, editor, Garg, Ritu, editor, Dua, Mohit, editor, and Hussien, Jemal, editor
- Published
- 2021
- Full Text
- View/download PDF
53. Time Series Analysis on Univariate and Multivariate Variables: A Comprehensive Survey
- Author
-
Beeram, Satyanarayana Reddy, Kuchibhotla, Swarna, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Satapathy, Suresh Chandra, editor, Bhateja, Vikrant, editor, Ramakrishna Murty, M., editor, Gia Nhu, Nguyen, editor, and Jayasri Kotti, editor
- Published
- 2021
- Full Text
- View/download PDF
54. Univariate Feature Selection Techniques for Classification of Epileptic EEG Signals
- Author
-
Kar, Moushmi, Dewangan, Laxmikant, Lovell, Nigel H., Advisory Editor, Oneto, Luca, Advisory Editor, Piotto, Stefano, Advisory Editor, Rossi, Federico, Advisory Editor, Samsonovich, Alexei V., Advisory Editor, Babiloni, Fabio, Advisory Editor, Liwo, Adam, Advisory Editor, Magjarevic, Ratko, Advisory Editor, Rizvanov, Albert A., editor, Singh, Bikesh Kumar, editor, and Ganasala, Padma, editor
- Published
- 2021
- Full Text
- View/download PDF
55. Geostatistical Analysis of Suspended Particulate Matter Along the North-Western Coastal Waters of Bay of Bengal
- Author
-
Basu, Atreya, Mukhopadhaya, Sayan, Gupta, Kaushik, Mitra, Debasish, Chattoraj, Shovan Lal, Mukhopadhyay, Anirban, Das, Sourav, editor, and Ghosh, Tuhin, editor
- Published
- 2021
- Full Text
- View/download PDF
56. Umformer: A Transformer Dedicated to Univariate Multistep Prediction
- Author
-
Min Li, Qinghui Chen, Gang Li, and Delong Han
- Subjects
Multi-step ,univariate ,time series forecasting ,transformers ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
Univariate multi-step time series forecasting (UMTF) has many applications, such as the forecast of access traffic. The solution to the UMTF problem needs to efficiently capture key information in univariate data and improve the accuracy of multi-step forecasting. The advent of deep learning (DL) enables multi-level, high-performance prediction of complex multivariate inputs, but the solution and research of UMTF problems is extremely scarce. Existing methods cannot satisfy recent univariate forecasting tasks in terms of forecasting accuracy, efficiency, etc. This paper proposes a Transformer-based univariate multi-step forecasting model: Umformer. The contributions include: (1) To maximize the information obtained from a single variable, we propose a Prophet-based method for variable extraction, additionally considering some correlated variables for accurate predictions. (2) Gated linear units variants with three weight matrices (GLUV3) are designed, as a gating to improve the function of selective memory in long sequences, thereby obtaining more helpful information from a limited number of univariate variables and improving prediction accuracy. (3) Shared Double-heads Probsparse Attention (SDHPA) mechanism reduces memory footprint and improves attention-awareness. We combine the latest research results of current DL technology to achieve high-precision prediction in UMTF.Extensive experiments on public datasets from five different domains have shown that five metrics demonstrate that the Umformer approach is significantly better than existing methods. We offer a more efficient solution for UMTF.
- Published
- 2022
- Full Text
- View/download PDF
57. Spatiotemporal Assessment of Satellite Image Time Series for Land Cover Classification Using Deep Learning Techniques: A Case Study of Reunion Island, France.
- Author
-
Navnath, Naik Nitesh, Chandrasekaran, Kandasamy, Stateczny, Andrzej, Sundaram, Venkatesan Meenakshi, and Panneer, Prabhavathy
- Subjects
- *
REMOTE-sensing images , *ARTIFICIAL neural networks , *ZONING , *DEEP learning , *CONVOLUTIONAL neural networks , *TIME series analysis , *RECURRENT neural networks - Abstract
Current Earth observation systems generate massive amounts of satellite image time series to keep track of geographical areas over time to monitor and identify environmental and climate change. Efficiently analyzing such data remains an unresolved issue in remote sensing. In classifying land cover, utilizing SITS rather than one image might benefit differentiating across classes because of their varied temporal patterns. The aim was to forecast the land cover class of a group of pixels as a multi-class single-label classification problem given their time series gathered using satellite images. In this article, we exploit SITS to assess the capability of several spatial and temporal deep learning models with the proposed architecture. The models implemented are the bidirectional gated recurrent unit (GRU), temporal convolutional neural networks (TCNN), GRU + TCNN, attention on TCNN, and attention of GRU + TCNN. The proposed architecture integrates univariate, multivariate, and pixel coordinates for the Reunion Island's landcover classification (LCC). the evaluation of the proposed architecture with deep neural networks on the test dataset determined that blending univariate and multivariate with a recurrent neural network and pixel coordinates achieved increased accuracy with higher F1 scores for each class label. The results suggest that the models also performed exceptionally well when executed in a partitioned manner for the LCC task compared to the temporal models. This study demonstrates that using deep learning approaches paired with spatiotemporal SITS data addresses the difficult task of cost-effectively classifying land cover, contributing to a sustainable environment. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
58. EEG Forecasting With Univariate and Multivariate Time Series Using Windowing and Baseline Method.
- Author
-
Bukhari, Syed Ahmad Chan
- Subjects
EPILEPSY ,SPASMS ,ELECTROENCEPHALOGRAPHY ,MULTIVARIATE analysis ,TIME series analysis - Abstract
People suffering from epilepsy disorder are very much in need for precautionary measures. The only way to provide precaution to such people is to find some methods which help them to know in advance the occurrence of seizures. Using Electroencephalogram, the authors have worked on developing a forecasting method using simple LSTM with windowing technique. The window length was set to five time steps; step by step the length was increased by 1 time step. The number of correct predictions increased with the window length. When the length reached to 20 time steps, the model gave impressive results in predicting the future EEG value. Past 20 time steps are learnt by the neural network to forecast the future EEG in two stages; in univariate method, only one attribute is used as the basis to predict the future value. In multivariate method, 42 features were used to predict the future EEG. Multivariate is more powerful and provides the prediction which is almost equal to the actual target value. In case of univariate the accuracy achieved was about 70%, whereas in case of multivariate method it was 90%. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
59. Mixed data generation packages and related computational tools in R.
- Author
-
Demirtas, H. and Gao, R.
- Subjects
- *
RANDOM numbers - Abstract
This paper is concerned with providing some computation-related details of the 16R packages that have been developed by Demirtas and his colleagues in the context of random number generation. The dominant theme is multivariate mixed data generation. However, univariate and multivariate data generation from different distributions as well as some other tools such as modeling the correlation transitions in latency and discretization domains are also included. This is intended for interested people who would benefit from access to a comprehensive set of data simulation tools at one single place. While the focus is on conceptual and implementation issues, the ideas are supported by appropriate references for methodological development. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
60. Predicting Saudi Stock Market Index by Using Multivariate Time Series Based on Deep Learning
- Author
-
Mutasem Jarrah and Morched Derbali
- Subjects
deep learning ,predictions ,time series ,LSTM ,multivariate ,univariate ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Time-series (TS) predictions use historical data to forecast future values. Various industries, including stock market trading, power load forecasting, medical monitoring, and intrusion detection, frequently rely on this method. The prediction of stock-market prices is significantly influenced by multiple variables, such as the performance of other markets and the economic situation of a country. This study focuses on predicting the indices of the stock market of the Kingdom of Saudi Arabia (KSA) using various variables, including opening, lowest, highest, and closing prices. Successfully achieving investment goals depends on selecting the right stocks to buy, sell, or hold. The output of this project is the projected closing prices over the next seven days, which aids investors in making informed decisions. Exponential smoothing (ES) was employed in this study to eliminate noise from the input data. This study utilized exponential smoothing (ES) to eliminate noise from data obtained from the Saudi Stock Exchange, also known as Tadawul. Subsequently, a sliding-window method with five steps was applied to transform the task of time series forecasting into a supervised learning problem. Finally, a multivariate long short-term memory (LSTM) deep-learning (DL) algorithm was employed to predict stock market prices. The proposed multivariate LSTMDL model achieved prediction rates of 97.49% and 92.19% for the univariate model, demonstrating its effectiveness in stock market price forecasting. These results also highlight the accuracy of DL and the utilization of multiple information sources in stock-market prediction.
- Published
- 2023
- Full Text
- View/download PDF
61. Time Series Missing Value Prediction: Algorithms and Applications
- Author
-
Dubey, Aditya, Rasool, Akhtar, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Badica, Costin, editor, Liatsis, Panos, editor, Kharb, Latika, editor, and Chahal, Deepak, editor
- Published
- 2020
- Full Text
- View/download PDF
62. Brain Tumor Detection and Classification Using Machine Learning
- Author
-
Pritanjli, Doegar, Amit, Bansal, Jagdish Chand, Series Editor, Deep, Kusum, Series Editor, Nagar, Atulya K., Series Editor, Sharma, Harish, editor, Pundir, Aditya Kumar Singh, editor, Yadav, Neha, editor, Sharma, Ajay, editor, and Das, Swagatam, editor
- Published
- 2020
- Full Text
- View/download PDF
63. Natural Time Series Parameters Forecasting: Validation of the Pattern-Sequence-Based Forecasting (PSF) Algorithm; A New Python Package.
- Author
-
Shende, Mayur Kishor, Salih, Sinan Q., Bokde, Neeraj Dhanraj, Scholz, Miklas, Oudah, Atheer Y., and Yaseen, Zaher Mundher
- Subjects
PYTHON programming language ,TIME series analysis ,ALGORITHMS ,CLIMATOLOGISTS ,REFERENCE values ,FORECASTING ,BOX-Jenkins forecasting - Abstract
Climate change has contributed substantially to the weather and land characteristic phenomena. Accurate time series forecasting for climate and land parameters is highly essential in the modern era for climatologists. This paper provides a brief introduction to the algorithm and its implementation in Python. The pattern-sequence-based forecasting (PSF) algorithm aims to forecast future values of a univariate time series. The algorithm is divided into two major processes: the clustering of data and prediction. The clustering part includes the selection of an optimum value for the number of clusters and labeling the time series data. The prediction part consists of the selection of a window size and the prediction of future values with reference to past patterns. The package aims to ease the use and implementation of PSF for python users. It provides results similar to the PSF package available in R. Finally, the results of the proposed Python package are compared with results of the PSF and ARIMA methods in R. One of the issues with PSF is that the performance of forecasting result degrades if the time series has positive or negative trends. To overcome this problem difference pattern-sequence-based forecasting (DPSF) was proposed. The Python package also implements the DPSF method. In this method, the time series data are first differenced. Then, the PSF algorithm is applied to this differenced time series. Finally, the original and predicted values are restored by applying the reverse method of the differencing process. The proposed methodology is tested on several complex climate and land processes and its potential is evidenced. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
64. Quantitative Comparison of Statistical Methods for Analyzing Human Metabolomics Data.
- Author
-
Henglin, Mir, Claggett, Brian L., Antonelli, Joseph, Alotaibi, Mona, Magalang, Gino Alberto, Watrous, Jeramie D., Lagerborg, Kim A., Ovsak, Gavin, Musso, Gabriel, Demler, Olga V., Vasan, Ramachandran S., Larson, Martin G., Jain, Mohit, and Cheng, Susan
- Subjects
FALSE discovery rate ,METABOLOMICS ,LATENT structure analysis ,STATISTICAL learning ,SMALL molecules ,STATISTICAL power analysis ,NUMBER theory - Abstract
Emerging technologies now allow for mass spectrometry-based profiling of thousands of small molecule metabolites ('metabolomics') in an increasing number of biosamples. While offering great promise for insight into the pathogenesis of human disease, standard approaches have not yet been established for statistically analyzing increasingly complex, high-dimensional human metabolomics data in relation to clinical phenotypes, including disease outcomes. To determine optimal approaches for analysis, we formally compare traditional and newer statistical learning methods across a range of metabolomics dataset types. In simulated and experimental metabolomics data derived from large population-based human cohorts, we observe that with an increasing number of study subjects, univariate compared to multivariate methods result in an apparently higher false discovery rate as represented by substantial correlation between metabolites directly associated with the outcome and metabolites not associated with the outcome. Although the higher frequency of such associations would not be considered false in the strict statistical sense, it may be considered biologically less informative. In scenarios wherein the number of assayed metabolites increases, as in measures of nontargeted versus targeted metabolomics, multivariate methods performed especially favorably across a range of statistical operating characteristics. In nontargeted metabolomics datasets that included thousands of metabolite measures, sparse multivariate models demonstrated greater selectivity and lower potential for spurious relationships. When the number of metabolites was similar to or exceeded the number of study subjects, as is common with nontargeted metabolomics analysis of relatively small cohorts, sparse multivariate models exhibited the most-robust statistical power with more consistent results. These findings have important implications for metabolomics analysis in human disease. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
65. On-Farm Phenotypic Characterization of Indigenous Chicken, in Dire and Yabello Districts, Borena Zone, Oromia Regional State, Ethiopia
- Author
-
Dabasa Wario, Yosef Tadesse, and Shashi Bhushan Singh Yadav
- Subjects
borena zone ,characterization ,morphological trait ,morphometric traits ,multivariate ,plumage ,scavenging ,univariate ,Genetics ,QH426-470 - Abstract
This study was conducted in two districts of Borena zone (Ethiopia), with the objectives to characterize phenotypically the indigenous chicken types in the study sites. The study involved both qualitative and quantitative types of research. A total of 480 chickens (144 male and 336 female) aged more than 6 months for the quantitative study were considered in this study. Descriptive statistics, frequency procedures, general linear model, univariate and multivariate analysis were used with SAS 9.1.3 to analyze the data. SPSS package was used to analyze qualitative data. Qualitative traits such as plumage color, comb type, shank color, eye, earlobe color, and skin color were used for the study. Quantitative traits included: body weight and linear morphometric measurements such as shank length, body length, wattle length, wingspan, chest circumference, comb width, and comb length. The result of this study revealed that white, red, and brown plumage color was dominated in the study area. The local chickens possessed variants in shank color, skin color, comb type, and eye color. White shanks, white skin, single combs, and red earlobe color were predominately seen across both the study districts. The mean body weights of indigenous male and female chickens were 1.623± 0.229 kg and 1.313 ± 0.81 kg, respectively. Large comb, wattle, and long legs were observed in the study areas. Generally, morphological and morphometric variations were observed between and within the indigenous chicken populations, which suggests that there is an opportunity for genetic improvement through selection.
- Published
- 2021
- Full Text
- View/download PDF
66. Generative Approach For Multivariate Signals
- Author
-
Sawant, Vinay, Bhende, Renu, Sawant, Vinay, and Bhende, Renu
- Abstract
In this thesis, we explored the generative adversarial network called uTSGAN to generate patterns from multivariate CAN bus time series dataset. Since the given data is unlabelled, unprocessed and highly imbalanced containing large amount of missing values, we have to define and discard a few timestamps and limit the focus of the study to the reduced subset involving patterns of the 10-second window size, which are categorised and clustered into majority and minority classes. To generate such an imbalanced set, we have used image based time series GAN called uTSGAN which transforms a time sequence into a spectrogram image and back to a time sequence within the same GAN framework. For the comparison, we also prepared a resampled (balanced) dataset from the imbalanced set to use in alternative experiments. This comparison evaluates the results of the conventional resampling approach against the baseline as well as our novel implementations. We propose two new methods using "cluster density based" and "sample loss based" techniques. Throughout the experimentation, the "cluster density based" GANs consistently achieved better results on some common and uncommon evaluations for multivariate and highly imbalanced sample sets. Among some regular evaluation methods, classification metrics such as balanced accuracy and precision provide a better understanding of experimentation results. The TRTS balanced accuracy and precision from "cluster density based" GAN achieves over 82% and 90% with an improvement of 20-30% and 14-18% respectively from that of baseline; the TSTR balanced accuracy of "cluster density based" increased by 10.6% from that of baseline and it show slightly better precision with respect to that of the baseline when compared on generated results from univariate experiments. Secondly, the alternative "resampling based" implementations show similar values to that of the baseline in TRTS and TSTR classifications. Simultaneously, More distinguished results are se
- Published
- 2024
67. Copula Modelling to Analyse Financial Data.
- Author
-
Dewick, Paul R. and Liu, Shuangzhe
- Subjects
ECONOMIC models - Abstract
Copula modelling is a popular tool in analysing the dependencies between variables. Copula modelling allows the investigation of tail dependencies, which is of particular interest in risk and survival applications. Copula modelling is also of specific interest to economic and financial modelling as it can help in the prediction of financial contagion and periods of "boom" or "bust". Bivariate copula modelling has a rich variety of copulas that may be chosen to represent the modelled dataset dependencies and possible extreme events that may lie within the dataset tails. Financial copula modelling tends to diverge as this richness of copula types within the literature may not be well realised with the two different types of modelling, one being non-time-series and the other being time-series, being undertaken differently. This paper investigates standard copula modelling and financial copula modelling and shows why the modelling strategies in using time-series and non-time-series copula modelling is undertaken using different methods. This difference, apart from the issues surrounding the time-series component, is mostly due to standard copula modelling having the ability to use empirical CDFs for the probability integral transformation. Financial time-series copula modelling uses pseudo-CDFs due to the standardized time-series residuals being centred around zero. The standardized residuals inhibit the estimation of the possible distributions required for constructing the copula model in the usual manner. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
68. SMOOTHING IN NEURAL NETWORK FOR UNIVARIAT TIME SERIES DATA FORECASTING
- Author
-
Nurfia Oktaviani Syamsiah and Indah Purwandani
- Subjects
smoothing ,univariate ,time series ,neural network ,Electronic computers. Computer science ,QA75.5-76.95 ,Computer engineering. Computer hardware ,TK7885-7895 - Abstract
Time series data is interesting research material for many people. Not a few models have been produced, but very optimal accuracy has not been obtained. Neural network is one that is widely used because of its ability to understand non-linear relationships between data. This study will combine a neural network with exponential smoothing to produce higher accuracy. Exponential smoothing is one of the best linear methods is used for data set transformation and thereafter the new data set will be used in training and testing the Neural Network model. The resulting model will be evaluated using the standard error measure Root Mean Square Error (RMSE). Each model was compared with its RMSE value and then performed a T-Test. The proposed ES-NN model proved to have better predictive results than using only one method.
- Published
- 2020
- Full Text
- View/download PDF
69. Assessing the Limitations of Relief-Based Algorithms in Detecting Higher-Order Interactions.
- Author
-
Freda PJ, Ye S, Zhang R, Moore JH, and Urbanowicz RJ
- Abstract
Background: The investigation of epistasis becomes increasingly complex as more loci are considered due to the exponential expansion of possible interactions. Consequently, selecting key features that influence epistatic interactions is crucial for effective downstream analyses. Recognizing this challenge, this study investigates the efficiency of Relief-Based Algorithms (RBAs) in detecting higher-order epistatic interactions, which may be critical for understanding the genetic architecture of complex traits. RBAs are uniquely non-exhaustive, eliminating the need to construct features for every possible interaction and thus improving computational tractability. Motivated by previous research indicating that some RBAs rank predictive features involved in higher-order epistasis as highly negative, we explore the utility of absolute value ranking of RBA feature weights as an alternative method to capture complex interactions. We evaluate ReliefF, MultiSURF, and MultiSURFstar on simulated genetic datasets that model various patterns of genotype-phenotype associations, including 2-way to 5-way genetic interactions, and compare their performance to two control methods: a random shuffle and mutual information., Results: Our findings indicate that while RBAs effectively identify lower-order (2 to 3-way) interactions, their capability to detect higher-order interactions is significantly limited, primarily by large feature count but also by signal noise. Specifically, we observe that RBAs are successful in detecting fully penetrant 4-way XOR interactions using an absolute value ranking approach, but this is restricted to datasets with a minimal number of total features., Conclusions: These results highlight the inherent limitations of current RBAs and underscore the need for enhanced detection capabilities for the investigation of epistasis, particularly in datasets with large feature counts and complex higher-order interactions., Competing Interests: Competing interests The authors declare that they have no competing interests. Additional Declarations: No competing interests reported.
- Published
- 2024
- Full Text
- View/download PDF
70. A Census of Cades Cove Through Gravestones
- Author
-
Foster, Gary S., Lovekamp, William E., Foster, Gary S., and Lovekamp, William E.
- Published
- 2019
- Full Text
- View/download PDF
71. Regionalization of Drought across Pakistan
- Author
-
Tausif Khan, Zeeshan Waheed, Muhammad Nauman Altaf, Muhammad Naveed Anjum, Fiaz Hussain, and Muhammad Azam
- Subjects
HCPC ,regionalization ,SPEI ,univariate ,homogeneity ,discordancy test ,Environmental sciences ,GE1-350 - Abstract
Due to Pakistan’s complex hydro-climatic and topographical features, drought is a severe problem. It is necessary to regionalize various topographical and hydrometeorological occurrences into uniform zones. The regionalization of clusters across Pakistan has been examined and analyzed using the hierarchical classification of principal components (HCPC). Five statistically homogenous zones were made, which were validated through the cluster validation indices. Univariate discordancy tests were run using the drought’s severity and duration as inputs. Over 12 months, drought was regionalized for SPEI time scales, indicating regional discordancy in cluster 4, while cluster 2 had a smaller number of stations, which were further adjusted to ensure homogeneity. The results of this research might be utilized to offer the fundamental information needed to develop a regional drought mitigation plan.
- Published
- 2022
- Full Text
- View/download PDF
72. An unsupervised neural network approach for imputation of missing values in univariate time series data.
- Author
-
Savarimuthu, Nickolas and Karesiddaiah, Shobha
- Subjects
MULTIPLE imputation (Statistics) ,MISSING data (Statistics) ,TIME series analysis ,STANDARD deviations ,STATISTICAL correlation - Abstract
Summary: Handling missing values in time series data plays a key role in predicting and forecasting, as complete and clean historical data help to achieve higher accuracy. Numerous research works are present in multivariate time series imputation, but imputation in univariate time series data is least considered due to correlated variables unavailability. This article aims to propose an iterative imputation algorithm by clustering univariate time series data, considering the trend, seasonality, cyclical, and residue features of the data. The proposed method uses a similarity based nearest neighbor imputation approach on each clusters for filling missing values. The proposed method is evaluated on publicly available data set from the data market repository and UCI repository by randomly simulating missing patterns under low, moderate, and high missingness rates throughout the data series. The proposed method's outcome is evaluated with the imputeTestbench package with root mean squared error as an error metric and validated through prediction accuracy and concordance correlation coefficient statistical test. Experimental results show that the proposed imputation technique produces closer values to the original time series data set, resulting in low error rates compared with other existing imputation methods. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
73. Natural Time Series Parameters Forecasting: Validation of the Pattern-Sequence-Based Forecasting (PSF) Algorithm; A New Python Package
- Author
-
Mayur Kishor Shende, Sinan Q. Salih, Neeraj Dhanraj Bokde, Miklas Scholz, Atheer Y. Oudah, and Zaher Mundher Yaseen
- Subjects
forecasting ,univariate ,time series ,Python ,PSF ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Climate change has contributed substantially to the weather and land characteristic phenomena. Accurate time series forecasting for climate and land parameters is highly essential in the modern era for climatologists. This paper provides a brief introduction to the algorithm and its implementation in Python. The pattern-sequence-based forecasting (PSF) algorithm aims to forecast future values of a univariate time series. The algorithm is divided into two major processes: the clustering of data and prediction. The clustering part includes the selection of an optimum value for the number of clusters and labeling the time series data. The prediction part consists of the selection of a window size and the prediction of future values with reference to past patterns. The package aims to ease the use and implementation of PSF for python users. It provides results similar to the PSF package available in R. Finally, the results of the proposed Python package are compared with results of the PSF and ARIMA methods in R. One of the issues with PSF is that the performance of forecasting result degrades if the time series has positive or negative trends. To overcome this problem difference pattern-sequence-based forecasting (DPSF) was proposed. The Python package also implements the DPSF method. In this method, the time series data are first differenced. Then, the PSF algorithm is applied to this differenced time series. Finally, the original and predicted values are restored by applying the reverse method of the differencing process. The proposed methodology is tested on several complex climate and land processes and its potential is evidenced.
- Published
- 2022
- Full Text
- View/download PDF
74. Quantitative Comparison of Statistical Methods for Analyzing Human Metabolomics Data
- Author
-
Mir Henglin, Brian L. Claggett, Joseph Antonelli, Mona Alotaibi, Gino Alberto Magalang, Jeramie D. Watrous, Kim A. Lagerborg, Gavin Ovsak, Gabriel Musso, Olga V. Demler, Ramachandran S. Vasan, Martin G. Larson, Mohit Jain, and Susan Cheng
- Subjects
metabolomics ,statistical methods ,univariate ,multivariate ,Microbiology ,QR1-502 - Abstract
Emerging technologies now allow for mass spectrometry-based profiling of thousands of small molecule metabolites (‘metabolomics’) in an increasing number of biosamples. While offering great promise for insight into the pathogenesis of human disease, standard approaches have not yet been established for statistically analyzing increasingly complex, high-dimensional human metabolomics data in relation to clinical phenotypes, including disease outcomes. To determine optimal approaches for analysis, we formally compare traditional and newer statistical learning methods across a range of metabolomics dataset types. In simulated and experimental metabolomics data derived from large population-based human cohorts, we observe that with an increasing number of study subjects, univariate compared to multivariate methods result in an apparently higher false discovery rate as represented by substantial correlation between metabolites directly associated with the outcome and metabolites not associated with the outcome. Although the higher frequency of such associations would not be considered false in the strict statistical sense, it may be considered biologically less informative. In scenarios wherein the number of assayed metabolites increases, as in measures of nontargeted versus targeted metabolomics, multivariate methods performed especially favorably across a range of statistical operating characteristics. In nontargeted metabolomics datasets that included thousands of metabolite measures, sparse multivariate models demonstrated greater selectivity and lower potential for spurious relationships. When the number of metabolites was similar to or exceeded the number of study subjects, as is common with nontargeted metabolomics analysis of relatively small cohorts, sparse multivariate models exhibited the most-robust statistical power with more consistent results. These findings have important implications for metabolomics analysis in human disease.
- Published
- 2022
- Full Text
- View/download PDF
75. Accelerated Single Linkage Algorithm using the farthest neighbour principle.
- Author
-
Banerjee, Payel, Chakrabarti, Amlan, and Ballabh, Tapas Kumar
- Subjects
- *
HIERARCHICAL clustering (Cluster analysis) , *ALGORITHMS , *BIG data , *NEIGHBORS - Abstract
Single Linkage algorithm is a hierarchical clustering method which is most unsuitable for large dataset because of its high convergence time. The paper proposes an efficient accelerated technique for the algorithm for clustering univariate data with a merging threshold. It is a two-stage algorithm with the first one as an incremental pre-clustering step that uses the farthest neighbour principle to partially cluster the database by scanning it only once. The algorithm uses the Segment Addition Postulate as a major tool for accelerating the pre-clustering stage. The incremental approach makes it suitable for partial clustering of streaming data while collecting it. The Second stage merges these pre-clusters to produce the final set of Single Linkage clusters by comparing the biggest and the smallest data of each pre-cluster and thereby converging faster in comparison to those methods where all the members of the clusters are used for a clustering action. The algorithm is also suitable for fast-changing dynamic databases as it can cluster a newly added data without using all the data of the database. Experiments are conducted with various datasets and the result confirms that the proposed algorithm outperforms its well-known variants. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
76. Healing of Apical Periodontitis after Minimally Invasive Endodontics therapy using Er,Cr:YSGG laser: A Prospective Clinical Study.
- Author
-
Shaheed, Ali A., Jawad, Hussien A., Hussain, Basima M. A., and Said, Ahmed M.
- Subjects
- *
PERIAPICAL periodontitis , *SALINE solutions , *LASERS , *ROOT canal treatment , *LONGITUDINAL method , *PERIAPICAL diseases - Abstract
The aim of the present study was to clarify the healing percentages for a 6-months duration after clinical endodontic treatment, in cases treated by Er,Cr:YSGG laser (Waterlase MD; Biolase Technology, Inc, San Clement, CA) and filled by two obturation techniques. The study group were composed of 40 patients, who were referred for endodontic treatment and diagnosed having apical periodontitis they received well-performed minmaly invasive nonsurgical root canal treatment with ProTaper Next instruments (Dentsply Maillefer, Ballagues, Switzerland) and copiously irrigation with 2 mL 5% NaOCl . After instrumentation, laser irradiation was performed for smear layer removal with Er,Cr:YSGG laser 2,780 nm wavelength with a redial firing tips RFT2 and RFT3 [diameter 200 μm for apical and middle third and 320 μm for coronal third respectivly]. After laser irradiation, a final irrigation was done with 5ml of Saline solution. Then, dissinfection for the root canal was done with the same laser device. The subjects were divided into 2 groups; the first one was obturated with carrierbased technique(GuttaCore, Dentsply Maillefer, Ballagues, Switzerland) and the other with cold lateral compaction technique , AH Plus sealer (Dentsply Tulsa Dental Specialties, Tulsa, OK)was used in both groups. Healing of the different apical periodontitis cases was evaluated clinically and radiographically employing Periapical index (PAI) scoring. According to this evaluation, three conditions could be distinguished : healed, healing, or diseased. Successed cases includes healed or healing conditions which accompanied the success of root canal treatment. Statistical analysis was done by using independent t-test, Univariate (ONE WAY ANOVA) test, and Pearson coefficient.Differences between variables were setting as significant at 5% (P≤0.05) and highly significant at 1% (P≤0.01). The prognosis of healing rates was compared temporally. Forty patients were folowed up at three recall periods at 1-month, 3-month, and 6-month after treatment. The success of root canal treatment for these periods was 67.5%, 82.5%, and 97.5%, respectively. In conclusion, the teeth of different apical diagnosis can be treated with the Er,Cr:YSGG laser.It showed a tremendous degree of success of root canal treatment in aperiod of 6 months after treatment. The Er,Cr:YSGG laser permit for rapid rate of healing with a predictable outcome. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
77. Combining rules for F- and Beta-statistics from multiply-imputed data
- Author
-
Ashok Chaurasia
- Subjects
Statistics and Probability ,Economics and Econometrics ,education.field_of_study ,Combining rules ,Covariance matrix ,Computer science ,05 social sciences ,Population ,Univariate ,050401 social sciences methods ,Estimator ,Inference ,Missing data ,01 natural sciences ,010104 statistics & probability ,0504 sociology ,Statistics ,0101 mathematics ,Statistics, Probability and Uncertainty ,education ,Type I and type II errors - Abstract
Missing values in data impede the task of inference for population parameters of interest. Multiple Imputation (MI) is a popular method for handling missing data since it accounts for the uncertainty of missing values. Inference in MI involves combining point and variance estimates from each imputed dataset via Rubin’s rules. A sufficient condition for these rules is that the estimator is approximately (multivariate) normally distributed. However, these traditional combining rules get computationally cumbersome for multicomponent parameters of interest, and unreliable at high rates of missingness (due to an unstable variance matrix). New combining rules for univariate F- and Beta-statistics from multiply-imputed data are proposed for decisions about multicomponent parameters. The proposed combining rules have the advantage of being computationally convenient since they only involve univariate F- and Beta-statistics, while providing the same inferential reliability as the traditional multivariate combining rules. Simulation study is conducted to demonstrate that the proposed method has good statistical properties of maintaining low type I and type II error rates at relatively large proportions of missingness. The general applicability of the proposed method is demonstrated within a lead exposure study to assess the association between lead exposure and neurological motor function.
- Published
- 2023
- Full Text
- View/download PDF
78. Maximum Cranial Circumference: A Predictor of Sexual Dimorphism of Human Skull
- Author
-
Petkar, Madhusudan R., Datir, Sandesh B., Makhani, Chandeep Singh, Farooqui, Jamebaseer, Bangal, Rajendra S., and Chavan, Kalidas D.
- Published
- 2018
- Full Text
- View/download PDF
79. A network-level test of the role of the co-activated default mode network in episodic recall and social cognition
- Author
-
Richard J. Binney, Grace E. Rice, Rebecca L. Jackson, Gina F. Humphreys, Matthew A. Lambon Ralph, Lambon Ralph, Matthew [0000-0001-5907-2488], and Apollo - University of Cambridge Repository
- Subjects
Social Cognition ,Brain Mapping ,Recall ,Episodic memory ,Cognitive Neuroscience ,Univariate ,Brain ,Experimental and Cognitive Psychology ,Independent component analysis ,Task (project management) ,Neuropsychology and Physiological Psychology ,Social cognition ,Theory of mind ,Mental Recall ,Default mode network ,Humans ,Psychology ,Cognitive psychology ,Resting-state networks - Abstract
Resting-state network research is extremely influential, yet the functions of many networks remain unknown. In part, this is due to typical (e.g., univariate) analyses testing the function of individual regions and not the full set of co-activated regions that form a network. Connectivity is dynamic and the function of a region may change based on its current connections. Therefore, determining the function of a network requires assessment at the network-level. Yet popular theories implicating the default mode network (DMN) in episodic memory and social cognition, rest principally upon analyses performed at the level of individual brain regions. Here we use independent component analysis to formally test the role of the DMN in episodic and social processing at the network level. As well as an episodic retrieval task, two independent datasets were employed to assess DMN function across the breadth of social cognition; a person knowledge judgement and a theory of mind task. Each task dataset was separated into networks of co-activated regions. In each, the co-activated DMN, was identified through comparison to an a priori template and its relation to the task model assessed. This co-activated DMN did not show greater activity in episodic or social tasks than high-level baseline conditions. Thus, no evidence was found to support hypotheses that the co-activated DMN is involved in explicit episodic or social processing tasks at a network-level. The networks associated with these processes are described. Implications for prior univariate findings and the functional significance of the co-activated DMN are considered.
- Published
- 2023
80. Univariate and multivariate spatial models of health facility utilisation for childhood fevers in an area on the coast of Kenya
- Author
-
Paul O. Ouma, Nathan O. Agutu, Robert W. Snow, and Abdisalan M. Noor
- Subjects
Utilisation ,Fever ,Univariate ,Multivariate ,Computer applications to medicine. Medical informatics ,R858-859.7 - Abstract
Abstract Background Precise quantification of health service utilisation is important for the estimation of disease burden and allocation of health resources. Current approaches to mapping health facility utilisation rely on spatial accessibility alone as the predictor. However, other spatially varying social, demographic and economic factors may affect the use of health services. The exclusion of these factors can lead to the inaccurate estimation of health facility utilisation. Here, we compare the accuracy of a univariate spatial model, developed only from estimated travel time, to a multivariate model that also includes relevant social, demographic and economic factors. Methods A theoretical surface of travel time to the nearest public health facility was developed. These were assigned to each child reported to have had fever in the Kenya demographic and health survey of 2014 (KDHS 2014). The relationship of child treatment seeking for fever with travel time, household and individual factors from the KDHS2014 were determined using multilevel mixed modelling. Bayesian information criterion (BIC) and likelihood ratio test (LRT) tests were carried out to measure how selected factors improve parsimony and goodness of fit of the time model. Using the mixed model, a univariate spatial model of health facility utilisation was fitted using travel time as the predictor. The mixed model was also used to compute a multivariate spatial model of utilisation, using travel time and modelled surfaces of selected household and individual factors as predictors. The univariate and multivariate spatial models were then compared using the receiver operating area under the curve (AUC) and a percent correct prediction (PCP) test. Results The best fitting multivariate model had travel time, household wealth index and number of children in household as the predictors. These factors reduced BIC of the time model from 4008 to 2959, a change which was confirmed by the LRT test. Although there was a high correlation of the two modelled probability surfaces (Adj R 2 = 88%), the multivariate model had better AUC compared to the univariate model; 0.83 versus 0.73 and PCP 0.61 versus 0.45 values. Conclusion Our study shows that a model that uses travel time, as well as household and individual-level socio-demographic factors, results in a more accurate estimation of use of health facilities for the treatment of childhood fever, compared to one that relies on only travel time.
- Published
- 2017
- Full Text
- View/download PDF
81. Sampling properties of the Bayesian posterior mean with an application to WALS estimation
- Author
-
Giuseppe De Luca, Jan R. Magnus, Franco Peracchi, Econometrics and Data Science, Giuseppe De Luca, Jan R Magnu, and Franco Peracchi
- Subjects
Economics and Econometrics ,WALS ,SDG 16 - Peace ,Settore SECS-P/05 ,Monte Carlo method ,Bayesian probability ,Posterior probability ,Settore SECS-P/05 - Econometria ,Double-shrinkage estimators ,01 natural sciences ,Least squares ,010104 statistics & probability ,Frequentist inference ,0502 economics and business ,Statistics ,Posterior moments and cumulants ,Statistics::Methodology ,0101 mathematics ,double-shrinkage estimator ,050205 econometrics ,Mathematics ,Location model ,Applied Mathematics ,05 social sciences ,SDG 16 - Peace, Justice and Strong Institutions ,Univariate ,Sampling (statistics) ,Estimator ,Variance (accounting) ,Justice and Strong Institutions ,Sample size determination ,posterior moments and cumulant ,Normal location model - Abstract
Many statistical and econometric learning methods rely on Bayesian ideas, often applied or reinterpreted in a frequentist setting. Two leading examples are shrinkage estimators and model averaging estimators, such as weighted-average least squares (WALS). In many instances, the accuracy of these learning methods in repeated samples is assessed using the variance of the posterior distribution of the parameters of interest given the data. This may be permissible when the sample size is large because, under the conditions of the Bernstein--von Mises theorem, the posterior variance agrees asymptotically with the frequentist variance. In finite samples, however, things are less clear. In this paper we explore this issue by first considering the frequentist properties (bias and variance) of the posterior mean in the important case of the normal location model, which consists of a single observation on a univariate Gaussian distribution with unknown mean and known variance. Based on these results, we derive new estimators of the frequentist bias and variance of the WALS estimator in finite samples. We then study the finite-sample performance of the proposed estimators by a Monte Carlo experiment with design derived from a real data application about the effect of abortion on crime rates.
- Published
- 2022
- Full Text
- View/download PDF
82. Diagnostic Efficiency of Diffusion Sequences and a Clinical Nomogram for Detecting Lymph Node Metastases from Rectal Cancer
- Author
-
Qing Xu, Yang Li, Jing Yu, Ming Lu, Hongyuan Shi, and Chen Wang
- Subjects
medicine.medical_specialty ,Multivariate analysis ,Rectal Neoplasms ,business.industry ,Univariate ,Nomogram ,Logistic regression ,Motion ,Nomograms ,Diffusion Magnetic Resonance Imaging ,medicine.anatomical_structure ,Lymphatic Metastasis ,Rectal Adenocarcinoma ,Humans ,Medicine ,Radiology, Nuclear Medicine and imaging ,Radiology ,business ,Lymph node ,Intravoxel incoherent motion ,Retrospective Studies ,Diffusion MRI - Abstract
Rationale and Objectives First, to evaluate and compare three different diffusion sequences (i.e., standard DWI, IVIM, and DKI) for nodal staging. Second, to combine the DWI, and anatomic information to assess metastatic lymph node (LN). Materials and Methods We retrospectively identified 136 patients of rectal adenocarcinoma who met the inclusion criteria. Three diffusion sequences (i.e., standard DWI, IVIM, and DKI) were performed, and quantitative parameters were evaluated. Univariate and multivariate analyses were used to assess the associations between the anatomic and DWI information and LN pathology. Multivariate logistic regression was used to identify independent risk factors. A nomogram model was established, and the model performance was evaluated by the concordance index (c-index) and calibration curve. Results There was a statistical difference in variables (LN long diameter, LN short diameter, LN boundary, LN signal, peri-LN signal intensity, ADC-1000, ADC-1400, ADC-2000, Kapp and D) between metastatic and non-metastatic LN for training and validation cohorts (p Conclusion The IVIM and DKI model's diagnostic efficiency was not significantly improved compared to conventional DWI. The diagnostic accuracy of metastatic LN can be enhanced using the nomogram model, leading to a rational therapeutic choice.
- Published
- 2022
- Full Text
- View/download PDF
83. Spirituality and the quality of life of cancer patients: An antidepressant effect
- Author
-
Khalil Honein, G. Rached, Fady Haddad, K. Jradi, Moussa Riachy, E. Mikhael, T. Ghayad, D. Chelala, Ghassan Sleilaty, Georges Dabar, Sami Richa, and M. Mekhael
- Subjects
Gerontology ,Univariate analysis ,Multivariate analysis ,business.industry ,Univariate ,Cancer ,Pilot Projects ,medicine.disease ,Antidepressive Agents ,Psychiatry and Mental health ,Cross-Sectional Studies ,Arts and Humanities (miscellaneous) ,Quality of life ,Neoplasms ,Surveys and Questionnaires ,Adaptation, Psychological ,Chronic Disease ,Spirituality ,Quality of Life ,Humans ,Medicine ,business ,Psychosocial ,Depression (differential diagnoses) - Abstract
Objectives Cancer is the second leading cause of mortality in the world, and represents an economic, social and psychological burden. Scientific studies have focused on psychosocial coping mechanisms of patients and on factors improving their quality of life. Thus, the aim of the present study is to analyze the influence that spirituality would have on the quality of life of Lebanese cancer patients and to identify whether the influence on quality of life is mediated through a decreased depression. Methods This is a cross-sectional study targeting cancer patients in the hemato-oncology department of the Hotel-Dieu de France Hospital (Beirut, Lebanon). It is based on a questionnaire composed of three parts: EQ-5D-5L, PHQ-9, and FACIT-Sp-12. Likewise, a control group suffering from chronic diseases and treated in the hospital was questioned. Univariate and multivariate analysis were conducted to assess the relationship between the different questionnaires for controls and for cancer patients. Results Thirty-nine cancer patients and eight control patients were questioned. In the univariate analysis, there was no relationship between depression and spirituality nor for spirituality and quality of life. After controlling for depression, an inverse correlation between quality of life and spirituality was shown. Conclusions Our study is a pilot study which for the first time investigates the implication of depression in a “spirituality-quality of life” association. There is no clear association of spirituality with quality of life. In fact, the physical and psychological burden of chronically ill patients could exceed and render insignificant a possible impact of spirituality on quality of life.
- Published
- 2022
- Full Text
- View/download PDF
84. Statistical depth for fuzzy sets
- Author
-
Luis González-De La Fuente, Alicia Nieto-Reyes, Pedro Terán, and Universidad de Cantabria
- Subjects
Multivariate statistics ,Fuzzy data ,Nonparametric statistics ,Logic ,Generalization ,Fuzzy random variable ,Fuzzy set ,Univariate ,Statistical depth ,Space (mathematics) ,Fuzzy logic ,Algebra ,Artificial Intelligence ,Tukey depth ,Probability distribution ,Centrality ,Mathematics - Abstract
Statistical depth functions provide a way to order the elements of a space by their centrality in a probability distribution. That has been very successful for generalizing non-parametric order-based statistical procedures from univariate to multivariate and (more recently) to functional spaces. We introduce two general definitions of statistical depth which are adapted to fuzzy data. For that purpose, two concepts of symmetric fuzzy random variables are introduced and studied. Furthermore, a generalization of Tukey's halfspace depth to the fuzzy setting is presented and proved to satisfy the above notions, through a detailed study of its properties. A. Nieto-Reyes and L. Gonzalez are supported by the Spanish Ministerio de Economía, Industria y Competitividad grant MTM2017-86061-C2-2-P. P. Terán is supported by the Ministerio de Economía y Competitividad grant MTM2015-63971-P, the Ministerio de Ciencia, Innovación y Universidades grant PID2019-104486GB-I00 and the Consejería de Empleo, Industria y Turismo del Principado de Asturias grant GRUPIN-IDI2018-000132.
- Published
- 2022
- Full Text
- View/download PDF
85. An approach for evolving neuro-fuzzy forecasting of time series based on parallel recursive singular spectrum analysis
- Author
-
Ginalber Luiz de Oliveira Serra and Selmo Eduardo Rodrigues Júnior
- Subjects
Fuzzy rule ,Series (mathematics) ,Relation (database) ,Neuro-fuzzy ,Logic ,Data stream mining ,Univariate ,computer.software_genre ,Artificial Intelligence ,Data mining ,Time series ,Singular spectrum analysis ,computer ,Mathematics - Abstract
Time series forecasting is an important research topic applied to various areas of human knowledge, such as economics, medicine, meteorology, and engineering. The forecasting results can assist the decision-making process, providing useful projections to specialists. In this paper, a hybrid approach named Parallel Recursive Singular Spectrum Analysis and Evolving Neuro-Fuzzy Network (PRSSA+ENFN) for forecasting univariate and multivariate experimental time series, with learning from data streams composed of time series samples, is presented. The adopted methodology considers an evolving neuro-fuzzy basis, i.e., the PRSSA+ENFN adapts, extends, and evolves the fuzzy rule structure from new incoming data of the time series. The time series is subdivided into unobservable components, which are patterns contained in the data, and the neuro-fuzzy network forecasts these component samples so to reconstruct the original time series considering future values. An alternative to determine the specific parameters of PRSSA+ENFN is the application of a genetic algorithm. Four multi-steps ahead forecasting experiments are outlined with univariate and multivariate time series. Finally, some competitive results in relation to other methods in literature are discussed.
- Published
- 2022
- Full Text
- View/download PDF
86. On a Sparse Shortcut Topology of Artificial Neural Networks
- Author
-
Hengtao Guo, Fenglei Fan, Qikui Zhu, Ge Wang, Hengyong Yu, Pingkun Yan, and Dayang Wang
- Subjects
FOS: Computer and information sciences ,Computer Science - Machine Learning ,Network architecture ,Artificial neural network ,Continuous function ,Computer science ,Generalization ,Univariate ,Contrast (statistics) ,Machine Learning (stat.ML) ,Topology (electrical circuits) ,Topology ,Machine Learning (cs.LG) ,Statistics - Machine Learning ,Generalizability theory - Abstract
In established network architectures, shortcut connections are often used to take the outputs of earlier layers as additional inputs to later layers. Despite the extraordinary effectiveness of shortcuts, there remain open questions on the mechanism and characteristics. For example, why are shortcuts powerful? Why do shortcuts generalize well? In this paper, we investigate the expressivity and generalizability of a novel sparse shortcut topology. First, we demonstrate that this topology can empower a one-neuron-wide deep network to approximate any univariate continuous function. Then, we present a novel width-bounded universal approximator in contrast to depth-bounded universal approximators and extend the approximation result to a family of equally competent networks. Furthermore, with generalization bound theory, we show that the proposed shortcut topology enjoys excellent generalizability. Finally, we corroborate our theoretical analyses by comparing the proposed topology with popular architectures, including ResNet and DenseNet, on well-known benchmarks and perform a saliency map analysis to interpret the proposed topology. Our work helps enhance the understanding of the role of shortcuts and suggests further opportunities to innovate neural architectures.
- Published
- 2022
- Full Text
- View/download PDF
87. Air quality prediction in metropolitan areas using deep learning methods
- Author
-
Ionascu, Augustin Ionut and Ionascu, Augustin Ionut
- Abstract
The rapid growth of the world's urban population shows that people are increasingly moving to cities. In recent decades, the frequent occurrence of smog caused by increasing industrialization has brought environmental pollution to record highs. Therefore, the need to develop forecasting models about air quality occurs when the ambient air contains gasses, dust particles, smoke or odors in quantities large enough to be harmful to organic life. Accurate forecasts help people anticipate environmental conditions and act consequently to decrease dangerous pollution levels, reducing health impacts and associated costs. Rather than investigating deterministic models that attempt to simulate physical processes and develop complex mathematical simulations, this paper will focus on statistical methods, studying historical information and extracting information from data patterns. In looking for new reliable air quality forecasting methods, the goal was to develop and test an artifact based on the Transformer architecture, a novel technique initially developed for natural language processing tasks. Testing was performed against recurrent and convolutional, well-established deep-learning models successfully implemented in many applications, including time-series forecasting. Two different Transformer models were tested, one using time embeddings in the same manner as proposed in the original paper, while in the second model, the Time2Vec method has been adapted. The obtained results reveal that, even though not necessarily better than reference models, both Transformers could output accurate predictions and perform almost as well as recurrent and convolutional models.
- Published
- 2023
88. A Bivariate Copula Approach to Extreme Water Level Estimation: For the city of Venice
- Author
-
Draisma, Max (author) and Draisma, Max (author)
- Abstract
Understanding the factors that drive extreme water levels is key to an accurate assessment of flood hazard. The city of Venice has always been affected by flooding due to extreme water levels. In this study, we examine the factors driving and influencing extreme water levels in the Venice lagoon, aiming at deriving accurate extreme water level estimates in the Venice lagoon. Due to the shallowness of the Venice lagoon, extreme water levels are influenced by both atmospheric forcing (surge) and water level of the lagoon (tide and bottom level) and interactions between these two. Furthermore, these extreme water levels have been changing over time due to variations in the bottom level. These variations are reportedly due to local (anthropogenic and natural) subsidence and sea level rise. In this study we resort to the available long-term water level observations of the Punta della Salute tide-gauge. Given the effects of subsidence and sea level rise in these data, we start by homogenizing the data by removing these trends and jumps from the time-series. Using the homogenized time-series, we study the influence of the dependence between tide and surge components on the extreme water level estimates. Finally, we quantify the effect in the estimates of modelling this dependence in the extreme value models. To homogenize the data and better understand the underlying trends, a time-series analysis was performed on the time-series of water level observations. Mann-Kendall tests for monotonic trend were performed, followed by an analysis using changepoint detection methods. Changepoint detection was performed using the RHtest and BEAST methods on the Punta della Salute time-series as well as time-series from neighbouring tide-gauge stations. Ultimately trend decomposition using the BEAST method was used to detrend and homogenize the Punta della Salute time-series. After detrending, the tide and surge components were separated using tidal harmonic analysis and re, Civil Engineering
- Published
- 2023
89. Sire evaluation models for estimating breeding values of Mehsana buffaloes
- Author
-
R N SATHWARA, J P GUPTA, J D CHAUDHARI, B M PRAJAPATI, A K SRIVASTAVA, H D CHAUHAN, and P A PATEL
- Subjects
Animal model ,Bivariate ,Breeding value ,Sire model ,Univariate ,Animal culture ,SF1-1100 - Abstract
First lactation data on 7,782 Mehsana buffaloes sired by 184 sires maintained at Dudhsagar Research and Development Association, Dudhsagar Dairy, Mehsana over a period of 24 years (1989–2012) were used to estimate least-squares means (LSM) and breeding value of the first lactation fat yield (FLFY) and average fat percentage (AFP) using univariate and bivariate models with the help of WOMBAT software. The effectiveness of different sire evaluation models using FLFY and AFP were compared on the basis of error variance, coefficient of determination (CV%), R2-value, AIC, BIC and Spearman’s rank correlation. The average estimate of FLFY and AFP was 135.04±0.57 kg and 7.11±0.11% in Mehsana buffaloes. These estimates were significantly affected by period and season of calving, and age at first calving group. The average expected breeding value of Mehsana buffalo bulls for FLFY and AFP were 133.24 kg and 7.14% using sire model (BLUP-SM), 135.71 kg and 7.22% using univariate animal model (BLUP-U-AM) and 133.23 kg and 7.14% from bivariate animal model (BLUP-B-AM), respectively for FLFY and AFP. The spearman’s rank correlation indicated similarity of rankings by BLUP-U-AM and BLUPB- AM. Animal model had a wider range of breeding values indicating the greater differentiating ability of the model. Based on error variance, AIC, BIC, R2 and CV; animal model was found to be superior in comparison to sire model.
- Published
- 2019
- Full Text
- View/download PDF
90. otsad: A package for online time-series anomaly detectors.
- Author
-
Iturria, Alaiñe, Carrasco, Jacinto, Charramendieta, Santi, Conde, Angel, and Herrera, Francisco
- Subjects
- *
DETECTORS , *ONLINE algorithms , *PACKAGING , *UNIVARIATE analysis , *ANOMALY detection (Computer security) , *TIME series analysis - Abstract
This paper presents otsad, the first R package which implements a set of novel online detection algorithms for univariate time-series. Moreover, this package also provides advanced functionalities and contents such as new false positive reduction algorithm and the novel NAB detectors measurement technique which is specifically designed to measure online time-series anomaly detectors. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
91. Nonparametric (distribution-free) control charts: An updated overview and some results.
- Author
-
Chakraborti, S. and Graham, M. A.
- Subjects
QUALITY control charts ,QUALITY control - Abstract
Control charts that are based on assumption(s) of a specific form for the underlying process distribution are referred to as parametric control charts. There are many applications where there is insufficient information to justify such assumption(s) and, consequently, control charting techniques with a minimal set of distributional assumption requirements are in high demand. To this end, nonparametric or distribution-free control charts have been proposed in recent years. The charts have stable in-control properties, are robust against outliers and can be surprisingly efficient in comparison with their parametric counterparts. Chakraborti and some of his colleagues provided review papers on nonparametric control charts in 2001, 2007 and 2011, respectively. These papers have been received with considerable interest and attention by the community. However, the literature on nonparametric statistical process/quality control/monitoring has grown exponentially and because of this rapid growth, an update is deemed necessary. In this article, we bring these reviews forward to 2017, discussing some of the latest developments in the area. Moreover, unlike the past reviews, which did not include the multivariate charts, here we review both univariate and multivariate nonparametric control charts. We end with some concluding remarks. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
92. Limitations of univariate linear bias correction in yielding cross‐correlation between monthly precipitation and temperature.
- Author
-
Bhowmik, R. Das and Sankarasubramanian, A.
- Subjects
- *
STATISTICAL bias , *UNIVARIATE analysis , *METEOROLOGICAL precipitation , *CONFIDENCE intervals , *REGRESSION analysis , *U.S. states - Abstract
Statistical bias correction techniques are commonly used in climate model projections to reduce systematic biases. Among the several bias correction techniques, univariate linear bias correction (e.g., quantile mapping) is the most popular, given its simplicity. Univariate linear bias correction can accurately reproduce the observed mean of a given climate variable. However, when performed separately on multiple variables, it does not yield the observed multivariate cross‐correlation structure. In the current study, we consider the intrinsic properties of two candidate univariate linear bias‐correction approaches (simple linear regression and asynchronous regression) in estimating the observed cross‐correlation between precipitation and temperature. Two linear regression models are applied separately on both the observed and the projected variables. The analytical solution suggests that two candidate approaches simply reproduce the cross‐correlation from the general circulation models (GCMs) in the bias‐corrected data set because of their linearity. Our study adopts two frameworks, based on the Fisher z‐transformation and bootstrapping, to provide 95% lower and upper confidence limits (referred as the permissible bound) for the GCM cross‐correlation. Beyond the permissible bound, raw/bias‐corrected GCM cross‐correlation significantly differs from those observed. Two frameworks are applied on three GCMs from the CMIP5 multimodel ensemble over the coterminous United States. We found that (a) the univariate linear techniques fail to reproduce the observed cross‐correlation in the bias‐corrected data set over 90% (30–50%) of the grid points where the multivariate skewness coefficient values are substantial (small) and statistically significant (statistically insignificant) from zero; (b) the performance of the univariate linear techniques under bootstrapping (Fisher z‐transformation) remains uniform (non‐uniform) across climate regions, months, and GCMs; (c) grid points, where the observed cross‐correlation is statistically significant, witness a failure fraction of around 0.2 (0.8) under the Fisher z‐transformation (bootstrapping). The importance of reproducing cross‐correlations is also discussed along with an enquiry into the multivariate approaches that can potentially address the bias in yielding cross‐correlations. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
93. A influência de outliers nos estudos métricos da informação: uma análise de dados univariados.
- Author
-
Lima, Luís Fernando Maia, Masson Maroldi, Alexandre, da Silva, Dávilla Vieira Odízio, Hayashi, Carlos Roberto Massao, and Innocentini Hayashi, Maria Cristina Piumbato
- Abstract
This paper presents a new formula for detecting outliers through Exploratory Data Analysis, while taking data asymmetry into account. The effect of removing outliers from the original dataset was also assessed. The new formula was applied on three datasets published in the literature on metric studies of information. The first dataset presented five lower outliers. The average of aggregate data conveyed a false impression that 40 universities, from a total of 49, were above average. The removal of the five lower outliers leads to a new average in which only 22 universities were above average. In the second dataset, there were five lower outliers and one upper outlier. In this case, the upper outlier eventually weaken the effect of the lower outliers. In the third dataset, five upper outliers and one lower outlier are detected. The average of aggregate data revealed that 10 universities were above average. Removing the six outliers from the original dataset, it was found that 28 universities were above the new average score. For the three datasets analyzed, the assessment demonstrated the effect of the outliers on the interval estimation (statistical inference): the removal of outliers generated a mean and standard deviation that were more representative of the sample analyzed. Therefore, became evident how outliers could influence results and conclusions in metric studies of the information. However, the formula for outliers' detection is open for future research. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
94. Sleeve Gastrectomy Weight Loss and the Preoperative and Postoperative Predictors: a Systematic Review.
- Author
-
Cottam, Samuel, Cottam, Daniel, and Cottam, Austin
- Subjects
WEIGHT loss ,SLEEVE gastrectomy ,META-analysis ,PREDICTION models ,MULTIVARIATE analysis - Abstract
The sleeve gastrectomy (SG) is the most popular weight loss procedure in the USA. Despite its popularity, little is definitively known about the variables that correlate with weight loss. We performed a literature search to find studies that reported variables that correlated with weight loss following SG. Forty-eight articles were identified and included. These articles covered 36 different factors predictive of weight loss while only including five predictive models. Only 12.5% of multivariate analyses evaluated sufficiently reported their results. The factors that predict weight loss following SG cannot be concluded due to the inconsistency in reporting and the methodological flaws in analysis. Reporting factors that predict weight loss should be standardized, and methods should be changed to allow physicians to use the data presented. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
95. The role of cross-correlation between precipitation and temperature in basin-scale simulations of hydrologic variables.
- Author
-
Seo, S.B., Das Bhowmik, R., Sankarasubramanian, A., Mahinthakumar, G., and Kumar, M.
- Subjects
- *
CROSS correlation , *METEOROLOGICAL precipitation , *TEMPERATURE , *CLIMATE change , *STANDARD deviations - Abstract
Highlights • Precipitation (P)-Temperature (T) cross-correlation structure impacts hydrologic simulations. • Simulation of surface/sub-surface variables is improved by reproducing P-T cross-correlation. • Multivariate pre-processing approaches are essential for climate change impact studies. Abstract Uncertainty in climate forcings causes significant uncertainty in estimating streamflow and other land-surface fluxes in hydrologic model simulations. Earlier studies primarily analyzed the importance of reproducing cross-correlation between precipitation and temperature (P-T cross-correlation) using various downscaling and weather generator schemes, leaving out how such biased estimates of P-T cross-correlation impact streamflow simulation and other hydrologic variables. The current study investigates the impacts of biased P-T cross-correlation on hydrologic variables using a fully coupled hydrologic model (Penn-state Integrated Hydrologic Model, PIHM). For this purpose, a synthetic weather generator was developed to generate multiple realizations of daily climate forcings for a specified P-T cross-correlation. Then, we analyzed how reproducing/neglecting P-T cross-correlation in climate forcings affect the accuracy of a hydrologic simulation. A total of 50 synthetic data sets of daily climate forcings with different P-T cross-correlation were forced into to estimate streamflow, soil moisture, and groundwater level under humid (Haw River basin in NC, USA) and arid (Lower Verde River basin in AZ, USA) hydroclimate settings. Results show that climate forcings reproducing the P-T cross-correlation yield lesser root mean square errors in simulated hydrologic variables (primarily on the sub-surface variables) as compared to climate forcings that neglect the P-T cross-correlation. Impacts of P-T cross-correlation on hydrologic simulations were remarkable to low flow and sub-surface variables whereas less significant to flow variables that exhibit higher variability. We found that hydrologic variables with lower internal variability (for example: groundwater and soil-moisture depth) are susceptible to the bias in P-T cross-correlation. These findings have potential implications in using univariate linear downscaling techniques to bias-correct GCM forcings, since univariate linear bias-correction techniques reproduce the GCM estimated P-T cross-correlation without correcting the bias in P-T cross-correlation. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
96. Ridge Regression Based Subdivision Schemes for Noisy Data.
- Author
-
Asghar, Muhammad and Mustafa, Ghulam
- Subjects
UNIVARIATE analysis ,BIVARIATE analysis ,RIDGE regression (Statistics) ,LEAST squares ,POLYNOMIALS - Abstract
In this article, a generalized algorithm for curve and surface design has been presented. This algorithm is based on Ridge regression. The subdivision schemes generated by the proposed algorithm give less response to the outliers. The quality of proposed subdivision schemes are also better than the least squares based subdivision schemes. Moreover, least squares based subdivision schemes are the special case of our schemes. [ABSTRACT FROM AUTHOR]
- Published
- 2019
97. Dynamic functional time-series forecasts of foreign exchange implied volatility surfaces
- Author
-
Fearghal Kearney and Han Lin Shang
- Subjects
FOS: Computer and information sciences ,Multivariate statistics ,Implied volatility ,Statistics - Applications ,Statistics - Computation ,FOS: Economics and business ,Stochastic processes ,62M20, 60G25 ,Economics ,Econometrics ,Applications (stat.AP) ,Trading strategy ,Business and International Management ,Computation (stat.CO) ,Functional principal component analysis ,Statistical Finance (q-fin.ST) ,Series (mathematics) ,Long-run covariance ,Stochastic process ,Univariate ,Quantitative Finance - Statistical Finance ,Augmented common factor method ,Univariate time-series forecasting ,Foreign exchange - Abstract
This paper presents static and dynamic versions of univariate, multivariate, and multilevel functional time-series methods to forecast implied volatility surfaces in foreign exchange markets. We find that dynamic functional principal component analysis generally improves out-of-sample forecast accuracy. More specifically, the dynamic univariate functional time-series method shows the greatest improvement. Our models lead to multiple instances of statistically significant improvements in forecast accuracy for daily EUR-USD, EUR-GBP, and EUR-JPY implied volatility surfaces across various maturities, when benchmarked against established methods. A stylised trading strategy is also employed to demonstrate the potential economic benefits of our proposed approach., 52 pages, 5 figures, to appear at the International Journal of Forecasting
- Published
- 2022
- Full Text
- View/download PDF
98. Wide-range model predictive control for aero-engine transient state
- Author
-
Bing Yu, Hongwei Ke, Tianhong Zhang, and Zhouyang Li
- Subjects
Schedule ,Transient state ,Model predictive control ,Flight envelope ,Control theory ,Computer science ,Mechanical Engineering ,Range (statistics) ,Univariate ,Aerospace Engineering ,Field (computer science) - Abstract
To perform transient state control of an aero-engine, a structure that combines linear controller and min-max selector is widely adopted, which is inherently conservative and therefore limits the fulfillment of the engine potential. Model predictive control is a new control method that has vast application prospects in the field of aero-engine control. Therefore, this paper proposes a wide-range model predictive controller that can control the engine over a wide range within the flight envelope. This paper first introduces the engine parameters and the model prediction algorithm used by the controller. Then a wide-range model prediction controller with a three-layer nested structure is presented. These three layers of the structure are univariate controller, nominal point controller, and wide-range controller from inside to outside. Finally, by analyzing and verifying the effectiveness of the univariate controller for small-range variations and the wide-range model predictive controller for large-range parameter variations, it is demonstrated that the controller can schedule the controller’s output based on inlet altitude, Mach number, and low-pressure shaft corrected speed, and ensure that the limits are not exceeded. It is concluded that the designed wide-range model predictive controller has good dynamic effect and safety.
- Published
- 2022
- Full Text
- View/download PDF
99. Remodelling State-Space Prediction With Deep Neural Networks for Probabilistic Load Forecasting
- Author
-
B. K. Panigrahi, Ponnuthurai Nagaratnam Suganthan, Abbas Khosravi, and Parul Arora
- Subjects
Multivariate statistics ,Power transmission ,Control and Optimization ,Computer science ,Computation ,Probabilistic logic ,Univariate ,computer.software_genre ,Computer Science Applications ,Computational Mathematics ,Variable (computer science) ,Recurrent neural network ,Artificial Intelligence ,State space ,Data mining ,computer - Abstract
Probabilistic load forecasting (PLF) has become necessary for power system operators to do efficient planning across power transmission and distribution systems. However, there are not many PLF models, and those that exist take a lot of computation time and are not efficient, especially in multiple loads. This paper proposes a novel algorithm for spatially correlated multiple loads wherein a global parameter is learned from state-space parameters of individual loads by an amalgamation of deep neural networks and state-space models. The proposed model employs complex pattern learning capabilities of recurrent neural networks and temporal pattern extraction of innovation state-space models. It is tested on GEFCom-14 and ISO-NE datasets, one with a single load and multiple loads. Different case studies are conducted to examine the involvement of temperature for load forecasting. It has been observed that in the case of multivariate loads, temperature variable doesn’t make much difference in PLF, but in the case of univariate loads, forecasting results are four-times better. The proposed method is highly interpretable and can be employed in areas where limited training data is available to the areas where colossal data is available. The proposed model has outperformed several benchmarks present in the literature on the same datasets.
- Published
- 2022
- Full Text
- View/download PDF
100. Sexing of skull by metric analysis of hard palate
- Author
-
Petkar, Madhusudan R., Makhani, Chandeep Singh, Datir, Sandesh B., Farooqui, Jamebaseer, Bangal, Rajendra S., and Chavan, Kalidas D.
- Published
- 2017
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.