19,890 results on '"univariate"'
Search Results
2. Long Short-Term Memory and Discrete Wavelet Transform based Univariate Stock Market Prediction Model.
- Author
-
Jarrah, Mutasem
- Subjects
- *
LONG short-term memory , *BOX-Jenkins forecasting , *DISCRETE wavelet transforms , *DEEP learning , *TIME series analysis - Abstract
Analyzing financial situations in the current scenario is difficult, as it requires understanding the quality and value of investments. This study predicted the movement of stock prices in the Saudi Arabian stock market (Tadawul) over a one-week period using a proposed integrated model of Long Short-Term Memory (LSTM), which combines LSTM, Discrete Wavelet Transform (DWT), and Autoregressive Integrated Moving Average (ARIMA). Historical closing prices of a group of four companies listed on Tadawul were used as input for the proposed LSTM model, which consists of memory units capable of storing long time periods. Once the LSTM model predicted the closing values of stocks in Tadawul, they were further analyzed using the ARIMA model. The prediction accuracy of the proposed LSTM model and the traditional ARIMA model were 97.54% and 96.29% respectively. Therefore, the proposed integrated model of LSTM is considered a useful tool for predicting stock market values. The results emphasize the significance of Deep Learning (DL) and leveraging multiple information sources in predicting stock prices. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. From simple to complex: a sequential method for enhancing time series forecasting with deep learning.
- Author
-
Jiménez-Navarro, M J, Martínez-Ballesteros, M, Martínez-Álvarez, F, Troncoso, A, and Asencio-Cortés, G
- Subjects
TIME series analysis ,DEEP learning ,NONLINEAR functions ,LEARNING ,FORECASTING - Abstract
Time series forecasting is a well-known deep learning application field in which previous data are used to predict the future behavior of the series. Recently, several deep learning approaches have been proposed in which several nonlinear functions are applied to the input to obtain the output. In this paper, we introduce a novel method to improve the performance of deep learning models in time series forecasting. This method divides the model into hierarchies or levels from simpler to more complex ones. Simpler levels handle smoothed versions of the input, whereas the most complex level processes the original time series. This method follows the human learning process where general/simpler tasks are performed first, and afterward, more precise/harder ones are accomplished. Our proposed methodology has been applied to the LSTM architecture, showing remarkable performance in various time series. In addition, a comparison is reported including a standard LSTM and novel methods such as DeepAR, Temporal Fusion Transformer, NBEATS and Echo State Network. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. QUALITY CERTIFICATION AND BUSINESS PERFORMANCE: THE PORTUGUESE LARGE FIRMS' CASE.
- Author
-
Carvalho Peres Moreira, Cândido Jorge, Peres Terrinca, Catarina Carvalho, Custódio Cristóvão, Domingos, Guerreiro Antão, Mário Alexandre, Baptista Pinheiro, Pedro Miguel, and Afonso Geraldes, João Manuel
- Subjects
SUSTAINABLE development ,BUSINESS failures ,PRIVATE sector ,BUSINESS size ,VALUE creation ,CORPORATE sustainability - Abstract
Copyright of Environmental & Social Management Journal / Revista de Gestão Social e Ambiental is the property of Environmental & Social Management Journal and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2024
- Full Text
- View/download PDF
5. Comparative Analysis of Univariate and Deep Learning Models for Stock Market Prediction in Frontier Markets: A Case Study of Pakistan, Bangladesh, and Sri Lanka
- Author
-
Rabia Sabri and Sobia Iqbal
- Subjects
stock market forecasting ,frontier market ,univariate ,multivariate ,arima ,theta deep learning models ,ensemble learning techniques ,Business ,HF5001-6182 ,Economic theory. Demography ,HB1-3840 - Abstract
This research is focused on the stock exchanges of Sri Lanka, Pakistan, and Bangladesh due to their significant presence in the Asian region and the unique challenges, opportunities, and presence in these frontier markets. The study assesses both traditional stationary models, Autoregressive Integrated Moving Average (ARIMA) and Theta traditional stationary models, with the contemporary deep learning models Long Short-Term Memory (LSTM) with 1D Convolutional Neural Network (CNN) support. The study covers the sample period of historical data from 2020–2023 to include the in and out of sample forecasting of 2024–2025. The time series comprises linear and nonlinear datasets to capture a wider range of factors influencing the market. The evaluation metrics are used to balance the prediction model's accuracy and the intricate dynamics of the markets. The conventional time series models hold the advantages of their interpretability and computational efficiency, but the combined effect of CNN-LSTM exhibits significantly superior accuracy in predictions. Integrating advanced techniques with traditional statistical methods provides a more comprehensive and accurate forecast to capture the complex Intricacies of stock indices. The ensembling model approach can improve predictive performance and help stabilize the market ecosystem.
- Published
- 2024
- Full Text
- View/download PDF
6. Assessing the limitations of relief-based algorithms in detecting higher-order interactions.
- Author
-
Freda, Philip J., Ye, Suyu, Zhang, Robert, Moore, Jason H., and Urbanowicz, Ryan J.
- Subjects
- *
FEATURE selection , *ABSOLUTE value , *REPUTATION , *LOCUS (Genetics) , *ALGORITHMS - Abstract
Background: Epistasis, the interaction between genetic loci where the effect of one locus is influenced by one or more other loci, plays a crucial role in the genetic architecture of complex traits. However, as the number of loci considered increases, the investigation of epistasis becomes exponentially more complex, making the selection of key features vital for effective downstream analyses. Relief-Based Algorithms (RBAs) are often employed for this purpose due to their reputation as "interaction-sensitive" algorithms and uniquely non-exhaustive approach. However, the limitations of RBAs in detecting interactions, particularly those involving multiple loci, have not been thoroughly defined. This study seeks to address this gap by evaluating the efficiency of RBAs in detecting higher-order epistatic interactions. Motivated by previous findings that suggest some RBAs may rank predictive features involved in higher-order epistasis negatively, we explore the potential of absolute value ranking of RBA feature weights as an alternative approach for capturing complex interactions. In this study, we assess the performance of ReliefF, MultiSURF, and MultiSURFstar on simulated genetic datasets that model various patterns of genotype-phenotype associations, including 2-way to 5-way genetic interactions, and compare their performance to two control methods: a random shuffle and mutual information. Results: Our findings indicate that while RBAs effectively identify lower-order (2 to 3-way) interactions, their capability to detect higher-order interactions is significantly limited, primarily by large feature count but also by signal noise. Specifically, we observe that RBAs are successful in detecting fully penetrant 4-way XOR interactions using an absolute value ranking approach, but this is restricted to datasets with only 20 total features. Conclusions: These results highlight the inherent limitations of current RBAs and underscore the need for the development of Relief-based approaches with enhanced detection capabilities for the investigation of epistasis, particularly in datasets with large feature counts and complex higher-order interactions. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Comparative assessment of univariate and multivariate imputation models for varying lengths of missing rainfall data in a humid tropical region: a case study of Kozhikode, Kerala, India.
- Author
-
Kannegowda, Naveena, Udayar Pillai, Surendran, Kommireddi, Chinni Venkata Naga Kumar, and Fousiya
- Subjects
- *
MISSING data (Statistics) , *WEATHER & climate change , *STANDARD deviations , *RAINFALL , *CLIMATE change forecasts , *KALMAN filtering , *WEATHER forecasting - Abstract
Accurate measurement of meteorological parameters is crucial for weather forecasting and climate change research. However, missing observations in rainfall data can pose a challenge to these efforts. Traditional methods of imputation can lead to increased uncertainty in predictions. Additionally, varying lengths of missing data and nonlinearity in rainfall distribution make it difficult to rely on a single imputation method in all situations. To address this issue, our study compared univariate and multivariate imputation models for different lengths of missing daily rainfall observations in a humid tropical region. We used 33 years of weather data from Kozhikode, an urban city in Kerala region, and evaluated the selected models using accuracy measures such as Root Mean Square Error (RMSE), Mean Absolute Error (MAE), Nash–Sutcliffe Efficiency (NSE) and Mean Absolute Relative Error (MARE). Among the considered univariate and multivariate imputation models, Kalman filter coupled time series models like Kalman–Arima ( RMSE ¯ = 11.90, MAE ¯ = 4.46) and Kalman Smoothing with structure time series ( RMSE ¯ = 11.37, MAE ¯ = 5.28) were found to be best for small (< 7 days) range imputation of rainfall data. Random Forest ( RMSE ¯ = 16.57, MAE ¯ = 8.0) and Kalman Smoothing with structure time series ( RMSE ¯ = 16.84, MAE ¯ = 8.09) performed well for medium range (8–15 days) of rainfall imputation. Random Forest technique was found to be suitable for large (≤ 30 days) ( RMSE ¯ = 15.45, MAE ¯ = 6.77), and very large (> 30 days) ( RMSE ¯ = 12.91, MAE ¯ = 3.42) missing length groups and Kalman–ARIMA performed best for mixed day series (RMSE = 9.7, MAE = 3.52). NSE and MARE values for different gap margins in rainfall data (≥ 1 mm) suggest that Kalman Smoothing (KS) connected models, as a representative univariate model, perform exceptionally well when dealing with a small number of missing observations. Notably, multivariate models like Principal Component Analysis (PCA) and Random Forest outperformed univariate models for medium to large gap margins. Considering these findings, utilizing multivariate techniques is recommended for imputing a large number of missing rainfall values and univariate models can be limited for small range of rainfall missing data imputation. The identified imputation models provide effective solutions for filling missing data of various lengths in all stations' datasets in humid tropical regions, thus enhancing rainfall-related analysis and enabling more accurate weather forecasts and climate change research. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
8. The potential of odontometric measurements for sex and age determination through cast analysis in the North Indian population.
- Author
-
Jain, Ayushi, Devi, Priya, Gupta, Shalini, Dewan, Rakesh, and Shakya, Pratibha
- Subjects
MOLARS ,DENTAL casting ,DIMORPHISM (Biology) ,AGE groups ,UNIVARIATE analysis - Abstract
Objectives: Various dental morphological features, as well as odontometric parameters, have been used for gender determination such as mandibular and maxillary canine indices, mandibular canine dimensions, maxillary canine dimension, maxillary first molar dimensions, and the cumulative dimension of all teeth. However, the results are variable in the studies. This study aims to evaluate the parameters mentioned above along with the inclusion of more odontometric parameters for age and sex estimation. Methods: In this study, some principal maxillary and mandibular odontometric measurements were considered for 360 individuals (190 males and 170 females) and recorded using a digital vernier caliper from the casts. The parameters were correlated with the age and gender of individuals with the help of ANOVA and independent t-test respectively. Results: Using discriminant statistical analysis, the present data were analyzed. This study concludes that intercanine and intermolar width for the second molar was higher in males than females. In contrast, all other parameters have a comparable mean in both samples. With the help of the Pearson correlation, MDR6 was found to be significantly associated (p<0.05) with age. Additionally, univariate analysis was employed to evaluate age prediction for both the maxillary and mandibular cases. Conclusion: This study concludes that odontometric measurements could be helpful in the identification of an individual by estimating age and gender. In the maxillary and mandibular arches, it was discovered that females had greater intercanine and intermolar widths for the second molar than males. Additionally, an individual's estimated age may be accurately predicted by measuring the mesiodistal width of their right first molar. [ABSTRACT FROM AUTHOR]
- Published
- 2024
9. Estimation of additive and maternal covariance of production traits in Murrah buffalo.
- Author
-
Sharma, Smriti, Dhaka, Surender Singh, and Patil, Chandrashekhar Santosh
- Subjects
- *
MILK yield , *LIVESTOCK productivity , *CONFIRMATORY factor analysis , *MULTIVARIATE analysis , *ADDITIVES , *LIKELIHOOD ratio tests - Abstract
The study was done to determine additive, maternal and common permanent environmental effects and best‐suited model for some production traits using six univariate animal models that differed in the (co)variance components fitted to assess the importance of maternal effect using likelihood ratio test in Murrah buffaloes. Data from 614 Murrah buffaloes related to production traits were collected from history pedigree sheets maintained at the buffalo farm, Department of Livestock Production and Management (LPM), LUVAS, Hisar. The production traits under this study were 305 days milk yield (305DMY), peak yield (PY), lactation length (LL), dry period (DP), lactation milk yield (LMY) and wet average (WA). The heritability estimates were in the range of 0.33–0.44 for 305DMY, 0.25–0.51 for PY, 0.05–0.13 for LL, 0.03–0.23 for DP, 0.17–0.40 for LMY and 0.37–0.66 for WA. Model 1 was considered best for most of the traits, viz., 305DMY, PY, LL, LMY and WA followed by model 2 for DP. Covariance and correlated values within the traits caused inflation of heritability in model 3 and model 6. The maximum covariance between the additive and maternal effect was found in trait LMY, which was 14,183.90 in model 3 and the minimum value was also reported in the same trait for model 6, valued at −3522.37. Multivariate analysis showed that all production traits were moderate to high and positively correlated with each other except for DP, which was low and negative genetic and phenotypic correlated. Spearman's rank correlation coefficients of breeding value among all six models were high and significant, ranged from 0.78 to 1.00 for all the traits except DP, therefore any of the models could be taken into account depending upon the availability of data. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Comparative Analysis of Univariate and Deep Learning Models for Stock Market Prediction in Frontier Markets: A Case Study of Pakistan, Bangladesh, and Sri Lanka.
- Author
-
Sabri, Rabia and Iqbal, Sobia
- Subjects
ENSEMBLE learning ,DEEP learning ,STOCK price indexes ,MARKETING forecasting ,ECONOMIC trends - Abstract
In this study, we analyze stock exchanges of Sri Lanka, Pakistan and Bangladesh, exploring both traditional models (ARIMA and Theta) and modern deep learning models (LSTM with 1D CNN) for stock market prediction. The study uses historical data from 2020-2023 and forecasts market trends for 2024-2025. CNN LSTM combinations demonstrate superior predictive accuracy in predicting market dynamics, while traditional models are more interpretable and computationally efficient. This research shows the advantages of integrating advanced techniques in combination with traditional methods to more effectively capture the complexities of stock indices. The ensemble model approach could both improve predictive performance and market stability. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Addressing the Non-Stationarity and Complexity of Time Series Data for Long-Term Forecasts.
- Author
-
Baidya, Ranjai and Lee, Sang-Woong
- Subjects
TIME complexity - Abstract
Real-life time series datasets exhibit complications that hinder the study of time series forecasting (TSF). These datasets inherently exhibit non-stationarity as their distributions vary over time. Furthermore, the intricate inter- and intra-series relationships among data points pose challenges for modeling. Many existing TSF models overlook one or both of these issues, resulting in inaccurate forecasts. This study proposes a novel TSF model designed to address the challenges posed by real-life data, delivering accurate forecasts in both multivariate and univariate settings. First, we propose methods termed "weak-stationarizing" and "non-stationarity restoring" to mitigate distributional shift. These methods enable the removal and restoration of non-stationary components from individual data points as needed. Second, we utilize the spectral decomposition of weak-stationary time series to extract informative features for forecasting. To learn features from the spectral decomposition of weak-stationary time series, we exploit a mixer architecture to find inter- and intra-series dependencies from the unraveled representation of the overall time series. To ensure the efficacy of our model, we conduct comparative evaluations against state-of-the-art models using six real-world datasets spanning diverse fields. Across each dataset, our model consistently outperforms or yields comparable results to existing models. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
12. A Novel Dynamic Programming Method for Non-parametric Data Discretization
- Author
-
Quoc Trung, Bui, Hoang Minh, Vuong, Thi Hoai Linh, Nguyen, Thi Mai Anh, Bui, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Nguyen, Ngoc Thanh, editor, Chbeir, Richard, editor, Manolopoulos, Yannis, editor, Fujita, Hamido, editor, Hong, Tzung-Pei, editor, Nguyen, Le Minh, editor, and Wojtkiewicz, Krystian, editor
- Published
- 2024
- Full Text
- View/download PDF
13. A Comparative Study of Univariate and Multivariate Time Series Forecasting for CPO Prices Using Machine Learning Techniques
- Author
-
Mohd Fuad, Juz Nur Fatiha Deena, Ibrahim, Zaidah, Adam, Noor Latiffah, Mat Diah, Norizan, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Badioze Zaman, Halimah, editor, Robinson, Peter, editor, Smeaton, Alan F., editor, De Oliveira, Renato Lima, editor, Jørgensen, Bo Nørregaard, editor, K. Shih, Timothy, editor, Abdul Kadir, Rabiah, editor, Mohamad, Ummul Hanan, editor, and Ahmad, Mohammad Nazir, editor
- Published
- 2024
- Full Text
- View/download PDF
14. From big data to big insights: statistical and bioinformatic approaches for exploring the lipidome.
- Author
-
Chappel, Jessie R., Kirkwood-Donelson, Kaylie I., Reif, David M., and Baker, Erin S.
- Subjects
- *
BIG data , *DATA structures , *TASK analysis , *DEEP learning , *ENERGY storage , *DATA analysis - Abstract
The goal of lipidomic studies is to provide a broad characterization of cellular lipids present and changing in a sample of interest. Recent lipidomic research has significantly contributed to revealing the multifaceted roles that lipids play in fundamental cellular processes, including signaling, energy storage, and structural support. Furthermore, these findings have shed light on how lipids dynamically respond to various perturbations. Continued advancement in analytical techniques has also led to improved abilities to detect and identify novel lipid species, resulting in increasingly large datasets. Statistical analysis of these datasets can be challenging not only because of their vast size, but also because of the highly correlated data structure that exists due to many lipids belonging to the same metabolic or regulatory pathways. Interpretation of these lipidomic datasets is also hindered by a lack of current biological knowledge for the individual lipids. These limitations can therefore make lipidomic data analysis a daunting task. To address these difficulties and shed light on opportunities and also weaknesses in current tools, we have assembled this review. Here, we illustrate common statistical approaches for finding patterns in lipidomic datasets, including univariate hypothesis testing, unsupervised clustering, supervised classification modeling, and deep learning approaches. We then describe various bioinformatic tools often used to biologically contextualize results of interest. Overall, this review provides a framework for guiding lipidomic data analysis to promote a greater assessment of lipidomic results, while understanding potential advantages and weaknesses along the way. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
15. Finfish bycatch diversity of trawl fishery of Nagapattinam Coast, Tamil Nadu, South India
- Author
-
Santhoshkumar, S., Jawahar, P., Srinivasan, A., Jayakumar, N., and Subburaj, A.
- Published
- 2023
- Full Text
- View/download PDF
16. Univariate and multivariate short-term solar power forecasting of 25MWac Pasir Gudang utility-scale photovoltaic system using LSTM approach
- Author
-
Noor Hasliza Abdul Rahman, Mohamad Zhafran Hussin, Shahril Irwan Sulaiman, Muhammad Asraf Hairuddin, and Ezril Hisham Mat Saat
- Subjects
Large-scale solar ,LSTM ,Univariate ,Multivariate ,Solar power forecasting ,Electrical engineering. Electronics. Nuclear engineering ,TK1-9971 - Abstract
The generation of solar photovoltaic (PV) systems in Malaysia has great potential due to the abundance of sunlight and high irradiation level. Malaysia’s unique location near the equator makes solar energy the most attractive option for future energy sources. However, the erratic weather leads to variability of power generation, especially during high feed-in of solar energy that will eventually affect grid system stability. Hereof, solar power forecasting is critical, especially in operating utility-scale photovoltaic systems or large-scale solar (LSS) plants. This paper presents an approach to forecasting solar power generation for 10-min to 180-min ahead based on univariate and multivariate using Long -Short-Term Memory (LSTM) technique. The model performances are evaluated based on a real dataset from the 25MWac Pasir Gudang LSS plant from July 2021 to May 2022. The result shows that LSTM with a univariate model outperformed the multivariate model for the short-term forecasting (10-min to 50-min ahead) by 2.09% of RMSE. However, multivariate outperformed univariate for the longer forecasting horizon at 180-min ahead by 34.87% of RMSE. The forecasting output for the multivariate model that only uses historical meteorological data is less reliable than the multivariate model that uses historical meteorological and AC power output data. This research finding is envisaged to provide benefits to the grid system operation, planning, maintenance, and scheduling, thereby improving the reliability of the LSS plant.
- Published
- 2023
- Full Text
- View/download PDF
17. Predicted risk factors associated with secondary infertility in women: univariate and multivariate logistic regression analyses
- Author
-
Wafa Fatima, Abdul Majeed Akhtar, Asif Hanif, Aima Gilani, and Syed Muhammad Yousaf Farooq
- Subjects
secondary infertility ,logistic regression analysis ,multivariate ,univariate ,artificial neural network (ANN) ,Medicine (General) ,R5-920 - Abstract
IntroductionInfertile women are those who regularly engage in unprotected intercourse for a period of at least 1 year and are unable to become clinically pregnant. Primary infertility means the inability of couples to conceive, without any previous successful pregnancies. Secondary Infertility refers to the inability to get pregnant for 12 months, after having a previous pregnancy for one time at least. The objectives of the current study were to analyze risk factors for secondary infertility and compare the predictive accuracy of artificial neural network (ANN) and multiple logistic regression models.MethodsThe study was conducted at The University Institute of Public Health collecting data from Gilani Ultrasound Center 18 months after approval of synopsis. A total of 690 women (345 cases and 345 controls) were selected. The women selected for the case group had to be 20–45 years of age, had any parity, and had a confirmed diagnosis of secondary infertility.ResultsMultiple logistic regression (MLR) and ANN were used. The chance of secondary infertility was 2.91 times higher in women living in a joint family [odds ratio (OR) = 2.91; 95% confidence interval (CI) (1.91, 4.44)] and was also 2.35 times higher for those women who had relationship difficulties with their husband [OR = 2.35; 95% CI (1.18, 4.70)]. Marriage at an earlier age was associated with secondary infertility with β being negative and OR being < 1 [OR = 0.94; 95% CI (0.88, 0.99)]. For the logistic regression model, the area under the receiver operative characteristic curve (ROC) was 0.852 and the artificial neural network was 0.87, which was better than logistic regression.DiscussionIdentified risk factors of secondary infertility are mostly modifiable and can be prevented by managing these risk factors.
- Published
- 2024
- Full Text
- View/download PDF
18. Characterization of Six Lobster Species of Genus Panulirus (Decapoda, Palinuridae) from Aceh Waters, Indonesia Based on Morphometric Analysis
- Author
-
I. Irfannur, S. Saputra, M. Muliari, Y. Akmal, and A. S. Batubara
- Subjects
characters ,significantly ,comprehensive ,univariate ,multivariate ,Zoology ,QL1-991 - Abstract
Aceh Province is a potential area for the exploitation of Panulirus, with six species of Panulirus inhabiting coastal areas and coral ecosystems in Aceh Province including P. homarus (Linnaeus, 1758), P. longipes (A. Milne Edwards, 1868), P. ornatus (Fabricius, 1798), P. penicillatus (Olivier, 1791), P. polyphagus (Herbst, 1793), and P. versicolor (Latreille, 1804). This study aims to characterise six species of Panulirus originating from Aceh as management and conservation efforts in the future. This research was conducted from 2022‒2023 at Simeulue Island (P. homarus, P. longipes, P. penicillatus, and P. versicolor) and Aceh Jaya Regency (P. ornatus and P. polyphagus), Aceh Province, Indonesia. The collected samples were then transported to the Aquaculture Integrated Laboratory, Almuslim University, Indonesia for further analysis. The collected lobsters were of mature size (body weight and total length reaching 500 g and 18‒25 cm) with a total of 10 individuals per species. A total of 58 morphometric characters were measured, of which total length (TL) was used as the coefficient of data transformation, so only 57 characters were subjected to statistical tests. Statistical analysis of the measured morphometric characters was performed using univariate ANOVA (analysis of variance) and multivariate DFA (discriminant function analysis) methods using SPSS Ver. 22. Univariate and multivariate morphometric analysis allowed the classification of six Panulirus species based on their specific characters. A total of 51 out of 57 morphometric characters were significantly different (P < 0.05), while only the six characters were not significantly different. Panulirus ornatus is the species with the highest species distance compared to the other five Panulirus species based on DFA analysis (scatter plot). Morphometric analysis to differentiate the six Panulirus species provides more comprehensive information on key morphological identification characters.
- Published
- 2024
- Full Text
- View/download PDF
19. IDENTIFICATION OF MULTIVARIATE OUTLIER IN DIAGNOSIS OF BREAST CANCER.
- Author
-
Kannan, K. Senthamarari and Anitha, S.
- Abstract
Outlier detection is a vital step in many data mining applications. There are numerous ways available to discover outliers in a given data set. Outlier analysis is a fascinating field of data mining. The box-plot is used to identify outliers in a clinical data set. Mahalanobis distances are used to find multivariate outliers, while box plots are used to quickly find univariate outliers. The tests make use of breast tissue data. The elimination of outliers improved the descriptive categorization accuracy of Decision Tree, Discriminant analysis functions, and the closest neighbour approach while decreasing their predictive capacity. Outliers were also assessed personally by qualified physicians, who decided that the majority of the multivariate outliers were actual outliers in their profession. On univariate outliers, the experts occasionally differed with the strategy. When severe values are anticipated, this will happen in diagnostic groups with heterogeneity. The method can be used to identify suspect data or to collect anomalous cases for future examination. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
20. Univariate
- Author
-
Frery, Alejandro C., Finkl, Charles W., Series Editor, Fairbridge, Rhodes W., Series Editor, Daya Sagar, B. S., editor, Cheng, Qiuming, editor, McKinley, Jennifer, editor, and Agterberg, Frits, editor
- Published
- 2023
- Full Text
- View/download PDF
21. Univariate, Multivariate, and Ensemble of Multilayer Perceptron Models for Landslide Movement Prediction: A Case Study of Mandi
- Author
-
Priyanka, Kumar, Praveen, Devi, Arti, Akshay, K., Gaurav, G., Uday, K. V., Dutt, Varun, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Garg, Deepak, editor, Narayana, V. A., editor, Suganthan, P. N., editor, Anguera, Jaume, editor, Koppula, Vijaya Kumar, editor, and Gupta, Suneet Kumar, editor
- Published
- 2023
- Full Text
- View/download PDF
22. Strain Prediction of a Bridge Deploying Autoregressive Models with ARIMA and Machine Learning Algorithms
- Author
-
Psathas, Anastasios Panagiotis, Iliadis, Lazaros, Papaleonidas, Antonios, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Barbosa, Simone Diniz Junqueira, Editorial Board Member, Chen, Phoebe, Editorial Board Member, Cuzzocrea, Alfredo, Editorial Board Member, Du, Xiaoyong, Editorial Board Member, Kara, Orhun, Editorial Board Member, Liu, Ting, Editorial Board Member, Sivalingam, Krishna M., Editorial Board Member, Slezak, Dominik, Editorial Board Member, Washio, Takashi, Editorial Board Member, Yang, Xiaokang, Editorial Board Member, Yuan, Junsong, Editorial Board Member, Iliadis, Lazaros, editor, Maglogiannis, Ilias, editor, Alonso, Serafin, editor, Jayne, Chrisina, editor, and Pimenidis, Elias, editor
- Published
- 2023
- Full Text
- View/download PDF
23. Multivariate and Univariate Anomaly Detection in Machine Learning: A Bibliometric Analysis
- Author
-
Guembe, Blessing, Azeta, Ambrose, Misra, Sanjay, Garg, Lalit, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Garg, Lalit, editor, Sisodia, Dilip Singh, editor, Kesswani, Nishtha, editor, Vella, Joseph G., editor, Brigui, Imene, editor, Misra, Sanjay, editor, and Singh, Deepak, editor
- Published
- 2023
- Full Text
- View/download PDF
24. Multivariate Long-Term Forecasting of T1DM: A Hybrid Econometric Model-Based Approach
- Author
-
Phadke, Rekha, Nagaraj, H. C., Angrisani, Leopoldo, Series Editor, Arteaga, Marco, Series Editor, Panigrahi, Bijaya Ketan, Series Editor, Chakraborty, Samarjit, Series Editor, Chen, Jiming, Series Editor, Chen, Shanben, Series Editor, Chen, Tan Kay, Series Editor, Dillmann, Rüdiger, Series Editor, Duan, Haibin, Series Editor, Ferrari, Gianluigi, Series Editor, Ferre, Manuel, Series Editor, Hirche, Sandra, Series Editor, Jabbari, Faryar, Series Editor, Jia, Limin, Series Editor, Kacprzyk, Janusz, Series Editor, Khamis, Alaa, Series Editor, Kroeger, Torsten, Series Editor, Li, Yong, Series Editor, Liang, Qilian, Series Editor, Martín, Ferran, Series Editor, Ming, Tan Cher, Series Editor, Minker, Wolfgang, Series Editor, Misra, Pradeep, Series Editor, Möller, Sebastian, Series Editor, Mukhopadhyay, Subhas, Series Editor, Ning, Cun-Zheng, Series Editor, Nishida, Toyoaki, Series Editor, Oneto, Luca, Series Editor, Pascucci, Federica, Series Editor, Qin, Yong, Series Editor, Seng, Gan Woon, Series Editor, Speidel, Joachim, Series Editor, Veiga, Germano, Series Editor, Wu, Haitao, Series Editor, Zamboni, Walter, Series Editor, Zhang, Junjie James, Series Editor, Shetty, N. R., editor, Patnaik, L. M., editor, and Prasad, N. H., editor
- Published
- 2023
- Full Text
- View/download PDF
25. Statistical methods and resources for biomarker discovery using metabolomics
- Author
-
Najeha R. Anwardeen, Ilhame Diboun, Younes Mokrab, Asma A. Althani, and Mohamed A. Elrayess
- Subjects
Metabolomics ,Metabolomics tools ,Statistical methods ,Analytical workflow ,Univariate ,Multivariate ,Computer applications to medicine. Medical informatics ,R858-859.7 ,Biology (General) ,QH301-705.5 - Abstract
Abstract Metabolomics is a dynamic tool for elucidating biochemical changes in human health and disease. Metabolic profiles provide a close insight into physiological states and are highly volatile to genetic and environmental perturbations. Variation in metabolic profiles can inform mechanisms of pathology, providing potential biomarkers for diagnosis and assessment of the risk of contracting a disease. With the advancement of high-throughput technologies, large-scale metabolomics data sources have become abundant. As such, careful statistical analysis of intricate metabolomics data is essential for deriving relevant and robust results that can be deployed in real-life clinical settings. Multiple tools have been developed for both data analysis and interpretations. In this review, we survey statistical approaches and corresponding statistical tools that are available for discovery of biomarkers using metabolomics.
- Published
- 2023
- Full Text
- View/download PDF
26. Addressing the Non-Stationarity and Complexity of Time Series Data for Long-Term Forecasts
- Author
-
Ranjai Baidya and Sang-Woong Lee
- Subjects
time series forecasting ,non-stationary ,weak-stationary ,multivariate ,univariate ,spectral decomposition ,Technology ,Engineering (General). Civil engineering (General) ,TA1-2040 ,Biology (General) ,QH301-705.5 ,Physics ,QC1-999 ,Chemistry ,QD1-999 - Abstract
Real-life time series datasets exhibit complications that hinder the study of time series forecasting (TSF). These datasets inherently exhibit non-stationarity as their distributions vary over time. Furthermore, the intricate inter- and intra-series relationships among data points pose challenges for modeling. Many existing TSF models overlook one or both of these issues, resulting in inaccurate forecasts. This study proposes a novel TSF model designed to address the challenges posed by real-life data, delivering accurate forecasts in both multivariate and univariate settings. First, we propose methods termed “weak-stationarizing” and “non-stationarity restoring” to mitigate distributional shift. These methods enable the removal and restoration of non-stationary components from individual data points as needed. Second, we utilize the spectral decomposition of weak-stationary time series to extract informative features for forecasting. To learn features from the spectral decomposition of weak-stationary time series, we exploit a mixer architecture to find inter- and intra-series dependencies from the unraveled representation of the overall time series. To ensure the efficacy of our model, we conduct comparative evaluations against state-of-the-art models using six real-world datasets spanning diverse fields. Across each dataset, our model consistently outperforms or yields comparable results to existing models.
- Published
- 2024
- Full Text
- View/download PDF
27. Evaluation of stability parameters for the selection of stable and superior sunflower genotypes
- Author
-
Birhanu Mengistu and Mohammed Abu
- Subjects
AMMI ,nonparametric ,spearman correlation ,sunflower ,Univariate ,Agriculture ,Food processing and manufacture ,TP368-456 - Abstract
AbstractThe study was conducted to evaluate the performance of 10 sunflower genotypes in 12 environments during 2017 and 2018 cropping seasons with RCBD design. Twenty-nine parametric and non-parametric measures were compared for the stability of seed yield. Highly significant (p 0.001) interactions were found in the combined analyses of variance, with environmental factors contributing to 46.5% of the total variation. The Spearman correlation analysis determines the existence of a positive and significant (p
- Published
- 2023
- Full Text
- View/download PDF
28. AN EXPERIMENTAL INVESTIGATION TO DETERMINE THE FAILURE LOAD OF OPTIMAL HAMMER HEAD PIER CAP AND INTERPOLATE USING UNIVARIATE SPLINES MACHINE LEARNING ALGORITHM.
- Author
-
CHANDRASEKHAR, K. N. V., ESHWARI, D., CHARY, V. NIRANJANA, VARMA, C. SAI ANKITHA, VENKATSAI, G., SAMYUKTHA, C., KUMAR, P. ARUN, REDDY, D. INDU, DHANUSH, N., and BHARATHI, M.
- Subjects
MACHINE learning ,PIERS ,HIGH strength concrete ,FOUNDRY sand ,SPLINES ,INDUSTRIAL wastes - Abstract
The application of structural optimisation is fast changing the thinking of Design Engineers. With the rapid development in high strength concrete and other materials which can resist higher loads, the task of reducing the weight of the structure can be addressed with ease. The present study is an application of topology optimisation of continuum structures. The design domain is modelled using first order basis splines and optimisation is performed using optimality criteria minimising strain energy as the objective function. A model pier cap is chosen with the standard dimensions of 3 feet x 9 in x 4 in. The size and location of openings are determined using topology optimisation and drawn in AutoCAD® software. The casting is done using concrete with different percentages of replacement of cement and fine aggregate. Cement is partially replaced using Alcofine and fine aggregate is partially replaced using waste foundry sand. The foundry sand is an industrial waste obtained from the foundry industry located at Balanagar, Hyderabad. Four specimen beams are cast and tested in the laboratory. Steel fibres are used to care for the tensile stresses produced within the beam. The analysis done here can be applied to any material other than concrete as well. The failure load is determined in the laboratory for each sample. The interpolation of failure load is done using python code and run on Anaconda Jupyter ® platform to determine the value of failure load for any percentage of replacement between 0 to 10%. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
29. euMMD: efficiently computing the MMD two-sample test statistic for univariate data.
- Author
-
Bodenham, Dean A. and Kawahara, Yoshinobu
- Abstract
The maximum mean discrepancy (MMD) test is a nonparametric kernelised two-sample test that, when using a characteristic kernel, can detect any distributional change between two samples. However, when the total number of d -dimensional observations is n , direct computation of the test statistic is O (d n 2) . While approximations with lower computational complexity are known, more efficient methods for computing the exact test statistic are unknown. This paper provides an exact method for computing the MMD test statistic for the univariate case in O (n log n) using the Laplacian kernel. Furthermore, this exact method is extended to an approximate method for d -dimensional real-valued data also with complexity log-linear in the number of observations. Experiments show that this approximate method can have good statistical performance when compared to the exact test, particularly in cases where d > n . [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
30. Large and moderate deviations for a discrete-time marked Hawkes process.
- Author
-
Wang, Haixu
- Subjects
- *
LARGE deviations (Mathematics) , *STOCHASTIC models , *DATA recorders & recording - Abstract
Hawkes process is a continuous-time stochastic model that captures temporal stochastic self-exciting phenomena. In particular, the linear Hawkes process has been well studied and widely used in practice because of its mathematical tractability. However, in some contexts, a Hawkes model is not directly applicable because data is recorded in a discrete-time scheme or an aggregated way. Thus, a discrete-time Hawkes model is appealing for applications. In this paper, we study large and moderate deviations for a discrete-time marked Hawkes process first proposed in Xu, Zhu, and Wang (2020). [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
31. Predicting Saudi Stock Market Index by Using Multivariate Time Series Based on Deep Learning.
- Author
-
Jarrah, Mutasem and Derbali, Morched
- Subjects
STOCK price forecasting ,STOCK price indexes ,TIME series analysis ,DEEP learning ,SUPERVISED learning ,STATISTICAL smoothing ,DECISION making - Abstract
Time-series (TS) predictions use historical data to forecast future values. Various industries, including stock market trading, power load forecasting, medical monitoring, and intrusion detection, frequently rely on this method. The prediction of stock-market prices is significantly influenced by multiple variables, such as the performance of other markets and the economic situation of a country. This study focuses on predicting the indices of the stock market of the Kingdom of Saudi Arabia (KSA) using various variables, including opening, lowest, highest, and closing prices. Successfully achieving investment goals depends on selecting the right stocks to buy, sell, or hold. The output of this project is the projected closing prices over the next seven days, which aids investors in making informed decisions. Exponential smoothing (ES) was employed in this study to eliminate noise from the input data. This study utilized exponential smoothing (ES) to eliminate noise from data obtained from the Saudi Stock Exchange, also known as Tadawul. Subsequently, a sliding-window method with five steps was applied to transform the task of time series forecasting into a supervised learning problem. Finally, a multivariate long short-term memory (LSTM) deep-learning (DL) algorithm was employed to predict stock market prices. The proposed multivariate LSTMDL model achieved prediction rates of 97.49% and 92.19% for the univariate model, demonstrating its effectiveness in stock market price forecasting. These results also highlight the accuracy of DL and the utilization of multiple information sources in stock-market prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
32. Univariate and multivariate analysis of risk factors of gestational diabetes mellitus among pregnant women attending antenatal clinic at three urban health centers of Belagavi - A cross sectional study.
- Author
-
Kansal, Divyae, Kundu, Manish, Singh, Manisha, and Kambar, Sanjay
- Subjects
- *
GESTATIONAL diabetes , *PREGNANT women , *UNIVARIATE analysis , *FACTOR analysis , *MULTIVARIATE analysis , *ECLAMPSIA - Abstract
Background and Objective: Gestational diabetes mellitus (GDM) is defined as any degree of glucose intolerance with the onset or first recognition during pregnancy with or without remission after the end of pregnancy. GDM is associated with increased incidence of maternal hypertension, pre-eclampsia, obstetric intervention and risk of developing Diabetes mellitus (DM) in later life. This present study was conducted to determine the univariate and multivariate analysis of risk factors associated with GDM. And to find out the prevalence of GDM among pregnant women attending antenatal clinic of three Urban Health Centres (UHCs) Methods: This one-year cross sectional study was done in three Urban Health Centres (UHCs)Ram Nagar, Ashok Nagar and Rukmini Nagar which are urban field practice area of Jawaharlal Nehru Medical College, Belagavi. Data was collected from 360 pregnant women attending the antenatal clinic of three UHCs.Information on socio demographic details and risk factors associated with GDM was obtained. Univariate and multivariate analysis of risk factor of gestational diabetes mellitus among pregnant women were done. Diabetes in Pregnancy Study group India (DIPSI criteria) was used to diagnose GDM. Results: Out of total 360 participants, In the present study mean age of study participants was 24.3±3.92 years. The prevalence of GDM in this study was 12.2%. Univariate analysis was done risk factors such as age, socio economic status, gravida, previous history of abortion, family history of diabetes, physical activity is significantly associated with GDM. And in multivariate analysis risk factors such as socio-economic status family history of diabetes, physical activity was significantly associated with GDM. Conclusion: In this study there is a greater risk of GDM in women with increasing age, higher parity, increasing BMI and a family history of diabetes mellitus. There is a need for universal screening to pick up risk factors to prevent gestational diabetes mellitus. [ABSTRACT FROM AUTHOR]
- Published
- 2023
33. Importance of Copula-Based Bivariate Rainfall Intensity-Duration-Frequency Curves for an Urbanized Catchment Incorporating Climate Change.
- Author
-
Suresh, Amrutha and Pekkat, Sreeja
- Subjects
RAINFALL ,CLIMATE change ,MARKOV chain Monte Carlo ,HYDRAULIC structures ,BIVARIATE analysis - Abstract
The intensity-duration-frequency (IDF) curve is a critical input for designing hydraulic infrastructure such as stormwater drainage systems. The last few decades have witnessed drastic changes in rainfall patterns and changes in their extremes, mostly associated with the impact of climate change. Therefore, it is important to study the possible drift in the IDF curve associated with the changing trends in historical rainfall data and future climatic conditions. Based on the understanding of the negative correlation between rainfall intensity and duration, this study used a bivariate copula-based approach for IDF curve development (considering historical rainfall data) and compared it with the conventional empirical models and univariate frequency analysis. The study identified the Frank copula (from 24 candidate copulas) with its parameters estimated by Bayesian inference and a hybrid-evolution Monte Carlo Markov Chain. In addition, the IDF curve was developed for four future climate scenarios corresponding to three different time periods. The proposed methodology was demonstrated for an urban catchment in Northeast India, for which the future climate scenarios were not considered for previous IDF curve development. The rainfall intensity from one of the empirical models and univariate analysis compared well with the corresponding values obtained using the IDF developed using bivariate analysis for short durations (≤3 h). Both the time periods and different climate scenarios had a significant influence on the rainfall intensities compared to the historical bivariate IDF data. The recommendation from this study for this area is that to account for the influence of near future climate change on infrastructure with a design life of 50 years and <50 years , the representative concentration pathway (RCP) 6.0 scenario would yield the critical IDF curve. For long-term planning with a design life >50 years , it is desirable to consider the IDF curve based on the RCP 8.5 scenario. For the intermittent period P2 (2048–2074), both RCP 6.0 and RCP 8.5 exhibited a similar trend in terms of rainfall intensity. The results from this study and the literature recommend comprehensive studies on specific urban catchments to incorporate the impact of climate change on the IDF curve for assessing the adequacy of hydraulic structures from a future perspective. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
34. Statistical methods and resources for biomarker discovery using metabolomics.
- Author
-
Anwardeen, Najeha R., Diboun, Ilhame, Mokrab, Younes, Althani, Asma A., and Elrayess, Mohamed A.
- Subjects
METABOLOMICS ,ECOLOGICAL disturbances ,BIOMARKERS ,LATENT structure analysis ,RISK assessment ,DATA analysis ,STATISTICS - Abstract
Metabolomics is a dynamic tool for elucidating biochemical changes in human health and disease. Metabolic profiles provide a close insight into physiological states and are highly volatile to genetic and environmental perturbations. Variation in metabolic profiles can inform mechanisms of pathology, providing potential biomarkers for diagnosis and assessment of the risk of contracting a disease. With the advancement of high-throughput technologies, large-scale metabolomics data sources have become abundant. As such, careful statistical analysis of intricate metabolomics data is essential for deriving relevant and robust results that can be deployed in real-life clinical settings. Multiple tools have been developed for both data analysis and interpretations. In this review, we survey statistical approaches and corresponding statistical tools that are available for discovery of biomarkers using metabolomics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
35. Statistical Editor’s Practical Advice for Data Analysis
- Author
-
Abu-Zidan, Fikri M., Coccolini, Federico, Series Editor, Coimbra, Raul, Series Editor, Kirkpatrick, Andrew W., Series Editor, Di Saverio, Salomone, Series Editor, Ansaloni, Luca, Editorial Board Member, Balogh, Zsolt, Editorial Board Member, Biffl, Walt, Editorial Board Member, Catena, Fausto, Editorial Board Member, Davis, Kimberly, Editorial Board Member, Ferrada, Paula, Editorial Board Member, Fraga, Gustavo, Editorial Board Member, Ivatury, Rao, Editorial Board Member, Kluger, Yoram, Editorial Board Member, Leppaniemi, Ari, Editorial Board Member, Maier, Ron, Editorial Board Member, Moore, Ernest E., Editorial Board Member, Napolitano, Lena, Editorial Board Member, Peitzman, Andrew, Editorial Board Member, Reilly, Patrick, Editorial Board Member, Rizoli, Sandro, Editorial Board Member, Sakakushev, Boris E., Editorial Board Member, Sartelli, Massimo, Editorial Board Member, Scalea, Thomas, Editorial Board Member, Spain, David, Editorial Board Member, Stahel, Philip, Editorial Board Member, Sugrue, Michael, Editorial Board Member, Velmahos, George, Editorial Board Member, Weber, Dieter, Editorial Board Member, Ceresoli, Marco, editor, Abu-Zidan, Fikri M., editor, and Staudenmayer, Kristan L., editor
- Published
- 2022
- Full Text
- View/download PDF
36. Autoregressive Deep Learning Models for Bridge Strain Prediction
- Author
-
Psathas, Anastasios Panagiotis, Iliadis, Lazaros, Achillopoulou, Dimitra V., Papaleonidas, Antonios, Stamataki, Nikoleta K., Bountas, Dimitris, Dokas, Ioannis M., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Iliadis, Lazaros, editor, Jayne, Chrisina, editor, Tefas, Anastasios, editor, and Pimenidis, Elias, editor
- Published
- 2022
- Full Text
- View/download PDF
37. Efficiency of Linear Univariate Programming Method in Estimating the Parameters Reflecting the Behavior of R.C.C Beam Along the Span
- Author
-
Rakesh, R., Puttaraju, Kolhe, Mohan Lal, editor, Jaju, S. B., editor, and Diagavane, P. M., editor
- Published
- 2022
- Full Text
- View/download PDF
38. Time Series Analysis : Forecasting Tourism Demand with Time Series Analysis
- Author
-
Onder, Irem, Wei, Wenqi, Egger, Roman, Series Editor, and Gretzel, Ulrike, Series Editor
- Published
- 2022
- Full Text
- View/download PDF
39. Artificial Neural Networks and Deep Learning for Genomic Prediction of Binary, Ordinal, and Mixed Outcomes
- Author
-
Montesinos López, Osval Antonio, Montesinos López, Abelardo, Crossa, Jose, Montesinos López, Osval Antonio, Montesinos López, Abelardo, and Crossa, José
- Published
- 2022
- Full Text
- View/download PDF
40. Random Forest for Genomic Prediction
- Author
-
Montesinos López, Osval Antonio, Montesinos López, Abelardo, Crossa, Jose, Montesinos López, Osval Antonio, Montesinos López, Abelardo, and Crossa, José
- Published
- 2022
- Full Text
- View/download PDF
41. Applicability of AutoML to Modeling of Time-Series Data
- Author
-
Kancharla, Ajanta, Raghu Kishore, N., Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Yang, Xin-She, editor, Sherratt, Simon, editor, Dey, Nilanjan, editor, and Joshi, Amit, editor
- Published
- 2022
- Full Text
- View/download PDF
42. Comparing Single and Multiple Imputation Approaches for Missing Values in Univariate and Multivariate Water Level Data.
- Author
-
Umar, Nura and Gray, Alison
- Subjects
STANDARD deviations ,MISSING data (Statistics) ,WATER levels ,DECOMPOSITION method ,RANDOM forest algorithms - Abstract
Missing values in water level data is a persistent problem in data modelling and especially common in developing countries. Data imputation has received considerable research attention, to raise the quality of data in the study of extreme events such as flooding and droughts. This article evaluates single and multiple imputation methods used on monthly univariate and multivariate water level data from four water stations on the rivers Benue and Niger in Nigeria. The missing completely at random, missing at random and missing not at random data mechanisms were each considered. The best imputation method is identified using two error metrics: root mean square error and mean absolute percentage error. For the univariate case, the seasonal decomposition method is best for imputing missing values at various missingness levels for all three missing mechanisms, followed by Kalman smoothing, while random imputation is much poorer. For instance, for 5% missing data for the Kainji water station, missing completely at random, the Kalman smoothing, random and seasonal decomposition methods had average root mean square errors of 13.61, 102.60 and 10.46, respectively. For the multivariate case, missForest is best, closely followed by k nearest neighbour for the missing completely at random and missing at random mechanisms, and k nearest neighbour is best, followed by missForest, for the missing not at random mechanism. The random forest and predictive mean matching methods perform poorly in terms of the two metrics considered. For example, for 10% missing data missing completely at random for the Ibi water station, the average root mean square errors for random forest, k nearest neighbour, missForest and predictive mean matching were 22.51, 17.17, 14.60 and 25.98, respectively. The results indicate that the seasonal decomposition method, and missForest or k nearest neighbour methods, can impute univariate and multivariate water level missing data, respectively, with higher accuracy than the other methods considered. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
43. Algorithm 1031: MQSI—Monotone Quintic Spline Interpolation.
- Author
-
LUX, THOMAS, WATSON, LAYNE T., CHANG, TYLER, and THACKER, WILLIAM
- Subjects
- *
INTERPOLATION , *SPLINES , *ALGORITHMS , *FORTRAN , *SENSITIVITY analysis , *PERFORMANCE theory - Abstract
MQSI is a Fortran 2003 subroutine for constructing monotone quintic spline interpolants to univariate monotone data. Using sharp theoretical monotonicity constraints, first and second derivative estimates at data provided by a quadratic facet model are refined to produce a univariate C2 monotone interpolant. Algorithm and implementation details, complexity and sensitivity analyses, usage information, a brief performance study, and comparisons with other spline approaches are included. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
44. Petroleum Price Prediction with CNN-LSTM and CNN-GRU Using Skip-Connection.
- Author
-
Kim, Gun Il and Jang, Beakcheol
- Subjects
- *
PETROLEUM sales & prices , *RECURRENT neural networks , *CONVOLUTIONAL neural networks , *STANDARD deviations , *PEARSON correlation (Statistics) - Abstract
Crude oil plays an important role in the global economy, as it contributes one-third of the energy consumption worldwide. However, despite its importance in policymaking and economic development, forecasting its price is still challenging due to its complexity and irregular price trends. Although a significant amount of research has been conducted to improve forecasting using external factors as well as machine-learning and deep-learning models, only a few studies have used hybrid models to improve prediction accuracy. In this study, we propose a novel hybrid model that captures the finer details and interconnections between multivariate factors to improve the accuracy of petroleum oil price prediction. Our proposed hybrid model integrates a convolutional neural network and a recurrent neural network with skip connections and is trained using petroleum oil prices and external data directly accessible from the official website of South Korea's national oil corporation and the official Yahoo Finance site. We compare the performance of our univariate and multivariate models in terms of the Pearson correlation, mean absolute error, mean squared error, root mean squared error, and R squared ( R 2 ) evaluation metrics. Our proposed models exhibited significantly better performance than the existing models based on long short-term memory and gated recurrent units, showing correlations of 0.985 and 0.988, respectively, for 10-day price predictions and obtaining better results for longer prediction periods when compared with other deep-learning models. We validated that our proposed model with skip connections outperforms the benchmark models and showed that the convolutional neural network using gated recurrent units with skip connections is superior to the compared models. The findings suggest that, to some extent, relying on a single source of data is ineffective in predicting long-term changes in oil prices, and thus, to develop a better prediction model based on time-series based data, it is necessary to take a multivariate approach and develop an efficient computational model with skip connections. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
45. Meta-Analysis and Multivariate GWAS Analyses in 77,850 Individuals of African Ancestry Identify Novel Variants Associated with Blood Pressure Traits.
- Author
-
Udosen, Brenda, Soremekun, Opeyemi, Kamiza, Abram, Machipisa, Tafadzwa, Cheickna, Cisse, Omotuyi, Olaposi, Soliman, Mahmoud, Wélé, Mamadou, Nashiru, Oyekanmi, Chikowore, Tinashe, and Fatumo, Segun
- Subjects
- *
DIASTOLIC blood pressure , *MULTIVARIATE analysis , *HYPERTENSION , *DISEASE risk factors , *GENETIC variation , *BLOOD pressure - Abstract
High blood pressure (HBP) has been implicated as a major risk factor for cardiovascular diseases in several populations, including individuals of African ancestry. Despite the elevated burden of HBP-induced cardiovascular diseases in Africa and other populations of African descent, limited genetic studies have been carried out to explore the genetic mechanism driving this phenomenon. We performed genome-wide association univariate and multivariate analyses of both systolic (SBP) and diastolic blood pressure (DBP) traits in 77, 850 individuals of African ancestry. We used summary statistics data from six independent cohorts, including the African Partnership for Chronic Disease Research (APCDR), the UK Biobank, and the Million Veteran Program (MVP). FUMA was used to annotate, prioritize, visualize, and interpret our findings to gain a better understanding of the molecular mechanism(s) underlying the genetics of BP traits. Finally, we undertook a Bayesian fine-mapping analysis to identify potential causal variants. Our meta-analysis identified 10 independent variants associated with SBP and 9 with DBP traits. Whilst our multivariate GWAS method identified 21 independent signals, 18 of these SNPs have been previously identified. SBP was linked to gene sets involved in biological processes such as synapse assembly and cell–cell adhesion via plasma membrane adhesion. Of the 19 independent SNPs identified in the BP meta-analysis, only 11 variants had posterior probability (PP) of > 50%, including one novel variant: rs562545 (MOBP, PP = 77%). To facilitate further research and fine-mapping of high-risk loci/variants in highly susceptible groups for cardiovascular disease and other related traits, large-scale genomic datasets are needed. Our findings highlight the importance of including ancestrally diverse populations in large GWASs and the need for diversity in genetic research. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Comparing statistical process control charts for fault detection in wastewater treatment
- Author
-
H. L. Marais, V. Zaccaria, and M. Odlare
- Subjects
fault detection ,process efficiency ,process monitoring ,statistical control charts ,univariate ,wastewater treatment ,Environmental technology. Sanitary engineering ,TD1-1066 - Abstract
Fault detection is an important part of process supervision, especially in processes where there are strict requirements on the process outputs like in wastewater treatment. Statistical control charts such as Shewhart charts, cumulative sum (CUSUM) charts, and exponentially weighted moving average (EWMA) charts are common univariate fault detection methods. These methods have different strengths and weaknesses that are dependent on the characteristics of the fault. To account for this the methods in their base forms were tested with drift and bias sensor faults of different sizes to determine the overall performance of each method. Additionally, the faults were detected using two different sensors in the system to see how the presence of active process control influenced fault detectability. The EWMA method performed best for both fault types, specifically the drift faults, with a low false alarm rate and good detection time in comparison to the other methods. It was shown that decreasing the detection time can effectively reduce excess energy consumption caused by sensor faults. Additionally, it was shown that monitoring a manipulated variable has advantages over monitoring a controlled variable as set-point tracking hides faults on controlled variables; lower missed detection rates are observed using manipulated variables. HIGHLIGHTS The best fault detection performance was obtained with the EWMA chart.; Manipulated variable monitoring improves controlled variable sensor fault detection.; Fault detection in wastewater treatment processes can improve the energy efficiency.;
- Published
- 2022
- Full Text
- View/download PDF
47. Gaining or Losing Perspective for Piecewise-Linear Under-Estimators of Convex Univariate Functions.
- Author
-
Lee, Jon, Skipper, Daphne, Speakman, Emily, and Xu, Luze
- Subjects
- *
CONVEX functions , *VARIABLE costs , *CONVEX domains , *OVERHEAD costs , *COMBINATORIAL optimization - Abstract
We study mixed-integer nonlinear optimization (MINLO) formulations of the disjunction x ∈ { 0 } ∪ [ ℓ , u ] , where z is a binary indicator for x ∈ [ ℓ , u ] ( 0 ≤ ℓ < u ), and y "captures" f(x), which is assumed to be convex and positive on its domain [ ℓ , u ] , but otherwise y = 0 when x = 0 . This model is very useful in nonlinear combinatorial optimization, where there is a fixed cost c for operating an activity at level x in the operating range [ ℓ , u ] , and then, there is a further (convex) variable cost f(x). So the overall cost is c z + f (x) . In applied situations, there can be N 4-tuples (f , ℓ , u , c) , and associated (x, y, z), and so, the combinatorial nature of the problem is that for any of the 2 N choices of the binary z-variables, the non-convexity associated with each of the (f , ℓ , u) goes away. We study relaxations related to the perspective transformation of a natural piecewise-linear under-estimator of f, obtained by choosing linearization points for f. Using 3-d volume (in (x, y, z)) as a measure of the tightness of a convex relaxation, we investigate relaxation quality as a function of f, ℓ , u, and the linearization points chosen. We make a detailed investigation for convex power functions f (x) : = x p , p > 1 . [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. A Influência da certificação de qualidade na performance das grandes empresas portuguesas.
- Author
-
Peres Moreira, Cândido Jorge, Baptista Pinheiro, Pedro Miguel, Carvalho Terrinca, Catarina, Custodio Cristovão, Domingos, Afonso Geraldes, João Manuel, and Guerreiro Antão, Mario Alexandre
- Abstract
Copyright of GeSec: Revista de Gestao e Secretariado is the property of Sindicato das Secretarias e Secretarios do Estado de Sao Paulo (SINSESP) and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use. This abstract may be abridged. No warranty is given about the accuracy of the copy. Users should refer to the original published version of the material for the full abstract. (Copyright applies to all Abstracts.)
- Published
- 2023
- Full Text
- View/download PDF
49. Usage of Clustering and Weighted Nearest Neighbors for Efficient Missing Data Imputation of Microarray Gene Expression Dataset.
- Author
-
Dubey, Aditya and Rasool, Akhtar
- Subjects
- *
GENE expression , *MISSING data (Statistics) , *MULTIPLE imputation (Statistics) , *NEAREST neighbor analysis (Statistics) , *HUMAN error , *RESEARCH methodology , *NEIGHBORS - Abstract
A complete dataset is essential for most bioinformatics analytical techniques, including gene expression data categorization, prognosis, and prediction. Due to sensor malfunction, software inability, or human error, the gene sample value may be missing. In gene expression experiments, missing data has a massive effect on analyzing the data obtained. Consequently, this has become a crucial issue requiring an efficient imputation technique to address. This research provided a technique for predicting missing values by using clustering and top K closest neighbor techniques that consider the local similarity pattern. The K‐means method is integrated with a spectral clustering methodology. After optimizing the clustering parameters, cluster size, and weighting criteria, missed gene sample values are estimated. The top K closest neighbor method uses weighted distance to predict the missed gene sample value falling in a specific cluster. Experimental outcomes show that the suggested imputation methodology generates efficient predictions compared to existing imputation techniques. In this research, microarray datasets comprising information from various cancers and tumors are used to experiment with the imputation performance. The primary contribution of this work is that even if the microarray dataset has varied dimensions and features, local similarity‐based approaches may be employed for missing value prediction. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
50. Idle State Detection with an Autoregressive Multiple Model Probabilistic Framework in SSVEP-Based Brain-Computer Interfaces
- Author
-
Zerafa, Rosanne, Camilleri, Tracey, Falzon, Owen, Camilleri, Kenneth P., Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Ye, Xuesong, editor, Soares, Filipe, editor, De Maria, Elisabetta, editor, Gómez Vilda, Pedro, editor, Cabitza, Federico, editor, Fred, Ana, editor, and Gamboa, Hugo, editor
- Published
- 2021
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.