Descriptor: "Régression" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Régression"' showing total 76,243 results

Start Over Descriptor "Régression"

76,243 results on '"Régression"'

1. Case Study: Leveraging GenAI to Build AI-based Surrogates and Regressors for Modeling Radio Frequency Heating in Fusion Energy Science

Author: Bethel, Edward
Subjects: generative AI, surrogate modeling, regression, model optimization, fusion energy science, AI-assisted code development and optimization
Abstract: This work presents a detailed case study on using Generative AI (GenAI) to develop AI surrogates for simulation models in fusion energy research. The scope includes the methodology, implementation, and results of using GenAI to assist in model development and optimization, comparing these results with previous manually developed models.
Published: 2024

2. Sex Differences in Hydration Biomarkers and Test–Retest Reliability Following Passive Dehydration.

Author: Doherty, Colin S., Fortington, Lauren V., and Barley, Oliver R.
Subjects: *BIOMARKERS, *HYDRATION, *STATISTICAL reliability, *HEMATOCRIT, *REGRESSION analysis, *SEX distribution, *DEHYDRATION, *DESCRIPTIVE statistics, *INTRACLASS correlation, *BODY mass index, *OSMOLAR concentration
Abstract: This study investigated (a) differences between males and females for changes in serum, tear, and urine osmolality, hematocrit, and urine specific gravity following acute passive dehydration and (b) assessed the reliability of these biomarkers separately for each sex. Fifteen males (age: 26.3 ± 3.5 years, body mass: 76 ± 7 kg) and 15 females (age: 28.8 ± 6.4 years, body mass: 63 ± 7 kg) completed a sauna protocol twice (5–28 days apart), aiming for 4% body mass loss (BML). Urine, blood, and tear markers were collected pre- and postdehydration, and change scores were calculated. Male BML was significantly greater than that of females in Trial 1 (3.53% ± 0.55% vs. 2.53% ± 0.43%, p <.001) and Trial 2 (3.36% ± 0.66% vs. 2.53% ± 0.44%, p =.01). Despite significant differences in BML, change in hematocrit was the only change marker that displayed a significant difference in Trial 1 (males: 3% ± 1%, females: 2% ± 1%, p =.004) and Trial 2 (males: 3% ± 1%, females: 1% ± 1%, p =.008). Regression analysis showed a significant effect for sex (male) predicting change in hematocrit (β = 0.8, p =.032) and change in serum osmolality (β = −3.3, p =.005) when controlling for BML but not for urinary or tear measures. The intraclass correlation coefficients for females (ICC 2, 1) were highest for change in urine specific gravity (ICC =.62, p =.006) and lowest for change in tear osmolarity (ICC = −.14, p =.689), whereas for males, it was posthematocrit (ICC =.65, p =.003) and post tear osmolarity (ICC =.18, p =.256). Generally, biomarkers showed lower test–retest reliability in males compared with females but, overall, were classified as poor–moderate in both sexes. These findings suggest that the response and reliability of hydration biomarkers are sex specific and highlight the importance of accounting for BML differences. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

3. Prediction of Model Generated Patellofemoral Joint Contact Forces Using Principal Component Prediction and Reconstruction.

Author: Ashall, Myles, Wheatley, Mitchell G.A., Saliba, Chris, Deluzio, Kevin J., and Rainbow, Michael J.
Subjects: KNEE joint, BIOLOGICAL models, GAIT in humans, REGRESSION analysis, BIOFEEDBACK training, WALKING, FACTOR analysis, DIAGNOSIS, PREDICTION models, THREE-dimensional printing, KINEMATICS
Abstract: It is not currently possible to directly and noninvasively measure in vivo patellofemoral joint contact force during dynamic movement; therefore, indirect methods are required. Simple models may be inaccurate because patellofemoral contact forces vary for the same knee flexion angle, and the patellofemoral joint has substantial out-of-plane motion. More sophisticated models use 3-dimensional kinematics and kinetics coupled to a subject-specific anatomical model to predict contact forces; however, these models are time consuming and expensive. We applied a principal component analysis prediction and regression method to predict patellofemoral joint contact forces derived from a robust musculoskeletal model using exclusively optical motion capture kinematics (external approach), and with both patellofemoral and optical motion capture kinematics (internal approach). We tested this on a heterogeneous population of asymptomatic subjects (n = 8) during ground-level walking (n = 12). We developed equations that successfully capture subject-specific gait characteristics with the internal approach outperforming the external. These approaches were compared with a knee-flexion based model in literature (Brechter model). Both outperformed the Brechter model in interquartile range, limits of agreement, and the coefficient of determination. The equations generated by these approaches are less computationally demanding than a musculoskeletal model and may act as an effective tool in future rapid gait analysis and biofeedback applications. [ABSTRACT FROM AUTHOR]
Published: 2023
Full Text: View/download PDF

4. Regression Model Approach Towards Concrete Compressive Strength Prediction and Evaluation

Author: Mahesh, Vijayalakshmi G. V., Achyutha Gowda, CP, Krishna, Alla Vamsi, Kumar, Leti Manish, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Geetha, R., editor, Dao, Nhu-Ngoc, editor, and Khalid, Saeed, editor
Published: 2025
Full Text: View/download PDF

5. Reg-TTA3D: Better Regression Makes Better Test-Time Adaptive 3D Object Detection

Author: Yuan, Jiakang, Zhang, Bo, Gong, Kaixiong, Yue, Xiangyu, Shi, Botian, Qiao, Yu, Chen, Tao, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Leonardis, Aleš, editor, Ricci, Elisa, editor, Roth, Stefan, editor, Russakovsky, Olga, editor, Sattler, Torsten, editor, and Varol, Gül, editor
Published: 2025
Full Text: View/download PDF

6. Adversarial Robustness Certification for Bayesian Neural Networks

Author: Wicker, Matthew, Patane, Andrea, Laurenti, Luca, Kwiatkowska, Marta, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Platzer, André, editor, Rozier, Kristin Yvonne, editor, Pradella, Matteo, editor, and Rossi, Matteo, editor
Published: 2025
Full Text: View/download PDF

7. Machine Learning Concepts

Author: Gupta, Pramod, Sehgal, Naresh Kumar, Acken, John M., Gupta, Pramod, Sehgal, Naresh Kumar, and Acken, John M.
Published: 2025
Full Text: View/download PDF

8. Therapies for Down Syndrome Regression Disorder

Author: Children's Hospital Los Angeles
Published: 2024

9. Solution path algorithm for distributionally robust regression.

Author: Tang, Guangrui and Fan, Neng
Subjects: *ROBUST optimization, *MATHEMATICAL optimization, *REGRESSION analysis, *ALGORITHMS
Abstract: In this paper, we propose a general distributionally robust regression model based on distributionally robust optimization theory. The proposed model has a piecewise linear loss function and elastic net penalty term, and it generalizes many other regression models. We prove the piecewise linear property of the optimal solutions to this model, which enables us to develop a solution path algorithm for the hyperparameter tuning. A Doubly regularized Least Absolute Deviations (DrLAD) regression model is proposed based on this framework, and a solution path algorithm is developed to speed up the tuning of two hyperparameters in this model. Numerical experiments are implemented to validate the performance of this model and the computational efficiency of the solution path algorithm. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

10. Learning curve analyses in spine surgery: a systematic simulation-based critique of methodologies.

Author: McNamee, Conor, Keraidi, Salman, McDonnell, Jake, Kelly, Andrew, Wall, Julia, Darwish, Stacey, and Butler, Joseph S.
Subjects: *LEARNING curve, *CUSUM technique, *HOLISTIC education, *EDUCATIONAL outcomes, *SAMPLE size (Statistics), *SPINAL surgery
Abstract: Various statistical approaches exist to delineate learning curves in spine surgery. Techniques range from dividing cases into intervals for metric comparison, to employing regression and cumulative summation (CUSUM) analyses. However, their inherent inconsistencies and methodological flaws limit their comparability and reliability. To critically evaluate the methodologies used in existing literature for studying learning curves in spine surgery and to provide recommendations for future research. Systematic literature review. A comprehensive literature search was conducted using PubMed, Embase, and Scopus databases, covering articles from January 2010 to September 2023. For inclusion, articles had to evaluate the change in a metric of performance during human spine surgery across time/a case series. Results had to be reported in sufficient detail to allow for evaluation of individual performance rather than group/institutional performance. Articles were excluded if they included cadaveric/nonhuman subjects, aggregated performance data or no way to infer change across a number of cases. Risk of bias was assessed using the Risk of Bias in Nonrandomized Studies of Interventions (ROBINS-I) tool. Surgical data were simulated using Python 3 and then examined via multiple commonly used analytic approaches including division into consecutive intervals, regression and CUSUM techniques. Results were qualitatively assessed to determine the effectiveness and limitations of each approach in depicting a learning curve. About 113 studies met inclusion criteria. The majority of the studies were retrospective and evaluated a single-surgeon's experience. Methods varied considerably, with 66 studies using a single proficiency metric and 47 using more than 1. Operating time was the most commonly used metric. Interval division was the simplest and most commonly used method yet inherent limitations prevent collective synthesis. Regression may accurately describe the learning curve but in practice is hampered by sample size and model choice. CUSUM analyses are of widely varying quality with some being fundamentally flawed and widely misinterpreted however, others provide a reliable view of the learning process. There is considerable variation in the quality of existing studies on learning curves in spine surgery. CUSUM analyses, when correctly applied, offer the most reliable estimates. To improve the validity and comparability of future studies, adherence to methodological guidelines is crucial. Multiple or composite performance metrics are necessary for a holistic understanding of the learning process. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

11. Understanding the energy behavior of households in the mountainous town of Metsovo, Greece.

Author: Balaskas, Anastasios, Karani, Ioanna, Katsoulakos, Nikolas, Damigos, Dimitris, and Kaliampakos, Dimitris
Abstract: This article is a methodical attempt to understand the factors that influence energy consumption in households in the mountainous settlement of Metsovo, Greece. So far, most of the research on the settlement has indirectly approached the investigation of the factors that shape the energy behavior of households. In the present research, the identification of factors is directly approached through linear regression and clustering methods. Income, heating system, and household size were identified as the main factors influencing household energy expenditure. Since mountain areas are plagued by energy poverty, the study of household energy behavior inevitably highlights aspects of this phenomenon. By highlighting these factors and the spatial dimension of energy consumption (i.e., higher thermal energy needs in mountain areas), it was possible to suggest more targeted measures specifically designed for mountain areas, complementing the existing energy policy. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

12. Geological Setting and Formation of the Erosional Structure of Upper Miocene Deposits in Western Ciscaucasia.

Author: Postnikova, I. S., Patina, I. S., and Gorkin, G. M.
Subjects: *GEOLOGICAL formations, *VALLEYS, *MIOCENE Epoch, *AREA studies, *SEDIMENTS
Abstract: The results of regional studies of Early Miocene deposits in Western Ciscaucasia, based on the seismic stratigraphic analysis, are presented. The spatial pattern of sediment accumulation is analyzed and the paleogeographic conditions during the Late Miocene regressive stages in Western Ciscaucasia are refined. Erosional incisions of several levels, developed during the base-level fall in the course of major regressions in the studied time interval, were identified. Based on the spatial correlation of paleoincisions using a chosen series of intersecting seismic profiles, the buried Paleo-Don and Paleo-Donets river valleys were reconstructed at the Sarmatian‒Meotian boundaries and within the Meotian‒upper Pontian interval. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

13. Türkiye'nin toplam sağlık harcaması tahmini için trend-artık ayrıştırması temelli bir modelleme yaklaşımı.

Author: Yardımcı, Rezzan and Boğar, Eşref
Abstract: Accurate forecasting of health expenditures is a fundamental issue for the sustainability of health systems and policies. In this study, a trend-residual decomposition-based model is proposed to forecast Türkiye's total healthcare expenditure with high accuracy. The proposed model has a two-stage forecasting procedure. In the first stage, the trend of the health expenditure time series is determined using polynomial regression. In the second stage, a residual model with optimized linear parameters by least squares estimation method and non-linear parameters by neural network algorithm is proposed to model the detrending part of the time series. The performance of the proposed model using healthcare expenditure data for the years 1999-2021 are compared with grey models, regression models, exponential smoothing models and ARIMA models. The results obtained by using the years 1999-2017 for training and the years 2018-2021 for test demonstrate that the proposed model has better modeling and forecasting performance than other models. Therefore, Türkiye's total healthcare expenditure for the years 2022-2030 has been forecasted with the proposed model and it is predicted that it will reach 2.2 trillion TL in 2030. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

14. Radyosonde rasatları ile makine öğrenmesi tabanlı hava durumu kestirimi.

Author: Göğen, Eralp and Güney, Selda
Abstract: From the past to the present, weather forecasting holds significant importance for humanity. The precise execution of weather forecasting enables the implementation of precautions against natural disasters such as floods, tsunamis, etc., thereby minimizing the adverse effects that may arise. In this study, weather prediction is conducted using Radiosonde data. Within this prediction, estimations for both the highest and lowest temperatures are made employing machine learning algorithms. Unlike previous temperature prediction studies in the literature, a three-year dataset of Radiosonde observations is utilized. This dataset, measured at intervals of 1mbar up to an altitude of 40 km from the ground, allows for a more accurate modeling of the atmosphere compared to other studies in the literature. In this model, predictions for the highest and lowest temperatures for the next day are made. In this stage, the effects of normalization, feature extraction, or selection on the results are analyzed, and the most suitable model for prediction is determined. The software, implemented in the MATLAB environment, compares different regression methods. As a result of these analyses, utilizing the Gaussian Process Regression (GPR) method, the highest temperature prediction for the next day is achieved with the highest accuracy, with a mean square root deviation of 1.2. Using the same method, the lowest temperature prediction is made with a mean square root deviation ratio of 2.4. The results indicate more successful temperature predictions compared to studies in the literature. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

15. Performance Evaluation and Comparison of Deep Neural Network Models for African Soil Properties Prediction.

Author: Egeonu, Darlington and Jia, Bochen
Subjects: *ARTIFICIAL neural networks, *DISCRETE wavelet transforms, *STANDARD deviations, *CHEMICAL sample preparation, *SOIL scientists
Abstract: The significance of soil in agriculture and human survival cannot be overstated. Given its limited supply, improving soil properties is imperative. This study addresses this challenge by predicting five major soil properties—organic carbon, pH values, Mehlich-3 extractable calcium, Mehlich-3 extractable phosphorus, and sand content—utilizing mid-infrared absorbance measurements from the African Soil Information Service (AfSIS) dataset, covering non-desert regions of Africa. With a pressing need for a reliable and efficient model to predict African soil properties using spectral measurements, this study fills a crucial gap, addressing the scarcity of functional soil property databases in Africa. The developed model eliminates costly soil sample preparation and lengthy chemical analysis, applicable in both onsite and laboratory settings for determining soil functional properties. By employing stacked autoencoders for feature dimensionality reduction and combining discrete wavelet transform with two feature selection methods (stepwise regression and random forest) to build robust multi-layer perceptron (MLP) models, the study offers a comprehensive approach. Evaluation metrics including root mean square error (RMSE), scatter index (SI), Variance Accounted For (VAF), Nash-Sutcliffe Model Efficiency (NSE), and Correlation Coefficient (R) demonstrate the superior performance of the MLP model informed by stacked autoencoder-selected features, outperforming models informed by wavelet-transformed features and partial least square regression. This best-performing autoencoder-based model presents a valuable tool for soil scientists tasked with modeling African soil properties. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

16. Estimation of nitrate content in tomato using image features.

Author: Nassiri, Seyed Mehdi, Nematollahi, Mohammad Amin, Jafari, Abdolabbas, and Salamrudi, Peyman
Subjects: *ARTIFICIAL neural networks, *NITROGEN fertilizers, *TOMATOES, *NITRATES, *AGRICULTURAL productivity, *IMAGE processing
Abstract: The improper use of chemical fertilizers in crop production can result in unsafe food sources for consumers. This research focuses on estimating the accumulation of nitrate in tomatoes by analyzing images of tomato tissues. The experiments were conducted using a completely randomized design with four nitrogen levels: 400, 800, 1200, and 1600 kg.ha-1 . Fifty samples were randomly selected from each treatment to create images for feature processing and develop a prediction model. The samples were sliced to a consistent thickness, and their images were prepared. The nitrate contents of the same samples were then measured in the laboratory. Color features, including R, G, and B color components, as well as non -color features such as white pixel area (WPA), total slice surface area (TSA), and the ratio of white pixel area to total slice surface area (WPA/TSA), were extracted from the images. The results showed that the nitrate content of the samples increased significantly (P<0.05) in response to the applied nitrogen fertilizer, with measurements of 1.6%, 2.7%, 2.8%, and 3.3%, respectively. Moreover, a strong correlation was found between the color components, WPA, TSA, WPA/TSA, and nitrate accumulation in the samples. Multiple regression and multilayer perceptron neural network (MLP) models were employed to predict the nitrate content. The best subset method was used to build an appropriate regression model. Various topologies and transform functions were applied to identify the best MLP model. The results indicated that an MLP model with a 3 -15 -1 topology and the lowest mean relative percentage error (MRPE) was the most accurate neural network model. The final regression and neural network models were validated using 60 intact samples. The neural network model achieved a MRPE of approximately 3.5%, demonstrating its precise estimation of nitrate contents compared to the regression model with an MRPE of around 5.2%. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

17. Understanding Academic and Athletic Transfer Patterns for Latina/o College Athletes.

Author: Ortega, Guillermo
Subjects: *COLLEGE athletes, *COACHES (Athletics), *UNIVERSITY faculty, *ATHLETES, *GENDER
Abstract: Using the Student Athlete Climate Dataset, this paper examined factors associated with Latina/o college athletes' intent to transfer for academic and athletic reasons. This study offers insight regarding how gender, NCAA Division, and geographic location can influence Latina/o college athletes' decision to transfer. In addition, the roles of faculty members and head coaches were significant in Latina/o college athletes' intent to transfer for athletic reasons. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

18. School-Level Longitudinal Predictors of Alcohol, Cigarette, and Marijuana Use.

Author: Hansen, William B., Beamon, Emily, Orsini, Muhsin Michael, and Wyrick, David L.
Subjects: *BINGE drinking, *DRUG accessibility, *INDEPENDENT variables, *ALCOHOL drinking, *SOCIAL norms
Abstract: This study analyzed measures aggregated at the school level to identify key predictors of drinking alcohol, binge drinking, smoking cigarettes, and using marijuana. Using data collected from 6th through 12th grade students between 2011 and 2015, we identify school-level variables that predict school-level prevalence in the subsequent year. Data included prior year assessments of: (1) school-wide prevalence, (2) perceived ease of access to drugs, (3) perceived adult disapproval of drug use, (4) perceived peer disapproval of drug use, and (5) perceived prevalence of drug use. We regressed grade-level behaviors on predictor variables from the previous school year. In middle schools, prior grade prevalence and prior grade perceived norms were significant predictors of subsequent grade prevalence. For high schools, prior year prevalence, aggregated peer norms, and perceived ease of access predicted subsequent use. These analyses provide evidence that a school's culture is predictive of changes in prevalence over time. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

19. The Effect of One Year Aneurysm Sac Dynamics on Five Year Mortality and Continued Aneurysm Sac Evolution.

Author: Li, Chun, Böckler, Dittmar, Rastogi, Vinamr, Verhagen, Hence J.M., Loftus, Ian M., Reijnen, Michel M.P.J., Arko, Frank R., Guo, Jia, and Schermerhorn, Marc L.
Abstract: One year aneurysm sac dynamics after endovascular abdominal aortic aneurysm repair (EVAR) were independently associated with a greater all cause mortality risk in prior registry studies but were limited in completeness and granularity. This retrospective analysis aimed to study the impact of sac dynamics on survival within the Endurant Stent Graft Global Registry (ENGAGE) with five year follow up. A total of 1 263 subjects were enrolled in the ENGAGE Registry between March 2009 and April 2011. One year aneurysm sac changes were calculated between the one month post-operative imaging scans and the scan closest to the time of one year follow up. Sac regression was defined as a sac decrease of ≥ 5 mm and sac expansion as aneurysm sac growth ≥ 5 mm. The primary outcome was the five year all cause mortality rate. Kaplan–Meier estimates for freedom from all cause death were calculated. Multivariable Cox regression was used to determine the association between sac dynamics and all cause death. At one year, 441 of the 949 study participants with appropriate imaging (46%) had abdominal aortic aneurysm sac regression, 462 (49%) remained stable, and 46 (4.8%) had sac expansion. For patients with sac regression, the five year all cause mortality rate was 20%, compared with 28% for stable sac (p =.007) and 37% for the sac expansion (p =.010) cohorts. After adjustment, the sac expansion and stable sac cohorts were associated with a greater all cause mortality rate (expansion: hazard ratio [HR] 1.8; 95% CI 1.1 – 3.2; p =.032; stable: HR 1.4; 95% CI 1.1 – 1.9; p =.019). In the ENGAGE Global Registry, the one year rate of sac regression was 46%, and one year sac regression was observed to be associated with greater five year survival, corroborating prior findings using data from vascular registries. Sac regression could become the new standard for success after EVAR. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

20. Mapping the World Health Organization Disability Assessment Scale 2.0 to the EQ-5D-5L in patients with mental disorders.

Author: Abdin, Edimansyah, Seet, Vanessa, Jeyagurunathan, Anitha, Tan, Sing Chik, Mohmad Khalid, Muhammad Iskandar Shah, Mok, Yee Ming, Verma, Swapna, and Subramaniam, Mythily
Abstract: Objective: The current study aims to develop an algorithm for mapping the WHODAS 2.0 to the EQ-5D-5 L for patients with mental disorders. Methods: This cross-sectional study was conducted at the Institute of Mental Health and Community Wellness Clinics in Singapore between June 2019 and November 2022. We included four regression methods including the Ordinary Least Square (OLS) regression, the Tobit regression model (Tobit), the robust regression with MM estimator (MM), and the adjusted limited dependent variable mixture model (ALDVMM) to map EQ-5D-5 L utility scores from the WHODAS 2.0. Results: A total of 797 participants were included. The mean EQ-5D-5 L utility and WHODAS 2.0 total scores were 0.615 (SD = 0.342) and 11.957 (SD = 8.969), respectively. We found that the EQ-5D-5 L utility score was best predicted by the robust regression model with the MM estimator. Our findings suggest that the WHODAS 2.0 total scores were significantly and inversely associated with the EQ-5D-5 L utility scores. Conclusion: This study provides a mapping algorithm for converting the WHODAS 2.0 scores into EQ-5D-5 L utility scores which can be implemented using a simple online calculator in the following web application: . [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

21. Testing for Equivalence of Pre-Trends in Difference-in-Differences Estimation.

Author: Dette, Holger and Schumann, Martin
Subjects: FIXED effects model, REGRESSION analysis, NULL hypothesis, MATHEMATICAL statistics, RESEARCH personnel
Abstract: The plausibility of the "parallel trends assumption" in Difference-in-Differences estimation is usually assessed by a test of the null hypothesis that the difference between the average outcomes of both groups is constant over time before the treatment. However, failure to reject the null hypothesis does not imply the absence of differences in time trends between both groups. We provide equivalence tests that allow researchers to find evidence in favor of the parallel trends assumption and thus increase the credibility of their treatment effect estimates. While we motivate our tests in the standard two-way fixed effects model, we discuss simple extensions to settings in which treatment adoption is staggered over time. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

22. Lessons on Datasets and Paradigms in Machine Learning for Symbolic Computation: A Case Study on CAD.

Author: del Río, Tereso and England, Matthew
Abstract: Symbolic Computation algorithms and their implementation in computer algebra systems often contain choices which do not affect the correctness of the output but can significantly impact the resources required: such choices can benefit from having them made separately for each problem via a machine learning model. This study reports lessons on such use of machine learning in symbolic computation, in particular on the importance of analysing datasets prior to machine learning and on the different machine learning paradigms that may be utilised. We present results for a particular case study, the selection of variable ordering for cylindrical algebraic decomposition, but expect that the lessons learned are applicable to other decisions in symbolic computation. We utilise an existing dataset of examples derived from applications which was found to be imbalanced with respect to the variable ordering decision. We introduce an augmentation technique for polynomial systems problems that allows us to balance and further augment the dataset, improving the machine learning results by 28% and 38% on average, respectively. We then demonstrate how the existing machine learning methodology used for the problem—classification—might be recast into the regression paradigm. While this does not have a radical change on the performance, it does widen the scope in which the methodology can be applied to make choices. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

23. Maximum a posteriori estimation and filtering algorithm for numerical label noise.

Author: Jiang, Gaoxia, Li, Zhengying, and Wang, Wenjian
Subjects: GAUSSIAN mixture models, DIGITAL filters (Mathematics), REGRESSION analysis, DATA quality, NOISE
Abstract: Data quality, especially label quality, may have a significant impact on the prediction accuracy in supervised learning. Training on datasets with label noise causes a degradation in performance and a reduction in prediction accuracy. To overcome the numerical label noise problem in regression, we estimate the posterior distribution of the true label through the Gaussian mixture model (GMM). Then, label noise estimation is proposed by integrating the idea of maximum a posteriori (MAP) estimation with the posterior distribution. Besides, a noise filtering algorithm with MAP estimation (MAPNF) is designed by combining the optimal sample selection framework with the estimator. Extensive experiments are carried out on benchmark datasets and an age estimation dataset to verify the effectiveness of MAPNF. The results on benchmark datasets show that MAPNF outperforms other latest filtering algorithms in improving the generalization performance of different regression models, including noise-sensitive models and noise-robust models. The model error can be reduced by 29.7% to 69.6%. Our proposed approach can also identify erroneous labels in an age estimation dataset (total of 18424). The model trained on the filtered dataset (19% of the data removed) achieves a reduced test error on the dataset by at least 2.68%. The results demonstrate a less-is-better effect by achieving lower prediction errors with fewer high-quality samples. It can be concluded that MAPNF can effectively identify label noise and optimize the data quality. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

24. WSMOTER: a novel approach for imbalanced regression.

Author: Camacho, Luís and Bacao, Fernando
Subjects: MACHINE learning, CLASSIFICATION, FORECASTING
Abstract: Although the imbalanced learning problem is best known in the context of classification tasks, it also affects other areas of learning algorithms, such as regression. For regression, the problem is characterized by the existence of a continuous target variable domain and the need for models capable of making accurate predictions about rare events. Furthermore, such rare events with a real-value target are often the ones with greater interest in having models that can predict them. In this paper, we propose the novel approach WSMOTER (Weighting SMOTE for Regression) to tackle the imbalanced regression problem, which, according to the experimental work we present, outperforms currently available solutions to the problem. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

25. Gender-based differences in the association of self-reported sleep duration with cardiovascular disease and diabetes.

Author: Chen, Yufei, Yu, Guoqing, Zhang, Xi, Cai, Yimeng, Hu, Tian, and Xue, Rui
Abstract: Objective: Insufficient or prolonged sleep each day may contribute to the onset of cardiovascular disease and diabetes, and there may be some variability between genders; however, current research evidence is limited. We aimed to investigate the effects of gender on self-reported sleep duration and the prevalence of cardiovascular disease and diabetes. Research design and methods: This study is a population-based, cross-sectional analysis. Data from a nationally representative sample of US adults obtained from the National Health and Nutrition Examination Survey (NHANES) (2005–2020), and 13,002 participants, including 6,774men and 6,228women, were obtained by excluding the missing values for each variable self-reported sleep duration data obtained by using a habitual baseline questionnaire. Logistic regression models investigated the associations between gender-specific self-reported sleep duration, CVDs, and diabetes events. Result: In all participants, respectively, compared with sleep 7–8 h/day, the multivariable-adjusted odds ratios significantly associated with < 7 h /day and > 8 h /day were (1.43[1.15, 1.78]) and (1.34[1.01, 1.76]) for CHF, (1.62[1.28, 2.06]) for Angina, (1.42[1.17, 1.71]) for heart attack, (1.38[1.13, 1.70]) and (1.54[1.20, 1.97]) for Stroke, (1.21[1.09, 1.35]) and (1.28[1.11, 1.48]) for diabetes. In men, CHF (1.67[1.21, 2.14]), Angina (1.66[1.18, 2.15]), Stroke (1.55[1.13,1.97]), and diabetes (1.15[1.00, 1.32]) were significantly associated with < 7 h /day, and stroke (1.73[1.16, 2.32]) and diabetes (1.32[1.06, 1.52]) were significantly associated with > 8 h /day. In women, angina(1.83[1.16, 2.50]), heart attack(1.63[1.11, 2.15]), and diabetes (1.32[1.11, 1.54]) were significantly associated with < 7 h /day, while diabetes (1.31[1.03, 1.59]) was significantly associated with > 8 h /day. Conclusion: Self-reported long and short sleep duration was independently associated with partial CVDs and diabetes risk. However, sleep duration and gender did not have multiplicative or additive interactions with the onset of diabetes and CVDs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

26. FAStEN: An Efficient Adaptive Method for Feature Selection and Estimation in High-Dimensional Functional Regressions.

Author: Boschi, Tobia, Testa, Lorenzo, Chiaromonte, Francesca, and Reimherr, Matthew
Subjects: *REGRESSION analysis, *MACHINE learning, *PARAMETER estimation, *FUNCTIONAL analysis, *DATA analysis, *FEATURE selection
Abstract: AbstractFunctional regression analysis is an established tool for many contemporary scientific applications. Regression problems involving large and complex data sets are ubiquitous, and feature selection is crucial for avoiding overfitting and achieving accurate predictions. We propose a new, flexible and ultra-efficient approach to perform feature selection in a sparse high dimensional function-on-function regression problem, and we show how to extend it to the scalar-on-function framework. Our method, called FAStEN, combines functional data, optimization, and machine learning techniques to perform feature selection and parameter estimation simultaneously. We exploit the properties of Functional Principal Components and the sparsity inherent to the Dual Augmented Lagrangian problem to significantly reduce computational cost, and we introduce an adaptive scheme to improve selection accuracy. In addition, we derive asymptotic oracle properties, which guarantee estimation and selection consistency for the proposed FAStEN estimator. Through an extensive simulation study, we benchmark our approach to the best existing competitors and demonstrate a massive gain in terms of CPU time and selection performance, without sacrificing the quality of the coefficients’ estimation. The theoretical derivations and the simulation study provide a strong motivation for our approach. Finally, we present an application to brain fMRI data from the AOMIC PIOP1 study. Complete FAStEN code is provided at https://github.com/IBM/funGCN. Supplementary materials for this article are available online. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

27. Identifying biological markers and sociodemographic factors that influence the gap between phenotypic and chronological ages.

Author: Pala, Daniele, Xu, Jia, Xie, Yuezhi, Zhang, Yuqin, and Shen, Li
Subjects: *AGE, *OLDER people, *ENVIRONMENTAL exposure, *SOCIODEMOGRAPHIC factors, *NEURODEGENERATION
Abstract: IntroductionMethodsResultsConclusionThe world’s population is aging rapidly, leading to increased public health and economic burdens due to age-related cardiovascular and neurodegenerative diseases. Early risk detection is essential for prevention and to improve the quality of life in elderly individuals. Plus, health risks associated with aging are not directly tied to chronological age, but are also influenced by a combination of environmental exposures. Past research has introduced the concept of “Phenotypic Age,” which combines age with biomarkers to estimate an individual’s health risk.This study explores which factors contribute most to the gap between chronological and phenotypic ages. We combined ten machine learning regression techniques applied to the NHANES dataset, containing demographic, laboratory and socioeconomic data from 41,474 patients, to identify the most important features. We then used clustering analysis and a mixed-effects model to stratify by sex, ethnicity, and education.We identified 28 demographic, biological and environmental factors related to a significant gap between phenotypic and chronological ages. Stratifying for sex, education and ethnicity, we found statistically significant differences in the outcome distributions.By showing that health risk prevention should consider both biological and sociodemographic factors, we offer a new approach to predict aging rates and potentially improve targeted prevention strategies for age-related conditions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

28. Some approximations to the path formula for some nonlinear models.

Author: Kartsonaki, Christiana
Subjects: *PROPORTIONAL hazards models, *LOGISTIC regression analysis
Abstract: In linear least squares regression there exists a simple decomposition of the effect of an exposure on an outcome into two parts in the presence of an intermediate variable. This decomposition is described and then analogous decompositions for other models are examined, namely for logistic regression and proportional hazards models. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

29. Multivariate Assessment of Variability and Relationships of Soil Parameters Under Semi-Arid Alfisols Using Principal Component Analysis.

Author: Sathish, A. and Patel, Veerendra
Subjects: *ALFISOLS, *PRINCIPAL components analysis, *COPPER, *SOIL fertility, *SOILS
Abstract: Multivariate relationships and variability of initial and post-harvest soil parameters in the North and South Transects of Bengaluru during 2017 and 2018 are assessed. North (South) Transect soils indicated initial pH of 5.9 (7.1), EC of 0.17 (0.16) dS m−1, N of 376.9 (415.3) kg/ha, P2O5 of 83.18 (64.31) kg/ha, K2O of 169.7 (247.3) kg/ha, Fe of 11.78 (10.86) ppm, Zn of 1.59 (1.13) ppm, Cu of 0.71 (0.6) ppm, Mn of 8.24 (7.74) ppm and B of 0.24 (0.21) ppm; compared to post-harvest pH of 6.1 (6.8), EC of 0.26 (0.19) dS m−1, N of 352.5 (383.4) kg/ha, P2O5 of 68.05 (54.14) kg/ha, K2O of 153.5 (222.1) kg/ha, Fe of 9.91 (8.75) ppm, Zn of 1.35 (0.93) ppm, Cu of 0.58 (0.53) ppm, Mn of 6.88 (6.72) ppm and B of 0.20 (0.19) ppm. Four PCs of post-harvest parameters explained a maximum variance of 59.85% in the North compared to 69.94% in the South Transects. Soil pH (−0.623), Fe (0.648), Zn (−0.603) on PC-1; Cu (0.628) on PC-2; EC (0.613) on PC-3; and B (−0.657) on PC-4 were significant in North, while pH (−0.715), Fe (0.891), Zn (−0.624), Mn (0.835) on PC-1; EC (0.771), N (0.618) on PC-2; K2O (0.718), B (0.774) on PC-3; P2O5 (0.673), Cu (0.647) on PC-4 were significant in South Transect. It may be recommended that the package of practices of crops should be modified by considering variability, relationships, loadings, and buildup of soil fertility parameters under semi-arid Alfisols. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

30. Predicting Cytotoxicity of Nanoparticles: A Meta-Analysis Using Machine Learning.

Author: Masarkar, Ashish, Maparu, Auhin Kumar, Nukavarapu, Yaswanth Sai, and Rai, Beena
Abstract: Cytotoxicity evaluation of nanoparticles (NPs) is regarded as a crucial step for their successful application in the biomedical industry. However, conventional experimental methodologies for cytotoxicity measurements are often expensive, time-consuming, and demand intense training in cell culture. In this study, we developed generalized machine learning (ML) models for both qualitative and quantitative prediction of cytotoxicity across a wide variety of NPs. In particular, a meta-analysis of cytotoxicity data was conducted from published literature on metallic, metal oxide, polymer, and carbon-based NPs, leading to the development of random forest-based regression and classification models for predicting cell viability from physicochemical properties of NPs, cellular attributes, and testing conditions. Our feature importance analysis showed that accurately predicting the cytotoxicity of NPs using the regression model requires knowledge of their composition, concentration, zeta potential, and size, as well as exposure time, toxicity assay, and tissue type. Interestingly, among these attributes, the information about composition of NPs or tissue type was not needed for achieving high accuracy in the qualitative prediction of cytotoxicity using the classification model, indicating its superior robustness compared to the regression model. These findings may encourage future researchers to employ ML models more effectively and frequently to reliably assess the safety of NPs. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

31. Fitting of Dynamic Models for Continuous Adsorption: A Perspective.

Author: Zhi, Lee Lin, Amran, Fadina, and Zaini, Muhammad Abbas Ahmad
Subjects: *RHODAMINE B, *DYNAMIC models, *WASTEWATER treatment, *ACTIVATED carbon, *MATHEMATICAL models
Abstract: Continuous mode of adsorption using fixed bed column has been widely applied for wastewater treatment. Several dynamic mathematical models were used to analyze the adsorptive properties and breakthrough curve of the column system. This paper aims to highlight the underlying principles and applicability of three commonly used dynamic models, namely Thomas, Bohart-Adams and Yoon Nelson in estimating the behavior of continuous adsorption and its properties. The model equations were analyzed to unlock the root of derivation, and the findings from literature were summarized, compared and discussed. As opposed to the results from literature, it was revealed that the models studied are actually identical with direct relationships among the rate constants; hence, they must provide the same fitting regression. Three dynamic models were applied in a case study on column adsorption of rhodamine B by palm kernel shell activated carbon (surface area = 1775 m2/g), from which the results were presented and discussed by the analysis of breakthrough and fitting of dynamic models. The breakthrough time decreased with increasing initial dye concentration and flow rate, and vice-versa for bed height. The applicability of these dynamic models was demonstrated by the close agreement of qo from the models (33.9 mg/g − 69.3 mg/g) with the experimental ones (36.8 mg/g − 70.9 mg/g) with low percentage error. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

32. Prevalence and Predictors of Diabetic Retinopathy, Its Progression and Regression in Indian Children and Youth With Type-1 Diabetes.

Author: Oza, Chirantap, Khadilkar, Anuradha, Bhor, Shital, Curran, Katie, Sambare, Chitra, Ladkat, Dipali, Bettiol, Alessandra, Quinn, Michael, Sproule, Alan, Willoughby, Colin, and Peto, Tunde
Subjects: *TYPE 1 diabetes, *RISK assessment, *MEDICAL protocols, *RESEARCH funding, *DIABETIC retinopathy, *HYPERTENSION, *GLYCEMIC control, *DISEASE remission, *DESCRIPTIVE statistics, *AGE distribution, *MEDICAL screening, *RETINA, *TRIGLYCERIDES, *DISEASE progression, *DISEASE risk factors, *DISEASE complications, *ADOLESCENCE, *CHILDREN
Abstract: Objective: There are very few reports on the prevalence of diabetic retinopathy (DR) in children and youth with type-1 diabetes (T1D). Studies have also found very low rates of referral for DR screening in children and youth with T1D. We conducted this study to determine the prevalence of DR, to study the reliability of ISPAD screening recommendations and to identify predictors of DR, its progression and regression in Indian children and youth with T1D. Methods: This study included 882 children and youth with T1D. Demographic data, anthropometry, blood pressure, sexual maturity rating, ophthalmological examination (slit lamp for cataract) and biochemical measurements were performed using standard protocols. Fundus images were captured using the Forus Health 3netra classic digital non-mydriatic fundus camera by the same experienced operator. De-identified images were assessed by a senior grader and ophthalmologist (Belfast Ophthalmic Reading Center). Severity of DR was graded as per the UK National Health Service (NHS) DR classification scale. Result: We report 6.4% and 0.2% prevalence of DR and cataract in Indian children and youth with T1D, respectively. All the subjects with DR had early non-proliferative DR. We report that amongst subjects with DR, only 2 subjects were aged less than 11 years and had duration of illness less than 2 years. Presence of hypertension and older age were significant predictors of DR (P <.05). Subjects with DR had significantly higher triglyceride concentrations (P <.05), of these, 6.9% had progression and 2.9% had regression at 1 year follow up; the change in glycaemic control was a significant positive predictor of progression of DR (P <.05). None of the participants included in the study progressed to develop sight-threatening DR. Conclusion: DR is not uncommon in Indian children and youth with T1D, thus screening for DR needs to be initiated early, particularly in older individuals with higher disease duration. Controlling blood pressure and triglyceride concentrations may prevent occurrence of DR. Improving glycaemic control may prevent progression of DR in Indian children and youth with T1D. Plain Language Summary: Diabetic retinopathy in Indian children with Type 1 Diabetes We found that 6.4% and 0.2% Indian children and youth with type-1 diabetes had diabetic retinopathy and cataract respectively. We report that amongst subjects with DR, only 2 subjects were aged less than 11 years and had duration of illness less than 2 years. Thus, International Society for Paediatric and Adolescent Diabetes (ISPAD) screening criteria must be implemented by all centres to avoid missing cases. Presence of high blood pressure, high triglyceride levels and older age were significant predictors of DR. Of the subjects with DR, 6.9% had progression and 2.9% had regression at 1 year follow up; the change in glycaemic control was a significant positive predictor of progression of DR. None of the participants included in the study progressed to develop sight-threatening DR. DR is not uncommon in Indian children and youth with T1D, hence, screening for DR needs to be initiated early, particularly in older individuals with higher disease duration. Controlling blood pressure and triglyceride concentrations may prevent occurrence of DR. Improving glycaemic control may prevent progression of DR in Indian children and youth with T1D. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

33. Multivariate analysis of road crashes involving two-wheelers at Vienna's roads.

Author: Magusic, Radmila
Subjects: SAFETY regulations, ROAD safety measures, WEATHER, MULTIVARIATE analysis, WOUNDS & injuries
Abstract: In order to determine the critical elements influencing the frequency and severity of two-wheeler-related traffic crashes, this study explores these incidents along eleven years in Vienna. Applying sophisticated multivariate statistical approaches, a comprehensive dataset is used and includes variables about rider demographics, weather conditions, vehicle features, and crash circumstances to reveal intricate correlations and interactions among these elements. Is there significant and distinctive difference based on gender and age with specific conditions under which crashes are occurring influencing different injury degree. Multiple regression undoubtedly points fields for action in statistically based findings providing the most important answer to this research: why there are so many crashes and what is leading cause of injured two-wheelers. The research yields insightful information that politicians and practitioners of road safety may use to improve two-wheeler safety regulations and lower the number of serious injuries and fatalities. Highlights: The stunning fact that just one rider in every fourteen may be uninjured Every second two-wheel rider sustain light injury and every eight severe Lane type, crash circumstance and consequence proves to be significant predictors for crash [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

34. INTERCONNECTIONS AND INTERDEPENDENCIES OF ECONOMIC DEVELOPMENT AND SHADOW BANKING SECTOR IN DEVELOPING AND TRANSITIONAL ECONOMIES.

Author: Yao LIANG, Xu JIN, and AZIMZADEH, Aslan Javid
Subjects: *SHADOW banking system, *BANKING industry, *BANK loans, *NONBANK financial institutions, *TRANSITION economies
Abstract: The research objective is defined as the identification and confirmation of empirical relationships between shadow banking activities and economic development in developing and transitional economies to establish a theoretical basis for minimizing potential risks associated with shadow banking. The methodological design is based on a quantitative approach, implemented through correlation-regression analysis and ARIMA forecasting methods. The research findings confirm Hypothesis 1: China's shadow banking is closely interconnected with the country's economic development. However, Hypothesis 2 (the reduction of shadow banking in China contributes to per capita GDP growth) is only supported for specific structural elements of shadow banking that contribute to economic overheating. In contrast, for other structural elements, such as entrusted loans, a strong direct correlation exists, promoting a positive impact of shadow banking on the country's economic development. This highlights the need for a highly balanced state policy to minimize shadow banking risks. The research results can be valuable for professionals in public administration and academic researchers, particularly in terms of shaping future research directions. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

35. Methods and reagent-lot comparisons by regression analysis: Sample size considerations.

Author: Sadler, William A
Subjects: *CONFIDENCE regions (Mathematics), *CONFIDENCE intervals, *REGRESSION analysis, *COMPUTER software developers, *SAMPLE size (Statistics)
Abstract: Background: Parametric regression analysis is widely used in methods comparisons and more recently in checking the concordance of test results following receipt of new reagent lots. The greater frequency of reagent-lot evaluations increases pressure to detect bias with smallest possible sample sizes (i.e. smallest consumption of time and resources). This study revisits bias detection using the joint slope, intercept confidence region as an alternative to slope and intercept confidence intervals. Methods: Four cases were considered representing constant errors, proportional errors (constant CV) and two more complicated error patterns typical of immunoassays. Maximum:minimum range ratios varied from 2:1 to 2000:1. After setting a maximum tolerable difference a series of slope, intercept combinations, each of which predicted the critical difference, were systematically evaluated in simulations which determined the minimum sample size required to detect the difference, firstly using slope, intercept confidence intervals and secondly using the joint slope, intercept confidence region. Results: At small to moderate range ratios, bias detection by joint confidence region required greatly reduced sample sizes to the extent that it should encourage reagent-lot evaluations or, alternatively, transform those already routinely performed into considerably less costly exercises. Conclusions: While some software is available to calculate joint confidence regions in real-life analyses, shifting this testing method into the mainstream will require a greater number of software developers incorporating the necessary code into their regression programs. The computer program used to conduct this study is freely available and can be used to model any laboratory test. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

36. Factors of presenteeism and its association with detrimental effects among employees in Switzerland working in different sectors – a cross-sectional study using a multi-item instrument.

Author: Gerlach, Maisa, Blozik, Eva, Meichtry, André, Hägerbäumer, Miriam, Kilcher, Gablu, and Golz, Christoph
Subjects: *LINEAR statistical models, *JOB satisfaction, *INDUSTRIAL hygiene, *CORPORATE culture, *PRESENTEEISM (Labor)
Abstract: Purpose: Presenteeism, the phenomenon of employees working despite illness, is a significant issue globally, impacting individual well-being and organizational efficiency. This study examines presenteeism among Swiss employees, exploring its occurrence, primary factors, reasons, and impact on employees' health. Methods: This study used cross-sectional data from 1,521 employees in different sectors in Switzerland. Descriptive statistics and multiple linear models for influencing factors and detrimental effects, such as burnout symptoms, job satisfaction, general health, and quality of life, were calculated for data analysis. Presenteeism was measured using the Hägerbäumer multi-item scale, ranging from 1 = "Never in case of illness" – 5 = "Very often in case of illness." Results: The employees reported that in case of illness, they rarely worked in the last 12 months M = 2.04 (SD = 1.00). A positive approach to presenteeism in the team was associated with less presenteeism (β = -0.07) and problematic leadership culture in dealing with presenteeism with increased presenteeism (β = 0.10). In addition to well-known factors, presenteeism was significant for burnout symptoms (β = 1.49), general health status (β = -1.5), and quality of life (β = -0.01). Conclusion: The study offers insights into the phenomenon of presenteeism among Swiss employees in various sectors by applying a multi-item scale for presenteeism. The findings indicate that a positive team dynamic and organizational culture may significantly reduce presenteeism. Presenteeism behavior is a significant factor of adverse outcomes. This highlights the importance of acknowledging presenteeism in the context of occupational health. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

37. Statistical inference for sketching algorithms.

Author: Browne, Ryan P and Andrews, Jeffrey L
Subjects: *INFERENTIAL statistics, *DATA modeling, *REGRESSION analysis, *ALGORITHMS
Abstract: Sketching algorithms use random projections to generate a smaller sketched data set, often for the purposes of modelling. Complete and partial sketch regression estimates can be constructed using information from only the sketched data set or a combination of the full and sketched data sets. Previous work has obtained the distribution of these estimators under repeated sketching, along with the first two moments for both estimators. Using an alternative approach, we derive the distribution of the complete sketch estimator and additionally consider the error term under both repeated sketching and repeated sampling. Importantly, we obtain pivotal quantities which are based solely on the sketched data set—specifically not requiring information from the full data model fit. These pivotal quantities can be used for inference on the full data set regression estimates or the model parameters. For partial sketching, we derive pivotal quantities for a marginal test and an approximate distribution for the partial sketch under repeated sketching or repeated sampling—again avoiding reliance on a full data model fit. We extend these results to include the Hadamard and Clarkson–Woodruff sketches then compare them in a simulation study. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

38. Estimation and prediction with data quality indexes in linear regressions.

Author: Chatelain, P. and Milhaud, X.
Subjects: *DATA quality, *ACQUISITION of data, *FORECASTING, *CONFIDENCE, *ALGORITHMS
Abstract: Despite many statistical applications brush the question of data quality aside, it is a fundamental concern inherent to external data collection. In this paper, data quality relates to the confidence one can have about the covariate values in a regression framework. More precisely, we study how to integrate the information of data quality given by a (n × p) -matrix, with n the number of individuals and p the number of explanatory variables. In this view, we suggest a latent variable model that drives the generation of the covariate values, and introduce a new algorithm that takes all these information into account for prediction. Our approach provides unbiased estimators of the regression coefficients, and allows to make predictions adapted to some given quality pattern. The usefulness of our procedure is illustrated through simulations and real-life applications. Kindly check and confirm whether the corresponding author is correctly identified.Yes [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

39. Analyzing Wav2Vec 1.0 Embeddings for Cross-Database Parkinson's Disease Detection and Speech Features Extraction.

Author: Klempíř, Ondřej and Krupička, Radim
Subjects: *MACHINE learning, *PARKINSON'S disease, *NOSOLOGY, *DEEP learning, *SPEECH
Abstract: Advancements in deep learning speech representations have facilitated the effective use of extensive unlabeled speech datasets for Parkinson's disease (PD) modeling with minimal annotated data. This study employs the non-fine-tuned wav2vec 1.0 architecture to develop machine learning models for PD speech diagnosis tasks, such as cross-database classification and regression to predict demographic and articulation characteristics. The primary aim is to analyze overlapping components within the embeddings on both classification and regression tasks, investigating whether latent speech representations in PD are shared across models, particularly for related tasks. Firstly, evaluation using three multi-language PD datasets showed that wav2vec accurately detected PD based on speech, outperforming feature extraction using mel-frequency cepstral coefficients in the proposed cross-database classification scenarios. In cross-database scenarios using Italian and English-read texts, wav2vec demonstrated performance comparable to intra-dataset evaluations. We also compared our cross-database findings against those of other related studies. Secondly, wav2vec proved effective in regression, modeling various quantitative speech characteristics related to articulation and aging. Ultimately, subsequent analysis of important features examined the presence of significant overlaps between classification and regression models. The feature importance experiments discovered shared features across trained models, with increased sharing for related tasks, further suggesting that wav2vec contributes to improved generalizability. The study proposes wav2vec embeddings as a next promising step toward a speech-based universal model to assist in the evaluation of PD. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

40. A Deep Learning Approach to Distance Map Generation Applied to Automatic Fiber Diameter Computation from Digital Micrographs.

Author: Alejo Huarachi, Alain M. and Beltrán Castañón, César A.
Subjects: *CONVOLUTIONAL neural networks, *TEXTILE fiber industry, *SYNTHETIC textiles, *COMPUTER vision, *ANIMAL fibers, *DEEP learning
Abstract: Precise measurement of fiber diameter in animal and synthetic textiles is crucial for quality assessment and pricing; however, traditional methods often struggle with accuracy, particularly when fibers are densely packed or overlapping. Current computer vision techniques, while useful, have limitations in addressing these challenges. This paper introduces a novel deep-learning-based method to automatically generate distance maps of fiber micrographs, enabling more accurate fiber segmentation and diameter calculation. Our approach utilizes a modified U-Net architecture, trained on both real and simulated micrographs, to regress distance maps. This allows for the effective separation of individual fibers, even in complex scenarios. The model achieves a mean absolute error (MAE) of 0.1094 and a mean square error (MSE) of 0.0711 , demonstrating its effectiveness in accurately measuring fiber diameters. This research highlights the potential of deep learning to revolutionize fiber analysis in the textile industry, offering a more precise and automated solution for quality control and pricing. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

41. An investigation into dynamic behaviour of reconstituted and undisturbed fine-grained soil during triaxial and simple shear.

Author: Önalp, Akın, Özocak, Aşkın, Bol, Ertan, Sert, Sedat, Arslan, Eylem, and Ural, Nazile
Subjects: *DYNAMIC testing, *CONFORMANCE testing, *EARTHQUAKES, *CLAY, *SOILS
Abstract: This study aims to evaluate the factors controlling the sensitivity of fine-grained soils to seismic stresses and revise the criteria previously proposed by the authors to diagnose liquefaction. To this end, dynamic tests have been performed on artificial mixes as well as natural soils from a wide area of an earthquake devastated city (Adapazari) using two types of dynamic testing. Studies have led to findings suggesting that the gray area between susceptible and non-susceptible soils proposed by several investigators in the past can now be dispensed with. Although physical properties of fine-grained soil supply sufficient information for diagnosis, the dynamic simple shear test is found to be a convenient and rapid way to confirm the judgement. However, it has been seen that dynamic testing alone may not be the last word in the determination of liquefaction, and physical properties should also be addressed. Anomalies observed in test results are also discussed. Conclusions show significant differences from existing proposed criteria in the literature. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

42. Statistical Modeling of Asphalt Pavement Surface Friction Based on Aggregate Fineness Modulus and Asphalt Mix Volumetrics.

Author: Alsheyab, Mohammad Ahmad and Khasawneh, Mohammad Ali
Subjects: *ASPHALT modifiers, *ASPHALT pavements, *SURFACE texture, *STATISTICAL models, *ENGINEERING design
Abstract: Predicting pavement surface friction during the design stage allows engineers to optimize the design of the roadway to provide the appropriate level of friction for the intended use of the road in a safe and cost-effective manner. The main goal of the study is to propose a methodology to predict pavement surface friction during the design stage. Thus, this study analyzes the role of aggregate Fineness Modulus (FM) and Hot Mix Asphalt (HMA) volumetrics including Air Voids Volume (Va) and Effective Binder Volume (Vbe) on fabricating the surface texture. Surface frictional properties were evaluated using the British Pendulum Test (BPT) and the Sand Patch Test (SPT). The data were analyzed using the analysis of variance (ANOVA) test. Several statistical modeling techniques including Multiple Linear (ML) regression, Non-Linear Stepwise (NLSW) regression with all possible interactions, Non-Linear Beta (NLB) regression, Non-Linear Curve Fitting (NLCF) regression, and multilayer neural network (MNN) were utilized. Models were evaluated using synthetical data and compared using Post-Hoc analysis. The study evaluated nine types of mixes including different gradations with different Nominal Maximum Aggregate Sizes (NMAS) and several asphalt modifiers. The results revealed that Mean Textures Depth (MTD) and British pendulum Number (BPN) values are primarily influenced by FM, followed by Va and Vbe, respectively. According to ANOVA results, the two-level interaction showed that only when FM interacts with either Va or Vbe, the interaction is significant for both MTD and BPN. MNN models had the highest Coefficient of Determination (R2) values for both MTD and BPN. However, the sensitivity analysis and the Post-Hoc analysis revealed that due to the low number of data used to generate the models, statistical regression methods had comparable results and resulted in more accurate prediction than MNN. The NLCF was found to be the most reliable model for predicting both BPN and MTD. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

43. On inference in high-dimensional logistic regression models with separated data.

Author: Lewis, R M and Battey, H S
Subjects: *REGRESSION analysis, *MAXIMUM likelihood statistics, *SAMPLE size (Statistics), *ASYMPTOTES, *PROBABILITY theory
Abstract: Direct use of the likelihood function typically produces severely biased estimates when the dimension of the parameter vector is large relative to the effective sample size. With linearly separable data generated from a logistic regression model, the loglikelihood function asymptotes and the maximum likelihood estimator does not exist. We show that an exact analysis for each regression coefficient produces half-infinite confidence sets for some parameters when the data are separable. Such conclusions are not vacuous, but an honest portrayal of the limitations of the data. Finite confidence sets are only achievable when additional, perhaps implicit, assumptions are made. Under a notional double-asymptotic regime in which the dimension of the logistic coefficient vector increases with the sample size, the present paper considers the implications of enforcing a natural constraint on the vector of logistic transformed probabilities. We derive a relationship between the logistic coefficients and a notional parameter obtained as a probability limit of an ordinary least-squares estimator. The latter exists even when the data are separable. Consistency is ascertained under weak conditions on the design matrix. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

44. IoT-Enabled Smart Mental Health Assessment Using Deep Hybrid Regression Models Over Actigraph-Based Sequential Motor Activity Data.

Author: Arora, Anshika
Subjects: *REGRESSION analysis, *SLEEP quality, *MENTAL health, *STATISTICAL significance, *FEATURE extraction
Abstract: This research puts forward novel deep hybrid frameworks for assessment of mental health indicators viz. depression and sleep quality using IoT-based motor activity recordings. The study employs long short-term memory (LSTM) to extract high-level features from sequential motor activity data, which are then combined with statistical features of the raw data to form a hybrid feature extraction model. The combined feature vector obtained via the hybrid feature extractor is fed into four regression models, namely linear regression (LR), sequential minimal optimization regression (SMOR), random forest regression (RFR) and adaptive neuro-fuzzy inference system (ANFIS) forming four deep hybrid regression models, namely LSTM-LR, LSTM-SMOR, LSTM-RFR, and LSTM-ANFIS for prediction of depression and sleep quality. The proposed deep hybrid frameworks are validated on benchmark datasets, namely the Depresjon dataset and the MESA actigraphy dataset, and the best performance is observed by LSTM-RFR with the adjusted R2 value of 74.19% on the MESA dataset. On validating the significance of statistical features, it has been observed that all the models show significant improvement in performance on combining the statistical features with high-level features with highest reduction in mean absolute error as 4.0436 and highest increase in R2 statistic as 0.1651. Model building time of the four regression methods is also compared for which LR gave the best results as only 0.01 s for training and testing. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

45. The Montreal Cognitive Assessment: Norms and Reliable Change Indices for Standard and MoCA-22 Administrations.

Author: Ratcliffe, Lauren N, Hale, Andrew C, McDonald, Taylor, Hewitt, Kelsey C, Nguyen, Christopher M, Spencer, Robert J, and Loring, David W
Subjects: *MONTREAL Cognitive Assessment, *INDIGENOUS peoples, *ALZHEIMER'S disease, *REFERENCE values, *CRONBACH'S alpha
Abstract: Objective The Montreal Cognitive Assessment (MoCA) is among the most frequently administered cognitive screening tests, yet demographically diverse normative data are needed for repeated administrations. Method Data were obtained from 18,410 participants using the National Alzheimer's Coordinating Center Uniform Data Set. We developed regression-based norms using Tobit regression to account for ceiling effects, explored test–retest reliability of total scores and by domain stratified by age and diagnosis with Cronbach's alpha, and reported the cumulative change frequencies for individuals with serial MoCA administrations to gage expected change. Results Strong ceiling effects and negative skew were observed at the total score, domain, and item levels for the cognitively normal group, and performances became more normally distributed as the degree of cognitive impairment increased. In regression models, years of education was associated with higher MoCA scores, whereas older age, male sex, Black and American Indian or Alaska Native race, and Hispanic ethnicity were associated with lower predicted scores. Temporal stability was adequate and good at the total score level for the cognitively normal and cognitive disorders groups, respectively, but fell short of reliability standards at the domain level. Conclusions MoCA total scores are adequately reproducible among those with cognitive diagnoses, but domain scores are unstable. Robust regression-based norms should be used to adjust for demographic performance differences, and the limited reliability, along with the ceiling effects and negative skew, should be considered when interpreting MoCA scores. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

46. Retail banking closures in the United Kingdom. Are neighbourhood characteristics associated with retail bank branch closures?

Author: Clark, Stephen, Newing, Andy, Hood, Nick, and Birkin, Mark
Subjects: *BANKING industry, *RETAIL banking, *BRANCH banks, *WITHDRAWAL of funds, *NEIGHBORHOODS
Abstract: The United Kingdom retail banking sector has been through many changes over the past decade, driven by economic, technological and pandemic factors. This is perhaps most evident through the rationalisation of the branch networks, primarily through branch closures. There is a concern that these closures can have differential effects on certain sections of society, reducing or eliminating their access to cash and financial services. In this study we utilise comprehensive branch‐level data and a discrete time‐hazard model to identify whether certain neighbourhood types have been disproportionally affected by these closures during the period 2015 to 2021. We find that building society branches are least likely to close, and that competition and the presence of alternative banking facilities (including Post Office branches) influence likelihood of closure. Crucially, neighbourhood type matters, with rural communities continuing to be most impacted, along with those where the predominant work activities are less concentrated, or largely absent. However, in contrast to earlier studies, we find that affluent neighbourhoods were more at risk of branch closure than more diverse and economically challenged neighbourhoods. We conclude by considering these findings in the context of recently introduced legislation to protect access to basic banking and cash withdrawal and deposit facilities. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

47. Efficient Kirszbraun extension with applications to regression.

Author: Zaichyk, Hananel, Biess, Armin, Kontorovich, Aryeh, and Makarychev, Yury
Subjects: *INTERIOR-point methods, *SUPERVISED learning, *HILBERT space, *FORECASTING
Abstract: We introduce a framework for performing vector-valued regression in finite-dimensional Hilbert spaces. Using Lipschitz smoothness as our regularizer, we leverage Kirszbraun's extension theorem for off-data prediction. We analyze the statistical and computational aspects of this method—to our knowledge, its first application to supervised learning. We decompose this task into two stages: training (which corresponds operationally to smoothing/regularization) and prediction (which is achieved via Kirszbraun extension). Both are solved algorithmically via a novel multiplicative weight updates (MWU) scheme, which, for our problem formulation, achieves significant runtime speedups over generic interior point methods. Our empirical results indicate a dramatic advantage over standard off-the-shelf solvers in our regression setting. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

48. Morphological and Productive Correlations of Cutting Pennisetum Varieties Under Conditions of Peruvian Humid Tropics.

Author: Pinchi-Carbajal, S. F., Quispe-Ccasa, H. A., Ampuero-Trigoso, G., Nolasco-Lozano, E., and Saucedo-Uriarte, J. A.
Subjects: *PENNISETUM, *BIOMASS production, *LIVESTOCK farms, *CHLOROPHYLL, *GRASSES
Abstract: Livestock farming in the Peruvian tropics is based on the use of grazing forage, but cutting grasses offers greater productivity and seasonality advantages. In this study, the morphological and productive characteristics of King Grass Morado (KGM), Cuba OM-22 (CU), and Maralfalfa (MA) were evaluated and correlated with chlorophyll content under Peruvian humid tropic conditions. Five plots of 1 ha each were installed for the three Pennisetum varieties (2-1-2), with three samples per plot. No significant differences were found in plant height, leaf length, number of nodes, number of leaves/stem, number of stems, stem circumference, length of nodes, leaf, stems, and total weight, chlorophyll index (atLEAF CLOR), performance index (API), and dry matter. KGM stood out in tillering (12.86) (p<0.01), but CU and MA showed greater leaf width (4.16 and 4.42 cm, respectively) (p<0.05). The calculated biomass production was 40.3 t/ha for KGM, 24.5 t/ha for MA, and 76.5 t/ha for CU. MA had higher nitrogen (0.70%) and protein (4.33%) contents (p<0.01). The correlations were significant between stem height with the number of nodes and leaf width, stem circumference with stem, leaf, and total weight (p<0.05), and nitrogen and protein content were estimated with the atLEAF CLOR and API values of the basal leaves with R2 = 0.548 and R2 = 0.563, respectively (p<0.05). In conclusion, KGM, CU, and MA differed in some morphological and productive variables and were correlated with others; furthermore, the protein content could be estimated with the atLEAF CLOR and API values in these Pennisetum varieties. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

49. Modeling and optimization of WEDM machining of armour steel using modified crow search algorithm approach.

Author: Gupta, Rajesh, Agrawal, Sunil, and Singh, Pushpendra
Abstract: Armour steel is a type of steel that is often used in armoured vehicles, military equipment, and structural components that require a high level of resistance to penetration. Because of its high strength and hardness, cutting armour steel presents various obstacles. To overcome these challenges in armour steel cutting, innovative cutting methods, specialized equipment, and careful process planning are required. The current work discusses an experimental investigation that focuses on input process parameters of wire electrical discharge machining and on multi-objective optimization to obtain the best cutting rate (CR), surface roughness (SR), wire electrode temperature (TE) and material removal rate (MRR) for armour steel. The fractional factorial approach has been used in the investigation, with pulse off time (B), pulse on time (A), spark voltage (D), peak current (C), wire feed (E) as machining parameters and workpiece thickness (F) as a material parameter. The main and interaction impacts of the input parameters on the response parameters have been examined using the main effect plot, interaction plot, and ANOVA analysis, followed by the development of regression modeling. The research revealed that the pulse on time and workpiece thickness have the most significant contributions to CR and SR, with 55.25% and 21.77% contributions for CR and 67.87% and 8.72% contributions for SR, respectively. Toff and spark voltage are the major contributors for TE with 33.58% and 26.30% respectively and Ton is a major contributor with 70.04% for MRR. The ideal input parameters for CR [0.71 (mm/min)], SR [2.46 (microns)], TE [52 (°C)] and MRR [23.85 (mm3/min)] have been found to be A2B1C2D1E1F1, A1B2C2D2E1F2, A1B2C1D1E1F2 and A1B2C1D1E2F1 respectively. The modified crow search algorithm (MCSA) has been used in this study for single and multi-objective optimization, and their results contrast with other methods such as Rao-1 and Shuffled Frog Leaping Algorithm (SFLA). According to the conclusions of the present investigation, this study demonstrates that the MCSA technique exceeds the Rao-1 and SFLA techniques in terms of producing globally optimal outcomes for the specific problem under examination. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

50. A kinetic study and thermometric analysis on waste cooking oil.

Author: Kumar, Vishal, Das, Debasish, and Mahto, Vijay Kumar
Abstract: The objective of this research work is to perform a kinetic study and thermometric analysis on waste cooking oil which has been reported along with the outcomes. Biodiesel was obtained through the process of base-catalyzed transesterification process, and it was carried out at 50 and 60 ֯C, methanol/oil molar ratio and catalyst concentration were taken as a 9:1 and 0.45% w/w for potassium hydroxide (KOH), respectively at different stirring speed and reaction time. The experimental data from the transesterification reaction were fitted to first and second-order kinetic models. But the regression coefficient (R2) showed that the first order was a better fit for the experimental behavior of the kinetic data. The Arrhenius equation was applied to calculate the activation energy and the frequency factor of a given system, resulting in 45.76 kJ/mol and 4.95 × 105 min−1, respectively. At different temperatures, thermodynamic parameters were also taken into consideration in terms of Gibbs free energy, enthalpy, and entropy. Results show that biodiesel obtained from waste cooking oil gave better results and can be used as an excellent substitute of fossil fuels. [ABSTRACT FROM AUTHOR]
Published: 2024
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

76,243 results on '"Régression"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources