42 results
Search Results
2. The State of Play of Reproducibility in Statistics: An Empirical Analysis.
- Author
-
Xiong, Xin and Cribben, Ivor
- Subjects
- *
FUNCTIONAL magnetic resonance imaging , *SCIENTIFIC method , *COMPUTER programming , *REPRODUCIBLE research , *STATISTICS - Abstract
Reproducibility, the ability to reproduce the results of published papers or studies using their computer code and data, is a cornerstone of reliable scientific methodology. Studies where results cannot be reproduced by the scientific community should be treated with caution. Over the past decade, the importance of reproducible research has been frequently stressed in a wide range of scientific journals such as Nature and Science and international magazines such as The Economist. However, multiple studies have demonstrated that scientific results are often not reproducible across research areas such as psychology and medicine. Statistics, the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data, prides itself on its openness when it comes to sharing both computer code and data. In this article, we examine reproducibility in the field of statistics by attempting to reproduce the results in 93 published papers in prominent journals using functional magnetic resonance imaging (fMRI) data during the 2010–2021 period. Overall, from both the computer code and the data perspective, among all the 93 examined papers, we could only reproduce the results in 14 (15.1%) papers, that is, the papers provide both executable computer code (or software) with the real fMRI data, and our results matched the results in the paper. Finally, we conclude with some author-specific and journal-specific recommendations to improve the research reproducibility in statistics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. Five Hands-on Experiments for a Design of Experiments Course.
- Author
-
Woodard, Victoria
- Subjects
- *
EXPERIMENTAL design , *DATA scrubbing , *STATISTICAL learning , *EDUCATIONAL outcomes , *STATISTICS , *ACQUISITION of data - Abstract
In many collegiate level statistics courses, the focus of the learning outcomes is often on the analysis of data after it has been collected. Students are provided with clean data sets from previous studies to practice statistical analysis, but receive little to no application as to the amount of time and effort that goes in to collecting good data. To account for these deficits at the author's institution, a design of experiments course was created that provided students with a more hands-on learning experience to the statistical process, especially as pertains to data collection. This paper focuses on five of the experiments that students designed and implemented during the course, and some suggestions to instructors that may wish to use these experiments in their own courses. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
4. A semi-analytical solution to the maximum-likelihood fit of Poisson data to a linear model using the Cash statistic.
- Author
-
Bonamente, Massimiliano and Spence, David
- Subjects
- *
DATA modeling , *INDEPENDENT variables , *ANALYTICAL solutions , *PARAMETER estimation - Abstract
The Cash statistic, also known as the C statistic, is commonly used for the analysis of low-count Poisson data, including data with null counts for certain values of the independent variable. The use of this statistic is especially attractive for low-count data that cannot be combined, or re-binned, without loss of resolution. This paper presents a new maximum-likelihood solution for the best-fit parameters of a linear model using the Poisson-based Cash statistic. The solution presented in this paper provides a new and simple method to measure the best-fit parameters of a linear model for any Poisson-based data, including data with null counts. In particular, the method enforces the requirement that the best-fit linear model be non-negative throughout the support of the independent variable. The method is summarized in a simple algorithm to fit Poisson counting data of any size and counting rate with a linear model, by-passing entirely the use of the traditional χ 2 statistic. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
5. Data mining and statistics issues of precision and intelligent agriculture based on big data analysis.
- Author
-
Rao, Zhiwei and Yuan, Jie
- Subjects
- *
DATA mining , *STATISTICAL accuracy , *BIG data , *PRECISION farming , *DATABASES , *STATISTICS - Abstract
In order to improve the effect of agricultural data mining and statistics in the Internet era, based on data mining technology, this paper combines data mining technology with modern agricultural information to solve the problems of poor timeliness and incomplete information in the collection and sorting of agricultural information. Moreover, this paper makes innovations in time series representation and time series measurement, and uses custom methods to improve agricultural data mining and statistical algorithms. In addition, this paper finds reasonable mining methods to acquire knowledge through data analysis. Finally, this paper combines the actual needs to construct a data mining and statistical analysis model of precision and intelligent agriculture based on big data analysis, and designs experiments to verify the performance of the model. The research results show that the model constructed in this paper has a certain effect. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
6. Regression models using the LINEX loss to predict lower bounds for the number of points for approximating planar contour shapes.
- Author
-
Jayasinghe, J. M. Thilini, Ellingson, Leif, and Prematilake, Chalani
- Subjects
- *
REGRESSION analysis , *INDEPENDENT variables , *LEAST squares , *APPROXIMATION error , *STATISTICS , *SAMPLING errors - Abstract
Researchers in statistical shape analysis often analyze outlines of objects. Even though these contours are infinite-dimensional in theory, they must be discretized in practice. When discretizing, it is important to reduce the number of sampling points considerably to reduce computational costs, but to not use too few points so as to result in too much approximation error. Unfortunately, determining the minimum number of points needed to achieve sufficiently approximate the contours is computationally expensive. In this paper, we fit regression models to predict these lower bounds using characteristics of the contours that are computationally cheap as predictor variables. However, least squares regression is inadequate for this task because it treats overestimation and underestimation equally, but underestimation of lower bounds is far more serious. Instead, to fit the models, we use the LINEX loss function, which allows us to penalize underestimation at an exponential rate while penalizing overestimation only linearly. We present a novel approach to select the shape parameter of the loss function and tools for analyzing how well the model fits the data. Through validation methods, we show that the LINEX models work well for reducing the underestimation for the lower bounds. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
7. Statistical analysis of infrared thermogram for CNN-based electrical equipment identification methods.
- Author
-
Sheng Han, Fan Yang, Hui Jiang, Gang Yang, Dawei Wang, and Na Zhang
- Subjects
- *
DIGITAL images , *TEMPERATURE measuring instruments , *AUTOMATIC identification , *CONVOLUTIONAL neural networks , *STATISTICS - Abstract
It is essential to develop infrared (IR) thermogram identification technologies to establish automatic diagnosis systems in power substations. The convolutional neural network (CNN) based methods show the highest accuracy in this field. The IR thermograms of electrical equipment are very different from general digital images, which means the present methods need further improvements. For data-driven CNN methods, it is necessary to study the characteristics of the IR data. This paper collected 11817 thermograms from substations and structured the dataset according to equipment types. The statistical features of mean, variance, skewness, kurtosis and contrast are analyzed and compared with other five image datasets. Several tricks are revealed from the analysis and tested on CNN models. Firstly, greycaling the Iron pseudo-color images extracts the temperature information and makes it possible to design models with fewer channels. The test shows it could reduce over 35% computational costs. Secondly, the sparse information of color and edges of thermograms makes it necessary to keep the original aspect ratio. The image preprocessing method of cropping shows better performance than padding and rescaling. Thirdly, the 0-1 normalization can boost the training process for about 100 epochs, which is related to the particular background of thermograms. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
8. Machine Learning Based Method for Deciding Internal Value of Talent.
- Author
-
Loyarte-López, Edurne and García-Olaizola, Igor
- Subjects
- *
JOB evaluation , *ARTIFICIAL intelligence , *EMPLOYEE motivation , *STATISTICS , *MACHINE learning , *DECISION making - Abstract
This paper presents a machine-learning-based method for evaluating the internal value of talent in any organization and for evaluating the salary criteria. The study assumes the design and development of a salary predictor, based on artificial intelligence technologies, to help determine the internal value of employees and guarantee internal equity in the organization. The aim of the study is to achieve internal equity, which is a critical element a that directly affects employees' motivation. We implemented and validated the method with 130 employees and more than 70 talent acquisition cases with a Basque technology research organization during the years 2021 and 2022. The proposed method is based on statistical data assessment and machine-learning-based regression. We found that while most organizations have established variables for job evaluation as well as salary increments for staff according to their contribution to the organization, only a few employ tools to support equitable internal compensation. This study presents a successful real case of artificial intelligence applications where machine learning techniques help managers make the most equitable and least biased salary decisions possible, based on data. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
9. Impact of information and Lévy noise on stochastic COVID-19 epidemic model under real statistical data.
- Author
-
Liu, Peijiang, Huang, Lifang, Din, Anwarud, and Huang, Xiangxiang
- Subjects
- *
COVID-19 pandemic , *STATISTICS , *COMPUTER simulation , *COVID-19 , *EPIDEMICS - Abstract
In this paper, we consider the dynamical behaviour of a stochastic coronavirus (COVID-19) susceptible-infected-removed epidemic model with the inclusion of the influence of information intervention and Lévy noise. The existence and uniqueness of the model positive solution are proved. Then, we establish a stochastic threshold as a sufficient condition for the extinction and persistence in mean of the disease. Based on the available COVID-19 data, the parameters of the model were estimated and we fit the model with real statistics. Finally, numerical simulations are presented to support our theoretical results. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Statistical analysis of material properties and recommended values for the assessment of RC structures in Australia.
- Author
-
Menegon, Scott J., Tsang, Hing-Ho, Wilson, John L., and Lam, Nelson T. K.
- Subjects
- *
MECHANICAL properties of condensed matter , *VALUATION of real property , *STATISTICS , *NONLINEAR analysis , *CONCRETE testing - Abstract
This paper presents a statistical analysis of the actual mean material properties for typical grades of concrete and reinforcement available in Australia. The analysis was performed using a database of 3,447 concrete cylinder test results and 15,201 reinforcement tensile test results. The test results were taken over a period from 2009 to 2021. The paper provides a summary of the mean values and respective coefficient of variation values for the different grades of concrete and reinforcement that make up the database. Distinctive manufacturing trends and variability between suppliers were observed for reinforcement samples and appropriate recommendations have been proposed. Researchers or designers can adopt these values to undertake probabilistic assessments of RC structures. The paper also provides recommendations for mean material properties for the purpose of undertaking non-linear analysis of RC structures. The database of test results also includes early age strength data for concrete, which have been used to provide recommendations for predicting the early age strength of various standard grades of concrete available in Australia. The paper also presents an assessment of the theoretical characteristic values from the database for the various grades of reinforcement and concrete and how they compare to respective specified code requirements. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
11. The CUSUM statistics of change-point models based on dependent sequences.
- Author
-
Ding, Saisai, Fang, Hongyan, Dong, Xiang, and Yang, Wenzhi
- Subjects
- *
CHANGE-point problems , *CORPORATE finance , *STATISTICS - Abstract
In this paper, we investigate the mean change-point models based on associated sequences. Under some weak conditions, we obtain a limit distribution of CUSUM statistic which can be used to judge the mean change-mount δ n is satisfied or dissatisfied n 1 / 2 δ n = o (1). We also study the consistency of sample covariances and change-point location statistics. Based on Normality and Lognormality data, some simulations such as empirical sizes, empirical powers and convergence are presented to test our results. As an important application, we use CUSUM statistics to do the mean change-point analysis for a financial series. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. Distribution of the C statistic with applications to the sample mean of Poisson data.
- Author
-
Bonamente, Massimiliano
- Subjects
- *
ASTRONOMICAL observations , *PARAMETER estimation , *NULL hypothesis , *DEGREES of freedom - Abstract
The C statistic, also known as the Cash statistic, is often used in astronomy for the analysis of low-count Poisson data. The main advantage of this statistic, compared to the more commonly used χ 2 statistic, is its applicability without the need to combine data points. This feature has made the C statistic a very useful method to analyze Poisson data that have small (or even null) counts in each resolution element. One of the challenges of the C statistic is that its probability distribution, under the null hypothesis that the data follow a parent model, is not known exactly. This paper presents an effort towards improving our understanding of the C statistic by studying (a) the distribution of C statistic for a fully specified model, (b) the distribution of Cmin resulting from a maximum-likelihood fit to a simple one-parameter constant model, i.e. a model that represents the sample mean of N Poisson measurements, and (c) the distribution of the associated Δ C statistic that is used for parameter estimation. The results confirm the expectation that, in the high-count limit, both C statistic and Cmin have the same mean and variance as a χ 2 statistic with same number of degrees of freedom. It is also found that, in the low-count regime, the expectation of the C statistic and Cmin can be substantially lower than for a χ 2 distribution. The paper makes use of recent X-ray observations of the astronomical source PG 1116+215 to illustrate the application of the C statistic to Poisson data. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
13. Interviews of Notable Statistics and Data Science Educators.
- Author
-
Horton, Nicholas
- Subjects
- *
DATA science , *EDUCATORS , *STATISTICS , *SCIENCE education , *EUGENICS - Abstract
The Journal of Statistics and Data Science Education regularly publishes interviews with notable statistics and data science educators. These interviews aim to provide insight into the development of statistics and data science education, offer a historical perspective, and humanize the field through personal anecdotes. Allan Rossman has been conducting these interviews since 2011, profiling a diverse range of individuals who have made significant contributions to the field. Taylor & Francis has made the complete set of interviews easily accessible through centralized lists. The current issue of the journal features papers on various topics, including the role of eugenics in the development of statistics and strategies for addressing this history in the classroom. [Extracted from the article]
- Published
- 2024
- Full Text
- View/download PDF
14. Disaggregating population data for assessing progress of SDGs: methods and applications.
- Author
-
Qiu, Yue, Zhao, Xuesheng, Fan, Deqin, Li, Songnian, and Zhao, Yijing
- Subjects
- *
POPULATION statistics , *STATISTICS , *DATA distribution , *SUSTAINABLE development , *DATA analysis , *FISH populations - Abstract
Rapid population growth has had a significant impact on society, economy and environment, which will challenge the achievement of the United Nations Sustainable Development Goals (SDGs). Spatially accurate and detailed population distribution data are essential for measuring the impact of population growth and tracking progress on the SDGs. However, most population data are evenly distributed within administrative units, which seriously lacks spatial details. There are scale differences between the population statistical data and geospatial data, which makes data analysis and needed research difficult. The disaggregation method is an effective way to obtain the spatial distribution of population with greater granularity. It can also transform the statistical population data from irregular administrative units into regular grids to characterize the spatial distribution of the population, and the original population count is preserved. This paper summarizes the research advances of population disaggregation in terms of methodology, ancillary data, and products and discusses the role of spatial disaggregation of population statistical data in monitoring and evaluating SDG indicators. Furthermore, future work is proposed from two perspectives: challenges with spatial disaggregation and disaggregated population as an Essential SDG Variable (ESDGV). [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
15. How microbreweries flooded Europe: mapping a new phenomenon in the beer industry.
- Author
-
Materna, Kryštof, Bernhäuserová, Veronika, Hasman, Jiří, and Hána, David
- Subjects
- *
BEER industry , *MICROBREWERIES , *BREWERIES , *STATISTICS , *BEER - Abstract
Europe has experienced a major boom of new breweries over the last thirty years, with thousands of new breweries being set up, even in regions where brewing has no history. So far, however, this microbrewing wave has not been systematically mapped. This paper presents a unique database of European breweries from 1990–2020. Using a series of maps and statistical analyses, it shows how breweries have gradually spread across Europe. Initially, microbreweries were being established in countries that are in a declining stage of the beer life-cycle from industrial breweries. After 2005 (and particularly in the 2010s), breweries reached other regions through neighbouring and hierarchical spatial diffusion. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
16. Linear censored regression models with skew scale mixtures of normal distributions.
- Author
-
Guzmán, Daniel C. F., Ferreira, Clécio S., and Zeller, Camila B.
- Subjects
- *
GAUSSIAN distribution , *REGRESSION analysis , *MODELS & modelmaking , *SKEWNESS (Probability theory) , *STATISTICS , *ALGORITHMS - Abstract
A special source of difficulty in the statistical analysis is the possibility that some subjects may not have a complete observation of the response variable. Such incomplete observation of the response variable is called censoring. Censorship can occur for a variety of reasons, including limitations of measurement equipment, design of the experiment, and non-occurrence of the event of interest until the end of the study. In the presence of censoring, the dependence of the response variable on the explanatory variables can be explored through regression analysis. In this paper, we propose to examine the censorship problem in context of the class of asymmetric, i.e., we have proposed a linear regression model with censored responses based on skew scale mixtures of normal distributions. We develop a Monte Carlo EM (MCEM) algorithm to perform maximum likelihood inference of the parameters in the proposed linear censored regression models with skew scale mixtures of normal distributions. The MCEM algorithm has been discussed with an emphasis on the skew-normal, skew Student-t-normal, skew-slash and skew-contaminated normal distributions. To examine the performance of the proposed method, we present some simulation studies and analyze a real dataset. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
17. Computing in the Statistics Curricula: A 10-Year Retrospective.
- Author
-
Hardin, Johanna, Horton, Nicholas J., Nolan, Deborah, and Lang, Duncan Temple
- Subjects
- *
DATA science , *SPECIAL education , *SCIENCE education , *STATISTICS , *CURRICULUM - Abstract
The Journal of Statistics and Data Science Education special issue on "Computing in the Statistics and Data Science Curriculum "features a set of papers that provide amosaic of curricular innovations and approaches that embrace computing. As we reviewed the papers we felt that this collection would benefit from the perspective of the authors of the landmark "Computing in the Statistics Curricula" (TAS 2010) paper. We asked Deb and Duncan to take this opportunity to reflect on the landscape when they wrote the paper, to comment on the current situation, and to speculate on the future. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. Statistical Literacy--Misuse of Statistics and Its Consequences.
- Author
-
Johannssen, Arne, Chukhrova, Nataliya, Schmal, Friederike, and Stabenow, Kevin
- Subjects
- *
STATISTICAL literacy , *EDUCATION statistics , *SCIENTIFIC communication , *PEER review of students , *STATISTICS - Abstract
Although statistical literacy has become a key competence in today's data-driven society, it is usually not a part of statistics education. To address this issue, we propose an innovative concept for a conference-like seminar on the topic of statistical literacy. This seminar draws attention to the relevance and importance of statistical literacy, and moreover, students are made aware of the process of science communication and are introduced to the peer review process for the assessment of scientific papers. In the summer term 2020, the seminar was conducted as a joint project by the University of Hamburg, the University of Muenster, and the Joachim Herz Foundation. In this article, we present the concept of the seminar and our experience with this concept in the summer term 2020. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
19. Asymptotic normality of the test statistics for the unified relative dispersion and relative variation indexes.
- Author
-
Touré, Aboubacar Y., Dossou-Gbété, Simplice, and Kokonendji, Célestin C.
- Subjects
- *
ASYMPTOTIC distribution , *BINOMIAL distribution , *ASYMPTOTIC normality , *POISSON distribution , *STATISTICS , *EXPONENTIAL families (Statistics) , *DISPERSION (Chemistry) - Abstract
Dispersion indexes with respect to the Poisson and binomial distributions are widely used to assess the conformity of the underlying distribution from an observed sample of the count with one or the other of these theoretical distributions. Recently, the exponential variation index has been proposed as an extension to nonnegative continuous data. This paper aims to gather to study the unified definition of these indexes with respect to the relative variability of a nonnegative natural exponential family of distributions through its variance function. We establish the strong consistency of the plug-in estimators of the indexes as well as their asymptotic normalities. Since the exact distributions of the estimators are not available in closed form, we consider the test of hypothesis relying on these estimators as test statistics with their asymptotic distributions. Simulation studies globally suggest good behaviours of these tests of hypothesis procedures. Applicable examples are analysed, including the lesser-known references such as negative binomial and inverse Gaussian, and improving the very usual case of the Poisson dispersion index. Concluding remarks are made with suggestions of possible extensions. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
20. Computing the exact distribution of the Bartlett's test statistic by numerical inversion of its characteristic function.
- Author
-
Witkovský, Viktor
- Subjects
- *
CHARACTERISTIC functions , *DISTRIBUTION (Probability theory) , *MATHEMATICAL statistics , *STATISTICAL software , *INTEGRATED software , *FISHER exact test , *STATISTICS - Abstract
Application of the exact statistical inference frequently leads to non-standard probability distributions of the considered estimators or test statistics. The exact distributions of many estimators and test statistics can be specified by their characteristic functions, as is the case for the null distribution of the Bartlett's test statistic. However, analytical inversion of the characteristic function, if possible, frequently leads to complicated expressions for computing the distribution function and the corresponding quantiles. An efficient alternative is the well-known method based on numerical inversion of the characteristic functions, which is, however, ignored in popular statistical software packages. In this paper, we present the explicit characteristic function of the corrected Bartlett's test statistic together with the computationally fast and efficient implementation of the approach based on numerical inversion of this characteristic function, suggested for evaluating the exact null distribution used for testing homogeneity of variances in several normal populations, with possibly unequal sample sizes. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
21. Assigning scores for ordered categorical responses.
- Author
-
Fernández, Daniel, Liu, Ivy, Costilla, Roy, and Gu, Peter Yongqi
- Subjects
- *
STATISTICS , *TALLIES - Abstract
Deciding on the best statistical method to apply when the response variable is ordinal is essential because the way the categories are ordered in the data is relevant as it could change the results of the analysis. Although the models for continuous variables have similarities to those for ordinal variables, this paper presents the advantages of the use of the ordering information on the outcomes with methods developed for modeling ordinal data such as the ordered stereotype model. The novelty of this article lies in showing the dangers of assigning equally spaced scores to ordered response categories in statistical analysis, which are illustrated with a simulation study and a case study. We propose a new way to use the score parameters, which incorporates the fitted spacing dictated by the data. Additionally, this article uses score parameter estimates in the ordered stereotype model to propose a new measure to calculate continuous medians in the raw data: the adjusted c-median. It benefits the general audience who can easily understand the median as a summary statistic. Supplementary materials for this article are available online. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
22. A new two-parameter exponentiated discrete Lindley distribution: properties, estimation and applications.
- Author
-
El-Morshedy, M., Eliwa, M. S., and Nagy, H.
- Subjects
- *
WEIBULL distribution , *DISTRIBUTION (Probability theory) , *STATISTICAL models , *ORDER statistics , *KURTOSIS , *STATISTICS , *DISCRETE systems - Abstract
This paper introduces a new two-parameter exponentiated discrete Lindley distribution. A wide range of its structural properties are investigated. This includes the shape of the probability mass function, hazard rate function, moments, skewness, kurtosis, stress–strength reliability, mean residual lifetime, mean past lifetime, order statistics and L-moment statistics. The hazard rate function can be increasing, decreasing, decreasing–increasing–decreasing, increasing–decreasing–increasing, unimodal, bathtub, and J-shaped depending on its parameters values. Two methods are used herein to estimate the model parameters, namely, the maximum likelihood, and the proportion. A detailed simulation study is carried out to examine the bias and mean square error of maximum likelihood and proportion estimators. The flexibility of the proposed model is explained by using four distinctive data sets. It can serve as an alternative model to other lifetime distributions in the existing statistical literature for modeling positive real data in many areas. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
23. Discussion of LESA: Longitudinal Elastic Shape Analysis of Brain Subcortical Structures.
- Author
-
Aston, John A. D. and Lila, Eardi
- Subjects
- *
ELASTIC analysis (Engineering) , *BRAIN anatomy , *STATISTICS , *PHYSICAL sciences , *RANDOM effects model - Abstract
It should, however, be noted that such approximations have ultimately allowed the authors to conduct a comprehensive analysis of thousands of images from multiple studies, which further substantiates their findings. We congratulate the authors on an interesting paper which combines the ideas of shape analysis with longitudinal data analysis. This, of course, loses some of the elegance of the SRNF framework, but at a considerable ease of preserving the topology of the data and avoiding Euclidean approximations. [Extracted from the article]
- Published
- 2023
- Full Text
- View/download PDF
24. Dead or alive? Pitfall of survival analysis with TCGA datasets.
- Author
-
Idogawa, Masashi, Koizumi, Masayo, Hirano, Tomomi, Tange, Shoichiro, Nakase, Hiroshi, and Tokino, Takashi
- Subjects
- *
SURVIVAL analysis (Biometry) , *STATISTICS - Abstract
We often encounter situations in which data from the TCGA that have been analyzed in papers we read or reviewed cannot be reproduced, even when TCGA datasets are used, especially in survival analyses. Therefore, we attempted to confirm the data source for TCGA survival analysis and found that several websites used to analyze the survival data of TCGA datasets inappropriately handle the survival data, causing differences in statistical analyses. This causes the misinterpretation of results because figures of survival analysis results in several papers are sometimes exactly as generated by these sites, and the results depend on only the tools provided by these sites. We would like to make this situation widely known and raise the problem for scientific soundness. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
25. Statistical analysis of trace metals content of cocaine using inductively coupled plasma-mass spectrometry calibrations.
- Author
-
Bentil, Edward, Asiedu, Nana, Ataki, James, and Wong, Bryan M.
- Subjects
- *
TRACE analysis , *METAL analysis , *COCAINE , *TRACE metals , *STATISTICS , *INDUCTIVELY coupled plasma mass spectrometry , *HEAVY metals , *SPECTROMETRY - Abstract
The trafficking of cocaine has become a global challenge now and Ghana is no exception. Cocaine is a whitish powder, which is, produced both from natural and synthetic means. This paper studied the metal content of seized cocaine in Ghana and the data used for batch identification. Ten metals, namely, Pb, Cu, Mg, Mn, Cr, As, Ni, Fe, Co and Ca were analyzed in 37 samples which were sampled from 2010 to 2014. Analyses of the metals were done using ICP-MS and data was analyzed using statistical tools. The results showed that, calcium recorded the highest amount in all the samples with a mean of 64.94mg/kg followed by Magnesium, Zinc and Iron with mean values of 24.35mg/kg, 6.25mg/kg and 2.65mg/kg, respectively. All the samples, within-seizure classification under class A showed to significant differences between each pair at a confidence level of 95%. With three sample pairs under class B in the within-seizure classification, one of the pairs; 103A and 105B showed no significant differences between them even though they were sampled from two different packages from the same seizure. Five samples from five different seizures also showed a significant difference among them showing that they came from different batches or origin. It is confirmed that seized cocaine contained poisonous heavy metals like Lead, Arsenic and Chromium, which have the amount that could affect the user-provide figures. Based on the data gathered from the within-seizure class A group, it could be proposed that a missing cocaine could be identified by its metal content. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
26. Randomised trials in education in the USA.
- Author
-
Hedges, Larry V. and Schauer, Jacob
- Subjects
- *
RANDOMIZED controlled trials , *UNITED States education system , *PSYCHOLOGISTS , *EDUCATION research , *SOCIAL services - Abstract
Background and purpose: Studies of education and learning that were described as experiments have been carried out in the USA by educational psychologists since about 1900. In this paper, we discuss the history of randomised trials in education in the USA in terms of five historical periods. In each period, the use of randomised trials was motivated by the research interests and conditions of the era. We have characterised these periods in terms of decades with sharp boundaries as a convenience. Sources of evidence and main arguments: Although some of the early studies used random allocation (and even random allocation of clusters such as schools), early researchers did not clearly understand the role of randomisation or clearly distinguish it from methods such as alternation. In 1940, E. F. Lindquist published an important book whose goal was to translate R. A. Fisher’s ideas into language congenial to education researchers, but this had little impact on education research outside of psychology. There was a substantial increase in the number of randomised trials during the period from 1960 to 1980, as the US government enacted and evaluated a variety of social programmes. This was followed by a dramatic decrease during the period from 1980 to 2000, amid debates about the relevance of randomised trials in education research. The creation of the US Institute of Education Sciences in 2002 provided major financial and administrative support for randomised trials, which has led to a large number of trials being conducted since that time. Conclusions: These developments suggest that there is a promising future for randomised trials in the USA. American education scientists must remain committed to explaining why evidence from randomised field trials has an indispensable role to play in making wise decisions about education policy and advancing our capacity to improve education for a productive workforce and a successful society. [ABSTRACT FROM AUTHOR]
- Published
- 2018
- Full Text
- View/download PDF
27. Rejoinder.
- Author
-
Zhou, Yang, Xue, Lirong, Shi, Zhengyu, Wu, Libo, and Fan, Jianqing
- Subjects
- *
STATISTICS , *BIG data - Abstract
We agree that converting data to similar resolutions would lose some statistical information from our data. We thank gratefully Professors Wei Tu, Bei Jiang, Linglong Kong, and Professor Sudipto Banerjee for their insightful comments and discussion of our paper. The inference of the occupancy status of houses sharing one location might require more information at the individual level, which is the primary data barrier in social science. [Extracted from the article]
- Published
- 2022
- Full Text
- View/download PDF
28. Comment on: "Confidence Intervals for Nonparametric Empirical Bayes Analysis" by Ignatiadis and Wager.
- Author
-
Imbens, Guido
- Subjects
- *
EMPIRICAL Bayes methods , *VALUE-added assessment (Education) , *CLINICAL drug trials , *STATISTICS , *PROBABILITY theory - Abstract
In these cases getting accurate confidence intervals is of first order importance, and the methods Ignatiadis and Wager develop are likely to be useful. I want to congratulate Nikolaos Ignatiadis and Stefan Wager on a very stimulating paper on a timely and important topic. [Extracted from the article]
- Published
- 2022
- Full Text
- View/download PDF
29. An asymptotic distribution of compound Poisson distribution.
- Author
-
Shimizu, Eiji and Shiraishi, Hiroshi
- Subjects
- *
STATISTICS , *GAUSSIAN distribution , *DISTRIBUTION (Probability theory) , *POISSON distribution , *MATHEMATICAL inequalities - Abstract
In many of statistical theory of asymptotic distribution, it is based on normal distribution or some parameters that it is difficult to determine. In this paper we consider combined Poisson distribution and introduce one of the theoretical approach that does not require much of them to obtain more convenience, instead by using inequality of best estimate. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
30. Review of statistical actuarial risk modelling.
- Author
-
Shiraishi, Hiroshi and Lu, Zudi
- Subjects
- *
ACTUARIAL risk , *DIVIDENDS , *STATISTICS , *PROBABILITY theory , *MATHEMATICAL functions - Abstract
In this paper, we review some results for insurance risk theory. We first introduce a variety of the insurance risk models proposed thus far. Then, we show that the expected discounted penalty function (the so-called Gerber-Shiu function) can describe some risk indicators. Next, the dividend problem is discussed; more precisely, the (approximated) optimal dividend barrier is derived and other extended dividend strategies introduced. In addition, some modified models depending on reinsurance or tax are introduced. Finally, we discuss the statistical estimation of the ruin probability and the Gerber-Shiu function. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
31. Statistical considerations for occupational and environmental physiology.
- Author
-
Curran-Everett, Douglas
- Subjects
- *
ECOPHYSIOLOGY , *STATISTICS , *ACQUISITION of data , *DATA analysis , *PHYSIOLOGISTS - Abstract
The article comments on the research paper "Basic statistical considerations for occupational and environmental physiology: the Temperature toolbox" by A. R. Caldwell and S. N. Cheuvront in the current issue of the journal. Topics include the paper goes without saying that experimental design, data collection, data analysis, and dissemination of results are important and their review being a guide for physiologists to avoid common pitfalls and mistakes as they plan and analyze their studies.
- Published
- 2019
- Full Text
- View/download PDF
32. Exploring models for the roles of health systems' responsiveness and social determinants in explaining universal health coverage and health outcomes.
- Author
-
Valentine, Nicole Britt and Bonsel, Gouke J.
- Subjects
- *
STATISTICAL correlation , *INSURANCE , *HEALTH insurance , *LONGITUDINAL method , *EVALUATION of medical care , *REGRESSION analysis , *STATISTICS , *SURVEYS , *EMPIRICAL research , *MULTIPLE regression analysis , *HEALTH & social status , *DESCRIPTIVE statistics , *HEALTH impact assessment - Abstract
Background: Intersectoral perspectives of health are present in the rhetoric of the sustainable development goals. Yet its descriptions of systematic approaches for an intersectoral monitoring vision, joining determinants of health, and barriers or facilitators to accessing healthcare services are lacking. Objective: To explore models of associations between health outcomes and health service coverage, and health determinants and health systems responsiveness, and thereby to contribute to monitoring, analysis, and assessment approaches informed by an intersectoral vision of health. Design: The study is designed as a series of ecological, cross-country regression analyses, covering between 23 and 57 countries with dependent health variables concentrated on the years 2002-2003. Countries cover a range of development contexts. Health outcome and health service coverage dependent variables were derived from World Health Organization (WHO) information sources. Predictor variables representing determinants are derived from the WHO and World Bank databases; variables used for health systems' responsiveness are derived from the WHO World Health Survey. Responsiveness is a measure of acceptability of health services to the population, complementing financial health protection. Results: Health determinants' indicators -- access to improved drinking sources, accountability, and average years of schooling -- were statistically significant in particular health outcome regressions. Statistically significant coefficients were more common for mortality rate regressions than for coverage rate regressions. Responsiveness was systematically associated with poorer health and health service coverage. With respect to levels of inequality in health, the indicator of responsiveness problems experienced by the unhealthy poor groups in the population was statistically significant for regressions on measles vaccination inequalities between rich and poor. For the broader determinants, the Gini mattered most for inequalities in child mortality; education mattered more for inequalities in births attended by skilled personnel. Conclusions: This paper adds to the literature on comparative health systems research. National and international health monitoring frameworks need to incorporate indicators on trends in and impacts of other policy sectors on health. This will empower the health sector to carry out public health practices that promote health and health equity. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
33. Empirical statistical characterization and regionalization of amplitude–duration–frequency curves for extreme peak flows in the Lake Victoria Basin, East Africa.
- Author
-
Onyutha, Charles and Willems, Patrick
- Subjects
- *
EMPIRICAL research , *PROBLEM solving , *STATISTICS , *WATERSHEDS , *REGRESSION analysis - Abstract
This paper focuses on a regionalization attempt to partly solve data limitation problems in statistical analysis of high flows to derive discharge–duration–frequency (QDF) relationships. The analysis is based on 24 selected catchments in the Lake Victoria Basin (LVB) in East Africa. Characteristics of the theoretical QDF relationships were parameterized to capture their slopes of extreme value distributions (evd), tail behaviour and scaling measures. To enable QDF estimates to be obtained for ungauged catchments, interdependence relationships between the QDF parameters were identified, and regional regression models were developed to explain the regional difference in these parameters from physiographic characteristics. In validation of the regression models, from the lowest (5 years) to the highest (25 years) return periods considered, the percentage bias in the QDF estimates ranged from –2% for the 5-year return period to 27% for 25-year return period.Editor D. Koutsoyiannis [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
34. Contraceptive use and distribution of high-risk births in Nigeria: a sub-national analysis.
- Author
-
Akinyemi, Akanni, Adedini, Sunday, Hounton, Sennen, Akinlo, Ambrose, Adedeji, Olanike, Adonri, Osarenti, Friedman, Howard, Shiferaw, Solomon, Maïga, Abdoulaye, Amouzou, Agbessi, and Barros, Aluisio J. D.
- Subjects
- *
CONTRACEPTIVE drugs , *CONFIDENCE intervals , *HIGH-risk pregnancy , *MULTIVARIATE analysis , *REGRESSION analysis , *STATISTICS , *SURVEYS , *FAMILY planning , *DESCRIPTIVE statistics , *ODDS ratio , *THERAPEUTICS - Abstract
Background: Family planning expansion has been identified as an impetus to harnessing Nigeria's demographic dividend. However, there is a need for data to address pockets of inequality and to better understand cultural and social factors affecting contraceptive use and health benefits. This paper contributes to addressing these needs by providing evidence on the trends and sub-national patterns of modern contraceptive prevalence in Nigeria and the association between contraceptive use and high-risk births in Nigeria. Design: The study utilised women's data from the last three Demographic and Health Surveys (2003, 2008, and 2013) in Nigeria. The analysis involved descriptive, bivariate, and multivariate analyses. The multivariate analyses were performed to examine the relationship between high-risk births and contraceptive use. Associations were examined using Poisson regression. Results: Findings showed that respondents in avoidable high-risk birth categories were less likely to use contraceptives compared to those at no risk [rate ratio 0.82, confidence interval: 0.76-0.89, p <0.001]. Education and wealth index consistently predicted significant differences in contraceptive use across the models. Conclusions:. The results of this study suggest that women in the high-risk birth categories were significantly less likely to use a modern method of contraception relative to those categorised as having no risk. However, there are huge sub-national variations at regional and state levels in contraceptive prevalence and subsequent high-risk births. These results further strengthen evidence-based justification for increased investments in family planning programmes at the state and regional levels, particularly regions and states with high unmet needs for family planning. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
35. The health system consequences of agency nursing and moonlighting in South Africa.
- Author
-
Rispel, Laetitia C. and Blaauw, Duane
- Subjects
- *
NURSES , *LEADERSHIP , *QUESTIONNAIRES , *STATISTICAL sampling , *SELF-evaluation , *STATISTICS , *SURVEYS , *MULTIPLE regression analysis , *NURSES' associations , *DATA analysis software - Abstract
Background: Worldwide, there is an increased reliance on casual staff in the health sector. Recent policy attention in South Africa has focused on the interrelated challenges of agency nursing and moonlighting in the health sector. Objective: This paper examines the potential health system consequences of agency nursing and moonlighting among South African nurses. Methods: During 2010, a cluster random sample of 80 hospitals was selected in four South African provinces. On the survey day, all nurses providing clinical care completed a self-administered questionnaire after giving informed consent. The questionnaire obtained information on socio-demographics, involvement in agency nursing and moonlighting, and self-reported indicators of potential health system consequences of agency nursing and moonlighting. A weighted analysis was done using STATA® 13. Results: In the survey, 40.7% of nurses reported moonlighting or working for an agency in the preceding year. Of all participants, 51.5% reported feeling too tired to work, 11.5% paid less attention to nursing work on duty, and 10.9% took sick leave when not actually sick in the preceding year. Among the moonlighters, 11.9% had taken vacation leave to do agency work or moonlighting, and 9.8% reported conflicting schedules between their primary and secondary jobs. In the bivariate analysis, moonlighting nurses were significantly more likely than non-moonlighters to take sick leave when not sick (p=0.011) and to pay less attention to nursing work on duty (p=0.035). However, in a multiple logistic regression analysis, the differences between moonlighters and non-moonlighters did not remain statistically significant after adjusting for other sociodemographic variables. Conclusion: Although moonlighting did not emerge as a statistically significant predictor, the reported health system consequences are serious. A combination of strong nursing leadership, effective management, and consultation with and buy-in from front-line nurses is needed to counteract the potential negative health system consequences of agency nursing and moonlighting. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
36. A Commentary on Statistical Assessment of Violence Recidivism Risk.
- Author
-
Imrey, Peter B. and Dawid, A. Philip
- Subjects
- *
VIOLENT crimes , *RECIDIVISM , *STATISTICS , *RISK assessment , *PREDICTION models - Abstract
Increasing integration and availability of data on large groups of persons has been accompanied by proliferation of statistical and other algorithmic prediction tools in banking, insurance, marketing, medicine, and other fields (see, e.g., Steyerberg 2009a, b). Controversy may ensue when such tools are introduced to fields traditionally reliant on individual clinical evaluations. Such controversy has arisen about "actuarial" assessments of violence recidivism risk, that is, the probability that someone found to have committed a violent act will commit another during a specified period. Recently, Hart, Michie, and Cooke (2007a) and subsequent papers from these authors in several reputable journals have claimed to demonstrate that statistical assessments of such risks are inherently too imprecise to be useful, using arguments that would seem to apply to statistical risk prediction quite broadly. This commentary examines these arguments from a technical statistical perspective, and finds them seriously mistaken in many particulars. They should play no role in reasoned discussions of violence recidivism risk assessment. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
37. Rejoinder.
- Author
-
Wood, Simon N., Pya, Natalya, and Säfken, Benjamin
- Subjects
- *
BOUNDARY element methods , *NUMERICAL analysis , *STATISTICS , *MATHEMATICAL statistics , *REGRESSION analysis , *ANALYSIS of covariance - Abstract
The article focuses on the study regarding the boundary of smoothing parameter space in statistical analysis. It mentions several papers by different authors which featured different approaches and methods to determine smoothing parameters on the edge of the feasible parameter space. It also describes the proposed fixes of the researchers that offer substantial improvement to the phase transition to smooth estimates to nonzero smoothing penalty.
- Published
- 2016
- Full Text
- View/download PDF
38. RETRACTED ARTICLE: A novel alpha power transformed exponential distribution with real-life applications.
- Author
-
Ijaz, Muhammad, Mashwani, Wali Khan, Göktaş, Atilla, and Unvan, Yuksel Akay
- Subjects
- *
DISTRIBUTION (Probability theory) , *WEIBULL distribution , *GAMMA distributions , *EDITORIAL policies , *STATISTICS , *ONLINE comments , *ELECTRONIC journals - Abstract
We, the Editor-in-Chief and Publisher of Journal of Applied Statistics have retracted the following article, which was due to appear in a special issue: Muhammad Ijaz, Wali Khan Mashwani, Atilla Göktaş & Yuksel Akay Unvan (2021): A novel alpha power transformed exponential distribution with real-life applications, Journal of Applied Statistics. DOI: . The Editor-in-Chief and the Publisher are cognisant of clear evidence that the findings presented are unreliable. The probability distribution is only valid if α > 1 and numerous mathematical properties in Section 2 have been shown to be incorrect. This has then impacted at least two figures in the article. We are further cognisant that the article contained a number of similarities to previously published papers where some of the findings had been published without proper cross-referencing including: Gupta, R.D. and Kundu, D. (2001), Exponentiated Exponential Family: An Alternative to Gamma and Weibull Distributions. Biom. J., 43: 117–130. . We have been informed in our decision-making by our corrections and editorial policies and the Committee on Publication Ethics (COPE) guidelines on retractions. The retracted article will remain online to maintain the scholarly record, but it will be digitally watermarked on each page as 'Retracted'. The Editor-in-Chief and the Publisher would like to thank the anonymous reader/s for their comments which alerted JAS to these major errors in the first instance. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
39. Comment.
- Author
-
Hardt, Moritz
- Subjects
- *
STATISTICS , *ESTIMATION theory , *MATHEMATICAL bounds , *MACHINE learning , *PERTURBATION theory - Abstract
The author offers a commentary on an article by Professors J. Duchi et al which developed packing arguments and perturbation mechanisms for differential privacy in the minimax framework for statistical estimation leading to a range of minimax optimal estimation rates. He discusses the prevalent models for designing differentially private algorithms, how minimax lower bounds are developed, and the attention received by the sequence of papers by the authors in the machine learning community.
- Published
- 2018
- Full Text
- View/download PDF
40. Stacy-static code analysis for enhanced vulnerability detection.
- Author
-
Lathar, Pankaj, Shah, Raunak, and K G, Srinivasa
- Subjects
- *
FLOWGRAPHS , *COMPUTER science , *MATHEMATICS , *STATISTICS , *SYSTEM analysis - Abstract
Computer program analysis refers to the automatic analysis of the behavior of a user defined program. An application of program analysis is to determine the quality of source code. Humans are prone to errors and, in most cases, the penalty of deploying low quality code is very high for a large organization. These errors often give rise to potential security vulnerabilities in an application, which could be exploited by malicious users. In this paper, we present Stacy—a tool that statically detects potential security vulnerabilities present in input source code. Static program analysis is the examination of source code prior to its execution. Our tool attempts to predict the behavior of a program before it is deployed. Stacy uses novel techniques to detect the primary sources of vulnerability in the source code of a program and informs the developer. [ABSTRACT FROM AUTHOR]
- Published
- 2017
- Full Text
- View/download PDF
41. Analysis of Thursday Night NFL Winning Margins.
- Author
-
Vaughan, Timothy S.
- Subjects
- *
GAMES , *STATISTICS , *BIG data , *GRAPHIC methods - Abstract
This paper introduces a dataset and associated analysis of the scores of National Football League (NFL) games over the 2012, 2013, and first five weeks of the 2014 season. In the face of current media attention to “lopsided” scores in Thursday night games in the early part of the 2014 season,t-test results indicate no statistically significant difference between the winning margins in Sunday games vs. Thursday games during the 2012 and 2013 seasons. Interestingly, there is a statistically significant difference between the Sunday vs. Thursday game margins over the first five weeks of the 2014 season. Moreover, statistical process control methods suggest an “out of control” condition for Thursday night game margins during the first five weeks of the 2014 season. The exercise provides students with an opportunity to apply a variety of hypothesis testing and graphical analysis tools to a question of current interest in the popular media. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
42. The health system consequences of agency nursing and moonlighting in South Africa.
- Author
-
Rispel, Laetitia C. and Blaauw, Duane
- Subjects
- *
LEADERSHIP , *NURSING , *QUESTIONNAIRES , *STATISTICAL sampling , *STATISTICS , *SURVEYS , *MULTIPLE regression analysis , *NURSES' associations , *DATA analysis software , *MEDICAL registry personnel - Abstract
Worldwide, there is an increased reliance on casual staff in the health sector. Recent policy attention in South Africa has focused on the interrelated challenges of agency nursing and moonlighting in the health sector. This paper examines the potential health system consequences of agency nursing and moonlighting among South African nurses. During 2010, a cluster random sample of 80 hospitals was selected in four South African provinces. On the survey day, all nurses providing clinical care completed a self-administered questionnaire after giving informed consent. The questionnaire obtained information on socio-demographics, involvement in agency nursing and moonlighting, and self-reported indicators of potential health system consequences of agency nursing and moonlighting. A weighted analysis was done using STATA® 13. In the survey, 40.7% of nurses reported moonlighting or working for an agency in the preceding year. Of all participants, 51.5% reported feeling too tired to work, 11.5% paid less attention to nursing work on duty, and 10.9% took sick leave when not actually sick in the preceding year. Among the moonlighters, 11.9% had taken vacation leave to do agency work or moonlighting, and 9.8% reported conflicting schedules between their primary and secondary jobs. In the bivariate analysis, moonlighting nurses were significantly more likely than non-moonlighters to take sick leave when not sick (p=0.011) and to pay less attention to nursing work on duty (p=0.035). However, in a multiple logistic regression analysis, the differences between moonlighters and non-moonlighters did not remain statistically significant after adjusting for other socio-demographic variables. Although moonlighting did not emerge as a statistically significant predictor, the reported health system consequences are serious. A combination of strong nursing leadership, effective management, and consultation with and buy-in from front-line nurses is needed to counteract the potential negative health system consequences of agency nursing and moonlighting. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.