37 results
Search Results
2. What Have We (Not) Learnt from Millions of Scientific Papers with P Values?
- Author
-
Ioannidis, John P. A.
- Subjects
- *
P-value (Statistics) , *STATISTICAL bias , *STATISTICAL hypothesis testing , *NULL hypothesis , *INFERENTIAL statistics , *BAYESIAN analysis - Abstract
P values linked to null hypothesis significance testing (NHST) is the most widely (mis)used method of statistical inference. Empirical data suggest that across the biomedical literature (1990–2015), when abstracts use P values 96% of them have P values of 0.05 or less. The same percentage (96%) applies for full-text articles. Among 100 articles in PubMed, 55 report P values, while only 4 present confidence intervals for all the reported effect sizes, none use Bayesian methods and none use false-discovery rate. Over 25 years (1990–2015), use of P values in abstracts has doubled for all PubMed, and tripled for meta-analyses, while for some types of designs such as randomized trials the majority of abstracts report P values. There is major selective reporting for P values. Abstracts tend to highlight most favorable P values and inferences use even further spin to reach exaggerated, unreliable conclusions. The availability of large-scale data on P values from many papers has allowed the development and applications of methods that try to detect and model selection biases, for example, p-hacking, that cause patterns of excess significance. Inferences need to be cautious as they depend on the assumptions made by these models and can be affected by the presence of other biases (e.g., confounding in observational studies). While much of the unreliability of past and present research is driven by small, underpowered studies, NHST with P values may be also particularly problematic in the era of overpowered big data. NHST and P values are optimal only in a minority of current research. Using a more stringent threshold, as in the recently proposed shift from P < 0.05 to P < 0.005, is a temporizing measure to contain the flood and death-by-significance. NHST and P values may be replaced in many fields by other, more fit-for-purpose, inferential methods. However, curtailing selection biases requires additional measures, beyond changes in inferential methods, and in particular reproducible research practices. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
3. What Have We (Not) Learnt from Millions of Scientific Papers with P Values?
- Author
-
Ioannidis, John P. A.
- Subjects
- *
P-value (Statistics) , *BAYESIAN analysis , *STATISTICAL significance , *STATISTICIANS - Abstract
P values linked to null hypothesis significance testing (NHST) is the most widely (mis)used method of statistical inference. Empirical data suggest that across the biomedical literature (1990-2015), when abstracts use P values 96% of them have P values of 0.05 or less. The same percentage (96%) applies for full-text articles. Among 100 articles in PubMed, 55 report P values, while only 4 present confidence intervals for all the reported effect sizes, none use Bayesian methods and none use false-discovery rate. Over 25 years (1990-2015), use of P values in abstracts has doubled for all PubMed, and tripled for meta-analyses, while for some types of designs such as randomized trials the majority of abstracts report P values. There is major selective reporting for P values. Abstracts tend to highlight most favorable P values and inferences use even further spin to reach exaggerated, unreliable conclusions. The availability of large-scale data on P values from many papers has allowed the development and applications of methods that try to detect and model selection biases, for example, p-hacking, that cause patterns of excess significance. Inferences need to be cautious as they depend on the assumptions made by these models and can be affected by the presence of other biases (e.g., confounding in observational studies). While much of the unreliability of past and present research is driven by small, underpowered studies, NHST with P values may be also particularly problematic in the era of overpowered big data. NHST and P values are optimal only in a minority of current research. Using a more stringent threshold, as in the recently proposed shift from P < 0.05 to P < 0.005, is a temporizing measure to contain the flood and death-by-significance. NHST and P values may be replaced in many fields by other, more fit-for-purpose, inferential methods. However, curtailing selection biases requires additional measures, beyond changes in inferential methods, and in particular reproducible research practices. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
4. A Multi-Method Data Science Pipeline for Analyzing Police Service.
- Author
-
Haensch, Anna, Gordon, Daanika, Knudson, Karin, and Cheng, Justina
- Abstract
AbstractDespite the fact that most police departments in the U.S. serve jurisdictions with fewer than 10,000 residents, policing practices in small towns are understudied. This is due in part to data limitations and technological barriers that exist in the small-town context. In this paper we focus on one small town police department in New England with a history of misconduct, and develop a comprehensive data science pipeline that addresses the stages from design and collection to reporting. We present the reader with specific tools in the open-source Python ecosystem for replicating this pipeline. Once these data are processed, we perform two statistical analyses in an attempt to better understand the provisions of service by the small-town police department of focus. First, we perform ecological inference to estimate the rate at which residents are placing calls for service. Second, we model wait times using a negative binomal regression model to account for overdispersion in the data. We discuss data and model limitations arising through the pipeline creation and analysis process. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. Integrative data analysis where partial covariates have complex non-linear effects by using summary information from an external data.
- Author
-
Liang, Jia, Chen, Shuo, Kochunov, Peter, Hong, L. Elliot, and Chen, Chixiang
- Abstract
AbstractA full parametric and linear specification may be insufficient to capture complicated patterns in studies exploring complex features, such as those investigating age-related changes in brain functional abilities. Alternatively, a partially linear model (PLM) consisting of both parametric and non-parametric elements may have a better fit. This model has been widely applied in economics, environmental science, and biomedical studies. In this paper, we introduce a novel statistical inference framework that equips PLM with high estimation efficiency by effectively synthesizing summary information from external data into the main analysis. Such an integrative scheme is versatile in assimilating various types of reduced models from the external study. The proposed method is shown to be theoretically valid and numerically convenient, and it ensures a high-efficiency gain compared to classic methods in PLM. Our method is further validated using two data applications by evaluating the risk factors of brain imaging measures and blood pressure. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Sequential Selection for Minimizing the Variance with Application to Crystallization Experiments.
- Author
-
Kerfonta, Caroline M., Kim, Sunuk, Chen, Ye, Zhang, Qiong, and Jiang, Mo
- Abstract
AbstractFor many crystal-based products (e.g., pharmaceuticals, energy storage), the size uniformity is not only a key quality attribute, but sometimes also an indicator of other attributes such as solid purity. This paper proposes a sequential selection approach to find a proper experimental setting that leads to high uniformity, or equivalently, small variance for crystal sizes, from the advanced slug flow reaction crystallization process of a model crystal, called manganese oxalate hydrate. The proposed sequential selection approach contains a Bayesian adaptive method to incorporate new uniformity measurements in each step and two design acquisition functions to improve the selection of the most promising experimental setting in terms of minimizing the variance. We study the performance of the proposed approach through multiple synthetic numerical studies, as well as a case study based on data from slug flow crystallization experiments. Throughout these studies, the proposed approach shows competitive performance in identifying the best experimental setting. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Einstein's First Published Paper.
- Author
-
Iglewicz, Boris
- Subjects
- *
MATHEMATICAL statistics , *RESEARCH , *DATA analysis , *STATISTICS , *STATISTICAL correlation , *DATA recorders & recording , *ESTIMATION theory , *LEAST squares - Abstract
This article reviews Albert Einstein's first published paper, submitted for publication in 1900. At that time, Einstein was 21 and a recent college graduate. His paper uses modeling and least squares to analyze data in support of a scientific proposition. Einstein is shown to be well trained, for his day, in using statistics as a tool in his scientific research. This paper also shows his ability to make trivial arithmetic mistakes and some clumsiness in data recording. A major aim of this article is to help provide a better appreciation of Einstein as an active user of statistical arguments in this and other of his important publications. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
8. Rethinking the Paper Helicopter: Combining Statistical and Engineering Knowledge.
- Author
-
Annis, David H.
- Subjects
- *
ENGINEERING , *EXPERIMENTAL design , *SCIENTIFIC method , *SCIENTIFIC experimentation , *MATHEMATICAL optimization , *ANALYSIS of variance - Abstract
Box's paper helicopter has been used to teach experimental design for more than a decade. It is simple, inexpensive, and provides real data for an involved, multifactor experiment. Unfortunately it can also further an all-too-common practice that Professor Box himself has repeatedly cautioned against, namely ignoring the fundamental science while rushing to solve problems that may not be sufficiently understood. Often this slighting of the science so as to get on with the statistics is justified by referring to Box's oft-quoted maxim that "All models are wrong, however some are useful." Nevertheless, what is equally true, to paraphrase both Professor Box and George Orwell, is that "All models are wrong, but some are more wrong than others." To experiment effectively it is necessary to understand the relevant science so as to distinguish between what is usefully wrong, and what is dangerously wrong. This article presents an improved analysis of Box's helicopter problem relying on statistical and engineering knowledge and shows that this leads to an enhanced paper helicopter, requiring fewer experimental trails and achieving superior performance. In fact, of the 20 experimental trials run for validation–10 each of the proposed aerodynamic design and the conventional full factorial optimum–the longest 10 flight times all belong to the aerodynamic optimum, while the shortest 10 all belong to the conventional full factorial optimum. I further discuss how ancillary engineering knowledge can be incorporated into thinking about–and teaching–experimental design. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
9. Identifying Key Statistical Papers From 1985 to 2002 Using Citation Data for Applied Biostatisticians.
- Author
-
Schell, Michael J.
- Subjects
- *
STATISTICS , *LIFE sciences , *BIOMETRY , *MATHEMATICAL analysis - Abstract
Dissemination of ideas from theory to practice is a significant challenge in statistics. Quick identification of articles useful to practitioners would greatly assist in this dissemination, thereby improving science. This article uses the citation count history of articles to identify key papers from 1985 to 2002 from 12 statistics journals for applied biostatisticians. One feature requiring attention in order to appropriately rank an article's impact is assessment of the citation accrual patterns over time. Citation counts in statistics differ dramatically from fields such as medicine. In statistics, most articles receive few citations, with 15-year-old articles from five key journals receiving a median of 13 citations compared to 66 in the Journal of Clinical Oncology. However, statistics articles in the top 2%-3% continue to gain citations at a high rate past 15 years, exceeding those in JCO, whose counts slow dramatically around 8 years past publication. Articles with the highest expected applied uses 20 years post publication were identified using joinpoint regression. In this evaluation, the fraction of citations that represent applied use was defined and estimated. The false discovery rate, quantification of heterogeneity in meta-analysis, and generalized estimating equations rank as the ideas with the greatest estimated applied impact. [ABSTRACT FROM AUTHOR]
- Published
- 2010
- Full Text
- View/download PDF
10. The State of Play of Reproducibility in Statistics: An Empirical Analysis.
- Author
-
Xiong, Xin and Cribben, Ivor
- Subjects
- *
FUNCTIONAL magnetic resonance imaging , *SCIENTIFIC method , *COMPUTER programming , *REPRODUCIBLE research , *STATISTICS - Abstract
Reproducibility, the ability to reproduce the results of published papers or studies using their computer code and data, is a cornerstone of reliable scientific methodology. Studies where results cannot be reproduced by the scientific community should be treated with caution. Over the past decade, the importance of reproducible research has been frequently stressed in a wide range of scientific journals such as Nature and Science and international magazines such as The Economist. However, multiple studies have demonstrated that scientific results are often not reproducible across research areas such as psychology and medicine. Statistics, the science concerned with developing and studying methods for collecting, analyzing, interpreting and presenting empirical data, prides itself on its openness when it comes to sharing both computer code and data. In this article, we examine reproducibility in the field of statistics by attempting to reproduce the results in 93 published papers in prominent journals using functional magnetic resonance imaging (fMRI) data during the 2010–2021 period. Overall, from both the computer code and the data perspective, among all the 93 examined papers, we could only reproduce the results in 14 (15.1%) papers, that is, the papers provide both executable computer code (or software) with the real fMRI data, and our results matched the results in the paper. Finally, we conclude with some author-specific and journal-specific recommendations to improve the research reproducibility in statistics. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
11. The blind paper cutter: Teaching about variation, bias, stability, and process control.
- Author
-
Stone, Richard A.
- Subjects
- *
STUDENT activities , *STATISTICS education - Abstract
Provides teachers with a student activity to help reinforce learning about variation, bias, stability and other statistical quality control concepts. Product tangible sequences generation through blind paper cutting; Rear-view process; Monitoring techniques; Measurement process.
- Published
- 1998
- Full Text
- View/download PDF
12. A Prediction Tournament Paradox.
- Author
-
Aldous, David J.
- Subjects
- *
TOURNAMENTS , *FUTUROLOGISTS , *PARADOX , *FORECASTING , *SPORTS forecasting - Abstract
In a prediction tournament, contestants "forecast" by asserting a numerical probability for each of (say) 100 future real-world events. The scoring system is designed so that (regardless of the unknown true probabilities) more accurate forecasters will likely score better. This is true for one-on-one comparisons between contestants. But consider a realistic-size tournament with many contestants, with a range of accuracies. It may seem self-evident that the winner will likely be one of the most accurate forecasters. But, in the setting where the range extends to very accurate forecasters, simulations show this is mathematically false, within a somewhat plausible model. Even outside that setting the winner is less likely than intuition suggests to be one of the handful of best forecasters. Though implicit in recent technical papers, this paradox has apparently not been explicitly pointed out before, though is easily explained. It perhaps has implications for the ongoing IARPA-sponsored research programs involving forecasting. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
13. Objective Bayesian testing for the correlation coefficient under divergence-based priors.
- Author
-
Peng, Bo and Wang, Min
- Subjects
- *
STATISTICAL correlation , *GAUSSIAN distribution , *NULL hypothesis , *BIVARIATE analysis - Abstract
The correlation coefficient is a commonly used criterion to measure the strength of a linear relationship between the two quantitative variables. For a bivariate normal distribution, numerous procedures have been proposed for testing a precise null hypothesis of the correlation coefficient, whereas the construction of flexible procedures for testing a set of (multiple) precise and/or interval hypotheses has received less attention. This paper fills the gap by proposing an objective Bayesian testing procedure using the divergence-based priors. The proposed Bayes factors can be used for testing any combination of precise and interval hypotheses and also allow a researcher to quantify evidence in the data in favor of the null or any other hypothesis under consideration. An extensive simulation study is conducted to compare the performances between the proposed Bayesian methods and some existing ones in the literature. Finally, a real-data example is provided for illustrative purposes. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
14. The Johnson System of Frequency Curves—Historical, Graphical, and Limiting Perspectives.
- Author
-
van Dorp, Johan René and Jones, M. C.
- Subjects
- *
PROBABILITY density function , *KURTOSIS , *GEOMETRIC series , *SKEWNESS (Probability theory) , *GAUSSIAN distribution , *FREQUENCY curves , *TWENTIETH century - Abstract
The idea of transforming one random variate to another with a more convenient density has been developed in the first half of the 20th century. In his thesis, Norman L. Johnson (1917–2004) developed a pioneering system of transformations of the standard normal distribution which gained substantial popularity in the second half of the 20th century and beyond. In Johnson's 1949Biometrika paper entitled Systems of frequency curves generated by methods of translation, summarizing that thesis, one of his primary interests was the behavior of the shape of the probability density functions as their parameter values change. Herein, we attempt to further elucidate this behavior through a series of geometric expositions of that transformation process. In these expositions insight is obtained into the behavior of Johnson's density functions, and their skewness and kurtosis, as they converge to their limiting distributions, a topic which received little attention. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
15. Interval Estimation for the Correlation Coefficient.
- Author
-
Hu, Xinjie, Jung, Aekyung, and Qin, Gengsheng
- Subjects
- *
STATISTICAL correlation , *INTRACLASS correlation , *RANDOM variables , *GAUSSIAN distribution , *CONFIDENCE intervals , *PROBABILITY theory - Abstract
The correlation coefficient (CC) is a standard measure of a possible linear association between two continuous random variables. The CC plays a significant role in many scientific disciplines. For a bivariate normal distribution, there are many types of confidence intervals for the CC, such as z-transformation and maximum likelihood-based intervals. However, when the underlying bivariate distribution is unknown, the construction of confidence intervals for the CC is not well-developed. In this paper, we discuss various interval estimation methods for the CC. We propose a generalized confidence interval for the CC when the underlying bivariate distribution is a normal distribution, and two empirical likelihood-based intervals for the CC when the underlying bivariate distribution is unknown. We also conduct extensive simulation studies to compare the new intervals with existing intervals in terms of coverage probability and interval length. Finally, two real examples are used to demonstrate the application of the proposed methods. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
16. The Analysis of Survey Data with Framing Effects.
- Author
-
Goldin, Jacob and Reck, Daniel
- Subjects
- *
DATA analysis - Abstract
A well-known difficulty in survey research is that respondents' answers to questions can depend on arbitrary features of a survey's design, such as the wording of questions or the ordering of answer choices. In this paper, we describe a novel set of tools for analyzing survey data characterized by such framing effects. We show that the conventional approach to analyzing data with framing effects—randomizing survey-takers across frames and pooling the responses—generally does not identify a useful parameter. In its place, we propose an alternative approach and provide conditions under which it identifies the responses that are unaffected by framing. We also present several results for shedding light on the population distribution of the individual characteristic the survey is designed to measure. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
17. A Probabilistic Approach to The Moments of Binomial Random Variables and Application.
- Author
-
Nguyen, Duy
- Subjects
- *
DISTRIBUTION (Probability theory) , *RANDOM variables , *PROBABILISTIC number theory - Abstract
In this paper, we provide a closed form formula for the moments of binomial random variables using a probabilistic approach. As an interesting application, we give a closed form formula for the sum 1 k + 2 k + 3 k + ... + n k . [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
18. The Impact of Results Blind Science Publishing on Statistical Consultation and Collaboration.
- Author
-
Locascio, Joseph J.
- Subjects
- *
SCIENCE publishing , *STATISTICAL consultants , *PUBLICATION bias , *STATISTICAL significance , *STATISTICAL hypothesis testing , *NULL hypothesis - Abstract
The author has previously proposed results blind manuscript evaluation (RBME) as a method of ameliorating often cited problems of statistical inference and scientific publication, notably publication bias, overuse/misuse of null hypothesis significance testing (NHST), and irreproducibility of reported scientific results. In RBME, manuscripts submitted to scientific journals are assessed for suitability for publication without regard to their reported results. Criteria for publication are based exclusively on the substantive importance of the research question addressed in the study, conveyed in the Introduction section of the manuscript, and the quality of the methodology, as reported in the Methods section. Practically, this policy is implemented by a two stage process whereby the editor initially distributes only the Introduction and Methods sections of a submitted manuscript to reviewers and a provisional decision regarding acceptance is made, followed by a second stage in which the complete manuscript is distributed for review but only if the decision of the first stage is for acceptance. The present paper expands upon this recommendation by addressing implications of this proposed policy with respect to statistical consultation and collaboration in research. It is suggested that under RBME, statisticians will become more integrated into research endeavors and called upon sooner for their input. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
19. The Impact of Results Blind Science Publishing on Statistical Consultation and Collaboration.
- Author
-
Locascio, Joseph J.
- Subjects
- *
INFERENTIAL statistics , *SCIENTIFIC literature , *STATISTICAL hypothesis testing , *PUBLICATION bias , *NULL hypothesis - Abstract
The author has previously proposed results blind manuscript evaluation (RBME) as a method of ameliorating often cited problems of statistical inference and scientific publication, notably publication bias, overuse/misuse of null hypothesis significance testing (NHST), and irreproducibility of reported scientific results. In RBME, manuscripts submitted to scientific journals are assessed for suitability for publication without regard to their reported results. Criteria for publication are based exclusively on the substantive importance of the research question addressed in the study, conveyed in the Introduction section of the manuscript, and the quality of the methodology, as reported in the Methods section. Practically, this policy is implemented by a two stage process whereby the editor initially distributes only the Introduction and Methods sections of a submitted manuscript to reviewers and a provisional decision regarding acceptance is made, followed by a second stage in which the complete manuscript is distributed for review but only if the decision of the first stage is for acceptance. The present paper expands upon this recommendation by addressing implications of this proposed policy with respect to statistical consultation and collaboration in research. It is suggested that under RBME, statisticians will become more integrated into research endeavors and called upon sooner for their input. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
20. Experience Simpson's Paradox in the Classroom.
- Author
-
Gou, Jiangtao and Zhang, Fengqing (Zoe)
- Subjects
- *
PROBABILITY theory , *STATISTICAL models , *STATISTICAL analysis in sports , *PSYCHOLOGY of students , *STATISTICS education (Elementary) - Abstract
Simpson's paradox is a challenging topic to teach in an introductory statistics course. To motivate students to understand this paradox both intuitively and statistically, this article introduces several new ways to teach Simpson's paradox. We design a paper toss activity between instructors and students in class to engage students in the learning process. We show that Simpson's paradox widely exists in basketball statistics, and thus instructors may consider looking for Simpson's paradox in their own school basketball teams as examples to motivate students’ interest. A new probabilistic explanation of Simpson's paradox is provided, which helps foster students’ statistical understanding. Supplementary materials for this article are available online. [ABSTRACT FROM PUBLISHER]
- Published
- 2017
- Full Text
- View/download PDF
21. Peirce, Youden, and Receiver Operating Characteristic Curves.
- Author
-
Baker, Stuart G. and Kramer, Barnett S.
- Subjects
- *
STATISTICIANS , *TORNADOES , *PREDICTION theory , *DIAGNOSIS , *DIAGNOSTIC imaging , *NONINVASIVE diagnostic tests , *STATISTICAL decision making , *RESEARCH & development - Abstract
Two recent articles in The American Statistician discussed a 1884 paper by Charles Sanders Peirce on evaluating the accuracy of tornado predictions, and focused on the connection between Peirce's ‘science of the method’ and Pearson's coefficient and Cohen's kappa (Rovine and Anderson 2004; Loken and Rovine 2006). In terms of later statistical developments, perhaps two more important contributions of Peirce's paper are (I) the identity between Peirce's ‘science of the method’ and the later Youden index and (2) the close relationship between Peirce's ‘utility of the method’ of prediction and the expected utility of a medical diagnostic test. Each of these contributions is discussed in turn, followed by a discussion of the roles they play in evaluating diagnostic tests with ordered or continuous results. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
22. The Use of Statistics in Medical Research: A Comparison of The New England Journal of Medicine and Nature Medicine.
- Author
-
Strasak, Alexander M., Zaman, Qamruz, Marinell, Gerhard, Pfeiffer, Karl P., and Ulmer, Hanno
- Subjects
- *
MEDICAL research , *MEDICAL literature , *QUANTITATIVE research , *STATISTICAL standards , *STATISTICAL significance , *DOCUMENTATION - Abstract
There is widespread evidence of the extensive use of statistical methods in medical research. Just the same, standards are generally low and a growing body of literature points to statistical errors in most medical journals. However, there is no comprehensive study contrasting the top medical journals of basic and clinical science for recent practice in their use of statistics. All original research articles in Volume 10, Numbers 1-6 of Nature Medicine (Nat Med) and Volume 350, Numbers 1-26 of The New England Journal of Medicine (NEJM) were screened for their statistical content. Types, frequencies, and complexity of applied statistical methods were systematically recorded. A 46-item checklist was used to evaluate statistical quality for a subgroup of papers. 94.5 percent (95% CI 87.6-98.2) of NEJM articles and 82.4 percent (95% CI 65.5-93.2) of Nat Med articles contained inferential statistics. NEJM papers were significantly more likely to use advanced statistical methods (p < 0.0001). Statistical errors were identified in a considerable proportion of articles, although not always serious in nature. Documentation of applied statistical methods was generally poor and insufficient, particularly in Nat Med. Compared to 1983, a vast increase in usage and complexity of statistical methods could be observed for NEJM papers. This does not necessarily hold true for Nat Med papers, as the results of the study indicate that basic science sticks with basic analysis. As statistical errors seem to remain common in medical literature, closer attention to statistical methodology should be seriously considered to raise standards. [ABSTRACT FROM AUTHOR]
- Published
- 2007
- Full Text
- View/download PDF
23. Undergraduate Programs and the Future of Academic Statistics.
- Author
-
Moore, David S.
- Subjects
- *
COLLEGE curriculum , *MATHEMATICS , *STATISTICS , *HIGHER education - Abstract
The following three articles are the results of papers presented and discussed at a symposium entitled "Improving the Workforce of the Future: Opportunities in Undergraduate Education," held August 12-13, 2000, in Indianapolis, IN. This symposium was sponsored by the American Statistical Association through its Undergraduate Statistics Education Initiative. The first article was the keynote address at the symposium. The latter two articles are position papers that were developed in part through discussions among representatives from liberal arts colleges, research universities, industry, and government at a May 1999 meeting at the ASA Headquarters and a subsequent April 2000 workshop in Alexandria, VA. This workshop was partially supported by the National Science Foundation. Other position papers based on symposium discussions are being developed and are scheduled to appear in the Journal of Statistical Education. [ABSTRACT FROM AUTHOR]
- Published
- 2001
- Full Text
- View/download PDF
24. A Unified Approach to Authorship Attribution and Verification.
- Author
-
Puig, Xavier, Font, Martí, and Ginebra, Josep
- Subjects
- *
ATTRIBUTION of authorship , *AUTHORS , *LITERARY style , *BAYESIAN analysis , *MULTINOMIAL distribution , *SIMULATION methods & models - Abstract
In authorship attribution, one assigns texts from an unknown author to either one of two or more candidate authors by comparing the disputed texts with texts known to have been written by the candidate authors. In authorship verification, one decides whether a text or a set of texts could have been written by a given author. These two problems are usually treated separately. By assuming an open-set classification framework for the attribution problem, contemplating the possibility that none of the candidate authors is the unknown author, the verification problem becomes a special case of attribution problem. Here both problems are posed as a formal Bayesian multinomial model selection problem and are given a closed-form solution, tailored for categorical data, naturally incorporating text length and dependence in the analysis, and copingwell with settings with a small number of training texts. The approach to authorship verification is illustrated by exploring whether a court ruling sentence could have been written by the judge that signs it, and the approach to authorship attribution is illustrated by revisiting the authorship attribution of the Federalist papers and through a small simulation study. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
25. The Mean Value Theorem and Taylor’s Expansion in Statistics.
- Author
-
Feng, Changyong, Wang, Hongyue, Han, Yu, Xia, Yinglin, and Tu, XinM.
- Subjects
- *
MEAN value theorems , *NONLINEAR estimation , *GENERALIZED estimating equations , *MAXIMUM likelihood statistics , *TAYLOR'S series , *ASYMPTOTIC distribution - Abstract
The mean value theorem and Taylor’s expansion are powerful tools in statistics that are used to derive estimators from nonlinear estimating equations and to study the asymptotic properties of the resulting estimators. However, the mean value theorem for a vector-valued differentiable function does not exist. Our survey shows that this nonexistent theorem has been used for a long time in statistical literature to derive the asymptotic properties of estimators and is still being used. We review several frequently cited papers and monographs that have misused this “theorem” and discuss the flaws in these applications. We also offer methods to fix such errors. [ABSTRACT FROM AUTHOR]
- Published
- 2013
- Full Text
- View/download PDF
26. AP Statistics: Building Bridges Between High School and College Statistics Education.
- Author
-
Franklin, Christine
- Subjects
- *
STATISTICS education , *ADVANCED placement programs (Education) , *COLLEGE graduates , *HIGH school graduates , *TEACHER education , *UNIVERSITIES & colleges - Abstract
After providing a brief history of the AP Statistics program and a description of the AP Statistics course content, exam and grading, the paper presents a discussion of current challenges for statistics education in the schools and a look at opportunities for the statistics profession, especially college faculty, to aid the AP Statistics program so as to improve statistics teaching in both venues and thus strengthen the quantitative literacy of future generations of high school or college graduates. This article has supplementary material online. [ABSTRACT FROM AUTHOR]
- Published
- 2011
- Full Text
- View/download PDF
27. Regression to the Mean, Murder Rates, and Shall-Issue Laws.
- Author
-
Grambsch, Patricia
- Subjects
- *
MURDER , *FINANCIAL liberalization , *WEAPON laws , *LIBERALISM , *GOVERNMENT policy , *REGRESSION (Psychology) , *LICENSES - Abstract
The relationship between state murder rates and the liberalization of conditions under which a citizen can obtain a permit to carry a concealed weapon (shall-issue laws) is controversial and important for policy. Many analyses have been done during the last decade, but regression to the mean has been ignored with the exception of two papers which concluded that it did not matter. We consider state murder rates for 1976-2001 and compare relative murder rate slopes (relative to the U.S. murder rate) for the five years following state adoption of shall-issue laws to the five years preceding for the 25 states becoming shall-issue in 1981-1996. We find strong evidence for regression to the mean. Using both a random and a fixed effects model, we compare analyses ignoring the regression effect via a paired t-test to those controlling for it by conditioning on the pre shall-issue slopes. We find that controlling for regression to the mean changes the sign of the estimated intervention effect on murder rate slopes from negative to positive, has strong impact on statistical significance, and gives no support to the hypothesis that shall-issue laws have beneficial effects in reducing murder rates. [ABSTRACT FROM AUTHOR]
- Published
- 2008
- Full Text
- View/download PDF
28. Teaching Experiences With a Course on "Web-Based Statistics".
- Author
-
Symanzik, Jürgen and Vukasinovic, Natascha
- Subjects
- *
STATISTICS education , *WORLD Wide Web - Abstract
Many statistics courses have been taught that make use of Webbased statistical tools such as teachware tools, electronic textbooks, and statistical software on the Web. However, to our best knowledge, there has been no course before where statistical issues and the Web have been discussed systematically. This article provides an overview on our "Web-Based Statistics" course aimed at advanced undergraduate and beginning graduate students, including detailed discussions of lecture topics, homework assignments, and student projects. We discuss references (papers and URLs) useful for such a course and summarize students' feedback. We finish this article with recommendations for future similar courses. [ABSTRACT FROM AUTHOR]
- Published
- 2003
- Full Text
- View/download PDF
29. On the Computation of Gauss Hypergeometric Functions.
- Author
-
Nadarajah, Saralees
- Subjects
- *
GAUSSIAN processes , *HYPERGEOMETRIC functions , *POCKET calculators , *NUMERICAL calculations - Abstract
The pioneering study undertaken by Liang et al. in 2008 (Journal of the American Statistical Association, 103, 410–423) and the hundreds of papers citing that work make use of certain hypergeometric functions. Liang et al. and many others claim that the computation of the hypergeometric functions is difficult. Here, we show that the hypergeometric functions can in fact be reduced to simpler functions that can often be computed using a pocket calculator. [ABSTRACT FROM AUTHOR]
- Published
- 2015
- Full Text
- View/download PDF
30. Cuthbert Daniel: Industrial statistician.
- Author
-
Hunter, J. Stuart
- Subjects
- *
STATISTICIANS , *DEATH - Abstract
Profiles Cuthbert Daniel, an industrial statistician, who died on August 8, 1997. Contributions to applied statistics; Educational background; Career history; Papers made by Daniel.
- Published
- 1998
- Full Text
- View/download PDF
31. Integrating scientific writing into a statistics curriculum: A course in statistically based...
- Author
-
Samsa, Gregory and Oddone, Eugene Z.
- Subjects
- *
MEDICAL literature , *TECHNICAL writing education , *STATISTICS education , *STATISTICS - Abstract
Describes a course in writing and critical appraisal of medical papers that uses statistics. Relationship of the course to a better integration of scientific writing into the statistics curriculum; Role of writing in statistical education; Model of scientific writing.
- Published
- 1994
- Full Text
- View/download PDF
32. A Historical Note on Zero Correlation and Independence.
- Author
-
David, Herbert A.
- Subjects
- *
STATISTICAL correlation , *T-test (Statistics) , *DISTRIBUTION (Probability theory) , *STATISTICAL significance , *PROBABILITY theory , *STATISTICAL hypothesis testing , *ANALYSIS of variance , *MATHEMATICAL statistics - Abstract
Ever since the introduction of the correlation coefficient in 1888, there has been some confusion between zero correlation and statistical independence. We examine this, with emphasis on Student's famous 1908 paper leading to the t-test, and indicate some subsequent developments. [ABSTRACT FROM AUTHOR]
- Published
- 2009
- Full Text
- View/download PDF
33. R. A. Fisher's Life and Death in Australia, 1959-1962.
- Author
-
Ludbrook, John
- Subjects
- *
STATISTICIANS , *GENETICISTS , *BIOGRAPHICAL sources , *WEBSITES - Abstract
In retirement, Fisher went to live in Adelaide, South Australia, at the invitation of the statistician E. A. Cornish and the geneticist J. H. Bennett. He died in Adelaide, following an operation for colon cancer, on July 29, 1962. During his life, Fisher engaged in vigorous controversy with Karl Pearson, Jerzy Neyman, and W. S. Gosset, to name but a few. After Fisher's death, his family donated his book copyrights and other intellectual and personal material to the University of Adelaide. This has resulted in the republication of his major books and scientific correspondence, and in a Web site from which can be downloaded most of his published papers. [ABSTRACT FROM AUTHOR]
- Published
- 2005
- Full Text
- View/download PDF
34. History Corner.
- Author
-
Scheuren, Fritz
- Subjects
- *
STATISTICS , *SURVEYS , *STATISTICIANS , *RESEARCH , *STATISTICAL sampling - Abstract
Recalls statistical research papers previously published in the United States. Survey research; Sampling; Technical problems of statisticians at the U.S. Census Bureau.
- Published
- 2004
- Full Text
- View/download PDF
35. Rudolf Wolf's contribution to the Buffon needle problem (an early Monte Carlo experiment) and...
- Author
-
Riedwyl, Hans
- Subjects
- WOLF, Rudolf
- Abstract
Discusses Rudolf Wolf's 1850 paper on the demonstration of the principle of estimating a parameter in a case where it is not easy to give a mathematical model. Wolf's fundamental work on sunspot periodicity; Focus on Wolf's experiment with Buffon's needle.
- Published
- 1990
- Full Text
- View/download PDF
36. Letters to the Editors.
- Author
-
Ryan, Thomas P., Woodall, William H., and Schell, Michael J.
- Subjects
- *
LETTERS to the editor , *BIOMETRY - Abstract
A letter to the editor is presented in response to the article "Identifying Key Statistical Papers From 1985 to 2002 Using Citation Data for Applied Biostaticians," by Michael J. Schell in the 2010 issue.
- Published
- 2011
- Full Text
- View/download PDF
37. History Corner.
- Author
-
Scheuren, Fritz
- Subjects
- *
MATHEMATICS , *STATISTICAL sampling , *STATISTICS , *PERIODICALS , *HIGH technology - Abstract
The article introduces previously published articles, related to statistics. The article by G.A. McIntyre was the first to introduce what has become known as rank set sampling. Published in the "Australian Journal of Agricultural Research," in 1952, the paper's initial application was to pasture measurement. A form of double sampling is being employed and, under fairly general conditions, gains can be made in sampling efficiency. In the August 2004 of the periodical "The American Statistician." there was an article by Gabor J. Székely and Donald St. P. Richards on an application of the St. Petersburg Paradox to the crash of high-tech stocks in 2000.
- Published
- 2005
- Full Text
- View/download PDF
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.