743 results on '"Statistics -- Analysis"'
Search Results
2. Socio-economic trends in the Canadian North: comparing the Provincial and Territorial Norths
- Author
-
Southcott, Chris
- Subjects
Northern Canada -- Economic aspects -- Social aspects ,Statistics -- Analysis ,Social science research ,Social classes -- Research ,General interest ,News, opinion and commentary - Abstract
Abstract: While there has been a recent increase in social research relating to the Canada's Territorial North, there is a relative poverty of research dealing with the Provincial North. That [...]
- Published
- 2014
3. Measuring defense: entering the zones of fielding statistics
- Author
-
Basco, Dan and Zimmerman, Jeff
- Subjects
Fielding (Baseball) -- Measurement ,Statistics -- Analysis ,History ,Sports and fitness ,Sports, sporting goods and toys industry - Abstract
Doug Glanville in his new baseball memoir notes that many players, 'rewarded with huge contracts because of their offensive prowess, ... have developed a kind of attention deficit disorder when [...]
- Published
- 2010
4. When is statistical evidence superior to anecdotal evidence in supporting probability claims? The role of argument type
- Author
-
Hoeken, Hans and Hustinx, Lettica
- Subjects
Statistics -- Analysis ,Inductive reasoning -- Research ,Psychology and mental health - Abstract
Under certain conditions, statistical evidence is more persuasive than anecdotal evidence in supporting a claim about the probability that a certain event will occur. In three experiments, it is shown that the type of argument is an important condition in this respect. If the evidence is part of an argument by generalization, statistical evidence is more persuasive compared with anecdotal evidence (Experiments 1 and 2). In the case of argument by analogy, statistical and anecdotal evidences are equally persuasive (Experiments 2 and 3). However, if the case in the anecdotal evidence is dissimilar from the case in the claim, statistical evidence is again more persuasive (Experiment 3). The implications of these results for the concept of argument quality are discussed. doi: 10.1111/j.1468-2958.2009.01360.x
- Published
- 2009
5. Rethinking data analysis-- part two
- Author
-
Kent, Ray
- Subjects
Statistics -- Analysis ,Marketing research -- Methods ,Advertising, marketing and public relations ,Business ,Business, international - Published
- 2009
6. Experience and problem representation in statistics
- Author
-
Rabinowitz, Mitchell and Hogan, Tracy M.
- Subjects
Methodology -- Analysis ,Statistics -- Analysis ,Psychology and mental health - Abstract
This research investigated experience level differences in problem representation in statistics. A triad judgment task was designed so that source problems shared either surface similarity (story narrative) or structural (inferential level) features (t test, correlation, of chi-square) with the target problem. Graduate students with varying levels of experience in statistics were asked to choose which source problem 'goes best' with the target problem for each triad. Given a choice between a problem that shares surface-level characteristics and one that shares inferential-level characteristics, students who had taken 0 to 4 courses in statistics tended to represent problems on the basis of surface-level features. Students who had more than 4 courses did not consistently make choices on the basis of surface-level features, nor did they consistently rely on structural features. However, all students with statistics course backgrounds noticed structural features when competition between different types of features was eliminated. The role of surface and structural features in determining problem representations is discussed.
- Published
- 2008
7. Lies, damned lies, and statistics: epistemology and fiction in Defoe's A Journal of the Plague Year
- Author
-
Seager, Nicholas
- Subjects
A Journal of the Plague Year (Novel) -- Criticism and interpretation ,Literary techniques -- Analysis ,Novelists -- Criticism and interpretation -- Works -- Analysis ,Statistics -- Analysis ,Literature/writing ,Criticism and interpretation ,Analysis ,Works - Abstract
This article considers Defoe's use of statistical data in his historical novel A Journal of the Plague Year, a device generally considered as a means of supplying a work of [...]
- Published
- 2008
8. Estimating time to event from longitudinal categorical data: an analysis of multiple sclerosis progression
- Author
-
Mandel, Micha, Gauthier, Susan A., Guttmann, Charles R.G., Weiner, Howard L., and Betensky, Rebecca A.
- Subjects
Multiple sclerosis -- Influence ,Multiple sclerosis -- Development and progression ,Statistics -- Usage ,Statistics -- Influence ,Statistics -- Analysis ,Mathematics - Abstract
The expanded disability status scale (EDSS) is an ordinal score that measures progression in multiple sclerosis (MS). Progression is defined as reaching EDSS of a certain level (absolute progression) or increasing EDSS by one point (relative progression). Survival methods for time to progression are not adequate for such data because they do not exploit the EDSS level at the end of follow-up. Instead, we suggest a Markov transitional model applicable for repeated categorical or ordinal data. This approach enables derivation of covariate-specific survival curves, obtained after estimation of the regression coefficients and manipulations of the resulting transition matrix. Large-sample theory and resampling methods are employed to derive pointwise confidence intervals, which perform well in simulation. Methods for generating survival curves for time to EDSS of a certain level, time to increase EDSS by at least one point, and time to two consecutive visits with EDSS greater than 3 are described explicitly. The regression models described are easily implemented using standard software packages. Survival curves are obtained from the regression results using packages that support simple matrix calculation. We present and demonstrate our method on data collected at the Partners Multiple Sclerosis Center in Boston. We apply our approach to progression defined by time to two consecutive visits with EDSS greater than 3 and calculate crude (without covariates) and covariate-specific curves. KEY WORDS: Markov model; Multistate model; Ordinal response; Pointwise confidence interval; Survival curve; Time series: Transition model.
- Published
- 2007
9. Selection, growth and the size distribution of firms
- Author
-
Luttmer, Erzo G. J.
- Subjects
Corporate growth -- Forecasts and trends ,Econometric models -- Usage ,Statistics -- Analysis ,Industry growth ,Market trend/market analysis ,Business ,Economics - Abstract
The analysis of data from American companies, to develop an econometric model of balanced growth consistent with the size distribution of the firms, is presented.
- Published
- 2007
10. The design and precision of data- fusion studies
- Author
-
Sharot, Trevor
- Subjects
Electronic data processing -- Methods ,Marketing research -- Forecasts and trends ,Statistics -- Analysis ,Market trend/market analysis ,Advertising, marketing and public relations ,Business ,Business, international - Abstract
The development of a tool to determine the effective sample size during the creation of fused datasets for market research is described. The utility of this tool for calculating confidence intervals and testing the data to deliver precise outputs is discussed.
- Published
- 2007
11. Wealth accumulation and distribution in urban China
- Author
-
Meng, Xin
- Subjects
China -- Economic policy ,Income distribution -- Forecasts and trends ,Statistics -- Analysis ,Wages -- Forecasts and trends ,Salary ,Market trend/market analysis ,Business ,Economics ,Social sciences - Abstract
The effect of economic reforms on the earning potential of and income distribution among Chinese people living in urban areas is described. The statistical analysis of income and investments of Chinese city-dwellers between 1995 and 2002 is presented.
- Published
- 2007
12. Forecasting regional employment with shift-share and ARIMA modeling
- Author
-
Mayor, Matias, Lopez, Ana Jesus, and Perez, Rigoberto
- Subjects
Asturias, Spain -- Economic policy ,Econometric models -- Usage ,Economic stabilization -- Forecasts and trends ,Statistics -- Analysis ,Market trend/market analysis ,Regional focus/area studies - Abstract
The utility of shift-share models and autoregressive integrated moving average models for the analysis of various economic scenarios, based on the statistics from an economically active population survey conducted in Asturias, Spain is examined.
- Published
- 2007
13. Are regional incomes in the converging? A non-linear perspective
- Author
-
Christopoulos, Dimitris K. and Tsionas, Efthymios G.
- Subjects
Econometric models -- Usage ,Income distribution -- Forecasts and trends ,Statistics -- Analysis ,Market trend/market analysis ,Regional focus/area studies - Abstract
The exponential smooth transition autoregressive models of the non-linearities of the convergence of regional incomes in the United States are presented.
- Published
- 2007
14. Geography and economic performance: exploratory spatial data analysis for Great Britain
- Author
-
Patacchini, Eleonora and Rice, Patricia
- Subjects
United Kingdom -- Economic policy ,Functional representation -- Economic aspects ,Income distribution -- Forecasts and trends ,Statistics -- Analysis ,Market trend/market analysis ,Regional focus/area studies - Abstract
The exploratory spatial analysis of statistical data, to determine the correlation between occupational composition and regional disparities in the wages and productivity of British workers, is presented.
- Published
- 2007
15. Political uncertainty's effect on judicial recruitment and retention: Japan in the 1990s
- Author
-
Ramseyer, J. Mark and Rasmussen, Eric B.
- Subjects
Japan -- Domestic policy ,Judicial selection -- Forecasts and trends ,Judicial selection -- Political aspects ,Statistics -- Analysis ,Market trend/market analysis ,Business ,Economics - Abstract
The statistical analysis of the effect of political uncertainty on the trends of judicial recruitment and retention in Japan, during the 1990s, is presented.
- Published
- 2007
16. Statistical analysis of in-service pavement performance data for LTPP SPS-1 and SPS-2 experiments
- Author
-
Haider, Syed Waqar, Chatti, Karim, Buch, Neeraj, Lyles, Richard W., Pulipaka, Aswani S., and Gilliland, Dennis
- Subjects
Pavements -- Mechanical properties ,Statistics -- Analysis ,Pavements -- Performance ,Pavements -- Measurement ,Engineering and manufacturing industries ,Science and technology ,Transportation industry - Abstract
Observational or experimental studies are designed to investigate the effects of various factors on a response variable. This distinction is important because the latter studies are assumed to provide a firmer basis for establishing cause-and-effect relationships. However, experimental studies involving in-service pavement sections present certain concerns in statistical analyses, which are addressed in this paper. The challenges presented by the in-service pavements data included: (1) outlier issues; (2) quantification of performance; and (3) the lack of measurable distresses due to the 'young' age of test sections. Experiment-related issues included: (1) wide variation in traffic levels and ages among the test sites and (2) an unbalanced distribution of test sites among climatic zones and subgrade types. The importance of selecting appropriate analytical methods for obtaining reliable results is discussed in this paper. Though most of the methods that were applied for the analyses are well established, the choice of magnitude--versus frequency-based methods was driven by the extent and occurrence of distresses. Based on the data, frequency-based methods such as linear discriminant analysis and binary logistic regression lend themselves well to explaining trends associated with distresses with reasonable occurrence but lower magnitude while a magnitude-based method like analysis of variance is more appropriate for evaluating distresses with high numbers of occurrence and magnitude. DOI: 10.1061/(ASCE)0733-947X(2007)133:6(378) CE Database subject headings: Pavements; Statistics; Data analysis; Performance characteristics.
- Published
- 2007
17. Use of nondestructive test deflection data for predicting airport pavement performance
- Author
-
Gopalakrishnan, Kasthurirrangan and Thompson, Marshall R.
- Subjects
Airports -- Design and construction ,Non-destructive testing -- Usage ,Statistics -- Analysis ,Pavements -- Performance ,Pavements -- Measurement ,Engineering and manufacturing industries ,Science and technology ,Transportation industry - Abstract
Surface deflections using nondestructive tests (NDTs) were measured prior to and throughout the traffic testing at the U.S. Federal Aviation Administration's National Airport Pavement Test Facility (NAPTF). The first series of traffic tests involved repeated loading of six-wheel Boeing 777 and four-wheel Boeing 747 test gears on two different lanes until the pavements were deemed failed. The NAPTF structural failure criterion was defined as at least 25.4 mm (1 in.) surface upheaval adjacent to the traffic lane. A predetermined wander sequence was applied. Two low-strength subgrade and two medium-strength subgrade flexible pavement test sections were tested. Transverse surface profiles were measured periodically to monitor the progression of permanent deformation in pavements. Deflection basin parameters derived from NDT surface deflections were related to pavement rutting performance. An airport pavement functional failure criterion, defined in terms of number of traffic load repetitions to reach specific rut depth levels, was used in characterizing the structural response-performance relations. DOI: 10.1061/(ASCE)0733-947X(2007)133:6(389) CE Database subject headings: Deflection; Predictions; Airports; Pavements; Full-scale tests; Nondestructive tests; Performance characteristics.
- Published
- 2007
18. Inside the family firm: the role of families in succession decisions and performance
- Author
-
Bennedsen, Morten, Nielsen, Kasper Meisner, Perez-Gonzalez, Francisco, and Wolfenzon, Daniel
- Subjects
Family corporations -- Management ,Family-owned business enterprises -- Management ,Statistics -- Analysis ,Succession planning (Business) -- Forecasts and trends ,Company business management ,Market trend/market analysis ,Business ,Economics - Abstract
The statistical analysis of the management of family-owned businesses in Denmark, to determine the impact of family characteristics on firm performance and the trends of succession planning in family-owned businesses, is presented.
- Published
- 2007
19. Relationship-specificity, incomplete contracts, and the pattern of trade
- Author
-
Nunn, Nathan
- Subjects
Capital investments -- Economic aspects ,Statistics -- Analysis ,Contracts -- Management ,Contracts -- Forecasts and trends ,Capital investment ,Market trend/market analysis ,Business ,Economics - Abstract
The development of a variable to measure the impact of relation-specific investments and the enforcement of written contracts by national governments is described. The usage of this variable to analyze data on trade flows and judicial quality in different countries is discussed.
- Published
- 2007
20. Does sickness absence increase the risk of unemployment?
- Author
-
Hesselius, Patrick
- Subjects
Diseases -- Sweden ,Diseases -- Economic aspects ,Statistics -- Analysis ,Workers -- Health aspects ,Business ,Social sciences - Abstract
The analysis of Swedish panel data, to analyze the impact of the duration of absenteeism due to sickness on the possibility of the future unemployment of Swedish workers, is presented.
- Published
- 2007
21. Are people inequality averse, or do they prefer redistribution by the state? Evidence from German longitudinal data on life satisfaction
- Author
-
Schwarze, Johannes and Harpfer, Marco
- Subjects
Germany -- Economic aspects ,Income distribution -- Economic aspects ,Statistics -- Analysis ,Business ,Social sciences - Abstract
The analysis of panel data from the German Socio-Economic Panel Study held from 1985 to 1988, to determine the extent of inequality aversion and among the German people, is presented. The economic aspects of the redistribution of wealth by the German provincial governments to reduce economic inequality are described.
- Published
- 2007
22. How wages change: micro evidence from the international wage flexibility project
- Author
-
Dickens, William T., Goette, Lorenz, Groshen, Erica L., Holden, Steinar, Messina, Julian, Schweitzer, Mark E., Turunen, Jarkko, and Ward, Melanie E.
- Subjects
Statistics -- Analysis ,Workers' compensation -- Forecasts and trends ,Workers' compensation -- International aspects ,Market trend/market analysis ,Economics - Abstract
The analysis of data from the international wage flexibility project, to determine the downward wage rigidity of workers who do not change jobs frequently, is presented.
- Published
- 2007
23. Bayesian wombling: curvilinear gradient assessment under spatial process models
- Author
-
Banerjee, Sudipto and Gelfand, Alan E.
- Subjects
Coordinates, Curvilinear -- Analysis ,Bayesian statistical decision theory -- Usage ,Statistics -- Analysis ,Spatial analysis (Statistics) ,Mathematics - Abstract
Large-scale inference for random spatial surfaces over a region using spatial process models has been well studied. Under such models, local analysis of the surface (e.g., gradients at given points) has received recent attention. A more ambitious objective is to move from points to curves, to attempt to assign a meaningful gradient to a curve. For a point, if the gradient in a particular direction is large (positive or negative), then the surface is rapidly increasing or decreasing in that direction. For a curve, if the gradients in the direction orthogonal to the curve tend to be large, then the curve tracks a path through the region where the surface is rapidly changing. In the literature, learning about where the surface exhibits rapid change is called wombling, and a curve such as we have described is called a wombling boundary. Existing wombling methods have focused mostly on identifying points and then connecting these points using an ad hoc algorithm to create curvilinear wombling boundaries. Such methods are not easily incorporated into a statistical modeling setting. The contribution of this article is to formalize the notion of a curvilinear wombling boundary in a vector analytic framework using parametric curves and to develop a comprehensive statistical framework for curvilinear boundary analysis based on spatial process models for point-referenced data. For a given curve that may represent a natural feature (e.g., a mountain, a river, or a political boundary), we address the issue of testing or assessing whether it is a wombling boundary. Our approach is applicable to both spatial response surfaces and, often more appropriately, spatial residual surfaces. We illustrate our methodology with a simulation study, a weather dataset for the state of Colorado, and a species presence/absence dataset from Connecticut. KEY WORDS: Arc-length measure; Bayesian modeling; Directional derivative; Flux; Gaussian process; Line integral; Parametric curve; Wombling.
- Published
- 2006
24. Focused information criteria and model averaging for the Cox hazard regression model
- Author
-
Hjort, Nils Lid and Claeskens, Gerda
- Subjects
Regression analysis -- Usage ,Statistics -- Analysis ,Mathematics - Abstract
This article is concerned with variable selection methods for the Cox proportional hazards regression model. Including excessive covariates causes extra variability and inflated confidence intervals for regression parameters; thus regimes for discarding the less informative ones are needed. Our framework has p covariates designated as 'protected,' while variables from a further set of q covariates are examined for possible inclusion or exclusion. We develop a focused information criterion (FIC) that for given interest parameter finds the best subset of covariates. Thus the FIC might find that the best model for predicting median survival time is different than the best model for estimating survival probabilities, and the best overall model for analyzing women's survival. Methodology is also developed for model averaging, wherein the final estimate of a quantity is a weighted average of estimates computed for a range of submodels. Our methods are illustrated in simulations and for a survival study of Danish skin cancer patients. KEY WORDS: Akaike information criterion; Covariate selection; Cox regression; Focused information criteria; Median survival time; Model averaging.
- Published
- 2006
25. Terror and trade of individual investors
- Author
-
Levy, Ori and Galili, Itai
- Subjects
Investments -- Forecasts and trends ,Investors -- Beliefs, opinions and attitudes ,Statistics -- Analysis ,Terror -- Economic aspects ,Terror -- Analysis ,Market trend/market analysis ,Business ,Social sciences - Abstract
The impact of terror on the emotions and trading decisions of individual investors is described. The analysis of data from three thousand households, for this purpose, is presented.
- Published
- 2006
26. Linking work design to mass customization: a sociotechnical systems perspective
- Author
-
Liu, Gensheng, Shah, Rachna, and Schroeder, Roger G.
- Subjects
Economics -- Usage ,Production management -- Methods ,Statistics -- Analysis ,Work design -- Forecasts and trends ,Market trend/market analysis ,Business ,Business, general - Abstract
The analysis of the relation between work design and mass customization, based on the application of the sociotechnical systems theory for the evauation of the results of a survey, is presented.
- Published
- 2006
27. Tukey's paper after 40 years
- Author
-
Mallows, Colin, Brillinger, David R., Buja, Andreas, Efron, Bradley, Huber, Peter J., and Landwehr, James M.
- Subjects
Statisticians -- Works ,Statistics -- Analysis ,Engineering and manufacturing industries ,Mathematics ,Science and technology - Abstract
A brief overview of Tukey's paper, 'The Future of Data Analysis' after 40 years by various authors is discussed considering the debates regarding whether statistics is a science and ways to attract bright students by showing excitement and rewards of applied work. It is argued that beyond the ideas of data analysis, one should look at how data is analyzed by others and the key concept should be statistical thinking.
- Published
- 2006
28. Power priors and their use in clinical trials
- Author
-
De Santis, Fulvio
- Subjects
Statistics -- Analysis ,Statistics -- Usage ,Clinical trials ,Science and technology ,Social sciences - Abstract
This article reviews power priors, a class of prior distributions for an unknown parameter that exploits information from results of previous, similar studies, a situation arising often in clinical trials. The article shows that, for independent and identically distributed historical data, a basic formulation of power priors (geometric priors) can be obtained as the result of a prior updating-and-combining process based on training samples of iid historical data. This formulation gives an operational justification to power priors. It also allows us to relate the discount scalar quantity controlling the influence of historical information on final inference to the size of training samples. Properties of power priors and their extension to more complex set-ups are discussed. Then several examples are provided of their use in the analysis of clinical trials data. The approach is shown to be appropriate for handling problems arising when information is combined from different studies, such as lack of exchangeability between preceding and current data, and the risk that prior information overwhelms evidence from the study in question. KEY WORDS: Conjugate analysis; Elicitation; Geometric priors; Objective Bayesian analysis; Prediction; Training sample.
- Published
- 2006
29. A statistical approach to identifying poorly performing countries
- Author
-
Anderson, Edward and Morrissey, Oliver
- Subjects
Developing countries -- Evaluation ,Statistics -- Analysis ,Economics ,Political science ,Regional focus/area studies - Abstract
A study examining whether it is possible to identify poor performing countries amongst developing countries based on exclusively statistical information, is presented.
- Published
- 2006
30. Interpretation of subgroup results in clinical trial publications: insights from a survey of medical specialists in Ontario, Canada
- Author
-
Parker, Andrea B. and Naylor, C. David
- Subjects
Cardiologists -- Beliefs, opinions and attitudes ,Cardiologists -- Research ,Cardiovascular research -- Information management ,Statistics -- Analysis ,Company systems management ,Health - Published
- 2006
31. Data is a commodity, but insight is gold
- Author
-
Julka, Samantha
- Subjects
Surveys -- Methods ,Insight -- Usage ,Statistics -- Analysis ,Business ,Business, regional - Abstract
A professor once told me, 'Data is a commodity; it's the insight that's gold.' The pandemic has changed the way we see everything, and right now, many people are reviewing [...]
- Published
- 2021
32. A robust measure of skewness
- Author
-
Brys, G., Hubert, M., and Struyf, A.
- Subjects
Statistics -- Analysis ,Statistics -- Models ,Mathematics ,Science and technology - Abstract
The asymmetry of a univariate continuous distribution is commonly measured by the classical skewness coefficient. Because this estimator is based on the first three moments of the dataset, it is [...]
- Published
- 2004
33. A diagnostic plot for estimating the tail index of a distribution
- Author
-
de Sousa, Bruno and Michailidis, George
- Subjects
Distribution (Probability theory) -- Analysis ,Statistics -- Analysis ,Mathematics ,Science and technology - Abstract
The problem of estimating the tail index in heavy-tailed distributions is very important in many applications. We propose a new graphical method that deals with this problem by selecting an [...]
- Published
- 2004
34. LOTUS: an algorithm for building accurate and comprehensible logistic regression trees
- Author
-
Chan, Kin-Yee and Loh, Wei-Yin
- Subjects
Statistics -- Analysis ,Regression analysis -- Analysis ,Mathematics ,Science and technology - Abstract
Logistic regression is a powerful technique for fitting models to data with a binary response variable, but the models are difficult to interpret if collinearity, nonlinearity, or interactions are present. [...]
- Published
- 2004
35. Weber correspondence analysis: the one-dimensional case
- Author
-
de Leeuw, Jan and Michailidis, George
- Subjects
Statistics -- Analysis ,Mathematics ,Science and technology - Abstract
1. INTRODUCTION Correspondence analysis or CA can be interpreted as a technique for drawing weighted bipartite graphs (Michailidis and de Leeuw 2001). In the adjacency matrix of the bipartite graph [...]
- Published
- 2004
36. Statistical simulations on parallel computers
- Author
-
Sevcikova, Hana
- Subjects
Statistical software -- Analysis ,Statistics -- Analysis ,Statistical/mathematical software ,Mathematics ,Science and technology - Abstract
The potential benefits of parallel computing for time-consuming statistical applications are well known, but have not been widely realized in practice, perhaps in part due to associated technical obstacles. This [...]
- Published
- 2004
37. Clustering visualizations of multidimensional data
- Author
-
Hurley, Catherine B.
- Subjects
Statistics -- Analysis ,Multidimensional scaling -- Analysis ,Mathematics ,Science and technology - Abstract
Many graphical methods for displaying multivariate data consist of arrangements of multiple displays of one or two variables: scatterplot matrices and parallel coordinates plots are two such methods. In principle [...]
- Published
- 2004
38. Evolutionary simulated annealing with application to image restoration
- Author
-
Gluhovsky, Ilya
- Subjects
Statistics -- Analysis ,Simulated annealing (Mathematics) -- Analysis ,Mathematics ,Science and technology - Abstract
Simulated annealing is a randomized algorithm proposed for finding a global optimum in large problems where a target function may have many local extrema. This article considers a modification of [...]
- Published
- 2004
39. Asymmetric linear dimension reduction for classification
- Author
-
Hennig, Christian
- Subjects
Statistics -- Analysis ,Statistics -- Models ,Mathematics ,Science and technology - Abstract
This article discusses methods to project a p-dimensional dataset with classified points from s known classes onto a lower dimensional hyperplane so that the classes appear optimally separated. Such projections [...]
- Published
- 2004
40. CARTscans: a tool for visualizing complex models
- Author
-
Nason, Martha, Emerson, Scott, and LeBlanc, Michael
- Subjects
Statistics -- Analysis ,Statistics -- Models ,Mathematics ,Science and technology - Abstract
We present CARTscans, a graphical tool that displays predicted values across a four-dimensional subspace. We show how these plots are useful for understanding the structure and relationships between variables in [...]
- Published
- 2004
41. Exploratory data analysis for complex models
- Author
-
Gelman, Andrew
- Subjects
Bayesian statistical decision theory -- Analysis ,Statistics -- Analysis ,Statistics -- Models ,Mathematics ,Science and technology - Abstract
'Exploratory' and 'confirmatory' data analysis can both be viewed as methods for comparing observed data to what would be obtained under an implicit or explicit statistical model. For example, many [...]
- Published
- 2004
42. Discussion
- Author
-
Buja, Andreas
- Subjects
Statistics -- Analysis ,Mathematics ,Science and technology - Abstract
Gelman's article is a thought-provoking mix of opinions and creative methodology. I agree with Gelman that the disjunction of models and exploratory data analysis (EDA) in mainstream statistics is unsound. [...]
- Published
- 2004
43. A random pattern-mixture model for longitudinal data with dropouts
- Author
-
Guo, Wensheng, Ratcliffe, Sarah J., and ten Have, Thomas T.
- Subjects
Statistics -- Analysis ,Longitudinal method -- Usage ,Mathematics - Abstract
Pattern-mixture models are frequently used for longitudinal data analysis with dropouts because they do not require explicit specification of the dropout mechanism. These models stratify the data according to time to dropout and formulate a model for each stratum. This usually results in underindentifiability, because we need to estimate many pattern-specific parameters even though the eventual interest is usually on the marginal parameters. In this article we extend this framework to a random pattern-mixture model, where the pattern-specific parameters are treated as nuisance parameters and modeled as random instead of fixed. The pattern is defined according to a surrogate for the dropout process. A constraint is then put on the pattern by linking it to the time to dropout using a random-effects survival model. We assume, conditional on the latent pattern effects, that the longitudinal outcome and the dropout process are independent. This model retains the robustness of the traditional pattern-mixture models, while avoiding the overparameterization problem. When we define each subject as a separate stratum, this model reduces to the shared parameter model. Maximum likelihood estimates are obtained using an EM Newton-Raphson algorithm. We apply the method to the depression data from the Prevention of Suicide in Primary Care Elderly Collaborative Trial (PROSPECT). We show when the dropout information is adjusted for under the proposed model, the treatment seems to reduce depression in the elderly. KEY WORDS: Dropout; EM algorithm; Mixed-effects model; Pattern-mixture mode., 1. INTRODUCTION Many longitudinal studies suffer from attrition, which can cause bias in the analysis if the dropouts are informative. To account for informative dropout, a number of model-based approaches [...]
- Published
- 2004
44. Smooth design-adapted wavelets for nonparametric stochastic regression
- Author
-
Delouille, V., Simoens, J., and von Sachs, R.
- Subjects
Regression analysis -- Methods ,Statistics -- Analysis ,Mathematics - Abstract
We treat nonparametric stochastic regression using smooth design-adapted wavelets built by means of the lifting scheme. The proposed method automatically adapts to the nature of the regression problem, that is, to the irregularity of the design, to data on the interval, and to arbitrary sample sizes (which do not need to be a power of 2). As such, this method provides a uniform solution to the usual criticisms of first-generation wavelet estimators. More precisely, starting from the unbalanced Haar basis orthogonal with respect to the empirical design measure, we use weighted average interpolation to construct biorthogonal wavelets with a higher number of vanishing analyzing moments. We include a lifting step that improves the conditioning through constrained local semiorthogonalization. We propose a wavelet thresholding algorithm and show its numerical performance both on real data and in simulations including white, correlated, and heteroscedastic noise. KEY WORDS: Biorthogonal wavelet transform; Heteroscedastic data; Irregular design; Lifting scheme; Weighted average-interpolation., 1. INTRODUCTION Nonparametric curve estimation by wavelets has been treated in numerous articles in various setups. These range from the simple Gaussian iid error situation to more complicated data structures [...]
- Published
- 2004
45. New estimation and model selection procedures for semiparametric modeling in longitudinal data analysis
- Author
-
Fan, Jianqing and Li, Runze
- Subjects
Statistics -- Analysis ,Statistics -- Methods ,Mathematics - Abstract
Semiparametric regression models are very useful for longitudinal data analysis. The complexity of semiparametric models and the structure of longitudinal data pose new challenges to parametric inferences and model selection that frequently arise from longitudinal data analysis. In this article, two new approaches are proposed for estimating the regression coefficients in a semiparametric model. The asymptotic normality of the resulting estimators is established. An innovative class of variable selection procedures is proposed to select significant variables in the semiparametric models. The proposed procedures are distinguished from others in that they simultaneously select significant variables and estimate unknown parameters. Rates of convergence of the resulting estimators are established. With a proper choice of regularization parameters and penalty functions, the proposed variable selection procedures are shown to perform as well as an oracle estimator. A robust standard error formula is derived using a sandwich formula and is empirically tested. Local polynomial regression techniques are used to estimate the baseline function in the semiparametric model. KEY WORDS: Local polynomial regression; Partial linear model; Penalized least squares; Profile least squares; Smoothly clipped absolute deviation., 1. INTRODUCTION Longitudinal data are often highly unbalanced because the data were collected at irregular and possibly subject-specific time points. Due to their unbalanced nature, it is difficult to directly [...]
- Published
- 2004
46. Exact and approximate inferences for nonlinear mixed-effects models with missing covariates
- Author
-
Wu, Lang
- Subjects
Analysis of variance -- Methods ,Statistics -- Analysis ,Mathematics - Abstract
Nonlinear mixed-effects (NLME) models are popular in many longitudinal studies, including human immunodeficiency virus (HIV) viral dynamics, pharmacokinetic analyses, and studies of growth and decay. In practice, covariates in these studies often contain missing data, and so standard complete-data methods are not directly applicable. In this article we propose Monte Carlo parameter-expanded (PX)-EM algorithms for exact and approximate likelihood inferences for NLME models with missing covariates when the missing-data mechanism is ignorable. We allow arbitrary missing-data patterns and allow the covariates to be categorical, continuous, and mixed. The PX-EM algorithm maintains the simplicity and stability of the standard EM algorithm and may converge much faster than EM. The approximate method is computationally more efficient and may be preferable to the exact method when the exact method exhibits convergence problems, such as slow convergence or nonconvergence. It becomes an exact method for linear mixed-effects models and certain NLME models with missing covariates. We also discuss several sampling methods and convergence of the Monte Carlo (PX) EM algorithms. We illustrate the methods using a real data example from the study of HIV viral dynamics and compare the methods via a simulation study. KEY WORDS: EM algorithm; Gibbs sampling; Importance sampling; Linearization; PX-EM algorithm; Rejection sampling., 1. INTRODUCTION Nonlinear mixed-effects (NLME) models, or hierarchical nonlinear models, are popular in many longitudinal studies such as human immunodeficiency virus (HIV) viral dynamics, pharmacokinetic analyses, and studies of growth [...]
- Published
- 2004
47. Variable selection and model building via likelihood basis pursuit
- Author
-
Zhang, Hao Helen, Wahba, Grace, Lin, Yi, Voelker, Meta, Ferris, Michael, Klein, Ronald, and Klein, Barbara
- Subjects
Statistics -- Analysis ,Variables (Mathematics) -- Analysis ,Variables (Mathematics) -- Methods ,Mathematics - Abstract
This article presents a nonparametric penalized likelihood approach for variable selection and model building, called likelihood basis pursuit (LBP). In the setting of a tensor product reproducing kernel Hilbert space, we decompose the log-likelihood into the sum of different functional components such as main effects and interactions, with each component represented by appropriate basis functions. Basis functions are chosen to be compatible with variable selection and model building in the context of a smoothing spline ANOVA model. Basis pursuit is applied to obtain the optimal decomposition in terms of having the smallest [l.sub.1] norm on the coefficients. We use the functional [L.sub.1] norm to measure the importance of each component and determine the 'threshold' value by a sequential Monte Carlo bootstrap test algorithm. As a generalized LASSO-type method, LBP produces shrinkage estimates for the coefficients, which greatly facilitates the variable selection process and provides highly interpretable multivariate functional estimates at the same time. To choose the regularization parameters appearing in the LBP models, generalized approximate cross-validation (GACV) is derived as a tuning criterion. To make GACV widely applicable to large datasets, its randomized version is proposed as well. A technique 'slice modeling' is used to solve the optimization problem and makes the computation more efficient. LBP has great potential for a wide range of research and application areas such as medical studies, and in this article we apply it to two large ongoing epidemiologic studies, the Wisconsin Epidemiologic Study of Diabetic Retinopathy (WESDR) and the Beaver Dam Eye Study (BDES). KEY WORDS: Generalized approximate cross-validation; LASSO; Monte Carlo bootstrap test; Nonparametric variable selection; Slice modeling; Smoothing spline ANOVA., 1. INTRODUCTION Variable selection, or dimension reduction, is fundamental to multivariate statistical model building. Not only does judicious variable selection improve the model's predictive ability, it also generally provides a [...]
- Published
- 2004
48. The effect of dependence on confidence intervals for a population proportion
- Author
-
Miao, Weiwen and Gastwirth, Joseph L.
- Subjects
Statistics -- Models ,Statistics -- Analysis ,Science and technology ,Social sciences - Abstract
The binomial model is widely used in statistical applications. Usually, the success probability, p, and its associated confidence interval are estimated from a random sample. Thus, the observations are independent and identically distributed. Motivated by a legal case where some grand jurors could serve a second year, this article shows that when the observations are dependent, even slightly, the coverage probabilities of the usual confidence intervals can deviate noticeably from their nominal level. Several modified confidence intervals that incorporate the dependence structure are proposed and examined. Our results show that the modified Wilson, Agresti-Coull, and Jeffreys confidence intervals perform well and can be recommended for general use. KEY WORDS: Coverage probability; Dependent observations; Expected length of confidence interval; Jury discrimination., 1. INTRODUCTION Let [X.sub.1], [X.sub.2],..., [X.sub.n] be binomial random variables with common unknown success probability p. Let [n.sub.i] be the number of trials for each [X.sub.i] and N = [n.sub.1] [...]
- Published
- 2004
49. Bootstrap methods for developing predictive models
- Author
-
Austin, Peter C. and Tu, Jack V.
- Subjects
Statistics -- Analysis ,Statistics -- Methods ,Science and technology ,Social sciences - Abstract
Researchers frequently use automated model selection methods such as backwards elimination to identify variables that are independent predictors of an outcome under consideration. We propose using bootstrap resampling in conjunction with automated variable selection methods to develop parsimonious prediction models. Using data on patients admitted to hospital with a heart attack, we demonstrate that selecting those variables that were identified as independent predictors of mortality in at least 60% of the bootstrap samples resulted in a parsimonious model with excellent predictive ability. KEY WORDS: Acute myocardial infarction; Epidemiological research; Mortality; Multivariate analysis; Regression models; Variable selection., 1. INTRODUCTION Researchers frequently develop regression models to predict dichotomous outcomes. Investigators need to maintain a balance between including too many variables and model parsimony (Murtaugh 1998; Wears and Lewis [...]
- Published
- 2004
50. New developments in the use of location quotients to estimate regional input-output coefficients and multipliers
- Author
-
Tohmo, Timo
- Subjects
Economic indicators -- Analysis ,Regional development -- Analysis ,Statistics -- Analysis ,Regional focus/area studies - Abstract
TOHMO T. (2004) New developments in the use of location quotients to estimate regional input--output coefficients and multipliers, Reg. Studies 38, 43-54. This study compares the survey-based regional input-output coefficients and production multipliers published by STATISTICS FINLAND, 2000, with estimates obtained by applying location quotients (LQs) to national data. The consequences of using alternative adjustment formulae, the 'SLQ, CILQ and FLQ' are illustrated by an input--output model constructed for the Keski-Pohjanmaa (K-P) region. The results indicate that the SLQ and CILQ both produce highly misleading regional input--output coefficients and multipliers. These adjustment formulae are clearly not good enough for the purposes of making local policy and regional planning. The FLQ formula ([beta]=1) yields much better regional input--output coefficients and multipliers than the SLQ and CILQ. The FLQ gives very good estimates for regional multipliers in nearly all industries. The difference between the multipliers generated by the FLQ and the survey-based K-P regional multipliers is on average about -0.3%. The multipliers for the K-P region are typically much lower than for Finland as a whole, indicating that the economic structure of the K-P region is dependent on that of other regions. Hence there is a need to make proper allowance for interregional trade. In the case of the K-P region, the FLQ with [beta]=1 was able to offset the tendency of the CILQ to generate excessively large regional multipliers. Input--output analysis Location quotients FLQ Regional multipliers TOHMO T. (2004) Des nouveaux developpements dans le domaine de l'emploi des quotients de localisation pour estimer les coefficients et les multiplicateurs des echanges intersectoriels regionaux, Reg. Studies 38, 43-54. Cette etude cherche a comparer les coefficients des echanges intersectoriels et les multiplicateurs de production regionaux bases sur une enquete et publies par Statistics Finland (2000) aux estimations qui proviennent de l'application des quotients de localisation (LQ) aux donnees nationales. A partir d'un modele des echanges intersectoriels construit pour la region Keski-Pohjanmaa (=K-P), on presente ce qui resulte de l'emploi d'autres formules d'ajustement, a savoir le SLQ, le CILQ et le FLQ. Les resultats laissent voir que et le SLQ et le CILQ fournissent des coefficients et des multiplicateurs des echanges intersectoriels regionaux trompeurs. Il est evident que ces formules d'ajustement ne suffisent ni pour la politique locale ni pour l'amenagement du territoire. La formule FLQ (ou [beta]=1) fournit de meilleurs coefficients et multiplicateurs des echanges intersectoriels regionaux que ne le font les formules SLQ et CILQ. Le FLQ fournit de tres bonnes estimations des multiplicateurs regionaux a travers l'industrie presque. En moyenne, l'ecart entre les multiplicateurs engendres par le FLQ et les multiplicateurs regionaux K-P bases sur l'enquete se chiffre a -0,3% environ. Il n'est pas a surprendre que les multiplicateurs pour la region K-P sont inferieurs que ceux pour la Finlande dans son ensemble, ce qui laisse supposer que la structure economique de la region K-P depend de celle des autres regions. Par la suite, il faut tenir compte du commerce international. Pour ce qui est de la region K-P, le FLQ ou [beta]=1 a compense la tendance du CILQ a engendrer des multiplicateurs regionaux nettement trop importants. Analyse des echanges intersectoriels Quotients de localisation FLQ Multiplicateurs regionaux TOHMO T. (2004) Neue Entwicklungen bei der Anwendung von Standortquotienten zur Berechnung von regionalen Aufwand-Ertragskoeffizienten und Multiplikatoren, Reg. Studies 38, 43-54. Dieser Aufsatz vergleicht die von Finland Statistics (2000) veroffentlichten, auf Untersuchungen gestutzten Aufwands-Ertragskoeffizienten und Produktionsmultiplikatoren mit Berechnungen, die durch die Anwendung von Standortquotienten (LQs) auf Landesebene ermittelt wurden. Die Folge der Anwendung alternativer Anpassungsformeln 'die SLQ, CILQ und FLQ' werden mittels eines fur die Keski-Pohjanma (=K-P) Region konstruierten Aufwands-Ertragsmodells erlautert. Die Ergebnisse besagen, da[BETA] sowohl SLQ und CILQ besorgniserregend irrefuhrende regionale Aufwands-Ertragskoeffizienten und Multiplikatoren hervorbringen. Diese Anpassungsformeln sind offensichtlich nicht gut genug fur die Zwecke der Ortspolitik und Regionalplanung. Die FLQ Formel (fur [beta]=1) ergibt viel bessere regionale Aufwands-Ertragskoeffizienten und Multiplikatoren als die SLQ und CILQs. Die FLQ liefert sehr gute Berechnungen fur regionale Multiplikatoren in nahezu allen Industrien. Der Unterschied zwischen von FLQ erzeugten und auf Untersuchung aufbauenden K-P Regionalmultiplikatoren ist im Durchschnitt etwa -0.3%. Die Multiplikatoren fur die K-P region sind typisch viel niedriger als fur Gesamtfinnland, und weisen damit darauf hin, da[BETA] die Wirtschaftsstruktur der K-P Region von der anderer Regionen abhangt. Es ist deswegen notig, entsprechende Zugestandnisse an den interregionalen Handel zu machen. Im Falle der K-P Region war der FLQ mit [beta]-1 imstande, die Tendenz der CILQ, uberma[BETA]ig gro[BETA]e regionale Multiplikatoren entstehen zu lassen, wieder auszugleichen. Aufwands-Ertragsanalyse Standortquotienten FLQ Regionale Multiplikatoren TOHMO T. (2004) Nuevos avances en el uso de cocientes de localizacion (location quotients = LQ) para estimar coeficientes de input-output y multiplicadores regionales, Reg. Studies 38, 43-54. Este estudio compara los coeficientes de input-output y los multiplicadores de produccion regionales basados en una encuesta publicada por Statistics Finland (2000) con estimaciones obtenidas a partir de la aplicacion de cocientes de localizacion (LQs) a datos nacionales. Las consecuencias derivadas de utilizar formulas de ajuste alternativo 'SLQ, CILQ y FLQ' se ilustran por medio de un modelo de input-output disenado para la region Keski-Pohjanmaa (K-P). Los resultados indican que tanto la SLQ como la CILQ producen coeficientes de input-output y multiplicadores regionales altamente enganosos. Claramente, estas formulas de ajuste no son suficientemente buenas para los propositos de disenar tanto politicas locales como planificacion regional. La formula FLQ (para [beta]=1) produce mejores coeficientes de input--output y multiplicadores regionales que la SLQ y la CILQ. La FLQ produce estimaciones muy buenas para los multiplicadores regionales en practicamente todas las industrias. La diferencia entre los multiplicadores generados por medio de la FLQ y los multiplicadores basados en la encuesta para la region K-P es de un promedio del -0.3%. Los multiplicadores para la region K-P son tipicamente mucho mas bajos que para Finlandia en su totalidad, lo que indica que la estructura economica de la region K-P depende de la de otras regiones. He aqui que es necesario otorgar concesiones adecuadas para el comercio interregional. En el caso de la region K-P, la FLQ con [beta]=1 fue capaz de contrarrestar la tendencia de la CILQ a generar multiplicadores regionales excesivamente elevados. Analisis input--output Cocientes de localizacion (LQs) FLQ Multiplicadores regionales JEL classifications: C67, R15, R20
- Published
- 2004
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.