681 results on '"Regression analysis -- Models"'
Search Results
2. Goldilocks vs. Robin Hood: Using Shape-constrained Regression to Evaluate u-Shaped (Or Inverse u-Shaped) Theories in Data
- Author
-
Ganz, Scott C.
- Subjects
Regression analysis -- Models ,Mathematical statistics -- Usage ,Algorithms -- Usage ,Algorithm ,Social sciences - Abstract
Theories that predict u-shaped and inverse u-shaped relationships are ubiquitous throughout the social sciences. As a result of this widespread interest in identifying u-shaped and inverse w-shaped relationships in data and the well-known problems with standard parametric approaches based on quadratic regression models, there has been considerable recent interest in finding new ways to evaluate such theories using semi-parametric and non-parametric methods. In this paper, I propose a new method for evaluating these theories, which I call the 'Goldilocks' algorithm. The algorithm is so named because it involves estimating three models in order to evaluate a u-shaped or inverse u-shaped hypothesis. One model is too flexible ('too hot') because it permits multiple inflection points in the expected relationship between x and y. One is too inflexible ('too cold') because it does not permit any inflection points. The final model ('just right') permits exactly one inflection point. In a simulation study based on 234 monotonic-increasing or inverse u-shaped functional forms and over 100 thousand simulated datasets, I show that my proposed algorithm outperforms the current favored method for testing w-shaped and inverse u-shaped hypotheses, called the 'Robin Hood' algorithm, in terms of controlling the false rejection rate and the power of the test. Key words: Statistics: nonparametric; Simulation: statistical analysis; Organizational studies: strategy, 1. Introduction Theories that predict w-shaped and inverse u-shaped relationships are ubiquitous throughout the social sciences (see, e.g. Haans et al. 2016, Lind and Mehlum 2010, Simonsohn 2018). In organizations [...]
- Published
- 2022
3. Multiple Organization Goals with Feedback from Shared Technological Task Environments
- Author
-
Hu, Songcui and Bettis, Richard A.
- Subjects
Technological innovations -- Analysis ,Regression analysis -- Models ,Environmental engineering -- Analysis ,Organizational learning -- Analysis ,Business, general ,Social sciences - Abstract
Goals and the performance feedback on those goals are fundamental to organizational learning and adaptation. However, most research has focused on single overall, high-level organizational goals while ignoring important operational goals farther down in the goal hierarchy. This paper explores the important issue of interdependent feedback on multiple operational goals with shared task environments. We conjecture about the impact of shared technological task environments on feedback across goals. We then empirically examine these conjectures using panel vector autoregression (PVAR) analysis of performance feedback from three strategically important operational goals with shared technological task environments in the automobile industry. We find that interdependent feedback can lead to severe and misleading confusion regarding learning from feedback on such goals with shared task environments. Then, we discuss the implications of our findings. These include the following: the absolute intractability of the problem of meeting multiple goals with interdependent task environments as the number of goals increases; limits on the modularity of organization structure; and severe challenges in ex post credit assignment and ex ante planning when goals share technological task environments. Finally, we discuss the application of PVAR to interdependent feedback problems in organizations. Supplemental Material: The online appendices are available at https://doi.org/10.1287/orsc.2018.1207. Keywords: goal feedback interdependency * shared technological task environments * intractability of engineering design * organization design * panel vector autoregression (PVAR), Coordination among many interdependent actors in complex product development projects is recognized as a key activity in organization theory. --Mihm, Loch, and Huchzermeier (1) Introduction Goals, also called objectives or [...]
- Published
- 2018
- Full Text
- View/download PDF
4. Monetary transmission channels in Bangladesh: Evidence from a floating exchange rate regime
- Author
-
Younus, Sayera
- Subjects
Foreign exchange rates -- Forecasts and trends -- Analysis ,Monetary policy -- Analysis ,Regression analysis -- Models ,Market trend/market analysis ,Business ,Economics ,Business, international ,Regional focus/area studies - Abstract
ABSTRACT The intention of this study is to examine the effectiveness of the monetary transmission channels in Bangladesh.A five-variable unrestricted Vector Auto Regression (VAR) technique is used to examine the [...]
- Published
- 2017
5. Logistics capability, logistics performance, and the moderating effect of firm size: Empirical evidence from east coast Malaysia
- Author
-
Zawawi, Nur Fadiah Binti Mohd, Wahab, Sazali Abdul, and Mamun, Abdullah Al
- Subjects
Industrial productivity -- Analysis ,Cost reduction -- Analysis ,Business logistics -- Analysis ,Regression analysis -- Models ,Productivity ,Business ,Economics ,Business, international ,Regional focus/area studies - Abstract
ABSTRACT Based on the underlying Resource-Based View (RBV) perspective, the main objective of this study is to empirically examine the relationship of logistics capability and logistics performance, and the moderating [...]
- Published
- 2017
6. Determinants of financial inclusion in Bangladesh: Dynamic GMM & Quantile Regression approach
- Author
-
Uddin, Ajim, Chowdhury, Mohammad Ashraful Ferdous, and Islam, Md. Nazrul
- Subjects
Financial inclusion -- Analysis ,Economic growth -- Analysis ,Regression analysis -- Models ,Business ,Economics ,Business, international ,Regional focus/area studies - Abstract
ABSTRACT Financial Inclusion has been recently acknowledged as a key enabler for reducing poverty and improving prosperity. However, more than 50% adults of the poorest households are still unbanked globally. [...]
- Published
- 2017
7. Use of random forests regression for predicting IRI of asphalt pavements
- Author
-
Gong, Hongren, Sun, Yiren, Shu, Xiang, and Huang, Baoshan
- Subjects
Asphalt pavements -- Mechanical properties -- Analysis ,Concrete cracking -- Analysis ,Regression analysis -- Models ,Algorithms ,Machine learning ,Databases ,Data mining ,Forests ,Business ,Construction and materials industries - Abstract
ABSTRACTRandom forest is a powerful machine learning algorithm with demonstrated success. In this study, the authors developed a random forests regression (RFR) model to estimate the international roughness index (IRI) [...]
- Published
- 2018
- Full Text
- View/download PDF
8. Data from Fasa University Broaden Understanding of Data Visualization (Using the multiple linear regression based on the relative importance metric and data visualization models for assessing the ability of drought indices)
- Subjects
Visualization (Computers) -- Models ,Regression analysis -- Models ,Computers - Abstract
2023 OCT 10 (VerticalNews) -- By a News Reporter-Staff News Editor at Information Technology Newsweekly -- Investigators publish new report on data visualization. According to news reporting originating from Fasa, [...]
- Published
- 2023
9. Capital structure and corporate governance
- Author
-
Naseem, Muhammad Akram, Zhang, Huanping, Malik, Fizzah, and Rehman, Ramiz-Ur
- Subjects
Corporate governance -- Analysis ,Return on assets -- Analysis ,Capital structure -- Analysis ,Regression analysis -- Models ,Business ,Economics ,Business, international ,Regional focus/area studies - Abstract
ABSTRACT Capital structure determination is considered as one of the key corporate financing decisions and managers often face difficulty in finding the optimal one. There are various theories regarding this [...]
- Published
- 2017
10. Generalized least squares and weighted least squares estimation methods for distributional parameters
- Author
-
Kantar, Yeliz Mert
- Published
- 2015
11. Does interest rate shocks transmit from United States to Ghana? Evidence from vector auto-regression
- Author
-
Oguanobi, Chibuike R., Akamobi, Anthony A., Ifebi, Ogonna E., and Maduka, Anne C.
- Subjects
Macroeconomics -- Analysis ,Interest rates -- Forecasts and trends -- Analysis ,Developing countries -- Economic aspects ,Regression analysis -- Models ,Market trend/market analysis ,Business ,Economics ,Business, international ,Regional focus/area studies - Abstract
In the heat of severe global macroeconomic volatility, monetary authorities in the developing world are faced with the challenge of identifying the sources of such volatilities in their countries. Previous studies on developed countries have attributed home country macroeconomic shocks to monetary policy shocks in foreign countries. Consequently, it was suspected that interest rate shocks in trading partner countries such as the United States might be contributing to macroeconomic shocks in Ghana. Hence, the need to replicate such studies on Ghana becomes imperative. Therefore, this paper aims at (i) ascertaining whether Ghana's interest rate respond to the interest rate shocks in USA (ii) finding out if interest rate shocks in USA affect other basic macroeconomic variables of the Ghanaian economy. Data on four Ghanaian variables of real gross domestic product, consumer price index, exchange rate and interest rate were collected. Data on U.S variables (federal fund rate and the world consumer price index) were also collected. All data series were annual and span the period, 1983 to 2011. A VAR model of the Ghanaian economy was specified assuming the U.S variables to be exogenous, affecting the vector of endogenous Ghanaian variables contemporaneously. To automatically resolve the problem of unit root found in our series when the mean reversion status of the series were checked, a VECM of the VAR model was estimated. The study further generated the impulse responses (IR) of the Ghanaian variables to variables of the United States. The impulse response analysis of our VAR model shows that in general, interest rate shocks in the United States as well as shocks in the global consumer prices led to insignificant fluctuations in the Ghanaian macro economy. While the countries' real gross domestic product, interest rates and consumer price indices responded insignificantly to shocks originating from the United States (the upper standard error band fluctuated between 0.00 and 0.50 and the lower standard error band fluctuated between -0.00 and -0.40), its exchange rate responded significantly to shocks from U.S exchange rates (the upper standard error band took off from 0.00, went down to -0.03 before going up to 0.02 while its lower standard error band fluctuated between -0.00 and -0.4). By implication, macroeconomic shocks in Ghana are mostly home- made. To avoid being misled into wrong policy decisions, policy makers in Ghana were therefore advised to always strike a balance between imported and home-made shocks when allocating their policy making resources. JEL Classifications: F36, F41, F65 Keywords: Interest rate, shocks, International transmission, Ghana, U.S.A., VAR., INTRODUCTION Macroeconomic theory asserts that economic policy shocks are transmitted from one country to another. This assertion emphasizes the international transmission effects of monetary policy shocks. Over the decades, one [...]
- Published
- 2015
12. Complex samples and regression-based inference: considerations for consumer researchers
- Author
-
Nielsen, Robert B. and Seay, Martin C.
- Subjects
Regression analysis -- Models ,Marketing research -- Forecasts and trends -- Evaluation ,Market trend/market analysis ,Advertising, marketing and public relations ,Business, general ,Business - Abstract
This article demonstrates that researchers who treat data collected via complex sampling procedures as if they were collected via simple random sample (SRS) may draw improper inferences when estimating regression [...]
- Published
- 2014
- Full Text
- View/download PDF
13. OR forum--an algorithmic approach to linear regression
- Author
-
Bertsimas, Dimitris and King, Angela
- Subjects
Regression analysis -- Models ,Linear programming -- Analysis ,Mathematical optimization -- Analysis ,Business ,Mathematics - Abstract
Linear regression models are traditionally built through trial and error to balance many competing goals such as predictive power, interpretability, significance, robustness to error in data, and sparsity, among others. [...]
- Published
- 2016
- Full Text
- View/download PDF
14. Risk estimation via regression
- Author
-
Broadie, Mark, Du, Yiping, and Moallemi, Ciamac C.
- Subjects
Regression analysis -- Models ,Financial risk -- Analysis ,Monte Carlo method -- Models -- Analysis ,Business ,Mathematics - Abstract
We introduce a regression-based nested Monte Carlo simulation method for the estimation of financial risk. An outer simulation level is used to generate financial risk factors and an inner simulation [...]
- Published
- 2015
- Full Text
- View/download PDF
15. Managing trade-in programs based on product characteristics and customer heterogeneity in business-to-business markets
- Author
-
Li, Kate J., Fong, Duncan K.H., and Xu, Susan H.
- Subjects
Business-to-business market -- Forecasts and trends -- Management ,Regression analysis -- Models ,Cluster analysis ,Company business management ,Business to business market ,Business - Abstract
Trade-in programs are offered extensively in business-to-business (B2B) markets. The success of such programs depends on well-designed and executed trade-in policies as well as accurate prediction of return flow to [...]
- Published
- 2011
- Full Text
- View/download PDF
16. Least absolute relative error estimation
- Author
-
Chen, Kani, Guo, Shaojun, Lin, Yuanyuan, and Ying, Zhiliang
- Subjects
Regression analysis -- Models ,Logarithmic functions -- Models ,Weighting (Statistics) -- Usage ,Mathematics - Abstract
Multiplicative regression model or accelerated failure time model, which becomes linear regression model after logarithmic transformation, is useful in analyzing data with positive responses, such as stock prices or life times, that are particularly common in economic/financial or biomedical studies. Least squares or least absolute deviation are among the most widely used criterions in statistical estimation for linear regression model. However, in many practical applications, especially in treating, for example, stock price data, the size of relative error, rather than that of error itself, is the central concern of the practitioners. This paper offers an alternative to the traditional estimation methods by considering minimizing the least absolute relative errors for multiplicative regression models. We prove consistency and asymptotic normality and provide an inference approach via random weighting. We also specify the error distribution, with which the proposed least absolute relative errors estimation is efficient. Supportive evidence is shown in simulation studies. Application is illustrated in an analysis of stock returns in Hong Kong Stock Exchange. KEY WORDS: Logarithm transformation; Multiplicative regression model; Random weighting.
- Published
- 2010
17. Model to describe the binodal curve on a type 1 ternary phase diagram
- Author
-
Lee, Kenneth Y.
- Subjects
Regression analysis -- Models ,Water, Underground -- Contamination ,Graphic methods -- Research ,Engineering and manufacturing industries ,Environmental issues - Abstract
A regression curve based on a Weibull distribution model is generated to describe the binodal curve on a three component liquid-liquid ternary phase diagram. The methodology involves transforming the experimental phase partitioning data points of each mixture from the ternary phase diagram to equivalent Cartesian coordinates. The regression curve is then generated, and the regression curve becomes the Weibull-derived binodal curve by superimposing the regression curve onto the original ternary phase diagram. A total of seven regression curves for seven ternary mixtures are generated in this study. From the regression analysis, the resulting [R.sup.2] value for each mixture is very close to one, indicating that there is strong correlation between the regression curve and the transformed experimental data points. The method developed in this research allows researchers to empirically describe a binodal curve, which is useful in analyzing and predicting the various phase partitioning scenarios of a three-component liquid-liquid system. DOI: 10.1061/(ASCE)EE.1943-7870.0000196 CE Database subject headings: Groundwater pollution; Equilibrium; Regression models; Nonaqueous phase liquids. Author keywords: Groundwater pollution; Nonaqueous phase liquid; Equilibrium; Regression models.
- Published
- 2010
18. Multiple linear regression model approach for aerosol dispersion in ventilated spaces using computational fluid dynamics and dimensional analysis
- Author
-
Hoque, Shamia, Farouk, Bakhtier, and Haas, Charles N.
- Subjects
Fluid dynamics -- Research ,Regression analysis -- Models ,Aerosols -- Chemical properties ,Aerosols -- Environmental aspects ,Ventilation -- Research ,Dimensional analysis -- Research ,Engineering and manufacturing industries ,Environmental issues - Abstract
Aerosol dispersion in living spaces especially bioaerosols, due to accidents or deliberate acts, is of significant current interest. Computational fluid dynamics (CFD) provides an accurate and detailed platform to study the influence of different parameters on aerosol distribution in indoor spaces. The simulations however are time consuming and site-specific. The work here introduces an approach toward addressing this challenge. During emergencies, an accurate, quicker, and more general model is required to give rapid answers to first responders. Significant parameters influencing aerosol behavior in an office room were identified and through dimensional analysis, nine dimensionless groups were developed. Fractional factorial design was used to build sixteen scenarios to explore the design space. These scenarios were then simulated using a comprehensive CFD model. Large eddy simulation with the Smagorinsky subgrid scale model was applied to compute the airflow. Aerosols were modeled as a dispersed solid phase using the Lagrangian treatment. The influence of the dimensionless groups on the temporal variation of the number of aerosols in the room and the spatial distribution of the particles in the room was analyzed. The results showed that all the identified dimensionless groups were significant. Multiple linear regression models were developed for the prediction of the number of aerosols in the room and their spatial distribution as a function of the significant parameters influencing aerosol transport. The linear models accurately predicted the data on which they were based but did not predict the results of the independent tests as well. The limited predictive ability of the model showed that the relationships between the dimensionless groups are nonlinear and a higher level of experimental design will have to be applied to better explore the design space. DOI: 10.1061/(ASCE)EE.1943-7870.0000201 CE Database subject headings: Air flow; Computational fluid dynamics technique; Design; Dimensional analysis; Regression models. Author keywords: Aerosols; Air flow; Computational fluid dynamics; Design; Dimensional analysis; Regression models.
- Published
- 2010
19. Inference in semiparametric regression models under partial questionnaire design and nonmonotone missing data
- Author
-
Chatterjee, Nilanjan and Li, Yan
- Subjects
Missing observations (Statistics) -- Usage ,Questionnaires -- Analysis ,Regression analysis -- Models ,Mathematics - Abstract
In epidemiologic studies, partial questionnaire design (PQD) can reduce cost, time, and other practical burdens associated with lengthy questionnaires by assigning different subsets of the questionnaire to different, but overlapping, subsets of the study participants. In this article, we describe methods for semiparametric inference for regression model under PQD and other study settings that can generate nonmonotone missing data in covariates. In particular, motivated from methods for multiphase designs, we develop three estimators, namely mean score, pseudo-likelihood, and semiparametric maximum likelihood, each of which has some unique advantages. We develop the asymptotic theory and a sandwich variance estimator for each of the estimators under the underlying semiparametric model that allows the distribution of the covariates to remain nonparametric. We study the finite sample performances and relative efficiencies of the methods using simulation studies. We illustrate the methods using data from a case-control study of non-Hodgkin's lymphoma where the data on the main chemical exposures of interest are collected using two different instruments on two different, but overlapping, subsets of the participants. This article has supplementary material online. KEY WORDS: Mean score; Multiphase design; Outcome dependent sampling; Pseudo-likelihood
- Published
- 2010
20. Analysis of the nonlinear response of electricity prices to fundamental and strategic factors
- Author
-
Chen, D. and Bunn, D.W.
- Subjects
Electric power -- Prices and rates ,Electric power -- Supply and demand ,Electric utilities -- Forecasts and trends ,Electric utilities -- Market share ,Nonlinear theories -- Analysis ,Regression analysis -- Models ,Company pricing policy ,Market trend/market analysis ,Company market share ,Business ,Electronics ,Electronics and electrical industries - Published
- 2010
21. Strong approach of quasi-maximum likelihood estimators in a semi-parametric regression model
- Author
-
Chang, Zhenhai, Liu, Wei, and Zhang, Desheng
- Subjects
Regression analysis -- Models ,Parameter estimation -- Research ,Convergence (Mathematics) -- Research ,High technology industry ,Business, international ,Law - Abstract
This paper considers a semi-parametric regression model with fixed design points. Firstly, the estimators [??] and [??](*) of, respectively, [beta] and g(*) are derived by using the weight function and quasi-maximum likelihood methods. Then, under proper conditions, strong consistencies of these estimators are established and strong convergence rates of the estimators [??] and [??](*) are obtained respectively. Keywords Semi-parametric regression model, quasi-maximum likelihood estimator, consistency, convergence rate., [section]1. Introduction Consider a semi-parametric regression model yk = [X.sup.T.sub.k] [beta] + g([t.sub.k])+ [e.sub.k], k = 1, 2, ..., n. (1) where [beta] [member of] [R.sup.d] is an unknown regression [...]
- Published
- 2010
22. Linear regression models and neural networks for the fast emulation of a molecular absorption code
- Author
-
Euvrard, Guillaume, Rivals, Isabelle, Huet, Thierry, Lefebvre, Sidonie, and Simoneau, Pierre
- Subjects
Regression analysis -- Models ,Code generators -- Design and construction ,Program generators -- Design and construction ,Neural networks -- Usage ,Code generator ,Neural network ,Astronomy ,Physics - Abstract
The background scene generator MATISSE, whose main functionality is to generate natural background radiance images, makes use of the so-called Correlated K (CK) model. It necessitates either loading or computing thousands of CK coefficients for each atmospheric profile. When the CK coefficients cannot be loaded, the computation time becomes prohibitive. The idea developed in this paper is to substitute fast approximate models for the exact CK generator; using the latter, a representative set of numerical examples is built and used to train linear or nonlinear regression models. The resulting models enable an accurate CK coefficient computation for all the profiles of an image in a reasonable time. OCIS codes: 000.4430, 010.1030, 010.5620, 200.4260, 010.1300, 110.2960.
- Published
- 2009
23. Estimation of the mean and variance response surfaces when the means and variances of the noise variables are unknown
- Author
-
Tan, Matthias Hwai Yong and Ng, Szu Hui
- Subjects
Algorithms -- Analysis ,Algorithms -- Models ,Robust statistics -- Analysis ,Regression analysis -- Models ,Algorithm ,Business ,Engineering and manufacturing industries - Abstract
The means and variances of noise variables are typically assumed known in the design and analysis of robust design experiments. However, these parameters are often not known with certainty and estimated with field data. Standard experimentation and optimization conducted with the estimated parameters can lead to results that are far from optimal due to variability in the data. In this paper, the estimation of the mean and variance response surfaces are considered using a combined array experiment in which estimates of the means and variances of the noise variables are obtained from random samples. The effects of random sampling error on the estimated mean and variance models are studied and a method to guide the design of the sampling effort and experiment to improve the estimation of the models is proposed. Mathematical programs are formulated to find the sample sizes for the noise variables and number of factorial, axial and center point replicates for a mixed resolution design that minimize the average variances of the estimators for the mean and variance models. Furthermore, an algorithm is proposed to find the optimal design and sample sizes given a candidate set of design points. [Supplementary materials are available for this article. Go to the publisher's online edition of IIE Transactions for the following free supplemental resource: Appendix] Keywords: Robust parameter design, combined array designs, estimation of mean and variance models, optimal sample sizes, 1. Introduction Robust Parameter Design (RPD) is a quality improvement methodology based on design of experiments for designing products and processes that are insensitive to variation in noise variables. The [...]
- Published
- 2009
24. Application and comparison of robust linear regression methods for trend estimation
- Author
-
Muhlbauer, Andreas, Spichtinger, Peter, and Lohmann, Ulrike
- Subjects
Regression analysis -- Methods ,Regression analysis -- Models ,Precipitation (Meteorology) -- Forecasts and trends ,Atmospheric temperature -- Forecasts and trends ,Time-series analysis -- Methods ,Meteorological research -- Methods ,Meteorological research -- Comparative analysis ,Market trend/market analysis ,Earth sciences - Abstract
In this study, robust parametric regression methods are applied to temperature and precipitation time series in Switzerland and the trend results are compared with trends from classical least squares (LS) regression and nonparametric approaches. It is found that in individual time series statistically outlying observations are present that influence the LS trend estimate severely. In some cases, these outlying observations lead to an over-/underestimation of the trends or even to a trend masking. In comparison with the classical LS method and standard nonparametric techniques, the use of robust methods yields more reliable trend estimations and outlier detection.
- Published
- 2009
25. Smoothing spline semiparametric nonlinear regression models
- Author
-
Yuedong Wang and Chunlei KE
- Subjects
Hilbert space -- Analysis ,Maximum likelihood estimates (Statistics) -- Usage ,Regression analysis -- Models ,Mathematics ,Science and technology - Published
- 2009
26. Modelling the gender pay gap in the UK: 1998 to 2006
- Author
-
Barnard, Andrew
- Subjects
United Kingdom. Office for National Statistics -- Reports ,Regression analysis -- Models ,Wages -- Laws, regulations and rules ,Wages -- Demographic aspects ,Salary ,Government regulation ,Business, international - Published
- 2008
27. Robust and efficient adaptive estimation of binary-choice regression models
- Author
-
Cizek, Pavel
- Subjects
Maximum likelihood estimates (Statistics) -- Models ,Regression analysis -- Models ,Robust statistics -- Models ,Mathematics - Abstract
The binary-choice regression models, such as probit and logit, are used to describe the effect of explanatory variables on a binary response variable. Typically estimated by the maximum likelihood method, estimates are very sensitive to deviations from a model, such as heteroscedasticity and data contamination. At the same time, the traditional robust (high-breakdown point) methods, such as the maximum trimmed likelihood, are not applicable because, by trimming observations, they induce nonidentification of parameter estimates. To provide a robust estimation method for binary-choice regression, we consider a maximum symmetrically trimmed likelihood estimator (MSTLE) and design a parameter-free adaptive procedure for choosing the amount of trimming. The proposed adaptive MSTLE preserves the robust properties of the original MSTLE, significantly improves the finite-sample behavior of MSTLE, and also ensures the asymptotic equivalence of the MSTLE and maximum likelihood estimator under no contamination. The results concerning the trimming identification, robust properties, and asymptotic distribution of the proposed method are accompanied by simulation experiments and an application documenting the finite-sample behavior of some existing and the proposed methods. KEY WORDS: Binary-choice regression; Breakdown point; Maximum likelihood estimation; Robust estimation; Trimming.
- Published
- 2008
28. Penalized estimating functions and variable selection in semiparametric regression models
- Author
-
Johnson, Brent A., Lin, D.Y., and Zeng, Donglin
- Subjects
Parameter estimation -- Analysis ,Regression analysis -- Models ,Mathematics - Abstract
We propose a general strategy for variable selection in semiparametric regression models by penalizing appropriate estimating functions. Important applications include semiparametric linear regression with censored responses and semiparametric regression with missing pre dictors. Unlike the existing penalized maximum likelihood estimators, the proposed penalized estimating functions may not pertain to the derivatives of any objective functions and may be discrete in the regression coefficients. We establish a general asymptotic theory for penalized estimating functions and present suitable numerical algorithms to implement the proposed estimators. In addition, we develop a resampling technique to estimate the variances of the estimated regression coefficients when the asymptotic variances cannot be evaluated directly. Simulation studies demonstrate that the proposed methods perform well in variable selection and variance estimation. We illustrate our methods using data from the Paul Converdell Stoke Registry. KEY WORDS: Accelerated failure time model; Buckley-James estimator; Censoring; Least absolute shrinkage and selection operator; Least squares; Linear regression; Missing data; Smoothly clipped absolute deviation.
- Published
- 2008
29. Using panel data analysis to estimate DEA confidence intervals adjusted for the environment
- Author
-
Barnum, Darold T., Gleason, John M., and Hemily, Brendon
- Subjects
Panel analysis -- Methods ,Data envelopment analysis -- Methods ,Cost control -- Methods ,Public transportation -- Equipment and supplies ,Public transportation -- Environmental aspects ,Metropolitan areas -- Environmental aspects ,Regression analysis -- Models ,Cost reduction ,Engineering and manufacturing industries ,Science and technology ,Transportation industry - Abstract
This paper illustrates three concepts new to the data envelopment analysis (DEA) literature, and applies them to data from Canadian paratransit agencies. First, it predicts valid confidence intervals and trends for each agency's true efficiency. Second, it uses panel data analysis methodology, a set of statistical procedures that are more likely to produce valid estimates than those commonly used in DEA studies. Third, it uses a new method of identifying and adjusting for environmental effects that has more power than conventional procedures. CE Database subject headings: Cost control; Data analysis; Economic models; Environmental issues; Performance characteristics; Public transportation; Urban areas; Regression models.
- Published
- 2008
30. Application of multidimensional selective item response regression model for studying multiple gene methylation in SV40 oncogenic pathways
- Author
-
Lin, Haiqun, Feng, Ziding, Yu, Yan, Zheng, Yingye, Shivapurkar, Narayan, and Gazdar, Adi F.
- Subjects
Methylation -- Analysis ,Oncogenic viruses -- Analysis ,Regression analysis -- Models ,Mathematics - Abstract
Alteration of gene methylation patterns has been reported to be involved in the early onsets of many human malignancies. Many exogenous risk factors, such as cigarette smoke, dietary additives, chemical exposures, radiation, and biologic agents including viral infection, are involved in the methylation pathways of cancers. We propose a multidimensional selective item response regression model to describe and test how a risk factor may alter molecular pathways involving aberrant methylation of multiple genes in oncogenesis. Our modeling framework is built on an item response model for multivariate dichotomous responses of high dimension, such as aberrant methylation of multiple tumor-suppressor genes, but we allow risk factors such as SV40 viral infection to alter the distribution of the latent factors that subsequently affect the outcome of cancer. We postulate empirical identification conditions under our model formulation. Moreover, we do not prespecify the links between the multiple dichotomous methylation responses and the latent factors, but rather conduct specification searches with a genetic algorithm to discover the links. Parameter estimation through maximum likelihood and specification searches in models with multidimensional latent factors for multivariate binary responses have become practical only recently, due to modern statistical computing development. We illustrate our proposal with the biological finding that simultaneous methylation of multiple tumor-suppressor genes is associated with the presence of SV40 viral sequences and with the cancer status of lymphoma/leukemia. We are able to test whether the data are consistent with the causal hypothesis that SV40 induces aberrant methylation of multiple genes in its oncogenic pathways. At the same time, we are able to evaluate the role of SV40 in the methylation pathway and to determine whether the methylation pathway is responsible for the development of leukemia/lymphoma. KEY WORDS: Biomarker; Causal pathway; Factor analysis; Genetic algorithm; Identification; Item response; Joint model; Latent variable; Specification search.
- Published
- 2008
31. Comparing the performance of bus routes after adjusting for the environment using data envelopment analysis
- Author
-
Barnum, Darold T., Tandon, Sonali, and McNeil, Sue
- Subjects
Public transportation -- Economic aspects ,Public transportation -- Social aspects ,Metropolitan areas -- Economic aspects ,Metropolitan areas -- Social aspects ,Regression analysis -- Models ,Electronic data processing -- Methods ,Engineering and manufacturing industries ,Science and technology ,Transportation industry - Abstract
Public transit managers strive to attain multiple goals with tightly constrained resources. Ratio analysis has evolved into a powerful tool for dealing with these goals and constraints. Ratio analysis provides analytical methods for comparing the performance of multiple agencies, as well as the performance of subunits within a particular agency, in order to identify opportunities for improvement. One ratio analysis procedure that has become increasingly popular is data envelopment analysis (DEA). DEA yields a single, comprehensive measure of performance, the ratio of the aggregated, weighted outputs to aggregated, weighted inputs. This paper makes two contributions to the practice of transit performance evaluation using DEA. First, instead of using DEA to compare the performance of multiple transit systems, it uses DEA to compare the performance of multiple bus routes of one transit system. Second, it introduces a new procedure for adjusting the raw DEA scores that modifies these scores to account for the environmental influences that are beyond the control of the transit agency. DOI: 10.1061/(ASCE)0733-947X(2008)134:2(77) CE Database subject headings: Cost control; Data analysis; Economic models; Environmental issues; Public transportation; Regression models; Urban areas; Performance characteristics.
- Published
- 2008
32. Comparison of linear and mixed-effect regression models and a k-nearest neighbour approach for estimation of single-tree biomass
- Author
-
Fehrmann, Lutz, Lehtonen, Aleksi, Kleinn, Christoph, and Tomppo, Erkki
- Subjects
Plant biomass -- Models ,Regression analysis -- Models ,Pine -- Properties -- Models ,Spruce -- Properties -- Models ,Earth sciences ,Models ,Properties - Abstract
Abstract: Allometric biomass models for individual trees are typically specific to site conditions and species. They are often based on a low number of easily measured independent variables, such as [...]
- Published
- 2008
33. Regression analysis of longitudinal data in the presence of informative observation and censoring times
- Author
-
Sun, Jianguo, Sun, Liuquan, and Liu, Dandan
- Subjects
Tumors -- Research ,Bladder cancer -- Research ,Regression analysis -- Models ,Mathematics - Abstract
Longitudinal data frequently occur in many studies, such as longitudinal follow-up studies. To develop statistical methods and theory for the analysis of these data, independent or noninformative observation and censoring times are typically assumed, which naturally leads to inference procedures conditional on observation and censoring times. But in many situations this may not be true or realistic; that is, longitudinal responses may be correlated with observation times as well as censoring times. This article considers the analysis of longitudinal data where these correlations may exist and proposes a joint modeling approach that uses some latent variables to characterize the correlations. For inference about regression parameters, estimating equation approaches are developed and both large-sample and final-sample properties of the proposed estimators are established. In addition, some graphical and numerical procedures are presented for model checking. The methodology is applied to a bladder cancer study that motivated this investigation. KEY WORDS: Estimating equation; Informative observation process; Joint modeling; Latent variables.
- Published
- 2007
34. Bayesian degradation modeling in accelerated pavement testing with estimated transformation parameter for the response
- Author
-
Onar, Arzu, Thomas, Fridtjof, Choubane, Bouzid, and Byron, Tom
- Subjects
Monte Carlo method -- Usage ,Bayesian statistical decision theory -- Usage ,Regression analysis -- Models ,Pavements -- Performance ,Pavements -- Testing ,Engineering and manufacturing industries ,Science and technology ,Transportation industry - Abstract
We discuss Bayesian degradation models that were developed for flexible pavements based on accelerated pavement testing with the heavy vehicle simulator. The models are fitted to data from the Florida Department of Transportation, where rutting performance of three binder types was tested under three temperature settings. The analysis utilizes Bayesian linear mixed-effects models for longitudinal degradation data where the parameter estimates and their posterior marginal distributions are obtained via a Markov chain Monte Carlo (MCMC) technique. The linearity in this model is achieved by utilizing a covariate-dependent Box-Cox transformation of the response variable, where the transformation parameter is estimated as part of the modeling procedure. The paper illustrates the various forms of useful inference that can easily be obtained via the output from the MCMC chains and provides insights regarding the accelerated test experiment at hand. As expected, the results suggest that rut depth development is affected both by the binder type, as well as the test temperature. What is more, the conditional inference made possible by the Bayesian approach utilized here clearly demonstrates the dependence of the inference for the covariate effects on the value of the Box-Cox transformation parameter. Hence the transformation of the response variable is an important step in model building that has to be carefully considered. CE Database subject headings: Pavement design; Statistics; Regression models; Parameters; Transformations.
- Published
- 2007
35. Evaluation and modeling of repeated load test data of asphalt concrete for mechanistic-empirical pavement design
- Author
-
Price, Stephen, Mehta, Yusuf, and McCarthy, Leslie Myers
- Subjects
Asphalt concrete -- Mechanical properties ,Regression analysis -- Models ,Pavements -- Live loads ,Pavements -- Evaluation ,Engineering and manufacturing industries ,Science and technology - Abstract
The Mechanistic-Empirical Pavement Design Guide (MEPDG) made available by the National Cooperative Highway Research Program in 2004 uses a power model that incorporates dynamic modulus (DM) as a factor in predicting rutting performance of asphalt concrete. The rutting model is empirical and the rutting profiles obtained from the model are based largely on the adjustment of calibration constants, not the DM. Various DM values would be capable of producing similar rutting profiles based on the current model. This study evaluates using accumulated strain data obtained from the Repeated Load Test (RLT) as an alternative to using DM. In addition, a model to represent the accumulated strain data obtained from the RLT is developed. The accuracy and behavior of the current model and the model developed in this study in representing a mix from Louisiana are compared. Alternatives for incorporating the RLT data into the MEPDG are also presented. DOI: 10.1061/(ASCE)0899-1561(2007)19:11(993) CE Database subject headings: Pavement design; Cyclic loads; Load tests; Regression models; Repeated loads; Asphalt; Concrete.
- Published
- 2007
36. Variable selection in finite mixture of regression models
- Author
-
Khalili, Abbas and Chen, Jiahua
- Subjects
Finite groups -- Analysis ,Poisson distribution -- Analysis ,Regression analysis -- Models ,Mathematics - Abstract
In the applications of finite mixture of regression (FMR) models, often many covariates are used, and their contributions to the response variable vary from one component to another of the mixture model. This creates a complex variable selection problem. Existing methods, such as the Akaike information criterion and the Bayes information criterion, are computationally expensive as the number of covariates and components in the mixture model increases. In this article we introduce a penalized likelihood approach for variable selection in FMR models. The new method introduces penalties that depend on the size of the regression coefficients and the mixture structure. The new method is shown to be consistent for variable selection. A data-adaptive method for selecting tuning parameters and an EM algorithm for efficient numerical computations are developed. Simulations show that the method performs very well and requires much less computing power than existing methods. The new method is illustrated by analyzing two real data sets. KEY WORDS: EM algorithm; LASSO; Mixture model; Penalty method; SCAD.
- Published
- 2007
37. Longitudinal studies with outcome-dependent follow-up: models and Bayesian regression
- Author
-
Ryu, Duchwan, Sinha, Debajyoti, Mallick, Bani, Lipsitz, Stuart R., and Lipshultz, Steven E.
- Subjects
Bayesian statistical decision theory -- Models ,Longitudinal method -- Analysis ,Regression analysis -- Models ,Mathematics - Abstract
We propose Bayesian parametric and semiparametric partially linear regression methods to analyze the outcome-dependent follow-up data when the random time of a follow-up measurement of an individual depends on the history of both observed longitudinal outcomes and previous measurement times. We begin with the investigation of the simplifying assumptions of Lipsitz, Fitzmaurice, Ibrahim, Gelber, and Lipshultz, and present a new model for analyzing such data by allowing subject-specific correlations for the longitudinal response and by introducing a subject-specific latent variable to accommodate the association between the longitudinal measurements and the follow-up times. An extensive simulation study shows that our Bayesian partially linear regression method facilitates accurate estimation of the true regression line and the regression parameters. We illustrate our new methodology using data from a longitudinal observational study. KEY WORDS: Bayesian cubic smoothing spline; Latent variable; Partially linear model.
- Published
- 2007
38. On directional regression for dimension reduction
- Author
-
Li, Bing and Wang, Shaoli
- Subjects
Parameter estimation -- Analysis ,Regression analysis -- Models ,Mathematics - Abstract
We introduce directional regression (DR) as a method for dimension reduction. Like contour regression, DR is derived from empirical directions, but achieves higher accuracy and requires substantially less computation. DR naturally synthesizes the dimension reduction estimators based on conditional moments, such as sliced inverse regression and sliced average variance estimation, and in doing so combines the advantages of these methods. Under mild conditions, it provides exhaustive and [square root of n]-consistent estimate of the dimension reduction space. We develop the asymptotic distribution of the DR estimator, and from that a sequential test procedure to determine the dimension of the central space. We compare the performance of DR with that of existing methods by simulation and find strong evidence of its advantage over a wide range of models. Finally, we apply DR to analyze a data set concerning the identification of hand-written digits. KEY WORDS: Contour regression; Exhaustive estimation; Efficiency; Sliced inverse regression; Sliced average variance estimation.
- Published
- 2007
39. Behavioral transition: a framework for the construction conflict--tension relationship
- Author
-
Yiu, Tak Wing and Cheung, Sai On
- Subjects
Regression analysis -- Models ,Conflict management -- Methods ,Human acts -- Evaluation ,Human behavior -- Evaluation ,Business ,Electronics and electrical industries ,Engineering and manufacturing industries - Abstract
Conflicts are inevitable in construction projects. One of the reasons is that all construction projects involve complex human interactions. Previous studies have shown that behavioral states can respond dynamically as the magnitude of a conflict increases. This has been empirically demonstrated using a catastrophe-theory-based, three-variable system involving the level of construction conflict, the level of tension, and the amount of behavioral flexibility (Yiu and Cheung, 2006). This paper reports on a study that builds on the above-mentioned study by Yiu and Cheung, and examines the application of moderated multiple regression (MMR) to the three-variable system. It was found that not all MMR models display a significant moderating effect. Two out of six MMR models were found to be significant in their effect. These models affirm that the nature of the relationship between the degree of uncertainty and adversarial attitudes (or mistrust level) varies, depending on the behavioral flexibility of the parties. Disordinal interactions were also found, suggesting that the interaction between behavioral flexibility and the conflict-tension relationship can change radically. Critical points for the degree of uncertainty were also able to be calculated. Beyond these points, even a flexible individual may find difficulty in minimizing or resolving construction conflicts. As such, it is suggested that such radical changes could be prevented by minimizing the degree of uncertainty in construction projects. Index Terms--Behavioral flexibility, construction conflicts, moderated multiple regression (MMR), tension.
- Published
- 2007
40. Evaluating prediction rules for t-year survivors with censored regression models
- Author
-
Uno, Hajime, Cai, Tianxi, Tian, Lu, and Wei, L.J.
- Subjects
Regression analysis -- Models ,Regression analysis -- Usage ,Mathematics - Abstract
Suppose that we are interested in establishing simple but reliable rules for predicting future t-year survivors through censored regression models. In this article we present inference procedures for evaluating such binary classification rules based on various prediction precision measures quantified by the overall misclassification rate, sensitivity and specificity, and positive and negative predictive values. Specifically, under various working models, we derive consistent estimators for the above measures through substitution and cross-validation estimation procedures. Furthermore, we provide large-sample approximations to the distributions of these nonsmooth estimators without assuming that the working model is correctly specified. Confidence intervals, for example, for the difference of the precision measures between two competing rules can then be constructed. All of the proposals are illustrated with real examples, and their finite-sample properties are evaluated through a simulation study. KEY WORDS: Cross-validation; Gene expression; Model selection; Positive and negative predictive values; Prediction error; Receiver operating characteristic curve; Survival analysis.
- Published
- 2007
41. Partially linear hazard regression for multivariate survival data
- Author
-
Cai, Jianwen, Fan, Jianqing, Jiang, Jiancheng, and Zhou, Haibo
- Subjects
Algorithms -- Usage ,Regression analysis -- Models ,Algorithm ,Mathematics - Abstract
This article studies estimation of partially linear hazard regression models for multivariate survival data. A profile pseudo-partial likelihood estimation method is proposed under the marginal hazard model framework. The estimation on the parameters for the linear part is accomplished by maximization of a pseudo-partial likelihood profiled over the nonparametric part. This enables us to obtain [square root of n]-consistent estimators of the parametric component. Asymptotic normality is obtained for the estimates of both the linear and nonlinear parts. The new technical challenge is that the nonparametric component is indirectly estimated through its integrated derivative function from a local polynomial fit. An algorithm of fast implementation of our proposed method is presented. Consistent standard error estimates using sandwich-type ideas are also developed, which facilitates inferences for the model. It is shown that the nonparametric component can be estimated as well as if the parametric components were known and the failure times within each subject were independent. Simulations are conducted to demonstrate the performance of the proposed method. A real dataset is analyzed to illustrate the proposed methodology. KEY WORDS: Local pseudo-partial likelihood; Marginal hazard model; Multivariate failure time; Partially linear; Profile pseudo-partial likelihood.
- Published
- 2007
42. Random effects Weibull regression model for occupational lifetime
- Author
-
So Young Sohn, In Sang Chang, and Tae Hee Moon
- Subjects
Brain drain -- Control ,Employee retention -- Research ,Human resource management -- Methods ,Regression analysis -- Models ,Business ,Business, general ,Business, international - Abstract
A random effects Weigbull regression model for predicting for efficient human resource management and occupational life expectancy of employees is presented. The proposed model can help to control the brain-drain.
- Published
- 2007
43. Residual (Sur)Realism
- Author
-
Stefanski, Leonard A.
- Subjects
Regression analysis -- Appreciation ,Regression analysis -- Models ,Science and technology ,Social sciences - Abstract
KEY WORDS: Added-variable plot; Backward selection; Forward selection; Hidden image; Hidden message; Linear regression; Model selection; Partial regression plot; Residual plots; Variable selection.
- Published
- 2007
44. Natural conjugate priors for the instrumental variables regression model applied to the Angrist--Krueger data
- Author
-
Hoogerheide, Lennart, Kleibergen, Frank, and van Dijk, Herman K.
- Subjects
Bayesian statistical decision theory -- Models ,Regression analysis -- Models ,Maximum likelihood estimates (Statistics) -- Analysis ,Business ,Economics - Abstract
We propose a natural conjugate prior for the instrumental variables regression model. The prior is a natural conjugate one since the marginal prior and posterior of the structural parameter have the same functional expressions which directly reveal the update from prior to posterior. The Jeffreys prior results from a specific setting of the prior parameters and results in a marginal posterior of the structural parameter that has an identical functional form as the sampling density of the limited information maximum likelihood estimator. We construct informative priors for the Angrist--Krueger [1991. Does compulsory school attendance affect schooling and earnings? Quarterly Journal of Economics 106, 979-1014] data and show that the marginal posterior of the return on education in the US coincides with the marginal posterior from the Southern region when we use the Jeffreys prior. This result occurs since the instruments are the strongest in the Southern region and the posterior using the Jeffreys prior, identical to maximum likelihood, focusses on the strongest available instruments. We construct informative priors for the other regions that make their posteriors of the return on education similar to that of the US and the Southern region. These priors show the amount of prior information needed to obtain comparable results for all regions. JEL classification: C11 Keywords: Instrumental variables; Rank reduction; Natural conjugate prior; Bayesian analysis
- Published
- 2007
45. A simple graphical decision aid for the placement of elderly people in long-term care
- Author
-
Xie, H., Chaussalet, T.J., Thompson, W.A., and Millard, P.H.
- Subjects
Long-term care of the sick -- Management ,Decision-making -- Models ,Decision support systems -- Technology application ,Regression analysis -- Models ,Graphic methods -- Usage ,Decision support software ,Company business management ,Technology application ,Business ,Business, general - Abstract
This paper describes the construction of a graphical decision tool to aid placement decisions of a multidisciplinary review panel for admissions to long-term care in a London borough in the UK. First we construct a prediction model of placement decisions based on an applicant's attributes. Using data from the London borough, a composite model comprising syndromic decision rules followed by a two-stage hierarchical logistic regression model is proposed. The model proved to be robust in differentiating cases needing residential home care and nursing home care. Placement outcomes generated by the model are then represented graphically on a triangle plot. This approach could potentially be used as a decision support tool by managers of long-term care for continuous monitoring and assessment of the appropriateness of placements with respect to residents' needs. doi: 10.1057/palgrave.jors.2602179 Published online 3 May 2006 Keywords: decision support; graphical approach; health; long-term care
- Published
- 2007
46. Bayesian CART: Prior specification and posterior simulation
- Author
-
Yuhong Wu, Tjelmeland, Hakon, and West, Mike
- Subjects
Regression analysis -- Models ,Bayesian statistical decision theory -- Usage ,Trees (Graph theory) -- Analysis ,Mathematics ,Science and technology - Abstract
Advances in the Bayesian formulation for CART (classification and regression tree) models with a new prior specification for the tree structure and a new radical restructure tree Metropolis-Hastings move for CART trees are discussed. Aspects of robustness and prediction in these models under the new formulation are explored to demonstrate the major improvements in the convergence and mixing properties of the posterior simulators using the new method.
- Published
- 2007
47. Modeling transformation in CEECs using smooth transitions
- Author
-
Foster, Neil and Stehrer, Robert
- Subjects
Europe -- Economic aspects ,Economic development -- Forecasts and trends ,Regression analysis -- Usage ,Regression analysis -- Models ,Market trend/market analysis ,Business ,Economics - Abstract
A study on economics of Central and Eastern Europe, using logistic smooth transition regression model, is presented.
- Published
- 2007
48. Power transformation toward a linear regression quantile
- Author
-
Mu, Yunming and He, Xuming
- Subjects
Quantile regression -- Models ,Quantile regression -- Analysis ,Regression analysis -- Models ,Regression analysis -- Usage ,Transformations (Mathematics) -- Methods ,Mathematics - Abstract
In this article we consider the linear quantile regression model with a power transformation on the dependent variable. Like the classical Box-Cox transformation approach, it extends the applicability of linear models without resorting to nonparametric smoothing, but transformations on the quantile models are more natural due to the equivariance property of the quantiles under monotone transformations. We propose an estimation procedure and establish its consistency and asymptotic normality under some regularity conditions. The objective function employed in the estimation can also be used to check inadequacy of a power-transformed linear quantile regression model and to obtain inference on the transformation parameter. The proposed approach is shown to be valuable through illustrative examples. KEY WORDS: Box-Cox power transformation; Conditional and unconditional inference; Cusum process; Empirical process; Lack of fit; Quantile regression; V statistic.
- Published
- 2007
49. Minimum area confidence set optimality for confidence bands in simple linear regression
- Author
-
Liu, W. and Hayter, A.J.
- Subjects
Regression analysis -- Research ,Regression analysis -- Models ,Confidence intervals -- Analysis ,Statistical hypothesis testing -- Methods ,Mathematics - Abstract
The average width of a simultaneous confidence band has been used by several authors (e.g., Naiman and Piegorsch) as a criterion for the comparison of different confidence bands. In this article the area of the confidence set that corresponds to a confidence band is used as a new criterion. For simple linear regression, comparisons have been carried out under this new criterion between hyperbolic bands, two-segment bands, and three-segment bands, which include constant width bands as special cases. It is found that if one requires a confidence band over the whole range of the covariate, then the best confidence band is given by the Working and Hotelling hyperbolic band. Furthermore, if one needs a confidence band over a finite interval of the covariate, then a restricted hyperbolic band can again be recommended, although a three-segment band may be very slightly superior in certain cases. KEY WORDS: Confidence bands; Confidence sets; Probability inequalities; Minimum area; Simple linear regression.
- Published
- 2007
50. The AIC criterion and symmetrizing the Kullback-Leibler divergence
- Author
-
Seghouane, Abd-Krim and Amari, Shun-Ichi
- Subjects
Harmonic functions -- Usage ,Regression analysis -- Models ,Neural networks -- Research ,Neural network ,Business ,Computers ,Electronics ,Electronics and electrical industries - Abstract
The Akaike information criterion (AIC) is a widely used tool for model selection. AIC is derived as an asymptotically unbiased estimator of a function used for ranking candidate models which is a variant of the Kullback--Leibler divergence between the true model and the approximating candidate model. Despite the Kullback--Leibler's computational and theoretical advantages, what can become inconvenient in model selection applications is their lack of symmetry. Simple examples can show that reversing the role of the arguments in the Kullback--Leibler divergence can yield substantially different results. In this paper, three new functions for ranking candidate models are proposed. These functions are constructed by symmetrizing the Kullback--Leibler divergence between the true model and the approximating candidate model. The operations used for symmetrizing are the average, geometric, and harmonic means. It is found that the original AIC criterion is an asymptotically unbiased estimator of these three different functions. Using one of these proposed ranking functions, an example of new bias correction to AIC is derived for univariate linear regression models. A simulation study based on polynomial regression is provided to compare the different proposed ranking functions with AIC and the new derived correction with [AIC.sub.c]. Index Terms--Akaike information criterion (AIC), geometric and harmonic means, Kullback-Leibler divergence, model selection.
- Published
- 2007
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.