56 results on '"Bivariate data"'
Search Results
2. A convenient tool for bivariate data analysis and bar graph plotting with R
- Author
-
Megan Cross, Andreas Hofmann, Malte A. Karow, Jan Hendrik Straub, Christoph S. Clemen, and Ludwig Eichinger
- Subjects
0303 health sciences ,Java ,business.industry ,Computer science ,Programming language ,Bar chart ,05 social sciences ,Scalable Vector Graphics ,050301 education ,computer.file_format ,computer.software_genre ,Biochemistry ,Pipeline (software) ,law.invention ,03 medical and health sciences ,Software ,Bivariate data ,law ,business ,0503 education ,Molecular Biology ,computer ,030304 developmental biology ,computer.programming_language ,Graphical user interface - Abstract
The Java software jBar consists of a graphical user interface that allows the user to customize and assemble an included script for R. The scripted R pipeline calculates means and standard errors/deviations for replicates of numerical bivariate data and generates presentations in the form of bar graphs. A two-sided Student's t test is carried out against a user-selected reference and p-values are calculated. The user can enter the data conveniently through the built-in spreadsheet and configure the R pipeline in the graphical user interface. The configured R script is written into a file and then executed. Bar graphs can be generated as static PNG, PDF, and SVG files or as interactive HTML widgets. © 2019 International Union of Biochemistry and Molecular Biology, 47(2): 207-210, 2019.
- Published
- 2019
- Full Text
- View/download PDF
3. Modeling conditional reference regions: application to glycemic markers
- Author
-
Óscar Lado-Baleato, Francisco Gude, Javier Roca-Pardiñas, Carmen Cadarso-Suárez, and Universidade de Santiago de Compostela. Departamento de Estatística, Análise Matemática e Optimización
- Subjects
Blood Glucose ,Statistics and Probability ,Models, Statistical ,Conditional reference regions ,Epidemiology ,Diabetes ,Normal Distribution ,Statistical model ,Bivariate analysis ,Regression ,Flexible additive predictors ,Bivariate data ,Covariate ,Statistics ,Kernel smoother ,Humans ,Kernel smoothing ,Reference Region ,Backfitting algorithm ,Additive model ,Algorithms ,Biomarkers ,Mathematics - Abstract
Many clinical decisions are taken based on the results of continuous diagnostic tests. Usually, only the results of one single test is taken into consideration, the interpretation of which requires a reference range for the healthy population. However, the use of two different tests, can be necessary in the diagnosis of certain diseases. This obliges a bivariate reference region be available for their interpretation. It should also be remembered that reference regions may depend on patient variables (eg, age and sex) independent of the suspected disease. However, few proposals have been made regarding the statistical modeling of such reference regions, and those put forward have always assumed a Gaussian distribution, which can be rather restrictive. The present work describes a new statistical method that allows such reference regions to be estimated with no insistence on the results being normally distributed. The proposed method is based on a bivariate location-scale model that provides probabilistic regions covering a specific percentage of the bivariate data, dependent on certain covariates. The reference region is estimated nonparametrically and the nonlinear effects of continuous covariates via polynomial kernel smoothers in additive models. The bivariate model is estimated using a backfitting algorithm, and the optimal smoothing parameters of the kernel smoothers selected by cross-validation. The model performed satisfactorily in simulation studies under the assumption of non-Gaussian conditions. Finally, the proposed methodology was found to be useful in estimating a reference region for two continuous diagnostic tests for diabetes (fasting plasma glucose and glycated hemoglobin), taking into account the age of the patient Óscar Lado-Baleato is funded by a predoctoral grant (ED481A-2018) from the Galician Government (Plan I2C)-Xunta de Galicia. This research was also supported by grants from the Carlos III Health Institute, Spain (PI16/01404 and RD16/0017/0018), and by the project MTM2017-83513-R cofinanced by the Ministry of Economy and Competitiveness (SPAIN) and the European Regional Development Fund (FEDER). This work was also supported by grants from the Galician Government: RED INBIOEST (ED341D-R2016/032), Grupo de Referencia Competitiva (ED431C 2016-025), and Grupo de Potencial Crecimiento (IN607B 2018-1). Javier Roca-Pardiñas acknowledges financial support by the Grant MTM2017-89422-P (MINECO/AEI/FEDER, UE) SI
- Published
- 2021
4. Overlapping Hot Spots?
- Author
-
Cory P. Haberman
- Subjects
Strategic planning ,Public Administration ,Jurisdiction ,05 social sciences ,Resource efficiency ,Univariate ,Crime analysis ,Plan (drawing) ,Criminology ,Computer security ,computer.software_genre ,Bivariate data ,050501 criminology ,Business ,Law ,computer ,0505 law ,Criminal justice - Abstract
Research Summary In this study, the extent to which hot spots of different crime types overlapped spatially in Philadelphia, PA, were examined. Multiple techniques were used to identify crime hot spots for 11 different crime types. Univariate and bivariate statistics also were used to quantify the extent to which hot spots across the 11 crime types overlapped spatially. Hot spots of different crime types were not found to overlap much. Policy Implications The results raise concerns regarding the resource efficiency of hot-spots policing for addressing all crime types. Police commanders will need to consider how the extent to which hot spots overlap in their jurisdiction should be incorporated into their strategic plans to meet their organizational goals. Furthermore, if police departments plan to use hot-spots policing to address all crime types, then many local criminal justice systems would need an infusion of resources.
- Published
- 2017
- Full Text
- View/download PDF
5. Modeling Discrete Bivariate Data with Applications to Failure and Count Data
- Author
-
Gianpaolo Pulcini, Hyunju Lee, and Ji Hwan Cha
- Subjects
0209 industrial biotechnology ,Class (set theory) ,02 engineering and technology ,Bivariate analysis ,Management Science and Operations Research ,Poisson distribution ,01 natural sciences ,010104 statistics & probability ,symbols.namesake ,020901 industrial engineering & automation ,Distribution (mathematics) ,Bivariate data ,Joint probability distribution ,Statistics ,symbols ,0101 mathematics ,Safety, Risk, Reliability and Quality ,Random variable ,Mathematics ,Count data - Abstract
In this study, we propose a new class of flexible bivariate distributions for discrete random variables. The proposed class of distribution is based on the notion of conditional failure rate for a discrete-type random variable. We derive general formulae for the joint distributions belonging to the proposed class that, unlike other discrete bivariate models already proposed in the literature such as the well-known and most popular Holgate's bivariate Poisson distribution, can model both positive and negative dependence. We discuss general statistical properties of the proposed class as well. Specific families of bivariate distributions can be generated from the general class proposed in this paper just by specifying the ‘baseline distributions’. Furthermore, specific discrete bivariate distributions belonging to the proposed class are applied to analyze three real data sets, and the results are compared with those obtained from conventional models. Copyright © 2017 John Wiley & Sons, Ltd.
- Published
- 2017
- Full Text
- View/download PDF
6. A new permutation technique to explore and control for spatial autocorrelation
- Author
-
Reinder Radersma and Ben C. Sheldon
- Subjects
Bootstrapping (electronics) ,Bivariate data ,Autocorrelation technique ,Ecological Modeling ,Statistics ,Univariate ,Test statistic ,Bivariate analysis ,Spatial dependence ,Spatial analysis ,Ecology, Evolution, Behavior and Systematics ,Mathematics - Abstract
1. Permutation tests are important in ecology and evolution as they enable robust analysis of small sample sizes and control for various forms of dependencies among observations. A common source of dependence is spatial autocorrelation. Accounting for spatial autocorrelation is often crucial, because many ecological and evolutionary processes are spatially restricted, such as gene flow, dispersal, mate choice, inter-and intraspecific competition, mutualism and predation. 2. Here we discuss various ways of controlling for spatial autocorrelation in permutation tests; we highlight their particular properties and assumptions and introduce a new permutation technique which explores and controls for spatial autocorrelation: the floating grid permutation technique (FGPT). 3. The FGPT is a method to randomize observations with known geographical locations. Within the randomization process, the probability an observation is assigned to any of the spatial locations is a negative function of the distance between its original and assigned location. The slope of this function depends on a preset parameter, and by exploring its parameter space, non-random ecological and evolutionary processes can be both assessed and controlled at multiple spatial scales. 4. We show that the FGPT has acceptable type-I-error rates. We applied the FGPT to simulated univariate and bivariate data sets in which both negative and positive spatial autocorrelation were present. In comparison with a method that uses eigenvector decomposition to separate negative from positive spatial autocorrelation, the FGPT performed better for negative spatial autocorrelation alone, equal for positive spatial autocorrelation alone and equal or slightly worse for simultaneous negative and positive spatial autocorrelation. For the bivariate data, it performed equally to a bootstrapping technique in which sampling probabilities were weighted by distance. The FGPT benefits from a large flexibility for application to bivariate (e.g. dyadic interactions) and multivariate observations (e.g. genetic marker-based relatedness measures) and has a large freedom in the choice of test statistic. It also has the potential to identify two spatial autocorrelation patterns, even if both result in positive spatial autocorrelation, given that they operate at different spatial scales. 5. The Floating Grid Permutation Technique is available as the R-package fgpt in CRAN. (Less)
- Published
- 2015
- Full Text
- View/download PDF
7. Correlation estimation with singly truncated bivariate data
- Author
-
Namseon Beck, Taesung Park, Jongho Im, Eunyong Ahn, and Jae Kwang Kim
- Subjects
Statistics and Probability ,Coefficient of determination ,Correlation coefficient ,Epidemiology ,Truncated regression model ,05 social sciences ,01 natural sciences ,010104 statistics & probability ,Bivariate data ,0502 economics and business ,Linear regression ,Statistics ,Consistent estimator ,Ordinary least squares ,0101 mathematics ,Simple linear regression ,050205 econometrics ,Mathematics - Abstract
Correlation coefficient estimates are often attenuated for truncated samples in the sense that the estimates are biased towards zero. Motivated by real data collected in South Sudan, we consider correlation coefficient estimation with singly truncated bivariate data. By considering a linear regression model in which a truncated variable is used as an explanatory variable, a consistent estimator for the regression slope can be obtained from the ordinary least squares method. A consistent estimator of the correlation coefficient is then obtained by multiplying the regression slope estimator by the variance ratio of the two variables. Results from two limited simulation studies confirm the validity and robustness of the proposed method. The proposed method is applied to the South Sudanese children's anthropometric and nutritional data collected by World Vision. Copyright © 2017 John Wiley & Sons, Ltd.
- Published
- 2017
- Full Text
- View/download PDF
8. Multiplicative by nature: Logarithmic transformation in allometry
- Author
-
Gary C. Packard
- Subjects
Heteroscedasticity ,Logarithm ,Scale (ratio) ,Multiplicative function ,Statistical model ,Biology ,Residual ,Bivariate data ,Genetics ,Molecular Medicine ,Applied mathematics ,Animal Science and Zoology ,Allometry ,Ecology, Evolution, Behavior and Systematics ,Developmental Biology - Abstract
The traditional allometric method, which is at the heart of research paradigms used by comparative biologists around the world, entails fitting a straight line to logarithmic transformations of the original bivariate data and then back-transforming the resulting equation to form a two-parameter power function in the arithmetic scale. The method has the dual advantages of enabling investigators to fit statistical models that describe multiplicative growth while simultaneously addressing the multiplicative nature of residual variation in response variables (heteroscedasticity). However, important assumptions of the traditional method seldom are assessed in contemporary practice. When the assumptions are not met, mean functions may fail to capture the dominant pattern in the original data and incorrect form for error may be imposed upon the fitted model. A worked example from metabolic allometry in doves and pigeons illustrates both the power of newer statistical procedures and limitations of the traditional allometric method.
- Published
- 2014
- Full Text
- View/download PDF
9. Identical Twins Raised Apart
- Author
-
David L. Farnsworth
- Subjects
Statistics and Probability ,Set (abstract data type) ,Data set ,Bivariate data ,Intelligence quotient ,Mathematics education ,Statistical analysis ,Predictor variables ,Mathematics instruction ,Identical twins ,Education ,Mathematics - Abstract
Summary This article describes a bivariate data set that is interesting to students. Indeed, this particular data set, which involves twins and IQ, has sparked more student interest than any other set that I have presented. Specific uses of the data set are presented.
- Published
- 2014
- Full Text
- View/download PDF
10. Graphical and formal statistical tools for the symmetry of bivariate copulas
- Author
-
Jean-François Quessy and Tarik Bahraoui
- Subjects
Statistics and Probability ,education.field_of_study ,Bivariate data ,Copula (linguistics) ,Population ,Statistics ,Applied mathematics ,Bivariate analysis ,Statistics, Probability and Uncertainty ,education ,Mathematics - Abstract
Statistical tools to check whether the underlying copula of a pair of random variables is symmetric are developed. The proposed methods are based on the theoretical and empirical versions of the C-power functions introduced and formally studied by Bahraoui & Quessy (2013). On one part, a methodology is developed for testing the null hypothesis that the copula of a given population is symmetric. To this end, a sequential testing procedure is proposed where at each level, the P-value is estimated with the help of the multiplier bootstrap method. On another side, a related graphical method is proposed in order to gain an idea of the degree of asymmetry in bivariate data. The good properties of the methods in small samples are investigated with the help of Monte Carlo simulations under various scenarios of symmetric and asymmetric dependence. The newly introduced procedures are used to analyse the Nutrient and the Walker Lake data sets. The Canadian Journal of Statistics 41: 637–656; 2013 © 2013 Statistical Society of Canada Resume Des outils statistiques pour verifier si la copule sous-jacente a une paire de variables aleatoires est symetrique sont developpes. Les methodes proposees sont basees sur les versions theoriques et empiriques des fonctions de C-puissance introduites et formellement etudiees par Bahraoui & Quessy (2013). D'une part, une methodologie est developpee afin de tester l'hypothese nulle que la copule d'une certaine population est symetrique. A cette fin, une procedure de test sequentielle est proposee ou a chaque etape, la p-value est estimee a l'aide d'un bootstrap de type multiplicateur. D'autre part, une methode graphique est proposee afin de visualiser le degre d'asymetrie present dans un jeu de donnees bivariees. Les proprietes echantillonnales des methodes introduites sont etudiees a l'aide de simulations Monte Carlo sous de nombreux scenarios de dependance symetrique et asymetrique. Les nouvelles procedures sont finalement utilisees pour analyser les jeux de donnees Nutrient et Walker Lake. La revue canadienne de statistique 41: 637–656; 2013 © 2013 Societe statistique du Canada
- Published
- 2013
- Full Text
- View/download PDF
11. Firework Plot as a Graphical Exploratory Data Analysis Tool for Evaluating the Impact of Outliers in Data Exploration and Regression
- Author
-
Dae-Heung Jang and Christine M. Anderson-Cook
- Subjects
Engineering ,business.industry ,Univariate ,Management Science and Operations Research ,Standard deviation ,Regression ,Exploratory data analysis ,Standard error ,Bivariate data ,Statistics ,Linear regression ,Outlier ,Econometrics ,Safety, Risk, Reliability and Quality ,business - Abstract
Outliers can distort many measures for data analysis. We propose a new set of graphical summaries, called firework plots, as simple tools for evaluating the impact of outliers in data exploration and regression assessment. One variation of the plot focuses on the impact of extreme observations on the mean and standard deviation by using curves that trace the relative contribution to the overall summary as weights for individual observations are changed from 1 to 0 in a univariate data set. Similarly, other variations for bivariate data allow examination of the impact of changing weights on combinations of the correlation coefficient and mean with two- or three-dimensional firework plots. One variation of the plot focuses on the impact on the estimated intercept, the estimated slope, and the estimated standard deviation by using curves based on the relative contribution to the overall summary as weights for individual observations are changed from 1 to 0 in a simple linear regression analysis. Similarly, other variations for a multiple regression allow the practitioner to examine the impact of changing weights on combinations of the estimated regression coefficients and the standard error with the pairwise firework plot matrix. Copyright © 2013 John Wiley & Sons, Ltd.
- Published
- 2013
- Full Text
- View/download PDF
12. Recent and future climate extremes arising from changes to the bivariate distribution of temperature and precipitation in Bavaria, Germany
- Author
-
Nicole Estrella and Annette Menzel
- Subjects
Atmospheric Science ,Bivariate data ,Climatology ,Mode (statistics) ,Period (geology) ,Environmental science ,Climate change ,Precipitation ,Reference Period ,Future climate ,Downscaling - Abstract
This study assesses recent and future changes in joint temperature and precipitation regimes at 28 sites in Bavaria, based on statistical downscaling of the ECHAM4 B2 Scenario (WETTREG). Monthly meteorological data of three future decades (2021–2030, 2031–2040 and 2041–2050) were compared with modelled data for the reference period 1971–2000. The bivariate data were classified into four weather types or modes (cooler/drier, cooler/wetter, warmer/wetter and warmer/drier relative to the 1971–2000 reference period). For the three future decades, a systematic change at all locations throughout the year was revealed. Both the cooler modes are predicted to occur less frequently in future decades; changes in warmer modes differ seasonally: In winter months, they are predicted to occur in the warmer/wetter mode, whereas in spring and summer months in the warmer/drier mode. Climatological extremes were defined as monthly weather events outside the 90% confidence ellipse of a Gaussian model of events in the reference period. An increase of extremes in the decade 2041–2050 was striking for June, which is predicted to be warmer and drier for half of the years. The same decade is predicted to experience the most extreme warm and wet modes, one in three Februarys and one in two winters are likely to be in this mode. The magnitude of these future changes in climate was related to the present mean annual temperature of the stations by rank regression. Stations with currently cooler climates are more likely to retain the normal conditions of the baseline period in contrast to currently warmer stations where the increase of extreme warm and wet conditions in summer and winter is expected to be larger. Copyright © 2012 Royal Meteorological Society
- Published
- 2012
- Full Text
- View/download PDF
13. Two-sample nonparametric likelihood inference based on incomplete data with an application to a pneumonia study
- Author
-
Shuling Liu, Lili Tian, Jihnhee Yu, and Albert Vexler
- Subjects
Statistics and Probability ,Clinical Trials as Topic ,Likelihood Functions ,Nonparametric statistics ,Inference ,Pneumonia ,General Medicine ,Bronchoalveolar Lavage ,Models, Biological ,law.invention ,Treatment Outcome ,Empirical likelihood ,Goodness of fit ,Randomized controlled trial ,Bivariate data ,Sample size determination ,law ,Statistics ,Econometrics ,Humans ,Statistics, Probability and Uncertainty ,Monte Carlo Method ,Parametric statistics ,Mathematics - Abstract
The clinical pulmonary infection score (CPIS) and bronchoalveolar lavage (BAL) are important diagnostic variables of pneumonia for forcefully ventilated patients who are susceptible to nosocomial infection. Because of its invasive nature, BAL is performed for patients only if the CPIS is greater than a certain threshold value. Thus, CPIS and BAL are closely related, yet BAL values are substantially missing. In a randomized clinical trial, the control and oral treatment groups were compared based on the outcomes from these procedures. Because of the relevance of both outcomes with respect to evaluating the efficacy of treatments, we propose and examine a nonparametric test based on these outcomes, which employs the empirical likelihood methodology. While efficient parametric methods are available when data are observed incompletely, performing appropriate goodness-of-fit tests to justify the parametric assumptions is difficult. Our motivation is to provide an approach based on no particular distributional assumption, which enables us to use all observed bivariate data, whether completed or not in an approximate likelihood manner. A broad Monte Carlo study evaluates the asymptotic properties and efficiency of the proposed method based on various sample sizes and underlying distributions. The proposed technique is applied to a data set from a pneumonia study demonstrating its practical worth.
- Published
- 2010
- Full Text
- View/download PDF
14. A new bivariate generalized Poisson distribution
- Author
-
Felix Famoye
- Subjects
Statistics and Probability ,Bivariate analysis ,Poisson distribution ,Pearson product-moment correlation coefficient ,Statistics::Computation ,symbols.namesake ,Bivariate data ,Joint probability distribution ,Statistics ,symbols ,Zero-inflated model ,Computer Science::Symbolic Computation ,Statistics, Probability and Uncertainty ,Index of dispersion ,Marginal distribution ,Mathematics - Abstract
In this paper, a new bivariate generalized Poisson distribution (GPD) that allows any type of correlation is defined and studied. The marginal distributions of the bivariate model are the univariate GPDs. The parameters of the bivariate distribution are estimated by using the moment and maximum likelihood methods. Some test statistics are discussed and one numerical data set is used to illustrate the applications of the bivariate model.
- Published
- 2010
- Full Text
- View/download PDF
15. Bivariate smoothing of mortality surfaces with cohort and period ridges
- Author
-
Leonie Tickle, Rob J. Hyndman, and Alexander Dokumentov
- Subjects
Statistics and Probability ,05 social sciences ,Bivariate analysis ,01 natural sciences ,010104 statistics & probability ,Cohort effect ,Bivariate data ,0502 economics and business ,Cohort ,Period effects ,0101 mathematics ,Statistics, Probability and Uncertainty ,Period (music) ,Smoothing ,050205 econometrics ,Mathematics ,Demography ,Graduation - Published
- 2018
- Full Text
- View/download PDF
16. Creating Realistic Data Sets with Specified Properties via Simulation
- Author
-
Robert Goldman and John D. McKenzie
- Subjects
Statistics and Probability ,Data set ,Set (abstract data type) ,Bivariate data ,Correlation coefficient ,Statistics ,Univariate ,Bivariate analysis ,Outcome (probability) ,Standard deviation ,Education ,Mathematics - Abstract
Summary We explain how to simulate both univariate and bivariate raw data sets having specified values for common summary statistics. The first example illustrates how to ‘construct’ a data set having prescribed values for the mean and the standard deviation – for a one-sample t test with a specified outcome. The second shows how to create a bivariate data set with a specified correlation coefficient.
- Published
- 2009
- Full Text
- View/download PDF
17. A New Way to Teach (or Compute) Pearson's r Without Reliance on Cross-Products
- Author
-
Schuyler W. Huck, Bixiang Ren, and Hongwei Yang
- Subjects
Statistics and Probability ,Bivariate data ,Scatter plot ,Statistics ,Geodetic datum ,Difficulty seeing ,Cross product ,Link (knot theory) ,Education ,Mathematics - Abstract
Summary Many students have difficulty seeing the conceptual ‘link’ between bivariate data displayed in a scatterplot and the statistical summary of the relationship, r. This article shows how to teach (and compute) r such that each datum's ‘direct’ and ‘indirect’ influences are made apparent and used in a new formula for calculating Pearson's r.
- Published
- 2007
- Full Text
- View/download PDF
18. BIVARIATE FREQUENCY ANALYSIS OF FLOODS USING COPULAS1
- Author
-
Jenq Tzong Shiau, Chang Tai Tsai, and Hsin Yi Wang
- Subjects
Return period ,Ecology ,fungi ,Univariate ,food and beverages ,Bivariate analysis ,humanities ,Univariate distribution ,Bivariate data ,Joint probability distribution ,parasitic diseases ,Statistics ,Flood mitigation ,Marginal distribution ,geographic locations ,Earth-Surface Processes ,Water Science and Technology ,Mathematics - Abstract
Bivariate flood frequency analysis offers improved understanding of the complex flood process and useful information in preparing flood mitigation measures. However, difficulties arise from limited bivariate distribution functions available to jointly model the correlated flood peak and volume that have different univariate marginal distributions. Copulas are functions that link univariate distribution functions to form bivariate distribution functions, which can overcome such difficulties. The objective of this study was to analyze bivariate frequency of flood peak and volume using copulas. Separate univariate distributions of flood peak and volume are first fitted from observed data. Copulas are then employed to model the dependence between flood peak and volume and join the predetermined univariate marginal distributions to construct the bivariate distribution. The bivariate probabilities and associated return periods are calculated in terms of univariate marginal distributions and copulas. The advantage of using copulas is that they can separate the effect of dependence from the effects of the marginal distributions. In addition, explicit relationships between joint and univariate return periods are made possible when copulas are employed to construct bivariate distribution of floods. The annual floods of Tongtou flow gauge station in the Jhuoshuei River, Taiwan, are used to illustrate bivariate flood frequency analysis.
- Published
- 2006
- Full Text
- View/download PDF
19. Goodness-of-fit Procedures for Copula Models Based on the Probability Integral Transformation
- Author
-
Bruno Rémillard, Jean-François Quessy, and Christian Genest
- Subjects
Statistics and Probability ,Bivariate data ,Weak convergence ,Goodness of fit ,Model selection ,Copula (linguistics) ,Statistics ,Test statistic ,Applied mathematics ,Probability distribution ,Statistics, Probability and Uncertainty ,Statistical hypothesis testing ,Mathematics - Abstract
Wang & Wells [J. Amer. Statist. Assoc. 95 (2000) 62] describe a non-parametric approach for checking whether the dependence structure of a random sample of censored bivariate data is appropriately modelled by a given family of Archimedean copulas. Their procedure is based on a truncated version of the Kendall process introduced by Genest & Rivest [J. Amer. Statist. Assoc. 88 (1993) 1034] and later studied by Barbe et al. [J. Multivariate Anal. 58 (1996) 197]. Although Wang & Wells (2000) determine the asymptotic behaviour of their truncated process, their model selection method is based exclusively on the observed value of its L2-norm. This paper shows how to compute asymptotic p-values for various goodness-of-fit test statistics based on a non-truncated version of Kendall's process. Conditions for weak convergence are met in the most common copula models, whether Archimedean or not. The empirical behaviour of the proposed goodness-of-fit tests is studied by simulation, and power comparisons are made with a test proposed by Shih [Biometrika 85 (1998) 189] for the gamma frailty family.
- Published
- 2006
- Full Text
- View/download PDF
20. A Class of Goodness of Fit Tests for a Copula Based on Bivariate Right-Censored Data
- Author
-
John P. Klein, Mei-Jie Zhang, Per Kragh Andersen, Youyi Shu, and Claus Thorn Ekstrøm
- Subjects
Male ,Statistics and Probability ,Statistics::Theory ,Denmark ,Longevity ,Copula (linguistics) ,Bivariate analysis ,Sex Factors ,Goodness of fit ,Bivariate data ,Joint probability distribution ,Statistics ,Twins, Dizygotic ,Econometrics ,Humans ,Mathematics ,Models, Statistical ,Twins, Monozygotic ,General Medicine ,Survival Analysis ,Statistics::Computation ,Survival function ,Multivariate Analysis ,Female ,Statistics, Probability and Uncertainty ,Marginal distribution ,Parametric family ,Monte Carlo Method ,Algorithms ,Statistical Distributions - Abstract
The copula of a bivariate distribution, constructed by making marginal transformations of each component, captures all the information in the bivariate distribution about the dependence between two variables. For frailty models for bivariate data the choice of a family of distributions for the random frailty corresponds to the choice of a parametric family for the copula. A class of tests of the hypothesis that the copula is in a given parametric family, with unspecified association parameter, based on bivariate right censored data is proposed. These tests are based on first making marginal Kaplan-Meier transformations of the data and then comparing a non-parametric estimate of the copula to an estimate based on the assumed family of models. A number of options are available for choosing the scale and the distance measure for this comparison. Significance levels of the test are found by a modified bootstrap procedure. The procedure is used to check the appropriateness of a gamma or a positive stable frailty model in a set of survival data on Danish twins.
- Published
- 2005
- Full Text
- View/download PDF
21. A bivariate Poisson count data model using conditional probabilities
- Author
-
Erik Plug and Peter Berkhout
- Subjects
Statistics and Probability ,Conditional probability ,Bivariate analysis ,Conditional probability distribution ,Poisson distribution ,Statistics::Computation ,symbols.namesake ,Bivariate data ,Joint probability distribution ,Statistics ,symbols ,Statistics, Probability and Uncertainty ,Conditional variance ,Mathematics ,Count data - Abstract
The applied econometrics of bivariate count data predominantly focus on a bivariate Poisson density with a correlation structure that is very restrictive. The main limitation is that this bivariate distribution excludes zero and negative correlation. This paper introduces a new model which allows for a more flexible correlation structure. To this end the joint density is decomposed by means of the multiplication rule in marginal and conditional densities. Simulation experiments and an application of the model to recreational data are presented.
- Published
- 2004
- Full Text
- View/download PDF
22. RECRUITERS' USE OF GPA IN INITIAL SCREENING DECISIONS: HIGHER GPAs DON'T ALWAYS MAKE THE CUT
- Author
-
Mary L. Connerley, Nicholas C. D'angelo, Kevin D. Carlson, Ross L. Mecham, and Arlise P. McKinney
- Subjects
Organizational Behavior and Human Resource Management ,stomatognathic system ,Bivariate data ,Econometrics ,Sampling error ,Decision rule ,Psychology ,Applied Psychology ,Plot (graphics) - Abstract
The relationship between college grade point average (GPA) and recruiters' initial screening decisions was examined using data from 548 job postings in a college recruitment program. Results indicate that in-major grade point average (GPA) is more strongly associated with screening decisions (p= 0.18, SDP= 0.200) than is overall GPA (p= 0.06, SDP= 0.187), but the magnitudes of the relationships varied across decision sets including a larger number of negative values than would be expected from sampling error alone. Subsequent examination of the bivariate data identified 6 different plot types suggesting that recruiters use a variety of GPA decision rules to initially screen applicants in college recruiting. The most common data plots found in 42% of the decision sets suggests that recruiters do not use GPA in screening decisions. But a surprising 81 of 548 decision sets indicated recruiters selected against applicants with high GPAs. Evidence that organizations recruiting for the same job produced different plot types suggests that the use of GPA data in initial screening decisions may be idiosyncratic to individual recruiters.
- Published
- 2003
- Full Text
- View/download PDF
23. Analysis of down times of jib cranes?A stochastic approach
- Author
-
Avinash Dharmadhikari and S.R. Deshmukh
- Subjects
Data set ,Mathematical optimization ,Bivariate data ,Stochastic modelling ,Stochastic process ,Computer science ,Modeling and Simulation ,Complex system ,Ocean Engineering ,Bivariate analysis ,Management Science and Operations Research ,Realization (systems) ,Maintenance Problem - Abstract
In this paper a case study dealing with the maintenance problem of jib cranes is presented. A jib crane is viewed as a complex system whose performance is observed as a single realization over period of time. After pointing out limitations of existing stochastic models to analyze the observed realization a new family of bivariate stochastic processes is introduced. The data of jib crane is analyzed using new model and cross-validated using part of the data set. It is noted that the new family of stochastic processes is useful to analyze bivariate data where one of the variables is finitely valued and the other is nonnegative and continuous. © 2002 Wiley Periodicals, Inc. Naval Research Logistics 49: 231–243, 2002; Published online in Wiley InterScience (www.interscience.wiley.com). DOI 10.1002/nav.10006
- Published
- 2002
- Full Text
- View/download PDF
24. Tail-dependence in stock-return pairs
- Author
-
Christoph Kuzmics and Ines Fortin
- Subjects
jel:C52 ,media_common.quotation_subject ,jel:C51 ,Tail dependence ,jel:C12 ,jel:C32 ,Stock return ,Value-at-Risk, Copula, Non-normal bivariate GARCH, Asymmetric dependence, Profile likelihood-ratio test ,General Business, Management and Accounting ,Stock market index ,Copula (probability theory) ,jel:G15 ,Bivariate data ,Joint probability distribution ,Statistics ,Econometrics ,Economics ,Finance ,Normality ,Value at risk ,media_common - Abstract
The empirical joint distribution of return pairs on stock indices displays high tail-dependence in the lower tail and low tail-dependence in the upper tail. The presence of tail-dependence is not compatible with the assumption of (conditional) joint normality. The presence of asymmetric tail-dependence is not compatible with the assumption of a joint student-t distribution. A general test for one dependence structure versus another via the profile likelihood is described and employed in a bivariate GARCH model, where the joint distribution of the disturbances is split into its marginals and its copula. The copula used in the paper is such that it allows for the existence of lower tail-dependence and for asymmetric tail-dependence, and is such that it encompasses the normal or t-copula, depending on the benchmark tested. The model is estimated using bivariate data on a set of European stock indices. We find that the assumption of normal or student-t dependence is easily rejected in favour of an asymmetrically tail-dependent distribution. Copyright © 2002 John Wiley & Sons, Ltd.
- Published
- 2002
- Full Text
- View/download PDF
25. A Bayesian analysis of bivariate ordinal data: Wisconsin epidemiologic study of diabetic retinopathy revisited
- Author
-
Kalyan Das and Atanu Biswas
- Subjects
Adult ,Male ,Statistics and Probability ,Ordinal data ,Diabetic Retinopathy ,Models, Statistical ,Epidemiology ,Computer science ,Bayesian probability ,Ordinal analysis ,Bayes Theorem ,Numerical Analysis, Computer-Assisted ,Bivariate analysis ,Latent variable ,Models, Biological ,Bayes' theorem ,Diabetes Mellitus, Type 1 ,Wisconsin ,Bivariate data ,Econometrics ,Humans ,Female ,Age of Onset ,Latent variable model - Abstract
In many biomedical experiments one may often encounter bivariate data which are component-wise ordinal. The data set of the ophthalmologic experiment of the Wisconsin Epidemiologic Study of Diabetic Retinopathy (WESDR) is an example of such data. Several authors considered the analysis of such data from different viewpoints. The present work reviews the existing literature based on the WESDR data and on the basis of some latent variables provide the technique for analysing such data more easily in a Bayesian framework. Computation supports the methodology to a great extent. A comparison between our approach and the likelihood based approach considered by Kim has also been made.
- Published
- 2002
- Full Text
- View/download PDF
26. Bivariate Normal Distribution
- Author
-
Donald F. Morrison
- Subjects
Correlation coefficient ,Regression function ,Matrix t-distribution ,Probability density function ,Multivariate normal distribution ,Function (mathematics) ,Conditional expectation ,Normal-gamma distribution ,Pearson product-moment correlation coefficient ,Statistics::Computation ,symbols.namesake ,Bivariate data ,Student's t-distribution ,Statistics ,symbols ,Matrix normal distribution ,Conditional variance ,Variance function ,Variable (mathematics) ,Mathematics - Abstract
The bivariate normal distribution is defined by its density function. The regression function of one variable on the other is given by the linear conditional mean function. The conditional variance is expressed in terms of the correlation coefficient of the variables. References to tables of the bivariate normal distribution are included. Keywords: bivariate normal; regression function; normal and multivariate normal distributions
- Published
- 2014
- Full Text
- View/download PDF
27. A Diagnostic for Assessing the Influence of Cases on the Prediction of Missing Data
- Author
-
Joseph E. Cavanaugh and Jacob J Oleson
- Subjects
Statistics and Probability ,Statistical model ,Conditional probability distribution ,Density estimation ,Missing data ,computer.software_genre ,Bivariate data ,Statistics ,Expectation–maximization algorithm ,Imputation (statistics) ,Data mining ,Time series ,computer ,Mathematics - Abstract
Summary. An important aspect of statistical modelling involves the identification of cases that have a significant influence on certain inferential results. In modelling problems where data are missing, the predicted values for the missing observations are frequently of interest. To assist in the identification of cases that substantially affect these predicted values, we introduce a case deletion diagnostic which is often conveniently evaluated in the setting of the EM algorithm. Our diagnostic is defined as the Kullback-Leibler information between two versions of the conditional density of the missing data given the observed data: one based on the parameter estimates arising from the full data set; the other based on the parameter estimates arising from the case-deleted data set. We outline the computation of the diagnostic for two Gaussian frameworks: for bivariate data applications in which some of the data pairs are incomplete, and for time series forecasting applications in which the missing observations are future realizations of the series. Our analyses involve bivariate data from the 1998 American Major League Baseball season and a time series consisting of cardiovascular mortality readings from the Los Angeles area.
- Published
- 2001
- Full Text
- View/download PDF
28. Bivariate ZIP Models
- Author
-
Jean François Walhin
- Subjects
Statistics and Probability ,General Medicine ,Bivariate analysis ,Poisson distribution ,Statistics::Computation ,symbols.namesake ,Bivariate data ,Joint probability distribution ,Statistics ,symbols ,Zero-inflated model ,Computer Science::Symbolic Computation ,Poisson regression ,Statistics, Probability and Uncertainty ,Statistical theory ,Likelihood function ,Mathematics - Abstract
The Zero-Inflated Poisson model is extended into a bivariate form. Three new bivariate models are considered. Parameters are estimated by maximum likelihood. Two numerical examples are discussed.
- Published
- 2001
- Full Text
- View/download PDF
29. A computationally simple test of homogeneity of odds ratios for twin data
- Author
-
Changzhong Chen, Frank B. Hu, Rebecca A. Betensky, Camille A. Jones, Xiping Xu, Binyan Wang, and James I. Hudson
- Subjects
Bivariate data ,Epidemiology ,Dizygotic twin ,Statistics ,Covariate ,Econometrics ,Monozygotic twin ,Bivariate analysis ,Logistic regression ,Generalized estimating equation ,Twin study ,Genetics (clinical) ,Mathematics - Abstract
It is of interest to compare measures of association of binary traits among samples of bivariate data. One example is the comparison of association within a sample of monozygotic (MZ) twins to that within a sample of dizygotic (DZ) twins. A larger association in the MZ twins suggests that the trait of interest may have a genetic component. The Bivariate data in this example are binary traits for the twins in each pair. Another example is the comparison of a measure of Hardy-Weinberg disequilibrium across several populations. The bivariate data in this example are the two alleles comprising the genotype of interest. We propose using derived logistic regression equations from the full exponential model for the bivariate outcomes to test for homogeneity. We adjust for correlation among outcomes via generalized estimating equations. This modeling approach allows for adjustment for individual-level and pair-level covariates and thereby allows for testing for gene x environment interactions. Further, we extend the model to allow for simultaneous analysis of two diseases, which allows for testing for a genetic component to the coaggregation of two diseases. In contrast to approaches proposed by previous authors, no special software is required; our approach can be easily implemented in standard software packages. We compare our results to those of other methods proposed in the literature for data from the Vietnam Era Twins Study. We apply our methods also to the Anqing Twin Study and to data on major depression and generalized anxiety disorder from the Virginia Twin Register.
- Published
- 2001
- Full Text
- View/download PDF
30. Least squares estimation of variance components for linkage
- Author
-
Christopher I. Amos, Xiangjun Gu, Barry R. Davis, and Jianfang Chen
- Subjects
Mixed model ,Multivariate statistics ,Bivariate data ,Epidemiology ,Estimation theory ,Statistics ,Estimator ,Variance components ,Multivariate normal distribution ,Least squares ,Genetics (clinical) ,Mathematics - Abstract
We develop least squares (LS) procedures for variance components estimation in genetic linkage studies. The LS procedure is expressed by simple expressions, and does not require inversion of large matrices. Simulations comparing LS with maximum likelihood (ML) procedures for normal data show that both yield unbiased estimators, but the efficiency of the LS procedure was less than 50% of the ML procedures. For bivariate normal data, the efficiency of the LS procedure relative to the ML method was better, generally over 60%. For skewed data, the LS method was markedly more efficient than ML for parameter estimation. The LS method was computationally rapid, over 4,000 times faster than ML estimation for bivariate data. Because ML estimation is time consuming, LS methods are suggested for initial interval mapping with multivariate data. Genet. Epidemiol. 19(Suppl 1):S1–S7, 2000. © 2000 Wiley-Liss, Inc.
- Published
- 2000
- Full Text
- View/download PDF
31. Statistical analysis of nonlinear parameter estimation for monod biodegradation kinetics using bivariate data
- Author
-
Christopher D. Knightes and Catherine A. Peters
- Subjects
Bivariate data ,Estimation theory ,Chemistry ,Covariance matrix ,Non-linear least squares ,Statistics ,Bioengineering ,Bivariate analysis ,Covariance ,Applied Microbiology and Biotechnology ,Nonlinear regression ,Confidence interval ,Biotechnology - Abstract
A nonlinear regression technique for estimat- ing the Monod parameters describing biodegradation ki- netics is presented and analyzed. Two model data sets were taken from a study of aerobic biodegradation of the polycyclic aromatic hydrocarbons (PAHs), naphthalene and 2-methylnaphthalene, as the growth-limiting sub- strates, where substrate and biomass concentrations were measured with time. For each PAH, the parameters estimated were: qmax, the maximum substrate utilization rate per unit biomass; KS, the half-saturation coefficient; and Y, the stoichiometric yield coefficient. Estimating pa- rameters when measurements have been made for two variables with different error structures requires a tech- nique more rigorous than least squares regression. An optimization function is derived from the maximum likelihood equation assuming an unknown, nondiagonal covariance matrix for the measured variables. Because the derivation is based on an assumption of normally distributed errors in the observations, the error struc- tures of the regression variables were examined. Through residual analysis, the errors in the substrate concentration data were found to be distributed log- normally, demonstrating a need for log transformation of this variable. The covariance between ln C and X was found to be small but significantly nonzero at the 67% confidence level for NPH and at the 94% confidence level for 2MN. The nonlinear parameter estimation yielded unique values for qmax, KS, and Y for naphthalene. Thus, despite the low concentrations of this sparingly soluble compound, the data contained sufficient information for parameter estimation. For 2-methylnaphthalene, the val- ues of qmax and KS could not be estimated uniquely; however, qmax/KS was estimated. To assess the value of including the relatively imprecise biomass concentra- tion data, the results from the bivariate method were compared with a univariate method using only the sub- strate concentration data. The results demonstrated that the bivariate data yielded a better confidence in the estimates and provided additional information about the model fit and model adequacy. The combination of the value of the bivariate data set and their nonzero co- variance justifies the need for maximum likelihood esti- mation over the simpler nonlinear least squares regres- sion. © 2000 John Wiley & Sons, Inc. Biotechnol Bioeng 69: 160-170, 2000.
- Published
- 2000
- Full Text
- View/download PDF
32. Linear regression for bivariate censored data via multiple imputation
- Author
-
Charles Kooperberg and Wei Pan
- Subjects
Male ,Statistics and Probability ,Epidemiology ,Linear model ,Univariate ,Ear ,Bivariate analysis ,Generalized least squares ,Anti-Bacterial Agents ,Otitis Media ,Bivariate data ,Joint probability distribution ,Linear regression ,Statistics ,Econometrics ,Humans ,Prednisone ,Computer Simulation ,Female ,Marginal distribution ,Child ,Monte Carlo Method ,Proportional Hazards Models ,Mathematics - Abstract
Bivariate survival data arise, for example, in twin studies and studies of both eyes or ears of the same individual. Often it is of interest to regress the survival times on a set of predictors. In this paper we extend Wei and Tanner's multiple imputation approach for linear regression with univariate censored data to bivariate censored data. We formulate a class of censored bivariate linear regression methods by iterating between the following two steps: 1, the data is augmented by imputing survival times for censored observations; 2, a linear model is fit to the imputed complete data. We consider three different methods to implement these two steps. In particular, the marginal (independence) approach ignores the possible correlation between two survival times when estimating the regression coefficient. To improve the efficiency, we propose two methods that account for the correlation between the survival times. First, we improve the efficiency by using generalized least squares regression in step 2. Second, instead of generating data from an estimate of the marginal distribution we generate data from a bivariate log-spline density estimate in step 1. Through simulation studies we find that the performance of the two methods that take the dependence into account is close and that they are both more efficient than the marginal approach. The methods are applied to a data set from an otitis media clinical trial. Copyright © 1999 John Wiley & Sons, Ltd.
- Published
- 1999
- Full Text
- View/download PDF
33. An extension of Kendall's coefficient of concordance to bivariate interval censored data
- Author
-
Rebecca A. Betensky and Dianne M. Finkelstein
- Subjects
Statistics and Probability ,Bivariate data ,Epidemiology ,Joint probability distribution ,Concordance ,Statistics ,Econometrics ,Nonparametric statistics ,Interval (graph theory) ,Extension (predicate logic) ,Bivariate analysis ,Kendall s ,Mathematics - Abstract
Non-parametric tests of independence, as well as accompanying measures of association, are essential tools for the analysis of bivariate data. Such tests and measures have been developed for uncensored and right censored failure time data, but have not been developed for interval censored failure time data. Bivariate interval censored data arise in AIDS studies in which screening tests for early signs of viral and bacterial infection are done at clinic visits. Because of missed clinic visits, the actual times of first positive screening tests are interval censored. To handle such data, we propose an extension of Kendall's coefficient of concordance. We apply it to data from an AIDS study that recorded times of shedding of cytomegalovirus (CMV) and times of colonization of mycobacterium avium complex (MAC). We examine the performance of our proposed measure through a simulation study.
- Published
- 1999
- Full Text
- View/download PDF
34. A non-parametric maximum likelihood estimator for bivariate interval censored data
- Author
-
Dianne M. Finkelstein and Rebecca A. Betensky
- Subjects
Statistics and Probability ,Epidemiology ,Univariate ,Nonparametric statistics ,Estimator ,Interval (mathematics) ,Bivariate analysis ,Statistics::Computation ,Bivariate data ,Joint probability distribution ,Nelson–Aalen estimator ,Statistics ,Econometrics ,Statistics::Methodology ,Computer Science::Symbolic Computation ,Mathematics - Abstract
We derive a non-parametric maximum likelihood estimator for bivariate interval censored data using standard techniques for constrained convex optimization. Our approach extends those taken for univariate interval censored data. We illustrate the estimator with bivariate data from an AIDS study.
- Published
- 1999
- Full Text
- View/download PDF
35. Fisher's Information on the Correlation Coefficient in Bivariate Logistic Models
- Author
-
Peter G. Moffatt and Murray D. Smith
- Subjects
Statistics and Probability ,Correlation coefficient ,Fisher transformation ,Bivariate analysis ,Censoring (statistics) ,Pearson product-moment correlation coefficient ,Statistics::Computation ,symbols.namesake ,Bivariate data ,Joint probability distribution ,Statistics ,Econometrics ,symbols ,Computer Science::Symbolic Computation ,Statistics, Probability and Uncertainty ,Fisher information ,Mathematics - Abstract
From a theoretical perspective, the paper considers the properties of the maximum likelihood estimator of the correlation coefficient, principally regarding precision, in various types of bivariate model which are popular in the applied literature. The models are: 'Full-Full', in which both variables are fully observed; 'Censored-Censored', in which both of the variables are censored at zero; and finally, 'Binary-Binary', in which both variables are observed only in sign. For analytical convenience, the underlying bivariate distribution which is assumed in each of these cases is the bivariate logistic. A central issue is the extent to which censoring reduces the level of Fisher's information pertaining to the correlation coefficient, and therefore reduces the precision with which this important parameter can be estimated.
- Published
- 1999
- Full Text
- View/download PDF
36. Using Median Lines in Robust Bivariate Data Analysis
- Author
-
Jochem König, Thomas Georg, and Uwe Feldmann
- Subjects
Statistics and Probability ,Exploratory data analysis ,Bivariate data ,Goodness of fit ,Joint probability distribution ,Principal component analysis ,Statistics ,Outlier ,General Medicine ,Bivariate analysis ,Statistics, Probability and Uncertainty ,Residual ,Mathematics - Abstract
Methods for robust comparison of bivariate errors-in-variables are considered. The concept of median lines is introduced for the robust estimation of principal components. Median lines separate the bivariate sample space into two equally sized parts. Statistical properties of the model parameters are derived. Robust residual analysis assesses linear relationships as well as goodness of fit and allows for the detection of potential outliers. Special emphasis is laid on graphical methods. A bivariate box-plot is proposed for exploratory data analysis. The median lines procedure is illustrated by a real example.
- Published
- 1998
- Full Text
- View/download PDF
37. Fitting continuous bivariate distributions to data
- Author
-
Ali S. Hadi, Enrique Castillo, and José María Sarabia
- Subjects
Statistics and Probability ,Logistic distribution ,Bivariate data ,Joint probability distribution ,Cumulative distribution function ,Statistics ,Probability distribution ,Bivariate analysis ,Marginal distribution ,Random variable ,Mathematics - Abstract
SUMMARY In this paper we are concerned with fitting bivariate distributions to data. Let X and Y be bivariate random variables with cumulative distribution function (CDF) F(x, y; 0), where 0 is an unknown, possibly vector-valued, parameter. To estimate 0 on the basis of an observed random sample from F(x, y; 0), we first express the predicted values of the random variables as functions of 0; then an estimate of 0 is obtained by minimizing the sum of the squares of the distances between the observed and predicted values. In the univariate case, this is straightforward because the inverse of the CDF is a single point, but in the bivariate case the inverse is a surface. We present ways for expressing the predicted values in the bivariate case as a function of 0. The idea is to use the joint and marginal CDFs as the basis for calculating the predicted values as functions of 0, thereby extending the univariate to the bivariate case. The method is illustrated by applications to several bivariate distributions such as the bivariate logistic, Pareto and exponential distributions. Simulation results indicate that the method performs well. The method is also applied to an example of real data. Finally, we briefly discuss possible extensions to the multivariate case.
- Published
- 1997
- Full Text
- View/download PDF
38. A Likelihood Approach to Analysing Longitudinal Bivariate Binary Data
- Author
-
Jennifer S. K. Chan, Anthony Y. C. Kuk, and Jimmy D. Bell
- Subjects
Statistics and Probability ,Methadone maintenance ,Model selection ,Logit ,General Medicine ,Bivariate analysis ,Bivariate data ,Covariate ,Statistics ,Econometrics ,Statistics, Probability and Uncertainty ,Akaike information criterion ,Likelihood function ,Mathematics - Abstract
To study the effect of methadone treatment in reducing multiple drug use, say heroin and benzodiazepines while controlling for their possible interaction, we analyse the results of urine drug screens from patients in treatment at a Sydney clinic in 1986. Weekly tests are either positive or negative for each type of drug and a bivariate binary model was developed to analyse such repeated bivariate binary outcomes. It models simultaneously the logit of each type of drug use and their log odds ratio linearly in some covariates. The serial correlation within subject is accounted for by including the ‘previous outcome’ of both drugs and their interaction as covariates. Our main conclusion is that drug use is reduced over time and the interaction between dose and time effects is not significant. It also suggests that while methadone maintenance is effective in reducing heroin use (CHAN et al., 1995), it does not suppress non-opioid drug use. Concerning the association between the two drugs, it is found that the present strength of their association depends on the previous outcomes only through a measure of concordance. The proposed model has a tractable likelihood function and so a full likelihood analysis is possible. It can be easily extended to incorporate mixture effects. The EM algorithm is used for the estimation of parameters in the mixture model and model selection can be based on the Akaike Information Criterion.
- Published
- 1997
- Full Text
- View/download PDF
39. Parametric Methods for Bivariate Quantile-Partitioned Tables and the Efficiency of Corresponding Nonparametric Methods
- Author
-
Mitchell H. Gail and Craig B. Borkowf
- Subjects
Statistics and Probability ,Nonparametric statistics ,General Medicine ,Bivariate analysis ,Bivariate data ,Joint probability distribution ,Statistics ,Parametric model ,Econometrics ,Multinomial distribution ,Statistics, Probability and Uncertainty ,Parametric statistics ,Mathematics ,Quantile - Abstract
In order to study the agreement between two continuous measurements on a sample of individuals, epidemiologists sometimes partition the original bivariate data into categories defined by the empirical quantiles of the two marginal distributions. The counts in the resulting two-way contingency table have the empirical bivariate quantile-partitioned (EBQP) distribution rather than the conventional multinomial distribution. BORKOWF et al. (1997) developed the asymptotic theory and inferential procedures for estimates of measures of agreement calculated from EBQP tables. In this paper, we develop parametric methods for bivariate quantile-partitioned (BQP) tables. We use these methods to study the efficiency of nonparametric (EBQP) methods and to improve the precision of estimates of measures of agreement. We present computational methods for estimating certain measures of agreement (kappa, weighted kappa, and row proportions) together with their variances. Numerical studies and an example show that nonparametric (EBQP) estimates of kappa and certain row proportions tend to be quite inefficient compared to parametric estimates, whereas nonparametric estimates of weighted kappa can be relatively more efficient for some underlying distributions. Thus, investigators should weigh the efficiency advantages of parametric estimates against possible biases that can result from choosing an inappropriate parametric model when they employ BQP tables to study agreement.
- Published
- 1997
- Full Text
- View/download PDF
40. Bivariate Data: Lessons from Students' Coursework
- Author
-
Roger Porkess
- Subjects
Statistics and Probability ,Secondary education ,Bivariate data ,Coursework ,education ,Mathematics education ,Econometrics ,Regression analysis ,Psychology ,Regression ,Education ,Causal model - Abstract
Summary This article examines some of the difficulties frequently encountered by students when analysing bivariate data and suggests how they might be overcome.
- Published
- 1996
- Full Text
- View/download PDF
41. Detecting patterns of bivariate mean vectors using model-selection criteria
- Author
-
Chuen-Chuen Joyce Huang and C. Mitchell Dayton
- Subjects
Statistics and Probability ,Model selection ,Null (mathematics) ,General Medicine ,Bivariate analysis ,Arts and Humanities (miscellaneous) ,Bivariate data ,Robustness (computer science) ,Homogeneous ,Statistics ,Kurtosis ,Akaike information criterion ,General Psychology ,Mathematics - Abstract
Three model-selection criteria, (1) AIC (Akaike, 1973), (2) SIC (Schwarz, 1978), and (3) CAIC (Bozdogan, 1987), were evaluated for detecting patterns of differences among mean vectors with bivariate data. This simulation study involved three and five group cases with equal-sized samples of 10, 20 and 50. Ten different distributions, including situations with extreme skewness and kurtosis, were generated to assess the robustness of the criteria with respect to non-normality. Both homogeneous and heterogeneous covariances cases were examined. The three model-selection criteria were compared in terms of correct decision rates based on 500 replications for each condition studied. Results indicate that all three criteria are relatively robust with respect to non-normality. SIC and CAIC performed especially well for large sample sizes when the true model contained only one or two clusters of homogeneous mean vectors. Overall, AIC tended to be superior to SIC and CAIC for homogeneous cases when the null case was excluded and, in general, for heterogeneous cases.
- Published
- 1995
- Full Text
- View/download PDF
42. On Countable Mixtures of Bivariate Binomial Distributions
- Author
-
H. Papageorgiou and Katerina M. David
- Subjects
Statistics and Probability ,Binomial distribution ,Bivariate data ,Joint probability distribution ,Statistics ,Negative binomial distribution ,Statistical parameter ,Probability distribution ,General Medicine ,Bivariate analysis ,Conditional probability distribution ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
A unified treatment is given for mixtures of bivariate binomial distributions with respect to their index parameter(s). The use of probability generating functions is employed and a number of interesting properties including probabilities, factorial moments, factorial cumulants and conditional distributions are derived. Five classes of such mixtures are examined and several well known bivariate discrete distributions are used as illustrative examples. Biological applications are indicated including the fit of three bivariate distributions to an actual set of human family data.
- Published
- 1994
- Full Text
- View/download PDF
43. Descriptive statistics of large data sets by scatter plots, an exploratory approach
- Author
-
W.J.J. Rey
- Subjects
Statistics and Probability ,Bivariate data ,Scatter plot ,Statistics ,Binary number ,Partial regression plot ,Statistics, Probability and Uncertainty ,Minimum spanning tree ,Statistical graphics ,Tree (graph theory) ,Statistic ,Mathematics - Abstract
In the analysis of large tables of M variables on N observations one is interested in the relations between the variables and it is usual to inspect the M(M-1)/2 scatter plots of N points. Clearly, the scatter plot approach relies on visual inspection and is to be preferred in so far as applicable to detect simple relations, namely when M is small. Other approaches are needed for large values of M. We consider that only the relatively few scatter plots that present a ‘structure’ are of interest for an exploratory analysis and, by ‘structure’, we mean a domain of specially high local density in the plot. Based on this concept, we propose a method constructed around two steps: the selection of the possibly interesting pairs of variables and the validation of the corresponding scatter plots. The selection of the pairs results from an algorithm based on a binary partitioning tree. The validation of the corresponding scatter plots enables the production of only those where a structure is found the recognition of a structure is derived from a statistic based on the length of the Minimum Spanning Tree constructed on the N points of the candidate scatter plot. For illustration, we report on an industrial application where the method is routinely applied for exploratory purposes.
- Published
- 1992
- Full Text
- View/download PDF
44. Testing For Independence of Observations in Animal Movements
- Author
-
Norman A. Slade and Robert K. Swihart
- Subjects
Local convex hull ,Bivariate data ,Statistical assumption ,Ecology ,Statistics ,Autocorrelation ,Bivariate analysis ,Null hypothesis ,Ecology, Evolution, Behavior and Systematics ,Independence (probability theory) ,Test (assessment) ,Mathematics - Abstract
Many analyses of animal movements assume that an animal's position at time t + 1 is independent of its position at time t, but no statistical procedure exists to test this assumption with bivariate data. Using empirically derived critical values for the ratio of mean squared distance between successive observations to mean squared distance from the center of activity, we demonstrate a bivariate test of the independence assumption first proposed by Schoener. For cases in which the null hypothesis of independence is rejected, we present a procedure for determining the time interval at which autocorrelation becomes negligible. To illustrate implementation of the test, locational data obtained from a radio-tagged adult female cotton rat (Sigmodon hispidus) were used. The test can be used to design an efficient sampling schedule for movement studies, and it is also useful in revealing behavioral phenomena such as home range shifting and any tendency of animals to follow prescribed routes in their daily activities. Further, the test may provide a means of examining how an animal's use of space is affected by its internal clock.
- Published
- 1985
- Full Text
- View/download PDF
45. SOME CHARACTERIZATIONS OF THE BIVARIATE DISTRIBUTION OF INDEPENDENT POISSON VARIABLES
- Author
-
D. N. Shanbhag and G. Rajamannar
- Subjects
Statistics and Probability ,symbols.namesake ,Bivariate data ,Joint probability distribution ,Statistics ,symbols ,Zero-inflated model ,Bivariate analysis ,Poisson distribution ,Compound probability distribution ,Normal-gamma distribution ,Pearson product-moment correlation coefficient ,Mathematics - Published
- 1974
- Full Text
- View/download PDF
46. Linear relations in biomechanics: the statistics of scaling functions
- Author
-
J. M. V. Rayner
- Subjects
Multivariate statistics ,Observational error ,Bivariate data ,Simple (abstract algebra) ,Ecology ,Applied mathematics ,Animal Science and Zoology ,Allometry ,Variation (game tree) ,Biology ,Scaling ,Ecology, Evolution, Behavior and Systematics ,Regression - Abstract
The problem of fitting a linear relation to a bivariate data cluster obtained from morphometric measurement or from experiment is formulated rigorously, and a family of solutions (the general structural relation, g.s.r.) is derived. The regression, reduced major axis and major axis models are special cases of this model; it permits a more realistic treatment of the errors in the variates, in particular when the errors are correlated, which is particularly important in the many biological situations in which the variates contain uncontrolled real variation in addition to measurement errors. The analysis is particularly directed to the testing of hypotheses about scaling relations derived from biomechanical theory. By making different assumptions about the configuration of the errors, the g.s.r. can also be used to test for transposition allometry and for the significance of an estimated or hypothesized gradient. The model generalizes simply to multivariate problems. Application is demonstrated with examples drawn from the study of bird flight mechanics. Finally, it is demonstrated that since observed quantities correspond to peaks of adaptation or of selective fitness, scaling relations are determined primarily by scale variation of constraints on adaptation and behaviour, and are the result of a variety of interacting factors rather than a response to a single selective force described by one simple hypothesis.
- Published
- 1985
- Full Text
- View/download PDF
47. A Method of Estimation for Some Bivariate Discrete Distributions
- Author
-
C. D. Kemp and H. Papageorgiou
- Subjects
Statistics and Probability ,Hermite polynomials ,Negative binomial distribution ,Estimator ,General Medicine ,Bivariate analysis ,Poisson distribution ,Moment (mathematics) ,symbols.namesake ,Bivariate data ,Statistics ,symbols ,Multinomial distribution ,Statistics, Probability and Uncertainty ,Mathematics - Abstract
As a method of estimating the parameters of bivariate discrete distributions from sample data, maximum-likelihood usually involves iterative procedures which are complicated and relatively time-consuming even on modern computers. Simple moment estimators, though quick and easy are, however, often inefficient and so alternative procedures have been investigated. In the present paper we introduce a simple estimation procedure which utilizes both marginal and conditional means, using the bivariate Poisson, negative binomial, Hermite and Neyman A distributions as illustrations. In addition, the bivariate negative trinomial distribution is fitted to a set of human family data and the correlation between the numbers of two types of children in a family is examined.
- Published
- 1988
- Full Text
- View/download PDF
48. Kernel-Based Density Estimates from Incomplete Data
- Author
-
G. M. Mill and D. M. Titterington
- Subjects
Statistics and Probability ,Multivariate statistics ,Bivariate data ,Kernel (statistics) ,Statistics ,Imputation (statistics) ,Density estimation ,Missing data ,Mathematics - Published
- 1983
- Full Text
- View/download PDF
49. A Test of Fit for Bivariate Distributions
- Author
-
Ram C. Dahiya and John Gurland
- Subjects
Statistics and Probability ,010102 general mathematics ,Multivariate normal distribution ,Bivariate analysis ,01 natural sciences ,Test (assessment) ,010104 statistics & probability ,Distribution (mathematics) ,Bivariate data ,Statistics ,Chi-square test ,Null distribution ,Test statistic ,0101 mathematics ,Mathematics - Abstract
Tests of fit based on generalized minimum chi-square techniques are developed for bivariate distributions. The asymptotic null distribution of the test statistic is chi square while the asymptotic non-null distribution turns out to be that of a weighted sum of independent non-central chi square variates. The special case of testing the fit of a bivariate normal distribution is investigated in detail and the power is obtained for several alternative families of bivariate distributions. (Author)
- Published
- 1973
- Full Text
- View/download PDF
50. A BIVARIATE SPATIAL REGRESSION OPERATOR
- Author
-
Leslie Curry
- Subjects
Polynomial regression ,Geography ,Bivariate data ,Bayesian multivariate linear regression ,Geography, Planning and Development ,Statistics ,Local regression ,Cross-sectional regression ,Segmented regression ,Factor regression model ,Earth-Surface Processes ,Nonparametric regression - Published
- 1972
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.