Back to Search
Start Over
The mutual independence hypothesis for categorical data in complex sampling schemes
- Source :
- Biometrika. 74:857-862
- Publication Year :
- 1987
- Publisher :
- Oxford University Press (OUP), 1987.
-
Abstract
- SUMMARY This paper is concerned with the analysis of contingency tables which are constructed using categorical data from a complex sampling scheme. For the mutual independence hypothesis, we extend Tavare's (1983) results on two-dimensional contingency tables to higher dimensions. In particular, with respect to Pearson's statistic for testing the independence of two variables, we strengthen the robustness result proven by Tavare, thus making it applicable more widely. This paper is concerned with the analysis of contingency tables which are constructed using categorical data from a complex sampling scheme, i.e. from data containing some form of between-observation dependence which renders the multinomial sampling scheme invalid. Throughout, we are interested in how departures from multinomial sampling affect Pearson's statistic, X2, which is appropriate for multinomial data. In the literature to date, two distinct approaches to the problem of categorical data from complex sampling schemes have been discussed. First, Holt, Scott & Ewings (1980) demonstrate that, for survey data, analyses based on x2 can be extremely misleading and propose the one-moment adjustment of X2. Related references are Rao & Scott (1981, 1984) and Bedrick (1983). Secondly, Altham (1976), Cohen (1976) and Brier (1980) model inherent clustering in the sample. They show that to avoid invalid analyses, some adjustment of X2 is then necessary. More recently Tavare & Altham (1983) and Tavare (1983) have considered two-dimensional contingency tables which are constructed from data generated by Markov chains. Analytic results for the null asymptotic distribution of X2 are obtained and, once more, it is shown that some adjustment of X2 will usually be necessary. However, Tavare (1983) also obtains a very interesting robustness result: let Y1 and Y2 be independent variables, where Y1 is an independent trials process and Y2 is an arbitrary, irreducible first-order Markov chain. Then, the null asymptotic distribution of X2, for testing the independence of Y1 and Y2, is x2 on the appropriate degrees of freedom. In other words, no adjustment of X2 is necessary. Of the literature cited above, the present
- Subjects :
- Statistics and Probability
Contingency table
Variables
Markov chain
Applied Mathematics
General Mathematics
media_common.quotation_subject
Asymptotic distribution
Agricultural and Biological Sciences (miscellaneous)
Statistics
Econometrics
Multinomial distribution
Statistics, Probability and Uncertainty
General Agricultural and Biological Sciences
Categorical variable
Statistic
Statistical hypothesis testing
Mathematics
media_common
Subjects
Details
- ISSN :
- 14643510 and 00063444
- Volume :
- 74
- Database :
- OpenAIRE
- Journal :
- Biometrika
- Accession number :
- edsair.doi...........e84a252289db24eb35c08f59f11c314f