Back to Search Start Over

The mutual independence hypothesis for categorical data in complex sampling schemes

Authors :
B. T. Porteous
Source :
Biometrika. 74:857-862
Publication Year :
1987
Publisher :
Oxford University Press (OUP), 1987.

Abstract

SUMMARY This paper is concerned with the analysis of contingency tables which are constructed using categorical data from a complex sampling scheme. For the mutual independence hypothesis, we extend Tavare's (1983) results on two-dimensional contingency tables to higher dimensions. In particular, with respect to Pearson's statistic for testing the independence of two variables, we strengthen the robustness result proven by Tavare, thus making it applicable more widely. This paper is concerned with the analysis of contingency tables which are constructed using categorical data from a complex sampling scheme, i.e. from data containing some form of between-observation dependence which renders the multinomial sampling scheme invalid. Throughout, we are interested in how departures from multinomial sampling affect Pearson's statistic, X2, which is appropriate for multinomial data. In the literature to date, two distinct approaches to the problem of categorical data from complex sampling schemes have been discussed. First, Holt, Scott & Ewings (1980) demonstrate that, for survey data, analyses based on x2 can be extremely misleading and propose the one-moment adjustment of X2. Related references are Rao & Scott (1981, 1984) and Bedrick (1983). Secondly, Altham (1976), Cohen (1976) and Brier (1980) model inherent clustering in the sample. They show that to avoid invalid analyses, some adjustment of X2 is then necessary. More recently Tavare & Altham (1983) and Tavare (1983) have considered two-dimensional contingency tables which are constructed from data generated by Markov chains. Analytic results for the null asymptotic distribution of X2 are obtained and, once more, it is shown that some adjustment of X2 will usually be necessary. However, Tavare (1983) also obtains a very interesting robustness result: let Y1 and Y2 be independent variables, where Y1 is an independent trials process and Y2 is an arbitrary, irreducible first-order Markov chain. Then, the null asymptotic distribution of X2, for testing the independence of Y1 and Y2, is x2 on the appropriate degrees of freedom. In other words, no adjustment of X2 is necessary. Of the literature cited above, the present

Details

ISSN :
14643510 and 00063444
Volume :
74
Database :
OpenAIRE
Journal :
Biometrika
Accession number :
edsair.doi...........e84a252289db24eb35c08f59f11c314f