Back to Search Start Over

Dichotomization of Multilevel Variables to Detect Hidden Associations

Authors :
Asdrúbal López-Chau
Lisbeth Rodriguez-Mazahua
Farid García-Lamont
Maricela Quintana-López
Carlos A. Rojas-Hernández
Source :
Applied Sciences, Vol 12, Iss 24, p 12929 (2022)
Publication Year :
2022
Publisher :
MDPI AG, 2022.

Abstract

A test of independence is commonly used to determine differences (or associations) between samples in a nominal level measurement. Fisher’s exact test and Chi-square test are two of the most widely applied tests of independence used in the data analyses in different areas such as information technologies, biostatistics, psychology and health sciences. In some cases, contingency tables with null entries (also called random zeros) arise, particularly if the number of samples is small, and the variables analyzed are multilevel. This situation becomes a problem because if one or more entries in a contingency table are zero or have small values, then the tests of independence produce unreliable results. In this paper, we propose a method to address that issue. The method merges one or more levels of the variables analyzed to create contingency tables with only one degree of freedom, avoiding applying a test of independence on contingency tables with random zeros. The source code (Python) of the method is publicly available for use. The results obtained using our method give a complete panorama of the associations between the variables of a data set. To show the effectiveness of our approach to find dependencies between variables, we use four data sets publicly available on the Internet.

Details

Language :
English
ISSN :
20763417
Volume :
12
Issue :
24
Database :
Directory of Open Access Journals
Journal :
Applied Sciences
Publication Type :
Academic Journal
Accession number :
edsdoj.5205beb4e5744a0be914ef4da46153e
Document Type :
article
Full Text :
https://doi.org/10.3390/app122412929