Universidade de Santiago de Compostela. Departamento de Ciencias Forenses, Anatomía Patolóxica, Xinecoloxía e Obstetricia, e Pediatría, Universidade de Santiago de Compostela. Departamento de Estatística, Análise Matemática e Optimización, Universidade de Santiago de Compostela. Departamento de Matemáticas, Universidade de Santiago de Compostela. Instituto de Ciencias Forenses “Luis Concheiro”(INCIFOR), Ambroa Conde, Adrián, Casares de Cal, María Ángeles, Gómez Tato, Antonio, Robinson, Oliver, Mosquera Miguel, Ana, de la Puente Vila, María del Carmen, Ruiz Ramírez, Jorge, Phillips, Christopher Paul, Lareu Huidobro, María Victoria, Freire Aradas, Ana María, Universidade de Santiago de Compostela. Departamento de Ciencias Forenses, Anatomía Patolóxica, Xinecoloxía e Obstetricia, e Pediatría, Universidade de Santiago de Compostela. Departamento de Estatística, Análise Matemática e Optimización, Universidade de Santiago de Compostela. Departamento de Matemáticas, Universidade de Santiago de Compostela. Instituto de Ciencias Forenses “Luis Concheiro”(INCIFOR), Ambroa Conde, Adrián, Casares de Cal, María Ángeles, Gómez Tato, Antonio, Robinson, Oliver, Mosquera Miguel, Ana, de la Puente Vila, María del Carmen, Ruiz Ramírez, Jorge, Phillips, Christopher Paul, Lareu Huidobro, María Victoria, and Freire Aradas, Ana María
DNA methylation has become a biomarker of great interest in the forensic and clinical fields. In criminal investigations, the study of this epigenetic marker has allowed the development of DNA intelligence tools providing information that can be useful for investigators, such as age prediction. Following a similar trend, when the origin of a sample in a criminal scenario is unknown, the inference of an individual’s lifestyle such as tobacco use and alcohol consumption could provide relevant information to help in the identification of DNA donors at the crime scene. At the same time, in the clinical domain, prediction of these trends of consumption could allow the identification of people at risk or better identification of the causes of different pathologies. In the present study, DNA methylation data from the UK AIRWAVE study was used to build two binomial logistic models for the inference of smoking and drinking status. A total of 348 individuals (116 non-smokers, 116 former smokers and 116 smokers) plus a total of 237 individuals (79 non-drinkers, 79 moderate drinkers and 79 drinkers) were used for development of tobacco and alcohol consumption prediction models, respectively. The tobacco prediction model was composed of two CpGs (cg05575921 in AHRR and cg01940273) and the alcohol prediction model three CpGs (cg06690548 in SLC7A11, cg0886875 and cg21294714 in MIR4435–2HG), providing correct classifications of 86.49% and 74.26%, respectively. Validation of the models was performed using leave-one-out cross-validation. Additionally, two independent testing sets were also assessed for tobacco and alcohol consumption. Considering that the consumption of these substances could underlie accelerated epigenetic ageing patterns, the effect of these lifestyles on the prediction of age was evaluated. To do that, a quantile regression model based on previous studies was generated, and the potential effect of tobacco and alcohol consumption with the epigenetic age was assesse