35 results on '"Aguiar-Pulido, V."'
Search Results
2. List of Contributors
- Author
-
Aguiar-Pulido, V., primary, Ayres, E., additional, Cimino, J.J., additional, de Fátima Marin, H., additional, de Holanda Albuquerque, R., additional, de Quiros, F.G.B., additional, Degoulet, P., additional, Delaney, C., additional, Ed Hammond, W., additional, Gattini, C.H., additional, Gutierrez, M.A., additional, Luna, D., additional, Martin-Sanchez, F., additional, Massad, E., additional, Moreno, R.A., additional, Novillo-Ortiz, D., additional, Pillay, R., additional, Quintana, Y., additional, Ramos, M.P., additional, Rebelo, M.S., additional, Rodrigues, R.J., additional, Safran, C., additional, Sigulem, D., additional, and Wen, C.L., additional
- Published
- 2017
- Full Text
- View/download PDF
3. Chapter 9 - Analytics and Decision Support Systems in Global Health Informatics
- Author
-
Martin-Sanchez, F. and Aguiar-Pulido, V.
- Published
- 2017
- Full Text
- View/download PDF
4. Secondary Use and Analysis of Big Data Collected for Patient Care
- Author
-
Martin-Sanchez, F J, Aguiar-Pulido, V, Lopez-Campos, G H, Peek, N, and Sacchi, L
- Subjects
Journal Article - Abstract
Objectives: To identify common methodological challenges and review relevant initiatives related to the re-use of patient data collected in routine clinical care, as well as to analyze the economic benefits derived from the secondary use of this data. Through the use of several examples, this article aims to provide a glimpse into the different areas of application, namely clinical research, genomic research, study of environmental factors, and population and health services research. This paper describes some of the informatics methods and Big Data resources developed in this context, such as electronic phenotyping, clinical research networks, biorepositories, screening data banks, and wide association studies. Lastly, some of the potential limitations of these approaches are discussed, focusing on confounding factors and data quality. Methods: A series of literature searches in main bibliographic databases have been conducted in order to assess the extent to which existing patient data has been repurposed for research. This contribution from the IMIA working group on "Data mining and Big Data analytics" focuses on the literature published during the last two years, covering the timeframe since the working group's last survey. Results and Conclusions: Although most of the examples of secondary use of patient data lie in the arena of clinical and health services research, we have started to witness other important applications, particularly in the area of genomic research and the study of health effects of environmental factors. Further research is needed to characterize the economic impact of secondary use across the broad spectrum of translational research.
- Published
- 2017
5. Secondary Use and Analysis of Big Data Collected for Patient Care
- Author
-
Martin-Sanchez, F. J., additional, Aguiar-Pulido, V., additional, Lopez-Campos, G. H., additional, Peek, N., additional, and Sacchi, L., additional
- Published
- 2017
- Full Text
- View/download PDF
6. Data Integration in Genomic Medicine: Trends and Applications
- Author
-
Seoane, J. A., primary, Aguiar-Pulido, V., primary, Pazos, A., primary, and Dorado, J., additional
- Published
- 2012
- Full Text
- View/download PDF
7. Weighting the importance of variables with genetic programming: An application to Galician schizophrenia patients
- Author
-
Aguiar-Pulido, V., Rivero, D., Marcos Gestal, and Dorado, J.
8. Genetic algorithm based on differential evolution with variable length: Runoff prediction on an artificial basin
- Author
-
Aguiar-Pulido, V., Freire, A., Garrido, M., and Juan R. Rabuñal
9. RExPRT: a machine learning tool to predict pathogenicity of tandem repeat loci.
- Author
-
Fazal S, Danzi MC, Xu I, Kobren SN, Sunyaev S, Reuter C, Marwaha S, Wheeler M, Dolzhenko E, Lucas F, Wuchty S, Tekin M, Züchner S, and Aguiar-Pulido V
- Subjects
- Virulence, Tandem Repeat Sequences, Machine Learning
- Abstract
Expansions of tandem repeats (TRs) cause approximately 60 monogenic diseases. We expect that the discovery of additional pathogenic repeat expansions will narrow the diagnostic gap in many diseases. A growing number of TR expansions are being identified, and interpreting them is a challenge. We present RExPRT (Repeat EXpansion Pathogenicity pRediction Tool), a machine learning tool for distinguishing pathogenic from benign TR expansions. Our results demonstrate that an ensemble approach classifies TRs with an average precision of 93% and recall of 83%. RExPRT's high precision will be valuable in large-scale discovery studies, which require prioritization of candidate loci for follow-up studies., (© 2024. The Author(s).)
- Published
- 2024
- Full Text
- View/download PDF
10. Minor intron-containing genes as an ancient backbone for viral infection?
- Author
-
Wuchty S, White AK, Olthof AM, Drake K, Hume AJ, Olejnik J, Aguiar-Pulido V, Mühlberger E, and Kanadia RN
- Abstract
Minor intron-containing genes (MIGs) account for <2% of all human protein-coding genes and are uniquely dependent on the minor spliceosome for proper excision. Despite their low numbers, we surprisingly found a significant enrichment of MIG-encoded proteins (MIG-Ps) in protein-protein interactomes and host factors of positive-sense RNA viruses, including SARS-CoV-1, SARS-CoV-2, MERS coronavirus, and Zika virus. Similarly, we observed a significant enrichment of MIG-Ps in the interactomes and sets of host factors of negative-sense RNA viruses such as Ebola virus, influenza A virus, and the retrovirus HIV-1. We also found an enrichment of MIG-Ps in double-stranded DNA viruses such as Epstein-Barr virus, human papillomavirus, and herpes simplex viruses. In general, MIG-Ps were highly connected and placed in central positions in a network of human-host protein interactions. Moreover, MIG-Ps that interact with viral proteins were enriched with essential genes. We also provide evidence that viral proteins interact with ancestral MIGs that date back to unicellular organisms and are mainly involved in basic cellular functions such as cell cycle, cell division, and signal transduction. Our results suggest that MIG-Ps form a stable, evolutionarily conserved backbone that viruses putatively tap to invade and propagate in human host cells., (© The Author(s) 2024. Published by Oxford University Press on behalf of National Academy of Sciences.)
- Published
- 2024
- Full Text
- View/download PDF
11. Lower respiratory tract microbiome composition and community interactions in smokers.
- Author
-
Campos M, Cickovski T, Fernandez M, Jaric M, Wanner A, Holt G, Donna E, Mendes E, Silva-Herzog E, Schneper L, Segal J, Amador DM, Riveros JD, Aguiar-Pulido V, Banerjee S, Salathe M, Mathee K, and Narasimhan G
- Abstract
The lung microbiome impacts on lung function, making any smoking-induced changes in the lung microbiome potentially significant. The complex co-occurrence and co-avoidance patterns between the bacterial taxa in the lower respiratory tract (LRT) microbiome were explored for a cohort of active (AS), former (FS) and never (NS) smokers. Bronchoalveolar lavages (BALs) were collected from 55 volunteer subjects (9 NS, 24 FS and 22 AS). The LRT microbiome composition was assessed using 16S rRNA amplicon sequencing. Identification of differentially abundant taxa and co-occurrence patterns, discriminant analysis and biomarker inferences were performed. The data show that smoking results in a loss in the diversity of the LRT microbiome, change in the co-occurrence patterns and a weakening of the tight community structure present in healthy microbiomes. The increased abundance of the genus Ralstonia in the lung microbiomes of both former and active smokers is significant. Partial least square discriminant and DESeq2 analyses suggested a compositional difference between the cohorts in the LRT microbiome. The groups were sufficiently distinct from each other to suggest that cessation of smoking may not be sufficient for the lung microbiota to return to a similar composition to that of NS. The linear discriminant analysis effect size (LEfSe) analyses identified several bacterial taxa as potential biomarkers of smoking status. Network-based clustering analysis highlighted different co-occurring and co-avoiding microbial taxa in the three groups. The analysis found a cluster of bacterial taxa that co-occur in smokers and non-smokers alike. The clusters exhibited tighter and more significant associations in NS compared to FS and AS. Higher degree of rivalry between clusters was observed in the AS. The groups were sufficiently distinct from each other to suggest that cessation of smoking may not be sufficient for the lung microbiota to return to a similar composition to that of NS., Competing Interests: The authors declare that there are no conflicts of interest., (© 2023 The Authors.)
- Published
- 2023
- Full Text
- View/download PDF
12. CIC missense variants contribute to susceptibility for spina bifida.
- Author
-
Han X, Cao X, Aguiar-Pulido V, Yang W, Karki M, Ramirez PAP, Cabrera RM, Lin YL, Wlodarczyk BJ, Shaw GM, Ross ME, Zhang C, Finnell RH, and Lei Y
- Subjects
- Animals, Female, Humans, Mice, Pregnancy, Folate Receptor 1 genetics, Folic Acid, Mutation, Missense, NIH 3T3 Cells, HeLa Cells, Neural Tube Defects genetics, Spinal Dysraphism genetics, Repressor Proteins genetics
- Abstract
Neural tube defects (NTDs) are congenital malformations resulting from abnormal embryonic development of the brain, spine, or spinal column. The genetic etiology of human NTDs remains poorly understood despite intensive investigation. CIC, homolog of the Capicua transcription repressor, has been reported to interact with ataxin-1 (ATXN1) and participate in the pathogenesis of spinocerebellar ataxia type 1. Our previous study demonstrated that CIC loss of function (LoF) variants contributed to the cerebral folate deficiency syndrome by downregulating folate receptor 1 (FOLR1) expression. Given the importance of folate transport in neural tube formation, we hypothesized that CIC variants could contribute to increased risk for NTDs by depressing embryonic folate concentrations. In this study, we examined CIC variants from whole-genome sequencing (WGS) data of 140 isolated spina bifida cases and identified eight missense variants of CIC gene. We tested the pathogenicity of the observed variants through multiple in vitro experiments. We determined that CIC variants decreased the FOLR1 protein level and planar cell polarity (PCP) pathway signaling in a human cell line (HeLa). In a murine cell line (NIH3T3), CIC loss of function variants downregulated PCP signaling. Taken together, this study provides evidence supporting CIC as a risk gene for human NTD., (© 2022 Wiley Periodicals LLC.)
- Published
- 2022
- Full Text
- View/download PDF
13. Systems biology analysis of human genomes points to key pathways conferring spina bifida risk.
- Author
-
Aguiar-Pulido V, Wolujewicz P, Martinez-Fundichely A, Elhaik E, Thareja G, Abdel Aleem A, Chalhoub N, Cuykendall T, Al-Zamer J, Lei Y, El-Bashir H, Musser JM, Al-Kaabi A, Shaw GM, Khurana E, Suhre K, Mason CE, Elemento O, Finnell RH, and Ross ME
- Subjects
- Case-Control Studies, Genetic Predisposition to Disease, Genome-Wide Association Study, Humans, Systems Biology, Transcription Factors genetics, Genome, Human, Spinal Dysraphism genetics
- Abstract
Spina bifida (SB) is a debilitating birth defect caused by multiple gene and environment interactions. Though SB shows non-Mendelian inheritance, genetic factors contribute to an estimated 70% of cases. Nevertheless, identifying human mutations conferring SB risk is challenging due to its relative rarity, genetic heterogeneity, incomplete penetrance, and environmental influences that hamper genome-wide association studies approaches to untargeted discovery. Thus, SB genetic studies may suffer from population substructure and/or selection bias introduced by typical candidate gene searches. We report a population based, ancestry-matched whole-genome sequence analysis of SB genetic predisposition using a systems biology strategy to interrogate 298 case-control subject genomes (149 pairs). Genes that were enriched in likely gene disrupting (LGD), rare protein-coding variants were subjected to machine learning analysis to identify genes in which LGD variants occur with a different frequency in cases versus controls and so discriminate between these groups. Those genes with high discriminatory potential for SB significantly enriched pathways pertaining to carbon metabolism, inflammation, innate immunity, cytoskeletal regulation, and essential transcriptional regulation consistent with their having impact on the pathogenesis of human SB. Additionally, an interrogation of conserved noncoding sequences identified robust variant enrichment in regulatory regions of several transcription factors critical to embryonic development. This genome-wide perspective offers an effective approach to the interrogation of coding and noncoding sequence variant contributions to rare complex genetic disorders., Competing Interests: Competing interest statement. R.H.F. formerly held a leadership position with the now dissolved TeratOmic Consulting LLC. He also receives travel funds to attend editorial board meetings of the Journal of Reproductive and Developmental Medicine published out of the Red Hospital of Fudan University. E.E. consults for the DNA Diagnostics Center. P.S. and R.H.F. are coauthors on a 2020 paper resulting from an NIH workshop: Maruvada P et al., Knowledge gaps in understanding the metabolic and clinical effects of excess folates/folic acid: a summary, and perspectives, from an NIH workshop. Am J Clin Nutr. 2020 Nov 11;112(5):1390-1403. doi: 10.1093/ajcn/nqaa259. PMID: 33022704; PMCID: PMC7657327., (Copyright © 2021 the Author(s). Published by PNAS.)
- Published
- 2021
- Full Text
- View/download PDF
14. Genome-wide investigation identifies a rare copy-number variant burden associated with human spina bifida.
- Author
-
Wolujewicz P, Aguiar-Pulido V, AbdelAleem A, Nair V, Thareja G, Suhre K, Shaw GM, Finnell RH, Elemento O, and Ross ME
- Subjects
- Case-Control Studies, Genome, Genome-Wide Association Study, Humans, Polymorphism, Single Nucleotide genetics, DNA Copy Number Variations genetics, Spinal Dysraphism genetics
- Abstract
Purpose: Next-generation sequencing has implicated some risk variants for human spina bifida (SB), but the genome-wide contribution of structural variation to this complex genetic disorder remains largely unknown. We examined copy-number variant (CNV) participation in the genetic architecture underlying SB risk., Methods: A high-confidence ensemble approach to genome sequences (GS) was benchmarked and employed for systematic detection of common and rare CNVs in two separate ancestry-matched SB case-control cohorts., Results: SB cases were enriched with exon disruptive rare CNVs, 44% of which were under 10 kb, in both ancestral populations (P = 6.75 × 10
-7 ; P = 7.59 × 10-4 ). Genes containing these disruptive CNVs fall into molecular pathways, supporting a role for these genes in SB. Our results expand the catalog of variants and genes with potential contribution to genetic and gene-environment interactions that interfere with neurulation, useful for further functional characterization., Conclusion: This study underscores the need for genome-wide investigation and extends our previous threshold model of exonic, single-nucleotide variation toward human SB risk to include structural variation. Since GS data afford detection of CNVs with greater resolution than microarray methods, our results have important implications toward a more comprehensive understanding of the genetic risk and mechanisms underlying neural tube defect pathogenesis.- Published
- 2021
- Full Text
- View/download PDF
15. Genome-wide bioinformatic analyses predict key host and viral factors in SARS-CoV-2 pathogenesis.
- Author
-
Ferrarini MG, Lal A, Rebollo R, Gruber AJ, Guarracino A, Gonzalez IM, Floyd T, de Oliveira DS, Shanklin J, Beausoleil E, Pusa T, Pickett BE, and Aguiar-Pulido V
- Subjects
- Binding Sites, COVID-19 virology, Cytokines genetics, Databases, Genetic, Gene Expression Regulation, Genome, Viral, Humans, RNA, Viral genetics, RNA, Viral metabolism, RNA-Binding Proteins genetics, RNA-Binding Proteins metabolism, RNA-Seq, Serpins genetics, Signal Transduction genetics, Transcriptome, Virus Replication genetics, COVID-19 genetics, Computational Biology methods, Host-Pathogen Interactions genetics, Pandemics, SARS-CoV-2 genetics
- Abstract
The novel betacoronavirus severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) caused a worldwide pandemic (COVID-19) after emerging in Wuhan, China. Here we analyzed public host and viral RNA sequencing data to better understand how SARS-CoV-2 interacts with human respiratory cells. We identified genes, isoforms and transposable element families that are specifically altered in SARS-CoV-2-infected respiratory cells. Well-known immunoregulatory genes including CSF2, IL32, IL-6 and SERPINA3 were differentially expressed, while immunoregulatory transposable element families were upregulated. We predicted conserved interactions between the SARS-CoV-2 genome and human RNA-binding proteins such as the heterogeneous nuclear ribonucleoprotein A1 (hnRNPA1) and eukaryotic initiation factor 4 (eIF4b). We also identified a viral sequence variant with a statistically significant skew associated with age of infection, that may contribute to intracellular host-pathogen interactions. These findings can help identify host mechanisms that can be targeted by prophylactics and/or therapeutics to reduce the severity of COVID-19.
- Published
- 2021
- Full Text
- View/download PDF
16. A data-driven, high-throughput methodology to determine tissue-specific differentially methylated regions able to discriminate body fluids.
- Author
-
Antunes J, Gauthier Q, Aguiar-Pulido V, Duncan G, and McCord B
- Subjects
- DNA Methylation, Female, Forensic Genetics, Humans, Mouth Mucosa, Real-Time Polymerase Chain Reaction, Body Fluids
- Abstract
Tissue-specific differentially methylated regions (tDMRs) are regions of the genome with methylation patterns that modulate gene expression in those tissue types. The detection of tDMRs in forensic evidence can permit the identification of body fluids at trace levels. In this report, we have performed a bioinformatic analysis of an existing array dataset to determine if new tDMRs could be identified for use in body fluid identification from forensic evidence. Once these sites were identified, primers were designed and bisulfite modification was performed. The relative methylation level for each body fluid at a given locus was then determined using qPCR with high-resolution melt analysis (HRM). After screening 127 tDMR's in multiple body fluids, we were able to identify four new markers able to discriminate blood (2 markers), vaginal epithelia (1 marker) and buccal cells (1 marker). One marker for each target body fluid was also tested with pyrosequencing showing results consistent with those obtained by HRM. This work successfully demonstrates the ability of in silico analysis to develop a novel set of tDMRs capable of being differentiated by real time PCR/HRM. The method can rapidly determine the body fluids left at crime scenes, assisting the triers of fact in forensic casework., (© 2021 Wiley-VCH GmbH.)
- Published
- 2021
- Full Text
- View/download PDF
17. Author Correction: Threshold for neural tube defect risk by accumulated singleton loss-of-function variants.
- Author
-
Chen Z, Lei Y, Zheng Y, Aguiar-Pulido V, Ross ME, Peng R, Jin L, Zhang T, Finnell RH, and Wang H
- Published
- 2021
- Full Text
- View/download PDF
18. Loss of RAD9B impairs early neural development and contributes to the risk for human spina bifida.
- Author
-
Cao X, Tian T, Steele JW, Cabrera RM, Aguiar-Pulido V, Wadhwa S, Bhavani N, Bi P, Gargurevich NH, Hoffman EN, Cai CQ, Marini NJ, Yang W, Shaw GM, Ross ME, Finnell RH, and Lei Y
- Subjects
- Case-Control Studies, Cell Line, DNA Damage, DNA Repair, Embryonic Stem Cells metabolism, Fluorescent Antibody Technique, Gene Expression, Humans, Loss of Function Mutation, Mutation, Neural Tube Defects diagnosis, Neurons metabolism, Risk Assessment, Risk Factors, Spinal Dysraphism diagnosis, Cell Cycle Proteins deficiency, Genetic Predisposition to Disease, Neural Tube Defects genetics, Spinal Dysraphism genetics
- Abstract
DNA damage response (DDR) genes orchestrating the network of DNA repair, cell cycle control, are essential for the rapid proliferation of neural progenitor cells. To date, the potential association between specific DDR genes and the risk of human neural tube defects (NTDs) has not been investigated. Using whole-genome sequencing and targeted sequencing, we identified significant enrichment of rare deleterious RAD9B variants in spina bifida cases compared to controls (8/409 vs. 0/298; p = .0241). Among the eight identified variants, the two frameshift mutants and p.Gln146Glu affected RAD9B nuclear localization. The two frameshift mutants also decreased the protein level of RAD9B. p.Ser354Gly, as well as the two frameshifts, affected the cell proliferation rate. Finally, p.Ser354Gly, p.Ser10Gly, p.Ile112Met, p.Gln146Glu, and the two frameshift variants showed a decreased ability for activating JNK phosphorylation. RAD9B knockdowns in human embryonic stem cells profoundly affected early differentiation through impairing PAX6 and OCT4 expression. RAD9B deficiency impeded in vitro formation of neural organoids, a 3D cell culture model for human neural development. Furthermore, the RNA-seq data revealed that loss of RAD9B dysregulates cell adhesion genes during organoid formation. These results represent the first demonstration of a DDR gene as an NTD risk factor in humans., (© 2020 Wiley Periodicals, Inc.)
- Published
- 2020
- Full Text
- View/download PDF
19. Single-cell sperm transcriptomes and variants from fathers of children with and without autism spectrum disorder.
- Author
-
Tomoiaga D, Aguiar-Pulido V, Shrestha S, Feinstein P, Levy SE, Mason CE, and Rosenfeld JA
- Abstract
The human sperm is one of the smallest cells in the body, but also one of the most important, as it serves as the entire paternal genetic contribution to a child. Investigating RNA and mutations in sperm is especially relevant for diseases such as autism spectrum disorders (ASD), which have been correlated with advanced paternal age. Historically, studies have focused on the assessment of bulk sperm, wherein millions of individual sperm are present and only high-frequency variants can be detected. Using 10× Chromium single-cell sequencing technology, we assessed the transcriptome from >65,000 single spermatozoa across six sperm donors (scSperm-RNA-seq), including two who fathered multiple children with ASD and four fathers of neurotypical children. Using RNA-seq methods for differential expression and variant analysis, we found clusters of sperm mutations in each donor that are indicative of the sperm being produced by different stem cell pools. Finally, we have shown that genetic variations can be found in single sperm., Competing Interests: Competing interestsC.E.M. is a cofounder and board member for Biotia and Onegevity Health, as well as an advisor or compensated speaker for Abbvie, Acuamark Diagnostics, ArcBio, BioRad, DNA Genotek, Genialis, Genpro, Karius, Illumina, New England Biolabs, QIAGEN, Whole Biome and Zymo Research. D.T., J.A.R. and C.E.M. have a related U.S. Patent application 62/460,480. The remaining authors declare that there are no competing interests., (© The Author(s) 2020.)
- Published
- 2020
- Full Text
- View/download PDF
20. Author Correction: Threshold for neural tube defect risk by accumulated singleton loss-of-function variants.
- Author
-
Chen Z, Lei Y, Zheng Y, Aguiar-Pulido V, Ross ME, Peng R, Jin L, Zhang T, Finnell RH, and Wang H
- Abstract
An amendment to this paper has been published and can be accessed via a link at the top of the paper.
- Published
- 2019
- Full Text
- View/download PDF
21. MATria: a unified centrality algorithm.
- Author
-
Cickovski T, Aguiar-Pulido V, and Narasimhan G
- Subjects
- Animals, Bacteria genetics, Computer Graphics, Gene Regulatory Networks, Ostreidae genetics, Time Factors, Algorithms
- Abstract
Background: Computing centrality is a foundational concept in social networking that involves finding the most "central" or important nodes. In some biological networks defining importance is difficult, which then creates challenges in finding an appropriate centrality algorithm., Results: We instead generalize the results of any k centrality algorithms through our iterative algorithm MATRIA, producing a single ranked and unified set of central nodes. Through tests on three biological networks, we demonstrate evident and balanced correlations with the results of these k algorithms. We also improve its speed through GPU parallelism., Conclusions: Our results show iteration to be a powerful technique that can eliminate spatial bias among central nodes, increasing the level of agreement between algorithms with various importance definitions. GPU parallelism improves speed and makes iteration a tractable problem for larger networks.
- Published
- 2019
- Full Text
- View/download PDF
22. Threshold for neural tube defect risk by accumulated singleton loss-of-function variants.
- Author
-
Chen Z, Lei Y, Zheng Y, Aguiar-Pulido V, Ross ME, Peng R, Jin L, Zhang T, Finnell RH, and Wang H
- Subjects
- Asian People genetics, Case-Control Studies, Child, Child, Preschool, China epidemiology, Databases, Genetic, Female, Gestational Age, Human Genome Project, Humans, Loss of Function Mutation, Male, Mutation, Missense, Neural Tube Defects epidemiology, Neural Tube Defects pathology, Odds Ratio, Pregnancy, Risk, Exome Sequencing, Neural Tube Defects genetics
- Published
- 2018
- Full Text
- View/download PDF
23. ATria: a novel centrality algorithm applied to biological networks.
- Author
-
Cickovski T, Peake E, Aguiar-Pulido V, and Narasimhan G
- Subjects
- Bacterial Physiological Phenomena, Software, Algorithms, Computational Biology methods, Models, Biological
- Abstract
Background: The notion of centrality is used to identify "important" nodes in social networks. Importance of nodes is not well-defined, and many different notions exist in the literature. The challenge of defining centrality in meaningful ways when network edges can be positively or negatively weighted has not been adequately addressed in the literature. Existing centrality algorithms also have a second shortcoming, i.e., the list of the most central nodes are often clustered in a specific region of the network and are not well represented across the network., Methods: We address both by proposing Ablatio Triadum (ATria), an iterative centrality algorithm that uses the concept of "payoffs" from economic theory., Results: We compare our algorithm with other known centrality algorithms and demonstrate how ATria overcomes several of their shortcomings. We demonstrate the applicability of our algorithm to synthetic networks as well as biological networks including bacterial co-occurrence networks, sometimes referred to as microbial social networks., Conclusions: We show evidence that ATria identifies three different kinds of "important" nodes in microbial social networks with different potential roles in the community.
- Published
- 2017
- Full Text
- View/download PDF
24. Gene expression patterns in transgenic mouse models of hypertrophic cardiomyopathy caused by mutations in myosin regulatory light chain.
- Author
-
Huang W, Kazmierczak K, Zhou Z, Aguiar-Pulido V, Narasimhan G, and Szczesna-Cordary D
- Subjects
- Algorithms, Animals, Arginine chemistry, Computational Biology, Gene Expression Profiling, Glutamic Acid chemistry, Glutamine chemistry, Lysine chemistry, Mice, Mice, Transgenic, Multigene Family, Myocardium metabolism, Myosin Light Chains genetics, Oligonucleotide Array Sequence Analysis, Phenotype, Principal Component Analysis, Valine chemistry, Cardiomyopathy, Hypertrophic genetics, Cardiomyopathy, Hypertrophic metabolism, Gene Expression Regulation, Mutation, Myosin Light Chains metabolism
- Abstract
Using microarray and bioinformatics, we examined the gene expression profiles in transgenic mouse hearts expressing mutations in the myosin regulatory light chain shown to cause hypertrophic cardiomyopathy (HCM). We focused on two malignant RLC-mutations, Arginine 58→Glutamine (R58Q) and Aspartic Acid 166 → Valine (D166V), and one benign, Lysine 104 → Glutamic Acid (K104E)-mutation. Datasets of differentially expressed genes for each of three mutants were compared to those observed in wild-type (WT) hearts. The changes in the mutant vs. WT samples were shown as fold-change (FC), with stringency FC ≥ 2. Based on the gene profiles, we have identified the major signaling pathways that underlie the R58Q-, D166V- and K104E-HCM phenotypes. The correlations between different genotypes were also studied using network-based algorithms. Genes with strong correlations were clustered into one group and the central gene networks were identified for each HCM mutant. The overall gene expression patterns in all mutants were distinct from the WT profiles. Both malignant mutations shared certain classes of genes that were up or downregulated, but most similarities were noted between D166V and K104E mice, with R58Q hearts showing a distinct gene expression pattern. Our data suggest that all three HCM mice lead to cardiomyopathy in a mutation-specific manner and thus develop HCM through diverse mechanisms., (Copyright © 2016 Elsevier Inc. All rights reserved.)
- Published
- 2016
- Full Text
- View/download PDF
25. Metagenomics, Metatranscriptomics, and Metabolomics Approaches for Microbiome Analysis.
- Author
-
Aguiar-Pulido V, Huang W, Suarez-Ulloa V, Cickovski T, Mathee K, and Narasimhan G
- Abstract
Microbiomes are ubiquitous and are found in the ocean, the soil, and in/on other living organisms. Changes in the microbiome can impact the health of the environmental niche in which they reside. In order to learn more about these communities, different approaches based on data from multiple omics have been pursued. Metagenomics produces a taxonomical profile of the sample, metatranscriptomics helps us to obtain a functional profile, and metabolomics completes the picture by determining which byproducts are being released into the environment. Although each approach provides valuable information separately, we show that, when combined, they paint a more comprehensive picture. We conclude with a review of network-based approaches as applied to integrative studies, which we believe holds the key to in-depth understanding of microbiomes.
- Published
- 2016
- Full Text
- View/download PDF
26. Unbiased high-throughput characterization of mussel transcriptomic responses to sublethal concentrations of the biotoxin okadaic acid.
- Author
-
Suarez-Ulloa V, Fernandez-Tajes J, Aguiar-Pulido V, Prego-Faraldo MV, Florez-Barros F, Sexto-Iglesias A, Mendez J, and Eirin-Lopez JM
- Abstract
Background. Harmful Algal Blooms (HABs) responsible for Diarrhetic Shellfish Poisoning (DSP) represent a major threat for human consumers of shellfish. The biotoxin Okadaic Acid (OA), a well-known phosphatase inhibitor and tumor promoter, is the primary cause of acute DSP intoxications. Although several studies have described the molecular effects of high OA concentrations on sentinel organisms (e.g., bivalve molluscs), the effect of prolonged exposures to low (sublethal) OA concentrations is still unknown. In order to fill this gap, this work combines Next-Generation sequencing and custom-made microarray technologies to develop an unbiased characterization of the transcriptomic response of mussels during early stages of a DSP bloom. Methods. Mussel specimens were exposed to a HAB episode simulating an early stage DSP bloom (200 cells/L of the dinoflagellate Prorocentrum lima for 24 h). The unbiased characterization of the transcriptomic responses triggered by OA was carried out using two complementary methods of cDNA library preparation: normalized and Suppression Subtractive Hybridization (SSH). Libraries were sequenced and read datasets were mapped to Gene Ontology and KEGG databases. A custom-made oligonucleotide microarray was developed based on these data, completing the expression analysis of digestive gland and gill tissues. Results. Our findings show that exposure to sublethal concentrations of OA is enough to induce gene expression modifications in the mussel Mytilus. Transcriptomic analyses revealed an increase in proteasomal activity, molecular transport, cell cycle regulation, energy production and immune activity in mussels. Oppositely, a number of transcripts hypothesized to be responsive to OA (notably the Serine/Threonine phosphatases PP1 and PP2A) failed to show substantial modifications. Both digestive gland and gill tissues responded similarly to OA, although expression modifications were more dramatic in the former, supporting the choice of this tissue for future biomonitoring studies. Discussion. Exposure to OA concentrations within legal limits for safe consumption of shellfish is enough to disrupt important cellular processes in mussels, eliciting sharp transcriptional changes as a result. By combining the study of cDNA libraries and a custom-made OA-specific microarray, our work provides a comprehensive characterization of the OA-specific transcriptome, improving the accuracy of the analysis of expresion profiles compared to single-replicated RNA-seq methods. The combination of our data with related studies helps understanding the molecular mechanisms underlying molecular responses to DSP episodes in marine organisms, providing useful information to develop a new generation of tools for the monitoring of OA pollution.
- Published
- 2015
- Full Text
- View/download PDF
27. Evolutionary computation and QSAR research.
- Author
-
Aguiar-Pulido V, Gestal M, Cruz-Monteagudo M, Rabuñal JR, Dorado J, and Munteanu CR
- Subjects
- Algorithms, Drug Design, Artificial Intelligence, Quantitative Structure-Activity Relationship
- Abstract
The successful high throughput screening of molecule libraries for a specific biological property is one of the main improvements in drug discovery. The virtual molecular filtering and screening relies greatly on quantitative structure-activity relationship (QSAR) analysis, a mathematical model that correlates the activity of a molecule with molecular descriptors. QSAR models have the potential to reduce the costly failure of drug candidates in advanced (clinical) stages by filtering combinatorial libraries, eliminating candidates with a predicted toxic effect and poor pharmacokinetic profiles, and reducing the number of experiments. To obtain a predictive and reliable QSAR model, scientists use methods from various fields such as molecular modeling, pattern recognition, machine learning or artificial intelligence. QSAR modeling relies on three main steps: molecular structure codification into molecular descriptors, selection of relevant variables in the context of the analyzed activity, and search of the optimal mathematical model that correlates the molecular descriptors with a specific activity. Since a variety of techniques from statistics and artificial intelligence can aid variable selection and model building steps, this review focuses on the evolutionary computation methods supporting these tasks. Thus, this review explains the basic of the genetic algorithms and genetic programming as evolutionary computation approaches, the selection methods for high-dimensional data in QSAR, the methods to build QSAR models, the current evolutionary feature selection methods and applications in QSAR and the future trend on the joint or multi-task feature selection methods.
- Published
- 2013
- Full Text
- View/download PDF
28. The CHROMEVALOA database: a resource for the evaluation of Okadaic Acid contamination in the marine environment based on the chromatin-associated transcriptome of the mussel Mytilus galloprovincialis.
- Author
-
Suárez-Ulloa V, Fernández-Tajes J, Aguiar-Pulido V, Rivera-Casas C, González-Romero R, Ausio J, Méndez J, Dorado J, and Eirín-López JM
- Subjects
- Animals, Carcinogens analysis, Carcinogens isolation & purification, Carcinogens toxicity, Chromatin metabolism, Environmental Monitoring methods, Humans, Mutagenicity Tests methods, Mutagens isolation & purification, Mutagens toxicity, Okadaic Acid toxicity, Sequence Analysis, DNA, Transcriptome, Databases, Factual, Mutagens analysis, Mytilus genetics, Okadaic Acid analysis
- Abstract
Okadaic Acid (OA) constitutes the main active principle in Diarrhetic Shellfish Poisoning (DSP) toxins produced during Harmful Algal Blooms (HABs), representing a serious threat for human consumers of edible shellfish. Furthermore, OA conveys critical deleterious effects for marine organisms due to its genotoxic potential. Many efforts have been dedicated to OA biomonitoring during the last three decades. However, it is only now with the current availability of detailed molecular information on DNA organization and the mechanisms involved in the maintenance of genome integrity, that a new arena starts opening up for the study of OA contamination. In the present work we address the links between OA genotoxicity and chromatin by combining Next Generation Sequencing (NGS) technologies and bioinformatics. To this end, we introduce CHROMEVALOAdb, a public database containing the chromatin-associated transcriptome of the mussel Mytilus galloprovincialis (a sentinel model organism) in response to OA exposure. This resource constitutes a leap forward for the development of chromatin-based biomarkers, paving the road towards the generation of powerful and sensitive tests for the detection and evaluation of the genotoxic effects of OA in coastal areas.
- Published
- 2013
- Full Text
- View/download PDF
29. Biomedical data integration in computational drug design and bioinformatics.
- Author
-
Seoane JA, Aguiar-Pulido V, Munteanu CR, Rivero D, Rabunal JR, Dorado J, and Pazos A
- Subjects
- Animals, Humans, Internet, Computational Biology methods, Computer-Aided Design, Databases, Factual, Drug Design
- Abstract
In recent years, in the post genomic era, more and more data is being generated by biological high throughput technologies, such as proteomics and transcriptomics. This omics data can be very useful, but the real challenge is to analyze all this data, as a whole, after integrating it. Biomedical data integration enables making queries to different, heterogeneous and distributed biomedical data sources. Data integration solutions can be very useful not only in the context of drug design, but also in biomedical information retrieval, clinical diagnosis, system biology, etc. In this review, we analyze the most common approaches to biomedical data integration, such as federated databases, data warehousing, multi-agent systems and semantic technology, as well as the solutions developed using these approaches in the past few years.
- Published
- 2013
30. Random Forest classification based on star graph topological indices for antioxidant proteins.
- Author
-
Fernández-Blanco E, Aguiar-Pulido V, Munteanu CR, and Dorado J
- Subjects
- Amino Acid Sequence, Databases, Protein, Molecular Sequence Data, Proteins chemistry, Quantitative Structure-Activity Relationship, ROC Curve, Algorithms, Antioxidants metabolism, Proteins metabolism
- Abstract
Aging and life quality is an important research topic nowadays in areas such as life sciences, chemistry, pharmacology, etc. People live longer, and, thus, they want to spend that extra time with a better quality of life. At this regard, there exists a tiny subset of molecules in nature, named antioxidant proteins that may influence the aging process. However, testing every single protein in order to identify its properties is quite expensive and inefficient. For this reason, this work proposes a model, in which the primary structure of the protein is represented using complex network graphs that can be used to reduce the number of proteins to be tested for antioxidant biological activity. The graph obtained as a representation will help us describe the complex system by using topological indices. More specifically, in this work, Randić's Star Networks have been used as well as the associated indices, calculated with the S2SNet tool. In order to simulate the existing proportion of antioxidant proteins in nature, a dataset containing 1999 proteins, of which 324 are antioxidant proteins, was created. Using this data as input, Star Graph Topological Indices were calculated with the S2SNet tool. These indices were then used as input to several classification techniques. Among the techniques utilised, the Random Forest has shown the best performance, achieving a score of 94% correctly classified instances. Although the target class (antioxidant proteins) represents a tiny subset inside the dataset, the proposed model is able to achieve a percentage of 81.8% correctly classified instances for this class, with a precision of 81.3%., (Copyright © 2012 Elsevier Ltd. All rights reserved.)
- Published
- 2013
- Full Text
- View/download PDF
31. Applied computational techniques on schizophrenia using genetic mutations.
- Author
-
Aguiar-Pulido V, Gestal M, Fernandez-Lozano C, Rivero D, and Munteanu CR
- Subjects
- Genotype, Humans, Mutation, Neural Networks, Computer, Polymorphism, Single Nucleotide genetics, Computational Biology, Schizophrenia genetics
- Abstract
Schizophrenia is a complex disease, with both genetic and environmental influence. Machine learning techniques can be used to associate different genetic variations at different genes with a (schizophrenic or non-schizophrenic) phenotype. Several machine learning techniques were applied to schizophrenia data to obtain the results presented in this study. Considering these data, Quantitative Genotype - Disease Relationships (QDGRs) can be used for disease prediction. One of the best machine learning-based models obtained after this exhaustive comparative study was implemented online; this model is an artificial neural network (ANN). Thus, the tool offers the possibility to introduce Single Nucleotide Polymorphism (SNP) sequences in order to classify a patient with schizophrenia. Besides this comparative study, a method for variable selection, based on ANNs and evolutionary computation (EC), is also presented. This method uses half the number of variables as the original ANN and the variables obtained are among those found in other publications. In the future, QDGR models based on nucleic acid information could be expanded to other diseases.
- Published
- 2013
- Full Text
- View/download PDF
32. Exploring patterns of epigenetic information with data mining techniques.
- Author
-
Aguiar-Pulido V, Seoane JA, Gestal M, and Dorado J
- Subjects
- Aging, Animals, Computational Biology methods, Databases, Factual, Gene Expression, Genome-Wide Association Study methods, High-Throughput Screening Assays methods, Humans, Mutation, Artificial Intelligence, Data Mining methods, Epigenesis, Genetic
- Abstract
Data mining, a part of the Knowledge Discovery in Databases process (KDD), is the process of extracting patterns from large data sets by combining methods from statistics and artificial intelligence with database management. Analyses of epigenetic data have evolved towards genome-wide and high-throughput approaches, thus generating great amounts of data for which data mining is essential. Part of these data may contain patterns of epigenetic information which are mitotically and/or meiotically heritable determining gene expression and cellular differentiation, as well as cellular fate. Epigenetic lesions and genetic mutations are acquired by individuals during their life and accumulate with ageing. Both defects, either together or individually, can result in losing control over cell growth and, thus, causing cancer development. Data mining techniques could be then used to extract the previous patterns. This work reviews some of the most important applications of data mining to epigenetics.
- Published
- 2013
33. Naïve Bayes QSDR classification based on spiral-graph Shannon entropies for protein biomarkers in human colon cancer.
- Author
-
Aguiar-Pulido V, Munteanu CR, Seoane JA, Fernández-Blanco E, Pérez-Montoto LG, González-Díaz H, and Dorado J
- Subjects
- Amino Acid Sequence, Area Under Curve, Bayes Theorem, Biomarkers, Tumor analysis, Colonic Neoplasms diagnosis, Entropy, Humans, Molecular Sequence Data, Proteins analysis, Quantitative Structure-Activity Relationship, ROC Curve, Sequence Analysis, Protein methods, Biomarkers, Tumor chemistry, Colonic Neoplasms chemistry, Computational Biology methods, Models, Biological, Proteins chemistry
- Abstract
Fast cancer diagnosis represents a real necessity in applied medicine due to the importance of this disease. Thus, theoretical models can help as prediction tools. Graph theory representation is one option because it permits us to numerically describe any real system such as the protein macromolecules by transforming real properties into molecular graph topological indices. This study proposes a new classification model for proteins linked with human colon cancer by using spiral graph topological indices of protein amino acid sequences. The best quantitative structure-disease relationship model is based on eleven Shannon entropy indices. It was obtained with the Naïve Bayes method and shows excellent predictive ability (90.92%) for new proteins linked with this type of cancer. The statistical analysis confirms that this model allows diagnosing the absence of human colon cancer obtaining an area under receiver operating characteristic of 0.91. The methodology presented can be used for any type of sequential information such as any protein and nucleic acid sequence.
- Published
- 2012
- Full Text
- View/download PDF
34. Machine learning techniques for single nucleotide polymorphism--disease classification models in schizophrenia.
- Author
-
Aguiar-Pulido V, Seoane JA, Rabuñal JR, Dorado J, Pazos A, and Munteanu CR
- Subjects
- Base Sequence, Genetic Predisposition to Disease, Humans, Receptors, Dopamine D3 genetics, Receptors, Serotonin, 5-HT3 genetics, Research Design, Spain, Artificial Intelligence, Polymorphism, Single Nucleotide, Schizophrenia diagnosis, Schizophrenia genetics
- Abstract
Single nucleotide polymorphisms (SNPs) can be used as inputs in disease computational studies such as pattern searching and classification models. Schizophrenia is an example of a complex disease with an important social impact. The multiple causes of this disease create the need of new genetic or proteomic patterns that can diagnose patients using biological information. This work presents a computational study of disease machine learning classification models using only single nucleotide polymorphisms at the HTR2A and DRD3 genes from Galician (Northwest Spain) schizophrenic patients. These classification models establish for the first time, to the best knowledge of the authors, a relationship between the sequence of the nucleic acid molecule and schizophrenia (Quantitative Genotype-Disease Relationships) that can automatically recognize schizophrenia DNA sequences and correctly classify between 78.3-93.8% of schizophrenia subjects when using datasets which include simulated negative subjects and a linear artificial neural network.
- Published
- 2010
- Full Text
- View/download PDF
35. Retrieval and management of medical information from heterogeneous sources, for its integration in a medical record visualisation tool.
- Author
-
Cabarcos A, Sanchez T, Seoane JA, Aguiar-Pulido V, Freire A, Dorado J, and Pazos A
- Subjects
- Decision Support Systems, Clinical instrumentation, Decision Support Systems, Clinical organization & administration, Humans, Information Management instrumentation, Information Management organization & administration, Medical Record Linkage instrumentation, Medical Records Systems, Computerized instrumentation, Remote Consultation instrumentation, Remote Consultation methods, Systems Integration, User-Computer Interface, Medical Record Linkage methods, Medical Records Systems, Computerized organization & administration, Point-of-Care Systems organization & administration
- Abstract
Nowadays, medical practice needs, at the patient Point-of-Care (POC), personalised knowledge adjustable in each moment to the clinical needs of each patient, in order to provide support to decision-making processes, taking into account personalised information. To achieve this, adapting the hospital information systems is necessary. Thus, there is a need of computational developments capable of retrieving and integrating the large amount of biomedical information available today, managing the complexity and diversity of these systems. Hence, this paper describes a prototype which retrieves biomedical information from different sources, manages it to improve the results obtained and to reduce response time and, finally, integrates it so that it is useful for the clinician, providing all the information available about the patient at the POC. Moreover, it also uses tools which allow medical staff to communicate and share knowledge.
- Published
- 2010
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.