Author: "Jie-Huei Wang" / Search Limiters: Full Text - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Jie-Huei Wang"' showing total 17 results

Start Over Author "Jie-Huei Wang" Search Limiters Full Text

17 results on '"Jie-Huei Wang"'

1. Multicategory Survival Outcomes Classification via Overlapping Group Screening Process Based on Multinomial Logistic Regression Model With Application to TCGA Transcriptomic Data

Author: Jie-Huei Wang, Po-Lin Hou, and Yi-Hau Chen
Subjects: Neoplasms. Tumors. Oncology. Including cancer and carcinogens, RC254-282
Abstract: Objectives: Under the classification of multicategory survival outcomes of cancer patients, it is crucial to identify biomarkers that affect specific outcome categories. The classification of multicategory survival outcomes from transcriptomic data has been thoroughly investigated in computational biology. Nevertheless, several challenges must be addressed, including the ultra-high-dimensional feature space, feature contamination, and data imbalance, all of which contribute to the instability of the diagnostic model. Furthermore, although most methods achieve accurate predicted performance for binary classification with high-dimensional transcriptomic data, their extension to multi-class classification is not straightforward. Methods: We employ the One-versus-One strategy to transform multi-class classification into multiple binary classification, and utilize the overlapping group screening procedure with binary logistic regression to include pathway information for identifying important genes and gene-gene interactions for multicategory survival outcomes. Results: A series of simulation studies are conducted to compare the classification accuracy of our proposed approach with some existing machine learning methods. In practical data applications, we utilize the random oversampling procedure to tackle class imbalance issues. We then apply the proposed method to analyze transcriptomic data from various cancers in The Cancer Genome Atlas, such as kidney renal papillary cell carcinoma, lung adenocarcinoma, and head and neck squamous cell carcinoma. Our aim is to establish an accurate microarray-based multicategory cancer diagnosis model. The numerical results illustrate that the new proposal effectively enhances cancer diagnosis compared to approaches that neglect pathway information. Conclusions: We showcase the effectiveness of the proposed method in terms of class prediction accuracy through evaluations on simulated synthetic datasets as well as real dataset applications. We also identified the cancer-related gene-gene interaction biomarkers and reported the corresponding network structure. According to the identified major genes and gene-gene interactions, we can predict for each patient the probabilities that he/she belongs to each of the survival outcome classes.
Published: 2024
Full Text: View/download PDF

2. Effects of vitamin D in pregnancy on maternal and offspring health-related outcomes: An umbrella review of systematic review and meta-analyses

Author: Mei-Chun Chien, Chueh-Yi Huang, Jie-Huei Wang, Chia-Lung Shih, and Pensee Wu
Subjects: Nutritional diseases. Deficiency diseases, RC620-627
Abstract: Abstract Background Vitamin D deficiency has been linked with several adverse maternal and fetal outcomes. Objective To summarize systematic reviews and meta-analyses evaluating the effects of vitamin D deficiency and of vitamin D supplementation in pregnancy on maternal and offspring health-related outcomes. Methods Prior to conducting this umbrella review, we registered the protocol in PROSPERO (CRD42022368003). We conducted searches in PubMed, Embase, and Cochrane Library for systematic reviews and meta-analyses on vitamin D in pregnancy, from database inception to October 2, 2023. All outcomes related to vitamin D in pregnancy obtained from the systematic reviews and meta-analyses were extracted. Data Extraction: Two reviewers independently chose studies and collected information on health outcomes. The quality of the included articles’ methodology was assessed using AMSTAR 2 (A Measurement Tool to Assess Systematic Reviews–2). Results We identified 16 eligible systematic reviews and meta-analyses, which included 250,569 women. Our results demonstrated that vitamin D deficiency in pregnancy is associated with increased risk of preterm birth, small-for gestational age/low birth weight infants, recurrent miscarriage, bacterial vaginosis and gestational diabetes mellitus. Vitamin D supplementation in pregnancy increases birth weight, and reduces the risk of maternal pre-eclampsia, miscarriage, and vitamin D deficiency, fetal or neonatal mortality, as well as attention-deficit hyperactivity disorder, and autism spectrum disorder in childhood. In women with gestational diabetes mellitus, vitamin D supplementation in pregnancy can reduce the risk of maternal hyperbilirubinemia, polyhydramnios, macrosomia, fetal distress, and neonatal hospitalization. Conclusion Due to the association with adverse maternal and offspring health outcomes, we recommend the vitamin D status in pregnancy should be monitored, particularly in women at high risk of vitamin D deficiency. It is suggested that pregnant women take a dose of >400 IU/day of vitamin D supplementation during pregnancy to prevent certain adverse outcomes.
Published: 2024
Full Text: View/download PDF

3. Cancer Diagnosis by Gene-Environment Interactions via Combination of SMOTE-Tomek and Overlapped Group Screening Approaches with Application to Imbalanced TCGA Clinical and Genomic Data

Author: Jie-Huei Wang, Cheng-Yu Liu, You-Ruei Min, Zih-Han Wu, and Po-Lin Hou
Subjects: binary logistic regression, cancer diagnostic, gene-environment interaction, joint modeling, overlapping group screening, SMOTE-Tomek, Mathematics, QA1-939
Abstract: The complexity of cancer development involves intricate interactions among multiple biomarkers, such as gene-environment interactions. Utilizing microarray gene expression profile data for cancer classification is anticipated to be effective, thus drawing considerable interest in the fields of bioinformatics and computational biology. Due to the characteristics of genomic data, problems of high-dimensional interactions and noise interference do exist during the analysis process. When building cancer diagnosis models, we often face the dilemma of model adaptation errors due to an imbalance of data types. To mitigate the issues, we apply the SMOTE-Tomek procedure to rectify the imbalance problem. Following this, we utilize the overlapping group screening method alongside a binary logistic regression model to integrate gene pathway information, facilitating the identification of significant biomarkers associated with clinically imbalanced cancer or normal outcomes. Simulation studies across different imbalanced rates and gene structures validate our proposed method’s effectiveness, surpassing common machine learning techniques in terms of classification prediction accuracy. We also demonstrate that prediction performance improves with SMOTE-Tomek treatment compared to no imbalance treatment and SMOTE treatment across various imbalance rates. In the real-world application, we integrate clinical and gene expression data with prior pathway information. We employ SMOTE-Tomek and our proposed methods to identify critical biomarkers and gene-environment interactions linked to the imbalanced binary outcomes (cancer or normal) in patients from the Cancer Genome Atlas datasets of lung adenocarcinoma and breast invasive carcinoma. Our proposed method consistently achieves satisfactory classification accuracy. Additionally, we have identified biomarkers indicative of gene-environment interactions relevant to cancer and have provided corresponding estimates of odds ratios. Moreover, in high-dimensional imbalanced data, for achieving good prediction results, we recommend considering the order of balancing processing and feature screening.
Published: 2024
Full Text: View/download PDF

4. Analyzing Treatment Effect by Integrating Existing Propensity Score and Outcome Regressions with Heterogeneous Covariate Sets

Author: Yi-Hau Chen, Szu-Yuan Hsu, Jie-Huei Wang, and Chien-Chou Su
Subjects: data integration, multi-center study, missing covariate, treatment effect, Mathematics, QA1-939
Abstract: Analyzing treatment or exposure effect is a major research theme in scientific studies. In the current big-data era where multiple sources of data are available, it is of interest to perform a synthesized analysis of treatment effects by integrating information from different data sources or studies. However, studies may contain heterogeneous and incomplete covariate sets, and individual data therein may not be accessible. We apply and extend the generalized meta-analysis method to integrate summary results (e.g., regression coefficients) of outcome and treatment (propensity score, PS) regression analyses across different datasets that may contain heterogeneous covariate sets. The proposed integrated analysis utilizes a reference dataset, which contains data on the complete set of covariates. The asymptotic distribution for the proposed integrated estimator is established. Simulations reveal that the proposed estimator performs well. We apply the proposed method to obtain the causal effect of waist circumference on hypertension by integrating two existing outcomes and PS regression analyses with different sets of covariates.
Published: 2024
Full Text: View/download PDF

5. Overlapping group screening for detection of gene-environment interactions with application to TCGA high-dimensional survival genomic data

Author: Jie-Huei Wang, Kang-Hsin Wang, and Yi-Hau Chen
Subjects: Gene-environment interaction, Joint model, Lasso, Overlapping group screening, Survival prediction, TCGA, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background In the context of biomedical and epidemiological research, gene-environment (G-E) interaction is of great significance to the etiology and progression of many complex diseases. In high-dimensional genetic data, two general models, marginal and joint models, are proposed to identify important interaction factors. Most existing approaches for identifying G-E interactions are limited owing to the lack of robustness to outliers/contamination in response and predictor data. In particular, right-censored survival outcomes make the associated feature screening even challenging. In this article, we utilize the overlapping group screening (OGS) approach to select important G-E interactions related to clinical survival outcomes by incorporating the gene pathway information under a joint modeling framework. Results Simulation studies under various scenarios are carried out to compare the performances of our proposed method with some commonly used methods. In the real data applications, we use our proposed method to identify G-E interactions related to the clinical survival outcomes of patients with head and neck squamous cell carcinoma, and esophageal carcinoma in The Cancer Genome Atlas clinical survival genetic data, and further establish corresponding survival prediction models. Both simulation and real data studies show that our method performs well and outperforms existing methods in the G-E interaction selection, effect estimation, and survival prediction accuracy. Conclusions The OGS approach is useful for selecting important environmental factors, genes and G-E interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The same idea of the OGS approach can apply to other outcome models, such as the proportional odds survival time model, the logistic regression model for binary outcomes, and the multinomial logistic regression model for multi-class outcomes.
Published: 2022
Full Text: View/download PDF

6. Feature screening for survival trait with application to TCGA high-dimensional genomic data

Author: Jie-Huei Wang, Cai-Rong Li, and Po-Lin Hou
Subjects: Survival feature screening, High-dimensional genomic data, Network, Survival prediction, TCGA, Esophageal cancer, Medicine, Biology (General), QH301-705.5
Abstract: Background In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; however, few studies have systematically compared these methods. The primary purpose of this article is to conduct a series of simulation studies for systematic comparison; the second purpose of this article is to use these feature screening methods to further establish a more accurate prediction model for patient survival based on the survival genomic datasets of The Cancer Genome Atlas (TCGA). Results Simulation studies prove that network-adjusted feature screening measurement performs well and outperforms existing popular univariate independent feature screening methods. In the application of real data, we show that the proposed network-adjusted feature screening approach leads to more accurate survival prediction than alternative methods that do not account for gene-gene dependency information. We also use TCGA clinical survival genetic data to identify biomarkers associated with clinical survival outcomes in patients with various cancers including esophageal, pancreatic, head and neck squamous cell, lung, and breast invasive carcinomas. Conclusions These applications reveal advantages of the new proposed network-adjusted feature selection method over alternative methods that do not consider gene-gene dependency information. We also identify cancer-related genes that are almost detected in the literature. As a result, the network-based screening method is reliable and credible.
Published: 2022
Full Text: View/download PDF

7. Identification of Gene-Environment Interactions by Non-Parametric Kendall’s Partial Correlation with Application to TCGA Ultrahigh-Dimensional Survival Genomic Data

Author: Jie-Huei Wang and Chun-Ting Yang
Subjects: gene-environment interaction, kendall's correlation, marginal modeling, partial correlation, survival prediction, tcga, Biochemistry, QD415-436, Biology (General), QH301-705.5
Abstract: Background: In biomedical and epidemiological studies, gene-environment (G-E) interactions play an important role in the etiology and progression of many complex diseases. In ultra-high-dimensional survival genomic data, two common approaches (marginal and joint models) are proposed to determine important interaction biomarkers. Most existing methods for detecting G-E interactions (marginal Cox model and marginal accelerated failure time model) are limited by a lack of robustness to contamination/outliers in response outcome and prediction biomarkers. In particular, right-censored survival outcomes and ultra-high-dimensional feature space make relevant feature screening even more challenging. Methods: In this paper, we utilize the non-parametric Kendall’s partial correlation method to obtain pure correlation to determine the importance of G-E interactions concerning clinical survival data under a marginal modeling framework. Results: A series of simulated scenarios are conducted to compare the performance of our proposed method (Kendall’s partial correlation) with some commonly used methods (marginal Cox’s model, marginal accelerated failure time model, and censoring quantile partial correlation approach). In real data applications, we utilize Kendall’s partial correlation method to identify G-E interactions related to the clinical survival results of patients with esophageal, pancreatic, and lung carcinomas using The Cancer Genome Atlas clinical survival genetic data, and further establish survival prediction models. Conclusions: Overall, both simulation with medium censoring level and real data studies show that our method performs well and outperforms existing methods in the selection, estimation, and prediction accuracy of main and interacting biomarkers. These applications reveal the advantages of the non-parametric Kendall’s partial correlation approach over alternative semi-parametric marginal modeling methods. We also identified the cancer-related G-E interactions biomarkers and reported the corresponding coefficients with p-values.
Published: 2022
Full Text: View/download PDF

8. Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

Author: Jie-Huei Wang and Yi-Hau Chen
Subjects: Gene-gene interaction, Lasso, Overlapping group, Survival prediction, Computer applications to medicine. Medical informatics, R858-859.7, Biology (General), QH301-705.5
Abstract: Abstract Background The development of a disease is a complex process that may result from joint effects of multiple genes. In this article, we propose the overlapping group screening (OGS) approach to determining active genes and gene-gene interactions incorporating prior pathway information. The OGS method is developed to overcome the challenges in genome-wide data analysis that the number of the genes and gene-gene interactions is far greater than the sample size, and the pathways generally overlap with one another. The OGS method is further proposed for patients’ survival prediction based on gene expression data. Results Simulation studies demonstrate that the performance of the OGS approach in identifying the true main and interaction effects is good and the survival prediction accuracy of OGS with the Lasso penalty is better than the ordinary Lasso method. In real data analysis, we identify several significant genes and/or epistasis interactions that are associated with clinical survival outcomes of diffuse large B-cell lymphoma (DLBCL) and non-small-cell lung cancer (NSCLC) by utilizing prior pathway information from the KEGG pathway and the GO biological process databases, respectively. Conclusions The OGS approach is useful for selecting important genes and epistasis interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The OGS approach is generally applicable to various types of outcome data (quantitative, qualitative, censored event time data) and regression models (e.g. linear, logistic, and Cox’s regression models).
Published: 2018
Full Text: View/download PDF

9. Identification of Gene-Environment Interactions by Non-Parametric Kendall's Partial Correlation with Application to TCGA Ultrahigh-Dimensional Survival Genomic Data

Author: Chun-Ting Yang and Jie-Huei Wang
Subjects: Lung Neoplasms, General Immunology and Microbiology, Humans, Computer Simulation, Gene-Environment Interaction, General Medicine, Genomics, General Biochemistry, Genetics and Molecular Biology
Abstract: In biomedical and epidemiological studies, gene-environment (G-E) interactions play an important role in the etiology and progression of many complex diseases. In ultra-high-dimensional survival genomic data, two common approaches (marginal and joint models) are proposed to determine important interaction biomarkers. Most existing methods for detecting G-E interactions (marginal Cox model and marginal accelerated failure time model) are limited by a lack of robustness to contamination/outliers in response outcome and prediction biomarkers. In particular, right-censored survival outcomes and ultra-high-dimensional feature space make relevant feature screening even more challenging.In this paper, we utilize the non-parametric Kendall's partial correlation method to obtain pure correlation to determine the importance of G-E interactions concerning clinical survival data under a marginal modeling framework.A series of simulated scenarios are conducted to compare the performance of our proposed method (Kendall's partial correlation) with some commonly used methods (marginal Cox's model, marginal accelerated failure time model, and censoring quantile partial correlation approach). In real data applications, we utilize Kendall's partial correlation method to identify G-E interactions related to the clinical survival results of patients with esophageal, pancreatic, and lung carcinomas using The Cancer Genome Atlas clinical survival genetic data, and further establish survival prediction models.Overall, both simulation with medium censoring level and real data studies show that our method performs well and outperforms existing methods in the selection, estimation, and prediction accuracy of main and interacting biomarkers. These applications reveal the advantages of the non-parametric Kendall's partial correlation approach over alternative semi-parametric marginal modeling methods. We also identified the cancer-related G-E interactions biomarkers and reported the corresponding coefficients with
Published: 2022

10. Interaction screening by Kendall’s partial correlation for ultrahigh-dimensional data with survival trait

Author: Jie-Huei Wang and Yi-Hau Chen
Subjects: Statistics and Probability, Lung Neoplasms, Computer science, Correlation and dependence, Genome-wide association study, Interaction, Machine learning, computer.software_genre, 01 natural sciences, Biochemistry, 010104 statistics & probability, 03 medical and health sciences, Carcinoma, Non-Small-Cell Lung, Humans, 0101 mathematics, Molecular Biology, Partial correlation, Statistic, 030304 developmental biology, 0303 health sciences, Measure (data warehouse), business.industry, Computer Science Applications, Computational Mathematics, Identification (information), Phenotype, Computational Theory and Mathematics, Trait, Artificial intelligence, business, computer, Genome-Wide Association Study
Abstract: Motivation In gene expression and genome-wide association studies, the identification of interaction effects is an important and challenging issue owing to its ultrahigh-dimensional nature. In particular, contaminated data and right-censored survival outcome make the associated feature screening even challenging. Results In this article, we propose an inverse probability-of-censoring weighted Kendall’s tau statistic to measure association of a survival trait with biomarkers, as well as a Kendall’s partial correlation statistic to measure the relationship of a survival trait with an interaction variable conditional on the main effects. The Kendall’s partial correlation is then used to conduct interaction screening. Simulation studies under various scenarios are performed to compare the performance of our proposal with some commonly available methods. In the real data application, we utilize our proposed method to identify epistasis associated with the clinical survival outcomes of non-small-cell lung cancer, diffuse large B-cell lymphoma and lung adenocarcinoma patients. Both simulation and real data studies demonstrate that our method performs well and outperforms existing methods in identifying main and interaction biomarkers. Availability and implementation R-package ‘IPCWK’ is available to implement this method, together with a reference manual describing how to perform the ‘IPCWK’ package. Supplementary information Supplementary data are available at Bioinformatics online.
Published: 2020

11. Feature screening for survival trait with application to TCGA high-dimensional genomic data

Author: Jie-Huei Wang, Cai-Rong Li, and Po-Lin Hou
Subjects: General Neuroscience, General Medicine, General Agricultural and Biological Sciences, General Biochemistry, Genetics and Molecular Biology
Abstract: Background In high-dimensional survival genomic data, identifying cancer-related genes is a challenging and important subject in the field of bioinformatics. In recent years, many feature screening approaches for survival outcomes with high-dimensional survival genomic data have been developed; however, few studies have systematically compared these methods. The primary purpose of this article is to conduct a series of simulation studies for systematic comparison; the second purpose of this article is to use these feature screening methods to further establish a more accurate prediction model for patient survival based on the survival genomic datasets of The Cancer Genome Atlas (TCGA). Results Simulation studies prove that network-adjusted feature screening measurement performs well and outperforms existing popular univariate independent feature screening methods. In the application of real data, we show that the proposed network-adjusted feature screening approach leads to more accurate survival prediction than alternative methods that do not account for gene-gene dependency information. We also use TCGA clinical survival genetic data to identify biomarkers associated with clinical survival outcomes in patients with various cancers including esophageal, pancreatic, head and neck squamous cell, lung, and breast invasive carcinomas. Conclusions These applications reveal advantages of the new proposed network-adjusted feature selection method over alternative methods that do not consider gene-gene dependency information. We also identify cancer-related genes that are almost detected in the literature. As a result, the network-based screening method is reliable and credible.
Published: 2021

12. Network-adjusted Kendall's Tau Measure for Feature Screening with Application to High-dimensional Survival Genomic Data

Author: Yi-Hau Chen and Jie-Huei Wang
Subjects: Statistics and Probability, 0303 health sciences, Markov chain, Computer science, Kendall tau rank correlation coefficient, Feature selection, Multivariate normal distribution, computer.software_genre, 01 natural sciences, Biochemistry, Computer Science Applications, 010104 statistics & probability, 03 medical and health sciences, Computational Mathematics, Identification (information), Computational Theory and Mathematics, Lasso (statistics), Data mining, 0101 mathematics, Molecular Biology, computer, Statistic, 030304 developmental biology
Abstract: Motivation In high-dimensional genetic/genomic data, the identification of genes related to clinical survival trait is a challenging and important issue. In particular, right-censored survival outcomes and contaminated biomarker data make the relevant feature screening difficult. Several independence screening methods have been developed, but they fail to account for gene–gene dependency information, and may be sensitive to outlying feature data. Results We improve the inverse probability-of-censoring weighted (IPCW) Kendall’s tau statistic by using Google’s PageRank Markov matrix to incorporate feature dependency network information. Also, to tackle outlying feature data, the nonparanormal approach transforming the feature data to multivariate normal variates are utilized in the graphical lasso procedure to estimate the network structure in feature data. Simulation studies under various scenarios show that the proposed network-adjusted weighted Kendall’s tau approach leads to more accurate feature selection and survival prediction than the methods without accounting for feature dependency network information and outlying feature data. The applications on the clinical survival outcome data of diffuse large B-cell lymphoma and of The Cancer Genome Atlas lung adenocarcinoma patients demonstrate clearly the advantages of the new proposal over the alternative methods. Supplementary information Supplementary data are available at Bioinformatics online.
Published: 2020

13. Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

Author: Yi-Hau Chen and Jie-Huei Wang
Subjects: 0301 basic medicine, Lung Neoplasms, Databases, Factual, Feature vector, Computational biology, Biology, lcsh:Computer applications to medicine. Medical informatics, Biochemistry, 03 medical and health sciences, Lasso (statistics), Structural Biology, Predictive Value of Tests, Carcinoma, Non-Small-Cell Lung, Humans, Computer Simulation, Gene-gene interaction, Molecular Biology, Gene, lcsh:QH301-705.5, Event (probability theory), Survival prediction, Applied Mathematics, Gene Expression Profiling, Regression analysis, Epistasis, Genetic, Overlapping group, Computer Science Applications, Survival Rate, 030104 developmental biology, lcsh:Biology (General), Sample size determination, Genetic Loci, Epistasis, lcsh:R858-859.7, Lymphoma, Large B-Cell, Diffuse, DNA microarray, Lasso, Transcriptome, Algorithms
Abstract: Background The development of a disease is a complex process that may result from joint effects of multiple genes. In this article, we propose the overlapping group screening (OGS) approach to determining active genes and gene-gene interactions incorporating prior pathway information. The OGS method is developed to overcome the challenges in genome-wide data analysis that the number of the genes and gene-gene interactions is far greater than the sample size, and the pathways generally overlap with one another. The OGS method is further proposed for patients’ survival prediction based on gene expression data. Results Simulation studies demonstrate that the performance of the OGS approach in identifying the true main and interaction effects is good and the survival prediction accuracy of OGS with the Lasso penalty is better than the ordinary Lasso method. In real data analysis, we identify several significant genes and/or epistasis interactions that are associated with clinical survival outcomes of diffuse large B-cell lymphoma (DLBCL) and non-small-cell lung cancer (NSCLC) by utilizing prior pathway information from the KEGG pathway and the GO biological process databases, respectively. Conclusions The OGS approach is useful for selecting important genes and epistasis interactions in the ultra-high dimensional feature space. The prediction ability of OGS with the Lasso penalty is better than existing methods. The OGS approach is generally applicable to various types of outcome data (quantitative, qualitative, censored event time data) and regression models (e.g. linear, logistic, and Cox’s regression models).
Published: 2018

14. Additional file 1: of Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

Author: Jie-Huei Wang and Yi-Hau Chen
Abstract: The full detail and performances of the OGS approach for survival, continuous and binary outcomes, and settings where some of genes are shared by three groups (pathways). (DOC 317 kb)
Published: 2018
Full Text: View/download PDF

15. Additional file 3: of Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

Author: Jie-Huei Wang and Yi-Hau Chen
Abstract: A reference manual for the â OGSâ package. (PDF 77 kb)
Published: 2018
Full Text: View/download PDF

16. TSGSIS: a high-dimensional grouped variable selection approach for detection of whole-genome SNP-SNP interactions

Author: Jie-Huei Wang, Chao A. Hsiung, and Yao-Hwei Fang
Subjects: 0301 basic medicine, Statistics and Probability, Computer science, Single-nucleotide polymorphism, Genome-wide association study, Feature selection, Computational biology, 030105 genetics & heredity, Interaction, Biochemistry, Polymorphism, Single Nucleotide, Arthritis, Rheumatoid, 03 medical and health sciences, SNP, Humans, Computer Simulation, Genetic Predisposition to Disease, Molecular Biology, Genetic association, Models, Genetic, Epistasis, Genetic, Computer Science Applications, Computational Mathematics, Identification (information), 030104 developmental biology, Phenotype, Computational Theory and Mathematics, Epistasis, Algorithms, Software, Genome-Wide Association Study
Abstract: Motivation Identification of single nucleotide polymorphism (SNP) interactions is an important and challenging topic in genome-wide association studies (GWAS). Many approaches have been applied to detecting whole-genome interactions. However, these approaches to interaction analysis tend to miss causal interaction effects when the individual marginal effects are uncorrelated to trait, while their interaction effects are highly associated with the trait. Results A grouped variable selection technique, called two-stage grouped sure independence screening (TS-GSIS), is developed to study interactions that may not have marginal effects. The proposed TS-GSIS is shown to be very helpful in identifying not only causal SNP effects that are uncorrelated to trait but also their corresponding SNP–SNP interaction effects. The benefit of TS-GSIS are gaining detection of interaction effects by taking the joint information among the SNPs and determining the size of candidate sets in the model. Simulation studies under various scenarios are performed to compare performance of TS-GSIS and current approaches. We also apply our approach to a real rheumatoid arthritis (RA) dataset. Both the simulation and real data studies show that the TS-GSIS performs very well in detecting SNP–SNP interactions. Availability and implementation R-package is delivered through CRAN and is available at: https://cran.r-project.org/web/packages/TSGSIS/index.html. Supplementary information Supplementary data are available at Bioinformatics online.
Published: 2016

17. TSGSIS: a high-dimensional grouped variable selection approach for detection of whole-genome SNP-SNP interactions.

Author: Yao-Hwei Fang, Jie-Huei Wang, and Hsiung, Chao A.
Subjects: *SINGLE nucleotide polymorphisms, *GENOTYPE-environment interaction, *GENETIC polymorphisms, *PROTEIN-protein interactions, *HUMAN genome
Abstract: Motivation: Identification of single nucleotide polymorphism (SNP) interactions is an important and challenging topic in genome-wide association studies (GWAS). Many approaches have been applied to detecting whole-genome interactions. However, these approaches to interaction analysis tend to miss causal interaction effects when the individual marginal effects are uncorrelated to trait, while their interaction effects are highly associated with the trait. Results: A grouped variable selection technique, called two-stage grouped sure independence screening (TS-GSIS), is developed to study interactions that may not have marginal effects. The proposed TS-GSIS is shown to be very helpful in identifying not only causal SNP effects that are uncorrelated to trait but also their corresponding SNP-SNP interaction effects. The benefit of TSGSIS are gaining detection of interaction effects by taking the joint information among the SNPs and determining the size of candidate sets in the model. Simulation studies under various scenarios are performed to compare performance of TS-GSIS and current approaches. We also apply our approach to a real rheumatoid arthritis (RA) dataset. Both the simulation and real data studies show that the TS-GSIS performs very well in detecting SNP-SNP interactions. [ABSTRACT FROM AUTHOR]
Published: 2017
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

17 results on '"Jie-Huei Wang"'

1. Multicategory Survival Outcomes Classification via Overlapping Group Screening Process Based on Multinomial Logistic Regression Model With Application to TCGA Transcriptomic Data

2. Effects of vitamin D in pregnancy on maternal and offspring health-related outcomes: An umbrella review of systematic review and meta-analyses

3. Cancer Diagnosis by Gene-Environment Interactions via Combination of SMOTE-Tomek and Overlapped Group Screening Approaches with Application to Imbalanced TCGA Clinical and Genomic Data

4. Analyzing Treatment Effect by Integrating Existing Propensity Score and Outcome Regressions with Heterogeneous Covariate Sets

5. Overlapping group screening for detection of gene-environment interactions with application to TCGA high-dimensional survival genomic data

6. Feature screening for survival trait with application to TCGA high-dimensional genomic data

7. Identification of Gene-Environment Interactions by Non-Parametric Kendall’s Partial Correlation with Application to TCGA Ultrahigh-Dimensional Survival Genomic Data

8. Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

9. Identification of Gene-Environment Interactions by Non-Parametric Kendall's Partial Correlation with Application to TCGA Ultrahigh-Dimensional Survival Genomic Data

10. Interaction screening by Kendall’s partial correlation for ultrahigh-dimensional data with survival trait

11. Feature screening for survival trait with application to TCGA high-dimensional genomic data

12. Network-adjusted Kendall's Tau Measure for Feature Screening with Application to High-dimensional Survival Genomic Data

13. Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

14. Additional file 1: of Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

15. Additional file 3: of Overlapping group screening for detection of gene-gene interactions: application to gene expression profiles with survival trait

16. TSGSIS: a high-dimensional grouped variable selection approach for detection of whole-genome SNP-SNP interactions

17. TSGSIS: a high-dimensional grouped variable selection approach for detection of whole-genome SNP-SNP interactions.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

17 results on '"Jie-Huei Wang"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources