27 results on '"Bing, Pingping"'
Search Results
2. MNNMDA: Predicting human microbe-disease association via a method to minimize matrix nuclear norm
- Author
-
Liu, Haiyan, Bing, Pingping, Zhang, Meijun, Tian, Geng, Ma, Jun, Li, Haigang, Bao, Meihua, He, Kunhui, He, Jianjun, He, Binsheng, and Yang, Jialiang
- Published
- 2023
- Full Text
- View/download PDF
3. Evaluating the performance of dropout imputation and clustering methods for single-cell RNA sequencing data
- Author
-
Xu, Junlin, Cui, Lingyu, Zhuang, Jujuan, Meng, Yajie, Bing, Pingping, He, Binsheng, Tian, Geng, Kwok Pui, Choi, Wu, Taoyang, Wang, Bing, and Yang, Jialiang
- Published
- 2022
- Full Text
- View/download PDF
4. A machine learning framework to trace tumor tissue-of-origin of 13 types of cancer based on DNA somatic mutation
- Author
-
He, Bingsheng, Dai, Chan, Lang, Jidong, Bing, Pingping, Tian, Geng, Wang, Bo, and Yang, Jialiang
- Published
- 2020
- Full Text
- View/download PDF
5. Glutamate and aspartate alleviate testicular/epididymal oxidative stress by supporting antioxidant enzymes and immune defense systems in boars
- Author
-
Tang, Wenjie, Wu, Jian, Jin, Shunshun, He, Liuqin, Lin, Qinlu, Luo, Feijun, He, Xingguo, Feng, Yanzhong, He, Binsheng, Bing, Pingping, Li, Tiejun, and Yin, Yulong
- Published
- 2020
- Full Text
- View/download PDF
6. Compensation effects of coated cysteamine on meat quality, amino acid composition, fatty acid composition, mineral content in dorsal muscle and serum biochemical indices in finishing pigs offered reduced trace minerals diet
- Author
-
Bai, Miaomiao, Liu, Hongnan, Xu, Kang, Zhang, Xiaofeng, Deng, Baichuan, Tan, Chengquan, Deng, Jinping, Bing, Pingping, and Yin, Yulong
- Published
- 2019
- Full Text
- View/download PDF
7. Synchrosqueezing Transform Based on Frequency-Domain Gaussian-Modulated Linear Chirp Model for Seismic Time–Frequency Analysis.
- Author
-
Bing, Pingping, Liu, Wei, Zhang, Haoqi, Zhu, Li, Zhu, Guiping, Zhou, Jun, and He, Binsheng
- Subjects
- *
TIME-frequency analysis , *FOURIER transforms - Abstract
The synchrosqueezing transform (SST) has attracted much attention as a post-processing technique since it was proposed. In recent years, improvements to SST have been made. However, the existing methods are mainly based on the time-domain signal model, and the weak frequency modulation assumption for the components composing the signal is always taken into account. Thus, the signals characterized by a rapidly changing instantaneous frequency (IF) may fail to be adequately tackled. To address this problem, the paper presents a novel seismic time–frequency analysis method via synchrosqueezing transform where a frequency-domain Gaussian modulated linear chirp model is utilized to deduce the SST. The group delay (GD) rather than the IF estimator is implemented to compute an estimation of the ridge. Furthermore, a new synchrosqueezing operator is constructed to rearrange the energy around the ridge. A synthetic example verifies the efficiency and robustness of the proposed SST method, which generates better results than some classic time–frequency analysis (TFA) approaches, e.g., short-time Fourier transform (STFT) and STFT-based SST (FSST). A field dataset further demonstrates this method's potential in the delineation of subsurface geological structures. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
8. Dynamic bidding analysis in power market based on the supply function
- Author
-
Lai, Mingyong, Tong, Xiaojiao, Yang, Hongming, and Bing, Pingping
- Published
- 2009
- Full Text
- View/download PDF
9. DGHNE: network enhancement-based method in identifying disease-causing genes through a heterogeneous biomedical network.
- Author
-
He, Binsheng, Wang, Kun, Xiang, Ju, Bing, Pingping, Tang, Min, Tian, Geng, Guo, Cheng, Xu, Miao, and Yang, Jialiang
- Subjects
RECEIVER operating characteristic curves ,PARKINSON'S disease ,GENES ,GENE regulatory networks - Abstract
The identification of disease-causing genes is critical for mechanistic understanding of disease etiology and clinical manipulation in disease prevention and treatment. Yet the existing approaches in tackling this question are inadequate in accuracy and efficiency, demanding computational methods with higher identification power. Here, we proposed a new method called DGHNE to identify disease-causing genes through a heterogeneous biomedical network empowered by network enhancement. First, a disease–disease association network was constructed by the cosine similarity scores between phenotype annotation vectors of diseases, and a new heterogeneous biomedical network was constructed by using disease–gene associations to connect the disease–disease network and gene–gene network. Then, the heterogeneous biomedical network was further enhanced by using network embedding based on the Gaussian random projection. Finally, network propagation was used to identify candidate genes in the enhanced network. We applied DGHNE together with five other methods into the most updated disease–gene association database termed DisGeNet. Compared with all other methods, DGHNE displayed the highest area under the receiver operating characteristic curve and the precision-recall curve, as well as the highest precision and recall, in both the global 5-fold cross-validation and predicting new disease–gene associations. We further performed DGHNE in identifying the candidate causal genes of Parkinson's disease and diabetes mellitus, and the genes connecting hyperglycemia and diabetes mellitus. In all cases, the predicted causing genes were enriched in disease-associated gene ontology terms and Kyoto Encyclopedia of Genes and Genomes pathways, and the gene–disease associations were highly evidenced by independent experimental studies. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
10. Characteristic of homodrimane and maturity of organic matter in the low mature source rocks
- Author
-
Bing, Pingping, Cao, Siyuan, Lu, Jiaotong, and Wang, Zongjun
- Published
- 2012
- Full Text
- View/download PDF
11. Study on the Mechanism of Astragalus Polysaccharide in Treating Pulmonary Fibrosis Based on "Drug-Target-Pathway" Network.
- Author
-
Bing, Pingping, Zhou, Wenhu, and Tan, Songwen
- Subjects
PULMONARY fibrosis ,ASTRAGALUS (Plants) ,POLYSACCHARIDES ,MOLECULAR docking ,PROTEIN-protein interactions ,ANKLEBONE - Abstract
Pulmonary fibrosis is a chronic, progressive and irreversible heterogeneous disease of pulmonary interstitial tissue. Its incidence is increasing year by year in the world, and it will be further increased due to the pandemic of COVID-19. However, at present, there is no safe and effective treatment for this disease, so it is very meaningful to find drugs with high efficiency and less adverse reactions. The natural astragalus polysaccharide has the pharmacological effect of anti-pulmonary fibrosis with little toxic and side effects. At present, the mechanism of anti-pulmonary fibrosis of astragalus polysaccharide is not clear. Based on the network pharmacology and molecular docking method, this study analyzes the mechanism of Astragalus polysaccharides in treating pulmonary fibrosis, which provides a theoretical basis for its further clinical application. The active components of Astragalus polysaccharides were screened out by Swisstarget database, and the related targets of pulmonary fibrosis were screened out by GeneCards database. Protein-protein interaction network analysis and molecular docking were carried out to verify the docking affinity of active ingredients. At present, through screening, we have obtained 92 potential targets of Astragalus polysaccharides for treating pulmonary fibrosis, including 11 core targets. Astragalus polysaccharides has the characteristics of multi-targets and multi-pathways, and its mechanism of action may be through regulating the expression of VCAM1, RELA, CDK2, JUN, CDK1, HSP90AA1, NOS2, SOD1, CASP3, AHSA1, PTGER3 and other genes during the development of pulmonary fibrosis. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
12. A Facile Route to Pyrazolo[1,2‐a]cinnoline via Rhodium(III)‐catalyzed Annulation of Pyrazolidinoes and Iodonium Ylides.
- Author
-
Yang, Zi, Zhou, Yi, Li, Haigang, Lei, Jieni, Bing, Pingping, He, Binsheng, and Li, Yaqian
- Subjects
YLIDES ,RHODIUM ,ANNULATION ,FUNCTIONAL groups ,RING formation (Chemistry) - Abstract
A rhodium(III)‐catalysed cascade C−H bond activation/intramolecular cyclization of pyrazolidiones with iodonium ylide has been described, leading to the formation of various pyrazolo[1,2‐a]cinnolines. Herein, iodonium ylide serves as a carbene precursor. This protocol exhibits good regioselectivity and high functional group tolerability under mild reaction conditions. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
13. Evaluating Cancer-Related Biomarkers Based on Pathological Images: A Systematic Review.
- Author
-
Xie, Xiaoliang, Wang, Xulin, Liang, Yuebin, Yang, Jingya, Wu, Yan, Li, Li, Sun, Xin, Bing, Pingping, He, Binsheng, Tian, Geng, and Shi, Xiaoli
- Subjects
TUMOR markers ,IMAGE segmentation ,BIOMARKERS ,COMPUTER-assisted image analysis (Medicine) ,IMAGE analysis - Abstract
Many diseases are accompanied by changes in certain biochemical indicators called biomarkers in cells or tissues. A variety of biomarkers, including proteins, nucleic acids, antibodies, and peptides, have been identified. Tumor biomarkers have been widely used in cancer risk assessment, early screening, diagnosis, prognosis, treatment, and progression monitoring. For example, the number of circulating tumor cell (CTC) is a prognostic indicator of breast cancer overall survival, and tumor mutation burden (TMB) can be used to predict the efficacy of immune checkpoint inhibitors. Currently, clinical methods such as polymerase chain reaction (PCR) and next generation sequencing (NGS) are mainly adopted to evaluate these biomarkers, which are time-consuming and expansive. Pathological image analysis is an essential tool in medical research, disease diagnosis and treatment, functioning by extracting important physiological and pathological information or knowledge from medical images. Recently, deep learning-based analysis on pathological images and morphology to predict tumor biomarkers has attracted great attention from both medical image and machine learning communities, as this combination not only reduces the burden on pathologists but also saves high costs and time. Therefore, it is necessary to summarize the current process of processing pathological images and key steps and methods used in each process, including: (1) pre-processing of pathological images, (2) image segmentation, (3) feature extraction, and (4) feature model construction. This will help people choose better and more appropriate medical image processing methods when predicting tumor biomarkers. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
14. Evaluation of the MGISEQ-2000 Sequencing Platform for Illumina Target Capture Sequencing Libraries.
- Author
-
Lang, Jidong, Zhu, Rongrong, Sun, Xue, Zhu, Siyu, Li, Tianbao, Shi, Xiaoli, Sun, Yanqi, Yang, Zhou, Wang, Weiwei, Bing, Pingping, He, Binsheng, and Tian, Geng
- Subjects
NUCLEOTIDE sequencing ,SUPPLY & demand ,SEQUENCE analysis ,DATA analysis - Abstract
Illumina is the leading sequencing platform in the next-generation sequencing (NGS) market globally. In recent years, MGI Tech has presented a series of new sequencers, including DNBSEQ-T7, MGISEQ-2000 and MGISEQ-200. As a complex application of NGS, cancer-detecting panels pose increasing demands for the high accuracy and sensitivity of sequencing and data analysis. In this study, we used the same capture DNA libraries constructed based on the Illumina protocol to evaluate the performance of the Illumina Nextseq500 and MGISEQ-2000 sequencing platforms. We found that the two platforms had high consistency in the results of hotspot mutation analysis; more importantly, we found that there was a significant loss of fragments in the 101–133 bp size range on the MGISEQ-2000 sequencing platform for Illumina libraries, but not for the capture DNA libraries prepared based on the MGISEQ protocol. This phenomenon may indicate fragment selection or low fragment ligation efficiency during the DNA circularization step, which is a unique step of the MGISEQ-2000 sequence platform. In conclusion, these different sequencing libraries and corresponding sequencing platforms are compatible with each other, but protocol and platform selection need to be carefully evaluated in combination with research purpose. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
15. A Review of Current In Silico Methods for Repositioning Drugs and Chemical Compounds.
- Author
-
He, Binsheng, Hou, Fangxing, Ren, Changjing, Bing, Pingping, and Xiao, Xiangzuo
- Subjects
DOSAGE forms of drugs ,NUCLEOTIDE sequencing ,ANTINEOPLASTIC agents ,TUMOR treatment ,DRUG utilization - Abstract
Drug repositioning is a new way of applying the existing therapeutics to new disease indications. Due to the exorbitant cost and high failure rate in developing new drugs, the continued use of existing drugs for treatment, especially anti-tumor drugs, has become a widespread practice. With the assistance of high-throughput sequencing techniques, many efficient methods have been proposed and applied in drug repositioning and individualized tumor treatment. Current computational methods for repositioning drugs and chemical compounds can be divided into four categories: (i) feature-based methods, (ii) matrix decomposition-based methods, (iii) network-based methods, and (iv) reverse transcriptome-based methods. In this article, we comprehensively review the widely used methods in the above four categories. Finally, we summarize the advantages and disadvantages of these methods and indicate future directions for more sensitive computational drug repositioning methods and individualized tumor treatment, which are critical for further experimental validation. [ABSTRACT FROM AUTHOR]
- Published
- 2021
- Full Text
- View/download PDF
16. Gene Coexpression Network and Module Analysis across 52 Human Tissues.
- Author
-
He, Binsheng, Xu, Junlin, Tian, Yingxiang, Liao, Bo, Lang, Jidong, Lin, Huixin, Mo, Xiaofei, Lu, Qingqing, Tian, Geng, and Bing, Pingping
- Subjects
TISSUE analysis ,FALLOPIAN tube analysis ,ADIPOSE tissues ,CELL lines ,CEREBELLUM ,FISHER exact test ,GENE expression ,KIDNEYS ,LIVER ,LUNGS ,METABOLISM ,MOLECULAR structure ,OMENTUM ,SKIN physiology ,UTERUS ,BIOINFORMATICS ,GENOTYPES - Abstract
Gene coexpression analysis is widely used to infer gene modules associated with diseases and other clinical traits. However, a systematic view and comparison of gene coexpression networks and modules across a cohort of tissues are more or less ignored. In this study, we first construct gene coexpression networks and modules of 52 GTEx tissues and cell lines. The network modules are enriched in many tissue-common functions like organelle membrane and tissue-specific functions. We then study the correlation of tissues from the network point of view. As a result, the network modules of most tissues are significantly correlated, indicating a general similar network pattern across tissues. However, the level of similarity among the tissues is different. The tissues closing in a physical location seem to be more similar in their coexpression networks. For example, the two adjacent tissues fallopian tube and bladder have the highest Fisher's exact test p value 8.54 E -291 among all tissue pairs. It is known that immune-associated modules are frequently identified in coexperssion modules. In this study, we found immune modules in many tissues like liver, kidney cortex, lung, uterus, adipose subcutaneous, and adipose visceral omentum. However, not all tissues have immune-associated modules, for example, brain cerebellum. Finally, by the clique analysis, we identify the largest clique of modules, in which the genes in each module are significantly overlapped with those in other modules. As a result, we are able to find a clique of size 40 (out of 52 tissues), indicating a strong correlation of modules across tissues. It is not surprising that the 40 modules are most commonly enriched in immune-related functions. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
17. Fragment Enrichment of Circulating Tumor DNA With Low-Frequency Mutations.
- Author
-
Liu, Xiaojun, Lang, Jidong, Li, Shijun, Wang, Yuehua, Peng, Lihong, Wang, Weitao, Han, Yingmin, Qi, Cuixiao, Song, Lei, Yang, Shuangshuang, Zhang, Kaixin, Zang, Guoliang, Pei, Hong, Lu, Qingqing, Peng, Yonggang, Xi, Shuxue, Wang, Weiwei, Yuan, Dawei, Bing, Pingping, and Zhou, Liqian
- Subjects
CELL-free DNA ,GENE frequency ,HEMATOLOGIC malignancies - Abstract
Human blood contains cell-free DNA (cfDNA), with circulating tumor-derived DNAs (ctDNAs) widely used in cancer diagnosis and treatment. However, it is still difficult to efficiently and accurately identify and distinguish specific ctDNAs from normal cfDNA in cancer patient blood samples. In this study, ctDNA fragment length distribution analysis showed that ctDNA fragments are frequently shorter than the normal cfDNAs, which is consistent with previous findings. Interestingly, the ctDNA fragment length was found to be partially associated with the mutant allele frequency, with a low mutant allele frequency (< ~0.6%) associated with a longer ctDNA fragment length when compared to normal cfDNAs. The findings of this study contribute to improving the detection of low-frequency tumor mutations. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
18. Prediction of Protein Subcellular Localization Based on Fusion of Multi-view Features.
- Author
-
Li, Bo, Cai, Lijun, Liao, Bo, Fu, Xiangzheng, Bing, Pingping, Yang, Jialiang, and Zou, Quan
- Subjects
PROTEINS ,PROTEIN-protein interactions ,AMINO acids ,AMINO acid sequence ,DIPEPTIDES - Abstract
The prediction of protein subcellular localization is critical for inferring protein functions, gene regulations and protein-protein interactions. With the advances of high-throughput sequencing technologies and proteomic methods, the protein sequences of numerous yeasts have become publicly available, which enables us to computationally predict yeast protein subcellular localization. However, widely-used protein sequence representation techniques, such as amino acid composition and the Chou's pseudo amino acid composition (PseAAC), are difficult in extracting adequate information about the interactions between residues and position distribution of each residue. Therefore, it is still urgent to develop novel sequence representations. In this study, we have presented two novel protein sequence representation techniques including Generalized Chaos Game Representation (GCGR) based on the frequency and distributions of the residues in the protein primary sequence, and novel statistics and information theory (NSI) reflecting local position information of the sequence. In the GCGR + NSI representation, a protein primary sequence is simply represented by a 5-dimensional feature vector, while other popular methods like PseAAC and dipeptide adopt features of more than hundreds of dimensions. In practice, the feature representation is highly efficient in predicting protein subcellular localization. Even without using machine learning-based classifiers, a simple model based on the feature vector can achieve prediction accuracies of 0.8825 and 0.7736 respectively for the CL317 and ZW225 datasets. To further evaluate the effectiveness of the proposed encoding schemes, we introduce a multi-view features-based method to combine the two above-mentioned features with other well-known features including PseAAC and dipeptide composition, and use support vector machine as the classifier to predict protein subcellular localization. This novel model achieves prediction accuracies of 0.927 and 0.871 respectively for the CL317 and ZW225 datasets, better than other existing methods in the jackknife tests. The results suggest that the GCGR and NSI features are useful complements to popular protein sequence representations in predicting yeast protein subcellular localization. Finally, we validate a few newly predicted protein subcellular localizations by evidences from some published articles in authority journals and books. [ABSTRACT FROM AUTHOR]
- Published
- 2019
- Full Text
- View/download PDF
19. A novel approach for denoising electrocardiogram signals to detect cardiovascular diseases using an efficient hybrid scheme.
- Author
-
Bing P, Liu W, Zhai Z, Li J, Guo Z, Xiang Y, He B, and Zhu L
- Abstract
Background: Electrocardiogram (ECG) signals are inevitably contaminated with various kinds of noises during acquisition and transmission. The presence of noises may produce the inappropriate information on cardiac health, thereby preventing specialists from making correct analysis., Methods: In this paper, an efficient strategy is proposed to denoise ECG signals, which employs a time-frequency framework based on S-transform (ST) and combines bi-dimensional empirical mode decomposition (BEMD) and non-local means (NLM). In the method, the ST maps an ECG signal into a subspace in the time frequency domain, then the BEMD decomposes the ST-based time-frequency representation (TFR) into a series of sub-TFRs at different scales, finally the NLM removes noise and restores ECG signal characteristics based on structural self-similarity., Results: The proposed method is validated using numerous ECG signals from the MIT-BIH arrhythmia database, and several different types of noises with varying signal-to-noise (SNR) are taken into account. The experimental results show that the proposed technique is superior to the existing wavelet based approach and NLM filtering, with the higher SNR and structure similarity index measure (SSIM), the lower root mean squared error (RMSE) and percent root mean square difference (PRD)., Conclusions: The proposed method not only significantly suppresses the noise presented in ECG signals, but also preserves the characteristics of ECG signals better, thus, it is more suitable for ECG signals processing., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (© 2024 Bing, Liu, Zhai, Li, Guo, Xiang, He and Zhu.)
- Published
- 2024
- Full Text
- View/download PDF
20. Minimal residual disease (MRD) detection in solid tumors using circulating tumor DNA: a systematic review.
- Author
-
Zhu L, Xu R, Yang L, Shi W, Zhang Y, Liu J, Li X, Zhou J, and Bing P
- Abstract
Minimal residual disease (MRD) refers to a very small number of residual tumor cells in the body during or after treatment, representing the persistence of the tumor and the possibility of clinical progress. Circulating tumor DNA (ctDNA) is a DNA fragment actively secreted by tumor cells or released into the circulatory system during the process of apoptosis or necrosis of tumor cells, which emerging as a non-invasive biomarker to dynamically monitor the therapeutic effect and prediction of recurrence. The feasibility of ctDNA as MRD detection and the revolution in ctDNA-based liquid biopsies provides a potential method for cancer monitoring. In this review, we summarized the main methods of ctDNA detection (PCR-based Sequencing and Next-Generation Sequencing) and their advantages and disadvantages. Additionally, we reviewed the significance of ctDNA analysis to guide the adjuvant therapy and predict the relapse of lung, breast and colon cancer et al. Finally, there are still many challenges of MRD detection, such as lack of standardization, false-negatives or false-positives results make misleading, and the requirement of validation using large independent cohorts to improve clinical outcomes., Competing Interests: Authors LZ, YZ, JL were employed by Geneis Beijing Co. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2023 Zhu, Xu, Yang, Shi, Zhang, Liu, Li, Zhou and Bing.)
- Published
- 2023
- Full Text
- View/download PDF
21. Electrocardiogram classification using TSST-based spectrogram and ConViT.
- Author
-
Bing P, Liu Y, Liu W, Zhou J, and Zhu L
- Abstract
As an important auxiliary tool of arrhythmia diagnosis, Electrocardiogram (ECG) is frequently utilized to detect a variety of cardiovascular diseases caused by arrhythmia, such as cardiac mechanical infarction. In the past few years, the classification of ECG has always been a challenging problem. This paper presents a novel deep learning model called convolutional vision transformer (ConViT), which combines vision transformer (ViT) with convolutional neural network (CNN), for ECG arrhythmia classification, in which the unique soft convolutional inductive bias of gated positional self-attention (GPSA) layers integrates the superiorities of attention mechanism and convolutional architecture. Moreover, the time-reassigned synchrosqueezing transform (TSST), a newly developed time-frequency analysis (TFA) method where the time-frequency coefficients are reassigned in the time direction, is employed to sharpen pulse traits for feature extraction. Aiming at the class imbalance phenomena in the traditional ECG database, the smote algorithm and focal loss (FL) are used for data augmentation and minority-class weighting, respectively. The experiment using MIT-BIH arrhythmia database indicates that the overall accuracy of the proposed model is as high as 99.5%. Furthermore, the specificity (Spe), F1-Score and positive Matthews Correlation Coefficient (MCC) of supra ventricular ectopic beat (S) and ventricular ectopic beat (V) are all more than 94%. These results demonstrate that the proposed method is superior to most of the existing methods., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2022 Bing, Liu, Liu, Zhou and Zhu.)
- Published
- 2022
- Full Text
- View/download PDF
22. Upregulation of TIMM8A is correlated with prognosis and immune regulation in BC.
- Author
-
Zhang Y, Lin L, Wu Y, Bing P, Zhou J, and Yu W
- Abstract
Backgrounds: Breast cancer is a common malignant tumors in women. TIMM8A was up-regulated in different cancers. The aim of this work was to clarify the value of TIMM8A in the diagnosis, prognosis of Breast Cancer (BC), and its association with immune cells and immune detection points. Gene mutations., Methods: The transcription and expression profile of TIMM8A between BC and normal tissues was downloaded from The Cancer Genome atlas (TCGA). The expression of TIMM8A protein was evaluated by human protein map. The correlation between TIMM8A and clinical features was analyzed using the R package to establish a ROC diagnostic curve. cBioPortal and MethSurv were used to identify gene alterations and DNA methylation and their effects on prognosis. The tumor immune estimation resource (TIMER) database and tumor immune system interaction database (TISIDB) database were used to determine the relationship between TIMM8A gene expression levels and immune infiltration. The CTD database was used to predict related drugs that inhibit TIMM8A, and the PubChem database was used to determine the molecular structure of potentially effective drug small molecules., Results: The expression of TIMM8A in breast cancer tissues was significantly higher than that in normally adjacent tissues to cancer. ROC curve analysis showed that the AUC value of TIMM8A was 0.679. Kaplan-Meier method showed that patients with high TIMM8A had a lower prognosis (Overall Survival HR = 1.83 (1.31 - 2.54), P < 0.001) than patients with low TIMM8A expression of breast cancer (148.5 months vs. 115.4 months, P < 0.001). Methylation levels at seven CpG were associated with prognosis. Correlation analysis showed that TIMM8A expression was associated with tumor immune cell infiltration. There was a significant positive correlation of TIMM8A with PDL-1, and CTLA-4 in BC. In addition, CTD database analysis identified 15 small molecular drugs that target TIMM8A, such as Cyclosporine, Leflunomide, and Tretinoin, which might be effective therapies for targeted inhibition of TIMM8A., Conclusion: In breast cancer, up-regulated TIMM 8A was significantly related to lower survival rate and higher immune invasiveness. Our research showed that TIMM 8A could be used as a biomarker for poor prognosis of breast cancer and a potential target of immunotherapy., Competing Interests: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2022 Zhang, Lin, Wu, Bing, Zhou and Yu.)
- Published
- 2022
- Full Text
- View/download PDF
23. Evaluating DNA Methylation, Gene Expression, Somatic Mutation, and Their Combinations in Inferring Tumor Tissue-of-Origin.
- Author
-
Liu H, Qiu C, Wang B, Bing P, Tian G, Zhang X, Ma J, He B, and Yang J
- Abstract
Carcinoma of unknown primary (CUP) is a type of metastatic cancer, the primary tumor site of which cannot be identified. CUP occupies approximately 5% of cancer incidences in the United States with usually unfavorable prognosis, making it a big threat to public health. Traditional methods to identify the tissue-of-origin (TOO) of CUP like immunohistochemistry can only deal with around 20% CUP patients. In recent years, more and more studies suggest that it is promising to solve the problem by integrating machine learning techniques with big biomedical data involving multiple types of biomarkers including epigenetic, genetic, and gene expression profiles, such as DNA methylation. Different biomarkers play different roles in cancer research; for example, genomic mutations in a patient's tumor could lead to specific anticancer drugs for treatment; DNA methylation and copy number variation could reveal tumor tissue of origin and molecular classification. However, there is no systematic comparison on which biomarker is better at identifying the cancer type and site of origin. In addition, it might also be possible to further improve the inference accuracy by integrating multiple types of biomarkers. In this study, we used primary tumor data rather than metastatic tumor data. Although the use of primary tumors may lead to some biases in our classification model, their tumor-of-origins are known. In addition, previous studies have suggested that the CUP prediction model built from primary tumors could efficiently predict TOO of metastatic cancers (Lal et al., 2013; Brachtel et al., 2016). We systematically compared the performances of three types of biomarkers including DNA methylation, gene expression profile, and somatic mutation as well as their combinations in inferring the TOO of CUP patients. First, we downloaded the gene expression profile, somatic mutation and DNA methylation data of 7,224 tumor samples across 21 common cancer types from the cancer genome atlas (TCGA) and generated seven different feature matrices through various combinations. Second, we performed feature selection by the Pearson correlation method. The selected features for each matrix were used to build up an XGBoost multi-label classification model to infer cancer TOO, an algorithm proven to be effective in a few previous studies. The performance of each biomarker and combination was compared by the 10-fold cross-validation process. Our results showed that the TOO tracing accuracy using gene expression profile was the highest, followed by DNA methylation, while somatic mutation performed the worst. Meanwhile, we found that simply combining multiple biomarkers does not have much effect in improving prediction accuracy., Competing Interests: BW, GT, and JY were employed by the company Genesis Beijing Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest., (Copyright © 2021 Liu, Qiu, Wang, Bing, Tian, Zhang, Ma, He and Yang.)
- Published
- 2021
- Full Text
- View/download PDF
24. A New Method for CTC Images Recognition Based on Machine Learning.
- Author
-
He B, Lu Q, Lang J, Yu H, Peng C, Bing P, Li S, Zhou Q, Liang Y, and Tian G
- Abstract
Circulating tumor cells (CTCs) derived from primary tumors and/or metastatic tumors are markers for tumor prognosis, and can also be used to monitor therapeutic efficacy and tumor recurrence. Circulating tumor cells enrichment and screening can be automated, but the final counting of CTCs currently requires manual intervention. This not only requires the participation of experienced pathologists, but also easily causes artificial misjudgment. Medical image recognition based on machine learning can effectively reduce the workload and improve the level of automation. So, we use machine learning to identify CTCs. First, we collected the CTC test results of 600 patients. After immunofluorescence staining, each picture presented a positive CTC cell nucleus and several negative controls. The images of CTCs were then segmented by image denoising, image filtering, edge detection, image expansion and contraction techniques using python's openCV scheme. Subsequently, traditional image recognition methods and machine learning were used to identify CTCs. Machine learning algorithms are implemented using convolutional neural network deep learning networks for training. We took 2300 cells from 600 patients for training and testing. About 1300 cells were used for training and the others were used for testing. The sensitivity and specificity of recognition reached 90.3 and 91.3%, respectively. We will further revise our models, hoping to achieve a higher sensitivity and specificity., (Copyright © 2020 He, Lu, Lang, Yu, Peng, Bing, Li, Zhou, Liang and Tian.)
- Published
- 2020
- Full Text
- View/download PDF
25. A Neural Network Framework for Predicting the Tissue-of-Origin of 15 Common Cancer Types Based on RNA-Seq Data.
- Author
-
He B, Zhang Y, Zhou Z, Wang B, Liang Y, Lang J, Lin H, Bing P, Yu L, Sun D, Luo H, Yang J, and Tian G
- Abstract
Sequencing-based identification of tumor tissue-of-origin (TOO) is critical for patients with cancer of unknown primary lesions. Even if the TOO of a tumor can be diagnosed by clinicopathological observation, reevaluations by computational methods can help avoid misdiagnosis. In this study, we developed a neural network (NN) framework using the expression of a 150-gene panel to infer the tumor TOO for 15 common solid tumor cancer types, including lung, breast, liver, colorectal, gastroesophageal, ovarian, cervical, endometrial, pancreatic, bladder, head and neck, thyroid, prostate, kidney, and brain cancers. To begin with, we downloaded the RNA-Seq data of 7,460 primary tumor samples across the above mentioned 15 cancer types, with each type of cancer having between 142 and 1,052 samples, from the cancer genome atlas. Then, we performed feature selection by the Pearson correlation method and performed a 150-gene panel analysis; the genes were significantly enriched in the GO:2001242 Regulation of intrinsic apoptotic signaling pathway and the GO:0009755 Hormone-mediated signaling pathway and other similar functions. Next, we developed a novel NN model using the 150 genes to predict tumor TOO for the 15 cancer types. The average prediction sensitivity and precision of the framework are 93.36 and 94.07%, respectively, for the 7,460 tumor samples based on the 10-fold cross-validation; however, the prediction sensitivity and precision for a few specific cancers, like prostate cancer, reached 100%. We also tested the trained model on a 20-sample independent dataset with metastatic tumor, and achieved an 80% accuracy. In summary, we present here a highly accurate method to infer tumor TOO, which has potential clinical implementation., (Copyright © 2020 He, Zhang, Zhou, Wang, Liang, Lang, Lin, Bing, Yu, Sun, Luo, Yang and Tian.)
- Published
- 2020
- Full Text
- View/download PDF
26. Assessing the Impact of Data Preprocessing on Analyzing Next Generation Sequencing Data.
- Author
-
He B, Zhu R, Yang H, Lu Q, Wang W, Song L, Sun X, Zhang G, Li S, Yang J, Tian G, Bing P, and Lang J
- Abstract
Data quality control and preprocessing are often the first step in processing next-generation sequencing (NGS) data of tumors. Not only can it help us evaluate the quality of sequencing data, but it can also help us obtain high-quality data for downstream data analysis. However, by comparing data analysis results of preprocessing with Cutadapt, FastP, Trimmomatic, and raw sequencing data, we found that the frequency of mutation detection had some fluctuations and differences, and human leukocyte antigen (HLA) typing directly resulted in erroneous results. We think that our research had demonstrated the impact of data preprocessing steps on downstream data analysis results. We hope that it can promote the development or optimization of better data preprocessing methods, so that downstream information analysis can be more accurate., (Copyright © 2020 He, Zhu, Yang, Lu, Wang, Song, Sun, Zhang, Li, Yang, Tian, Bing and Lang.)
- Published
- 2020
- Full Text
- View/download PDF
27. TOOme: A Novel Computational Framework to Infer Cancer Tissue-of-Origin by Integrating Both Gene Mutation and Expression.
- Author
-
He B, Lang J, Wang B, Liu X, Lu Q, He J, Gao W, Bing P, Tian G, and Yang J
- Abstract
Metastatic cancers require further diagnosis to determine their primary tumor sites. However, the tissue-of-origin for around 5% tumors could not be identified by routine medical diagnosis according to a statistics in the United States. With the development of machine learning techniques and the accumulation of big cancer data from The Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO), it is now feasible to predict cancer tissue-of-origin by computational tools. Metastatic tumor inherits characteristics from its tissue-of-origin, and both gene expression profile and somatic mutation have tissue specificity. Thus, we developed a computational framework to infer tumor tissue-of-origin by integrating both gene mutation and expression (TOOme). Specifically, we first perform feature selection on both gene expressions and mutations by a random forest method. The selected features are then used to build up a multi-label classification model to infer cancer tissue-of-origin. We adopt a few popular multiple-label classification methods, which are compared by the 10-fold cross validation process. We applied TOOme to the TCGA data containing 7,008 non-metastatic samples across 20 solid tumors. Seventy four genes by gene expression profile and six genes by gene mutation are selected by the random forest process, which can be divided into two categories: (1) cancer type specific genes and (2) those expressed or mutated in several cancers with different levels of expression or mutation rates. Function analysis indicates that the selected genes are significantly enriched in gland development, urogenital system development, hormone metabolic process, thyroid hormone generation prostate hormone generation and so on. According to the multiple-label classification method, random forest performs the best with a 10-fold cross-validation prediction accuracy of 96%. We also use the 19 metastatic samples from TCGA and 256 cancer samples downloaded from GEO as independent testing data, for which TOOme achieves a prediction accuracy of 89%. The cross-validation validation accuracy is better than those using gene expression (i.e., 95%) and gene mutation (53%) alone. In conclusion, TOOme provides a quick yet accurate alternative to traditional medical methods in inferring cancer tissue-of-origin. In addition, the methods combining somatic mutation and gene expressions outperform those using gene expression or mutation alone., (Copyright © 2020 He, Lang, Wang, Liu, Lu, He, Gao, Bing, Tian and Yang.)
- Published
- 2020
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.