Author: "Tang, Buzhou" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Tang, Buzhou"' showing total 314 results

Start Over Author "Tang, Buzhou"

314 results on '"Tang, Buzhou"'

301. Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study.

Author: Xiong Y, Chen S, Chen Q, Yan J, and Tang B
Abstract: Background: With the popularity of electronic health records (EHRs), the quality of health care has been improved. However, there are also some problems caused by EHRs, such as the growing use of copy-and-paste and templates, resulting in EHRs of low quality in content. In order to minimize data redundancy in different documents, Harvard Medical School and Mayo Clinic organized a national natural language processing (NLP) clinical challenge (n2c2) on clinical semantic textual similarity (ClinicalSTS) in 2019. The task of this challenge is to compute the semantic similarity among clinical text snippets., Objective: In this study, we aim to investigate novel methods to model ClinicalSTS and analyze the results., Methods: We propose a semantically enhanced text matching model for the 2019 n2c2/Open Health NLP (OHNLP) challenge on ClinicalSTS. The model includes 3 representation modules to encode clinical text snippet pairs at different levels: (1) character-level representation module based on convolutional neural network (CNN) to tackle the out-of-vocabulary problem in NLP; (2) sentence-level representation module that adopts a pretrained language model bidirectional encoder representation from transformers (BERT) to encode clinical text snippet pairs; and (3) entity-level representation module to model clinical entity information in clinical text snippets. In the case of entity-level representation, we compare 2 methods. One encodes entities by the entity-type label sequence corresponding to text snippet (called entity I), whereas the other encodes entities by their representation in MeSH, a knowledge graph in the medical domain (called entity II)., Results: We conduct experiments on the ClinicalSTS corpus of the 2019 n2c2/OHNLP challenge for model performance evaluation. The model only using BERT for text snippet pair encoding achieved a Pearson correlation coefficient (PCC) of 0.848. When character-level representation and entity-level representation are individually added into our model, the PCC increased to 0.857 and 0.854 (entity I)/0.859 (entity II), respectively. When both character-level representation and entity-level representation are added into our model, the PCC further increased to 0.861 (entity I) and 0.868 (entity II)., Conclusions: Experimental results show that both character-level information and entity-level information can effectively enhance the BERT-based STS model., (©Ying Xiong, Shuai Chen, Qingcai Chen, Jun Yan, Buzhou Tang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 29.12.2020.)
Published: 2020
Full Text: View/download PDF

302. Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis.

Author: Wang X, Chen S, Li T, Li W, Zhou Y, Zheng J, Chen Q, Yan J, and Tang B
Abstract: Background: Depression is a serious personal and public mental health problem. Self-reporting is the main method used to diagnose depression and to determine the severity of depression. However, it is not easy to discover patients with depression owing to feelings of shame in disclosing or discussing their mental health conditions with others. Moreover, self-reporting is time-consuming, and usually leads to missing a certain number of cases. Therefore, automatic discovery of patients with depression from other sources such as social media has been attracting increasing attention. Social media, as one of the most important daily communication systems, connects large quantities of people, including individuals with depression, and provides a channel to discover patients with depression. In this study, we investigated deep-learning methods for depression risk prediction using data from Chinese microblogs, which have potential to discover more patients with depression and to trace their mental health conditions., Objective: The aim of this study was to explore the potential of state-of-the-art deep-learning methods on depression risk prediction from Chinese microblogs., Methods: Deep-learning methods with pretrained language representation models, including bidirectional encoder representations from transformers (BERT), robustly optimized BERT pretraining approach (RoBERTa), and generalized autoregressive pretraining for language understanding (XLNET), were investigated for depression risk prediction, and were compared with previous methods on a manually annotated benchmark dataset. Depression risk was assessed at four levels from 0 to 3, where 0, 1, 2, and 3 denote no inclination, and mild, moderate, and severe depression risk, respectively. The dataset was collected from the Chinese microblog Weibo. We also compared different deep-learning methods with pretrained language representation models in two settings: (1) publicly released pretrained language representation models, and (2) language representation models further pretrained on a large-scale unlabeled dataset collected from Weibo. Precision, recall, and F1 scores were used as performance evaluation measures., Results: Among the three deep-learning methods, BERT achieved the best performance with a microaveraged F1 score of 0.856. RoBERTa achieved the best performance with a macroaveraged F1 score of 0.424 on depression risk at levels 1, 2, and 3, which represents a new benchmark result on the dataset. The further pretrained language representation models demonstrated improvement over publicly released prediction models., Conclusions: We applied deep-learning methods with pretrained language representation models to automatically predict depression risk using data from Chinese microblogs. The experimental results showed that the deep-learning methods performed better than previous methods, and have greater potential to discover patients with depression and to trace their mental health conditions., (©Xiaofeng Wang, Shuai Chen, Tao Li, Wanting Li, Yejie Zhou, Jie Zheng, Qingcai Chen, Jun Yan, Buzhou Tang. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 29.07.2020.)
Published: 2020
Full Text: View/download PDF

303. A Method to Learn Embedding of a Probabilistic Medical Knowledge Graph: Algorithm Development.

Author: Li L, Wang P, Wang Y, Wang S, Yan J, Jiang J, Tang B, Wang C, and Liu Y
Abstract: Background: Knowledge graph embedding is an effective semantic representation method for entities and relations in knowledge graphs. Several translation-based algorithms, including TransE, TransH, TransR, TransD, and TranSparse, have been proposed to learn effective embedding vectors from typical knowledge graphs in which the relations between head and tail entities are deterministic. However, in medical knowledge graphs, the relations between head and tail entities are inherently probabilistic. This difference introduces a challenge in embedding medical knowledge graphs., Objective: We aimed to address the challenge of how to learn the probability values of triplets into representation vectors by making enhancements to existing TransX (where X is E, H, R, D, or Sparse) algorithms, including the following: (1) constructing a mapping function between the score value and the probability, and (2) introducing probability-based loss of triplets into the original margin-based loss function., Methods: We performed the proposed PrTransX algorithm on a medical knowledge graph that we built from large-scale real-world electronic medical records data. We evaluated the embeddings using link prediction task., Results: Compared with the corresponding TransX algorithms, the proposed PrTransX performed better than the TransX model in all evaluation indicators, achieving a higher proportion of corrected entities ranked in the top 10 and normalized discounted cumulative gain of the top 10 predicted tail entities, and lower mean rank., Conclusions: The proposed PrTransX successfully incorporated the uncertainty of the knowledge triplets into the embedding vectors., (©Linfeng Li, Peng Wang, Yao Wang, Shenghui Wang, Jun Yan, Jinpeng Jiang, Buzhou Tang, Chengliang Wang, Yuting Liu. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 21.05.2020.)
Published: 2020
Full Text: View/download PDF

304. Re-examination of Rule-Based Methods in Deidentification of Electronic Health Records: Algorithm Development and Validation.

Author: Zhao Z, Yang M, Tang B, and Zhao T
Abstract: Background: Deidentification of clinical records is a critical step before their publication. This is usually treated as a type of sequence labeling task, and ensemble learning is one of the best performing solutions. Under the framework of multi-learner ensemble, the significance of a candidate rule-based learner remains an open issue., Objective: The aim of this study is to investigate whether a rule-based learner is useful in a hybrid deidentification system and offer suggestions on how to build and integrate a rule-based learner., Methods: We chose a data-driven rule-learner named transformation-based error-driven learning (TBED) and integrated it into the best performing hybrid system in this task., Results: On the popular Informatics for Integrating Biology and the Bedside (i2b2) deidentification data set, experiments showed that TBED can offer high performance with its generated rules, and integrating the rule-based model into an ensemble framework, which reached an F1 score of 96.76%, achieved the best performance reported in the community., Conclusions: We proved the rule-based method offers an effective contribution to the current ensemble learning approach for the deidentification of clinical records. Such a rule system could be automatically learned by TBED, avoiding the high cost and low reliability of manual rule composition. In particular, we boosted the ensemble model with rules to create the best performance of the deidentification of clinical records., (©Zhenyu Zhao, Muyun Yang, Buzhou Tang, Tiejun Zhao. Originally published in JMIR Medical Informatics (http://medinform.jmir.org), 30.04.2020.)
Published: 2020
Full Text: View/download PDF

305. De-identification of Clinical Text via Bi-LSTM-CRF with Neural Language Models.

Author: Tang B, Jiang D, Chen Q, Wang X, Yan J, and Shen Y
Subjects: Deep Learning, Language, Data Anonymization, Natural Language Processing, Neural Networks, Computer
Abstract: De-identification of clinical text, the prerequisite of electronic clinical data reuse, is a typical named entity recogni tion (NER) problem. A number of state-of-the-art deep learning methods for NER, such as Bi-LSTM-CRF (bidirec tional long-short-term-memory conditional random fields), have been applied for de-identification. Neural language models used for language representation bring great improvement in lots of NLP tasks when they are integrated with other deep learning methods. In this paper, we introduce Bi-LSTM-CRF with neural language models for de- identification of clinical text, and evaluate it on the de-identification datasets of the i2b2 2014 and the CEGS N- GRID 2016 challenges. Four neural language models of three types individually integrated with Bi-LSTM-CRF are compared in this study. Bi-LSTM-CRF with neural language models achieves the highest "strict" micro-averaged F1-score of 95.50% on the i2b2 2014 dataset and 91.82% on the CEGS N-GRID 2016 dataset, becoming new benchmark results on these two datasets respectively Keywords: De-identification, Named entity recognition, Bidirectional long-short-term-memory, Conditional ran dom fields, Neural language models., (©2019 AMIA - All rights reserved.)
Published: 2020

306. A hybrid method of recurrent neural network and graph neural network for next-period prescription prediction.

Author: Liu S, Li T, Ding H, Tang B, Wang X, Chen Q, Yan J, and Zhou Y
Abstract: Electronic health records (EHRs) have been widely used to help physicians to make decisions by predicting medical events such as diseases, prescriptions, outcomes, and so on. How to represent patient longitudinal medical data is the key to making these predictions. Recurrent neural network (RNN) is a popular model for patient longitudinal medical data representation from the view of patient status sequences, but it cannot represent complex interactions among different types of medical information, i.e., temporal medical event graphs, which can be represented by graph neural network (GNN). In this paper, we propose a hybrid method of RNN and GNN, called RGNN, for next-period prescription prediction from two views, where RNN is used to represent patient status sequences, and GNN is used to represent temporal medical event graphs. Experiments conducted on the public MIMIC-III ICU data show that the proposed method is effective for next-period prescription prediction, and RNN and GNN are mutually complementary., (© Springer-Verlag GmbH Germany, part of Springer Nature 2020.)
Published: 2020
Full Text: View/download PDF

307. CBN: Constructing a clinical Bayesian network based on data from the electronic medical record.

Author: Shen Y, Zhang L, Zhang J, Yang M, Tang B, Li Y, and Lei K
Subjects: Algorithms, Data Collection, False Positive Reactions, Humans, Knowledge Bases, Odds Ratio, Probability, ROC Curve, Risk Factors, Software, Bayes Theorem, Electronic Health Records, Medical Informatics methods
Abstract: The process of learning candidate causal relationships involving diseases and symptoms from electronic medical records (EMRs) is the first step towards learning models that perform diagnostic inference directly from real healthcare data. However, the existing diagnostic inference systems rely on knowledge bases such as ontology that are manually compiled through a labour-intensive process or automatically derived using simple pairwise statistics. We explore CBN, a Clinical Bayesian Network construction for medical ontology probabilistic inference, to learn high-quality Bayesian topology and complete ontology directly from EMRs. Specifically, we first extract medical entity relationships from over 10,000 deidentified patient records and adopt the odds ratio (OR value) calculation and the K2 greedy algorithm to automatically construct a Bayesian topology. Then, Bayesian estimation is used for the probability distribution. Finally, we employ a Bayesian network to complete the causal relationship and probability distribution of ontology to enhance the ontology inference capability. By evaluating the learned topology versus the expert opinions of physicians and entropy calculations and by calculating the ontology-based diagnosis classification, our study demonstrates that the direct and automated construction of a high-quality health topology and ontology from medical records is feasible. Our results are reproducible, and we will release the source code and CN-Stroke knowledge graph of this work after publication. 1 ., (Copyright © 2018 Elsevier Inc. All rights reserved.)
Published: 2018
Full Text: View/download PDF

308. Usability Study of Mainstream Wearable Fitness Devices: Feature Analysis and System Usability Scale Evaluation.

Author: Liang J, Xian D, Liu X, Fu J, Zhang X, Tang B, and Lei J
Abstract: Background: Wearable devices have the potential to promote a healthy lifestyle because of their real-time data monitoring capabilities. However, device usability is a critical factor that determines whether they will be adopted on a large scale. Usability studies on wearable devices are still scarce., Objective: This study aims to compare the functions and attributes of seven mainstream wearable devices and to evaluate their usability., Methods: The wearable devices selected were the Apple Watch, Samsung Gear S, Fitbit Surge, Jawbone Up3, Mi Band, Huawei Honor B2, and Misfit Shine. A mixed method of feature comparison and a System Usability Scale (SUS) evaluation based on 388 participants was applied; the higher the SUS score, the better the usability of the product., Results: For features, all devices had step counting, an activity timer, and distance recording functions. The Samsung Gear S had a unique sports track recording feature and the Huawei Honor B2 had a unique wireless earphone. The Apple Watch, Samsung Gear S, Jawbone Up3, and Fitbit Surge could measure heart rate. All the devices were able to monitor sleep, except the Apple Watch. For product characteristics, including attributes such as weight, battery life, price, and 22 functions such as step counting, activity time, activity type identification, sleep monitoring, and expandable new features, we found a very weak negative correlation between the SUS scores and price (r=-.10, P=.03) and devices that support expandable new features (r=-.11, P=.02), and a very weak positive correlation between the SUS scores and devices that support the activity type identification function (r=.11, P=.02). The Huawei Honor B2 received the highest score of mean 67.6 (SD 16.1); the lowest Apple Watch score was only 61.4 (SD 14.7). No significant difference was observed among brands. The SUS score had a moderate positive correlation with the user's experience (length of time the device was used) (r=.32, P<.001); participants in the medical and health care industries gave a significantly higher score (mean 61.1, SD 17.9 vs mean 68.7, SD 14.5, P=.03)., Conclusions: The functions of wearable devices tend to be homogeneous and usability is similar across various brands. Overall, Mi Band had the lowest price and the lightest weight. Misfit Shine had the longest battery life and most functions, and participants in the medical and health care industries had the best evaluation of wearable devices. The perceived usability of mainstream wearable devices is unsatisfactory and customer loyalty is not high. A consumer's SUS rating for a wearable device is related to their personal situation instead of the device brand. Device manufacturers should put more effort into developing innovative functions and improving the usability of their products by integrating more cognitive behavior change techniques., (©Jun Liang, Deqiang Xian, Xingyu Liu, Jing Fu, Xingting Zhang, Buzhou Tang, Jianbo Lei. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 08.11.2018.)
Published: 2018
Full Text: View/download PDF

309. CMedTEX: A Rule-based Temporal Expression Extraction and Normalization System for Chinese Clinical Notes.

Author: Liu Z, Tang B, Wang X, Chen Q, Li H, Bu J, Jiang J, Deng Q, and Zhu S
Subjects: China, Humans, Information Storage and Retrieval, Time, Electronic Health Records, Natural Language Processing
Abstract: Time is an important aspect of information and is very useful for information utilization. The goal of this study was to analyze the challenges of temporal expression (TE) extraction and normalization in Chinese clinical notes by assessing the performance of a rule-based system developed by us on a manually annotated corpus (including 1,778 clinical notes of 281 hospitalized patients). In order to develop system conveniently, we divided TEs into three categories: direct, indirect and uncertain TEs, and designed different rules for each category of them. Evaluation on the independent test set shows that our system achieves an F-score of93.40% on TE extraction, and an accuracy of 92.58% on TE normalization under "exact-match" criterion. Compared with HeidelTime for Chinese newswire text, our system is much better, indicating that it is necessary to develop a specific TE extraction and normalization system for Chinese clinical notes because of domain difference.
Published: 2017

310. Recognizing Disjoint Clinical Concepts in Clinical Text Using Machine Learning-based Methods.

Author: Tang B, Chen Q, Wang X, Wu Y, Zhang Y, Jiang M, Wang J, and Xu H
Subjects: Algorithms, Humans, Semantics, Machine Learning, Natural Language Processing, Pattern Recognition, Automated
Abstract: Clinical concept recognition (CCR) is a fundamental task in clinical natural language processing (NLP) field. Almost all current machine learning-based CCR systems can only recognize clinical concepts of consecutive words (called consecutive clinical concepts, CCCs), but can do nothing about clinical concepts of disjoint words (called disjoint clinical concepts, DCCs), which widely exist in clinical text. In this paper, we proposed two novel types of representations for disjoint clinical concepts, and applied two state-of-the-art machine learning methods to recognizing consecutive and disjoint concepts. Experiments conducted on the 2013 ShARe/CLEF challenge corpus showed that our best system achieved a "strict" F-measure of 0.803 for CCCs, a "strict" F-measure of 0.477 for DCCs, and a "strict" F-measure of 0.783 for all clinical concepts, significantly higher than the baseline systems by 4.2% and 4.1% respectively.
Published: 2015

311. The CHEMDNER corpus of chemicals and drugs and its annotation principles.

Author: Krallinger M, Rabal O, Leitner F, Vazquez M, Salgado D, Lu Z, Leaman R, Lu Y, Ji D, Lowe DM, Sayle RA, Batista-Navarro RT, Rak R, Huber T, Rocktäschel T, Matos S, Campos D, Tang B, Xu H, Munkhdalai T, Ryu KH, Ramanan SV, Nathan S, Žitnik S, Bajec M, Weber L, Irmer M, Akhondi SA, Kors JA, Xu S, An X, Sikdar UK, Ekbal A, Yoshioka M, Dieb TM, Choi M, Verspoor K, Khabsa M, Giles CL, Liu H, Ravikumar KE, Lamurias A, Couto FM, Dai HJ, Tsai RT, Ata C, Can T, Usié A, Alves R, Segura-Bedmar I, Martínez P, Oyarzabal J, and Valencia A
Abstract: The automatic extraction of chemical information from text requires the recognition of chemical entity mentions as one of its key steps. When developing supervised named entity recognition (NER) systems, the availability of a large, manually annotated text corpus is desirable. Furthermore, large corpora permit the robust evaluation and comparison of different approaches that detect chemicals in documents. We present the CHEMDNER corpus, a collection of 10,000 PubMed abstracts that contain a total of 84,355 chemical entity mentions labeled manually by expert chemistry literature curators, following annotation guidelines specifically defined for this task. The abstracts of the CHEMDNER corpus were selected to be representative for all major chemical disciplines. Each of the chemical entity mentions was manually labeled according to its structure-associated chemical entity mention (SACEM) class: abbreviation, family, formula, identifier, multiple, systematic and trivial. The difficulty and consistency of tagging chemicals in text was measured using an agreement study between annotators, obtaining a percentage agreement of 91. For a subset of the CHEMDNER corpus (the test set of 3,000 abstracts) we provide not only the Gold Standard manual annotations, but also mentions automatically detected by the 26 teams that participated in the BioCreative IV CHEMDNER chemical mention recognition task. In addition, we release the CHEMDNER silver standard corpus of automatically extracted mentions from 17,000 randomly selected PubMed abstracts. A version of the CHEMDNER corpus in the BioC format has been generated as well. We propose a standard for required minimum information about entity annotations for the construction of domain specific corpora on chemical and drug entities. The CHEMDNER corpus and annotation guidelines are available at: http://www.biocreative.org/resources/biocreative-iv/chemdner-corpus/.
Published: 2015
Full Text: View/download PDF

312. A comparison of conditional random fields and structured support vector machines for chemical entity recognition in biomedical literature.

Author: Tang B, Feng Y, Wang X, Wu Y, Zhang Y, Jiang M, Wang J, and Xu H
Abstract: Background: Chemical compounds and drugs (together called chemical entities) embedded in scientific articles are crucial for many information extraction tasks in the biomedical domain. However, only a very limited number of chemical entity recognition systems are publically available, probably due to the lack of large manually annotated corpora. To accelerate the development of chemical entity recognition systems, the Spanish National Cancer Research Center (CNIO) and The University of Navarra organized a challenge on Chemical and Drug Named Entity Recognition (CHEMDNER). The CHEMDNER challenge contains two individual subtasks: 1) Chemical Entity Mention recognition (CEM); and 2) Chemical Document Indexing (CDI). Our study proposes machine learning-based systems for the CEM task., Methods: The 2013 CHEMDNER challenge organizers provided a manually annotated 10,000 UTF8-encoded PubMed abstracts according to a predefined annotation guideline: a training set of 3,500 abstracts, a development set of 3,500 abstracts and a test set of 3,000 abstracts. We developed machine learning-based systems, based on conditional random fields (CRF) and structured support vector machines (SSVM) respectively, for the CEM task for this data set. The effects of three types of word representation (WR) features, generated by Brown clustering, random indexing and skip-gram, on both two machine learning-based systems were also investigated. The performance of our system was evaluated on the test set using scripts provided by the CHEMDNER challenge organizers. Primary evaluation measures were micro Precision, Recall, and F-measure., Results: Our best system was among the top ranked systems with an official micro F-measure of 85.05%. Fixing a bug caused by inconsistent features marginally improved the performance (micro F-measure of 85.20%) of the system., Conclusions: The SSVM-based CEM systems outperformed the CRF-based CEM systems when using the same features. Each type of the WR feature was beneficial to the CEM task. Both the CRF-based and SSVM-based systems using the all three types of WR features showed better performance than the systems using only one type of the WR feature.
Published: 2015
Full Text: View/download PDF

313. Role of text mining in early identification of potential drug safety issues.

Author: Liu M, Hu Y, and Tang B
Subjects: Animals, Clinical Trials as Topic, Humans, Cosmetics adverse effects, Data Mining methods, Databases, Bibliographic, Medical Records Systems, Computerized, Social Media
Abstract: Drugs are an important part of today's medicine, designed to treat, control, and prevent diseases; however, besides their therapeutic effects, drugs may also cause adverse effects that range from cosmetic to severe morbidity and mortality. To identify these potential drug safety issues early, surveillance must be conducted for each drug throughout its life cycle, from drug development to different phases of clinical trials, and continued after market approval. A major aim of pharmacovigilance is to identify the potential drug-event associations that may be novel in nature, severity, and/or frequency. Currently, the state-of-the-art approach for signal detection is through automated procedures by analyzing vast quantities of data for clinical knowledge. There exists a variety of resources for the task, and many of them are textual data that require text analytics and natural language processing to derive high-quality information. This chapter focuses on the utilization of text mining techniques in identifying potential safety issues of drugs from textual sources such as biomedical literature, consumer posts in social media, and narrative electronic medical records.
Published: 2014
Full Text: View/download PDF

314. Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method.

Author: Jiang M, Denny JC, Tang B, Cao H, and Xu H
Subjects: Humans, Natural Language Processing, Patient Discharge, Semantics, Terminology as Topic, Algorithms, Artificial Intelligence, Vocabulary, Controlled
Abstract: Semantic lexicons that link words and phrases to specific semantic types such as diseases are valuable assets for clinical natural language processing (NLP) systems. Although terminological terms with predefined semantic types can be generated easily from existing knowledge bases such as the Unified Medical Language Systems (UMLS), they are often limited and do not have good coverage for narrative clinical text. In this study, we developed a method for building semantic lexicons from clinical corpus. It extracts candidate semantic terms using a conditional random field (CRF) classifier and then selects terms using the C-Value algorithm. We applied the method to a corpus containing 10 years of discharge summaries from Vanderbilt University Hospital (VUH) and extracted 44,957 new terms for three semantic groups: Problem, Treatment, and Test. A manual analysis of 200 randomly selected terms not found in the UMLS demonstrated that 59% of them were meaningful new clinical concepts and 25% were lexical variants of exiting concepts in the UMLS. Furthermore, we compared the effectiveness of corpus-derived and UMLS-derived semantic lexicons in the concept extraction task of the 2010 i2b2 clinical NLP challenge. Our results showed that the classifier with corpus-derived semantic lexicons as features achieved a better performance (F-score 82.52%) than that with UMLS-derived semantic lexicons as features (F-score 82.04%). We conclude that such corpus-based methods are effective for generating semantic lexicons, which may improve named entity recognition tasks and may aid in augmenting synonymy within existing terminologies.
Published: 2012

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

314 results on '"Tang, Buzhou"'

301. Using Character-Level and Entity-Level Representations to Enhance Bidirectional Encoder Representation From Transformers-Based Clinical Semantic Textual Similarity Model: ClinicalSTS Modeling Study.

302. Depression Risk Prediction for Chinese Microblogs via Deep-Learning Methods: Content Analysis.

303. A Method to Learn Embedding of a Probabilistic Medical Knowledge Graph: Algorithm Development.

304. Re-examination of Rule-Based Methods in Deidentification of Electronic Health Records: Algorithm Development and Validation.

305. De-identification of Clinical Text via Bi-LSTM-CRF with Neural Language Models.

306. A hybrid method of recurrent neural network and graph neural network for next-period prescription prediction.

307. CBN: Constructing a clinical Bayesian network based on data from the electronic medical record.

308. Usability Study of Mainstream Wearable Fitness Devices: Feature Analysis and System Usability Scale Evaluation.

309. CMedTEX: A Rule-based Temporal Expression Extraction and Normalization System for Chinese Clinical Notes.

310. Recognizing Disjoint Clinical Concepts in Clinical Text Using Machine Learning-based Methods.

311. The CHEMDNER corpus of chemicals and drugs and its annotation principles.

312. A comparison of conditional random fields and structured support vector machines for chemical entity recognition in biomedical literature.

313. Role of text mining in early identification of potential drug safety issues.

314. Extracting semantic lexicons from discharge summaries using machine learning and the C-Value method.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

314 results on '"Tang, Buzhou"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources