7 results on '"Gaihong Yu"'
Search Results
2. RCMR 280k: Refined Corpus for Move Recognition Based on PubMed
- Author
-
Jie Li, Gaihong Yu, and Zhixiong Zhang
- Subjects
Artificial Intelligence ,Library and Information Sciences ,Computer Science Applications ,Information Systems - Abstract
Existing datasets for move recognition, such as PubMed 200k RCT, exhibit several problems that significantly impact recognition performance, especially for Background and Objective labels. In order to improve the move recognition performance, we introduce a method and construct a refined corpus based on PubMed, named RCMR 280k. This corpus comprises approximately 280,000 structured abstracts, totaling 3,386,008 sentences, each sentence is labeled with one of five categories: Background, Objective, Method, Result, or Conclusion. We also construct a subset of RCMR, named RCMR_RCT, corresponding to medical subdomain of RCTs. We conduct comparison experiments using our RCMR, RCMR_RCT with PubMed 380k and PubMed 200k RCT, respectively. The best results, obtained using the MSMBERT model, show that: (1) our RCMR outperforms PubMed 380k by 0.82%, while our RCMR_RCT outperforms PubMed 200k RCT by 9.35%; (2) compared with PubMed 380k, our corpus achieve better improvement on the Results and Conclusions categories, with average F1 performance improves 1% and 0.82%, respectively; (3) compared with PubMed 200k RCT, our corpus significantly improves the performance in the Background and Objective categories, with average F1 scores improves 28.31% and 37.22%, respectively. To the best of our knowledge, our RCMR is among the rarely high-quality, resource-rich refined PubMed corpora available. Our work in this paper has been applied in the SciAIEngine, which is openly accessible for researchers to conduct move recognition task.
- Published
- 2023
3. Moves Recognition in Abstract of Research Paper Based on Deep Learning.
- Author
-
Zhixiong Zhang, Huan Liu, Liangping Ding, Pengmin Wu, and Gaihong Yu
- Published
- 2019
- Full Text
- View/download PDF
4. Automatic Keyphrase Extraction from Scientific Chinese Medical Abstracts Based on Character-Level Sequence Labeling
- Author
-
Liangping Ding, Zhixiong Zhang, Huan Liu, Gaihong Yu, and Jie Li
- Subjects
0301 basic medicine ,Computer science ,business.industry ,Extraction (chemistry) ,02 engineering and technology ,computer.software_genre ,Sequence labeling ,03 medical and health sciences ,030104 developmental biology ,Character (mathematics) ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,Artificial intelligence ,business ,computer ,Natural language processing - Abstract
Purpose Automatic keyphrase extraction (AKE) is an important task for grasping the main points of the text. In this paper, we aim to combine the benefits of sequence labeling formulation and pretrained language model to propose an automatic keyphrase extraction model for Chinese scientific research. Design/methodology/approach We regard AKE from Chinese text as a character-level sequence labeling task to avoid segmentation errors of Chinese tokenizer and initialize our model with pretrained language model BERT, which was released by Google in 2018. We collect data from Chinese Science Citation Database and construct a large-scale dataset from medical domain, which contains 100,000 abstracts as training set, 6,000 abstracts as development set and 3,094 abstracts as test set. We use unsupervised keyphrase extraction methods including term frequency (TF), TF-IDF, TextRank and supervised machine learning methods including Conditional Random Field (CRF), Bidirectional Long Short Term Memory Network (BiLSTM), and BiLSTM-CRF as baselines. Experiments are designed to compare word-level and character-level sequence labeling approaches on supervised machine learning models and BERT-based models. Findings Compared with character-level BiLSTM-CRF, the best baseline model with F1 score of 50.16%, our character-level sequence labeling model based on BERT obtains F1 score of 59.80%, getting 9.64% absolute improvement. Research limitations We just consider automatic keyphrase extraction task rather than keyphrase generation task, so only keyphrases that are occurred in the given text can be extracted. In addition, our proposed dataset is not suitable for dealing with nested keyphrases. Practical implications We make our character-level IOB format dataset of Chinese Automatic Keyphrase Extraction from scientific Chinese medical abstracts (CAKE) publicly available for the benefits of research community, which is available at: https://github.com/possible1402/Dataset-For-Chinese-Medical-Keyphrase-Extraction. Originality/value By designing comparative experiments, our study demonstrates that character-level formulation is more suitable for Chinese automatic keyphrase extraction task under the general trend of pretrained language models. And our proposed dataset provides a unified method for model evaluation and can promote the development of Chinese automatic keyphrase extraction to some extent.
- Published
- 2021
5. Masked Sentence Model Based on BERT for Move Recognition in Medical Scientific Abstracts
- Author
-
Gaihong Yu, Liangping Ding, Zhixiong Zhang, and Huan Liu
- Subjects
Structure (mathematical logic) ,0303 health sciences ,Artificial neural network ,business.industry ,Computer science ,Context (language use) ,02 engineering and technology ,computer.software_genre ,Task (project management) ,03 medical and health sciences ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,020201 artificial intelligence & image processing ,Artificial intelligence ,Language model ,business ,F1 score ,computer ,Natural language processing ,Sentence ,030304 developmental biology - Abstract
Purpose Move recognition in scientific abstracts is an NLP task of classifying sentences of the abstracts into different types of language units. To improve the performance of move recognition in scientific abstracts, a novel model of move recognition is proposed that outperforms the BERT-based method. Design/methodology/approach Prevalent models based on BERT for sentence classification often classify sentences without considering the context of the sentences. In this paper, inspired by the BERT masked language model (MLM), we propose a novel model called the masked sentence model that integrates the content and contextual information of the sentences in move recognition. Experiments are conducted on the benchmark dataset PubMed 20K RCT in three steps. Then, we compare our model with HSLN-RNN, BERT-based and SciBERT using the same dataset. Findings Compared with the BERT-based and SciBERT models, the F1 score of our model outperforms them by 4.96% and 4.34%, respectively, which shows the feasibility and effectiveness of the novel model and the result of our model comes closest to the state-of-the-art results of HSLN-RNN at present. Research limitations The sequential features of move labels are not considered, which might be one of the reasons why HSLN-RNN has better performance. Our model is restricted to dealing with biomedical English literature because we use a dataset from PubMed, which is a typical biomedical database, to fine-tune our model. Practical implications The proposed model is better and simpler in identifying move structures in scientific abstracts and is worthy of text classification experiments for capturing contextual features of sentences. Originality/value T he study proposes a masked sentence model based on BERT that considers the contextual features of the sentences in abstracts in a new way. The performance of this classification model is significantly improved by rebuilding the input layer without changing the structure of neural networks.
- Published
- 2019
6. Vegfa signaling regulates diverse artery/vein formation in vertebrate vasculatures
- Author
-
Yabo Fang, Dong Liu, Daqing Jin, Gaihong Yu, Tao P. Zhong, Fen Li, Yiwei Chen, Weijun Pan, and Diqi Zhu
- Subjects
Vascular Endothelial Growth Factor A ,0301 basic medicine ,endocrine system ,Population ,Embryonic Development ,Neovascularization, Physiologic ,Veins ,03 medical and health sciences ,Dorsal aorta ,Vasculogenesis ,Genetics ,Animals ,Amino Acid Sequence ,Pseudopodia ,education ,Molecular Biology ,Zebrafish ,Alleles ,Sprouting angiogenesis ,education.field_of_study ,Base Sequence ,biology ,Common cardinal veins ,Brain ,Arteries ,biology.organism_classification ,Cell biology ,Endothelial stem cell ,Vascular endothelial growth factor A ,030104 developmental biology ,Mutation ,Immunology ,Signal Transduction - Abstract
Vascular endothelial growth factor A (Vegfa) signaling regulates vascular development during embryogenesis and organ formation. However, the signaling mechanisms that govern the formation of various arteries/veins in various tissues are incompletely understood. In this study, we utilized transcription activator-like effector nuclease (TALEN) to generate zebrafish vegfaa mutants. vegfaa-/- embryos are embryonic lethal, and display a complete loss of the dorsal aorta (DA) and expansion of the cardinal vein. Activation of Vegfa signaling expands the arterial cell population at the expense of venous cells during vasculogenesis of the axial vessels in the trunk. Vegfa signaling regulates endothelial cell (EC) proliferation after arterial-venous specification. Vegfa deficiency and overexpression inhibit the formation of tip cell filopodia and interfere with the pathfinding of intersegmental vessels (ISVs). In the head vasculature, vegfaa‒/‒ causes loss of a pair of mesencephalic veins (MsVs) and central arteries (CtAs), both of which usually develop via sprouting angiogenesis. Our results indicate that Vegfa signaling induces the formation of the DA at the expense of the cardinal vein during the trunk vasculogenesis, and that Vegfa is required for the angiogenic formation of MsVs and CtAs in the brain. These findings suggest that Vegfa signaling governs the formation of diverse arteries/veins by distinct cellular mechanisms in vertebrate vasculatures.
- Published
- 2017
7. Web-Based Multi-Dimensional Medical Image Collaborative Annotation System
- Author
-
Hualei Shen, Dianfu Ma, Gaihong Yu, and Yonggang Huang
- Subjects
World Wide Web ,Annotation ,Information retrieval ,Automatic image annotation ,Thin client ,business.industry ,Computer science ,Minimum information required in the annotation of models ,Web application ,Text annotation ,Temporal annotation ,business ,Image retrieval - Abstract
Medical image annotation is playing an increasingly important role in clinical diagnosis and medical research. Existing medical image annotation is faced with many demands and challenges. (1) The emergence and sharp increasing speed of multi-dimensional medical images. (2) Image annotation includes not only text annotation, but also graphical annotation, clinical diagnostic information and image content features information. (3) Uneven distribution of medical resources, which makes difficult to aggregate group intelligence from a much larger scale of distributed experts. Most of the present study is texted based within hospitals on single images annotation. It is difficult to organize and manage unstructured medical image annotation and collaborative sharing information. This paper dedicated to the research on collaborative web-based multi-dimensional medical image annotation and retrieval in order to address these problems, overcome the shortcoming of traditional thin client and facilitate medical experts in different locations to exchange views and comments,. It proposed (1) a system architecture that provides authoring, storing, querying, and exchanging of annotations, and supports web-based collaboration. (2) 2D multi-frame and 3D medical image collaborative annotation data model. (3) Collaborative annotation mechanisms.
- Published
- 2012
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.