Back to Search Start Over

Extracting Protein-Protein Interactions (PPIs) from Biomedical Literature using Attention-based Relational Context Information

Authors :
Park, Gilchan
McCorkle, Sean
Soto, Carlos
Blaby, Ian
Yoo, Shinjae
Source :
In 2022 IEEE Big Data, pp. 2052-2061 (2022)
Publication Year :
2024

Abstract

Because protein-protein interactions (PPIs) are crucial to understand living systems, harvesting these data is essential to probe disease development and discern gene/protein functions and biological processes. Some curated datasets contain PPI data derived from the literature and other sources (e.g., IntAct, BioGrid, DIP, and HPRD). However, they are far from exhaustive, and their maintenance is a labor-intensive process. On the other hand, machine learning methods to automate PPI knowledge extraction from the scientific literature have been limited by a shortage of appropriate annotated data. This work presents a unified, multi-source PPI corpora with vetted interaction definitions augmented by binary interaction type labels and a Transformer-based deep learning method that exploits entities' relational context information for relation representation to improve relation classification performance. The model's performance is evaluated on four widely studied biomedical relation extraction datasets, as well as this work's target PPI datasets, to observe the effectiveness of the representation to relation extraction tasks in various data. Results show the model outperforms prior state-of-the-art models. The code and data are available at: https://github.com/BNLNLP/PPI-Relation-Extraction<br />Comment: 10 pages, 3 figures, 7 tables, 2022 IEEE International Conference on Big Data (Big Data)

Details

Database :
arXiv
Journal :
In 2022 IEEE Big Data, pp. 2052-2061 (2022)
Publication Type :
Report
Accession number :
edsarx.2403.05602
Document Type :
Working Paper
Full Text :
https://doi.org/10.1109/BigData55660.2022.10021099