Back to Search
Start Over
Toward Computer-Assisted Text Curation: Classification Is Easy (Choosing Training Data Can Be Hard...).
- Source :
- Linking Literature, Information & Knowledge for Biology; 2010, p33-42, 10p
- Publication Year :
- 2010
-
Abstract
- We aim to design a system for classifying scientific articles based on the presence of protein characterization experiments, intending to aid the curators populating JCVI΄s Characterized Protein (CHAR) Database of experimentally characterized proteins. We trained two classifiers using small datasets labeled by CHAR curators, and another classifier based on a much larger dataset using annotations from public databases. Performance varied greatly, in ways we did not anticipate. We describe the datasets, the classification method, and discuss the unexpected results. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISBNs :
- 9783642131301
- Database :
- Complementary Index
- Journal :
- Linking Literature, Information & Knowledge for Biology
- Publication Type :
- Book
- Accession number :
- 76848329
- Full Text :
- https://doi.org/10.1007/978-3-642-13131-8_5