Back to Search Start Over

Generating Pseudo-ground Truth for Predicting New Concepts in Social Streams

Authors :
Graus, D.
Tsagkias, M.
Buitinck, L.
de Rijke, M.
Kenter, T.
de Vries, A.P.
Zhai, C.X.
de Jong, F.
Radinsky, K.
Hofmann, K.
Information and Language Processing Syst (IVI, FNWI)
Source :
Advances in Information Retrieval: 36th European Conference on IR Research, ECIR 2014, Amsterdam, The Netherlands, April 13-16, 2014: proceedings, 286-298, STARTPAGE=286;ENDPAGE=298;TITLE=Advances in Information Retrieval, Lecture Notes in Computer Science ISBN: 9783319060279, ECIR
Publication Year :
2014
Publisher :
Springer, 2014.

Abstract

The manual curation of knowledge bases is a bottleneck in fast paced domains where new concepts constantly emerge. Identification of nascent concepts is important for improving early entity linking, content interpretation, and recommendation of new content in real-time applications. We present an unsupervised method for generating pseudo-ground truth for training a named entity recognizer to specifically identify entities that will become concepts in a knowledge base in the setting of social streams. We show that our method is able to deal with missing labels, justifying the use of pseudo-ground truth generation in this task. Finally, we show how our method significantly outperforms a lexical-matching baseline, by leveraging strategies for sampling pseudo-ground truth based on entity confidence scores and textual quality of input documents.

Details

Language :
English
ISBN :
978-3-319-06027-9
ISBNs :
9783319060279
Database :
OpenAIRE
Journal :
Advances in Information Retrieval: 36th European Conference on IR Research, ECIR 2014, Amsterdam, The Netherlands, April 13-16, 2014: proceedings, 286-298, STARTPAGE=286;ENDPAGE=298;TITLE=Advances in Information Retrieval, Lecture Notes in Computer Science ISBN: 9783319060279, ECIR
Accession number :
edsair.doi.dedup.....92fd9ceb2bb1a55c74cfd51547f03eac