Start Over

QBSUM: A large-scale query-based document summarization dataset from real-world applications.

Authors :: Zhao, Mingjun
Yan, Shengli
Liu, Bang
Zhong, Xinwang
Hao, Qian
Chen, Haolan
Niu, Di
Long, Bowei
Guo, Weidong
Source :: Computer Speech & Language. Mar2021, Vol. 66, pN.PAG-N.PAG. 1p.
Publication Year :: 2021
Abstract: • Propose a large-scale query-based document summarization dataset. • Three models for solving the problem. • Data and experiment analysis; • Online large-scale A/B testing in real-world mobile applications. Query-based document summarization aims to extract or generate a summary of a document which directly answers or is relevant to the search query. It is an important technique that can be beneficial to a variety of applications such as search engines, document-level machine reading comprehension, and chatbots. Currently, datasets designed for query-based summarization are short in numbers and existing datasets are also limited in both scale and quality. Moreover, to the best of our knowledge, there is no publicly available dataset for Chinese query-based document summarization. In this paper, we present QBSUM , a high-quality large-scale dataset consisting of 49,000+ data samples for the task of Chinese query-based document summarization. We also propose multiple unsupervised and supervised solutions to the task and demonstrate their high-speed inference and superior performance via both offline experiments and online A/B tests. The QBSUM dataset is released in order to facilitate future advancement of this research field. [ABSTRACT FROM AUTHOR]

Subjects :: *NATURAL language processing
*INFORMATION retrieval
*CHINESE language
*ACQUISITION of data
*SEARCH engines

Details

Language :: English
ISSN :: 08852308
Volume :: 66
Database :: Academic Search Index
Journal :: Computer Speech & Language
Publication Type :: Academic Journal
Accession number :: 147227512
Full Text :: https://doi.org/10.1016/j.csl.2020.101166

Full Text Access

View/download PDF

Tools

Email
Cite

Printer

Authors Abstract Subjects Details

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

QBSUM: A large-scale query-based document summarization dataset from real-world applications.

Abstract

Subjects

Details

Tools

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

QBSUM: A large-scale query-based document summarization dataset from real-world applications.

Abstract

Subjects

Details

Tools

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources