Author: "Xu, Yumo" / Publication Year Range: Last 50 years - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Xu, Yumo"' showing total 24 results

Start Over Author "Xu, Yumo" Publication Year Range Last 50 years

24 results on '"Xu, Yumo"'

1. Beyond Relevant Documents: A Knowledge-Intensive Approach for Query-Focused Summarization using Large Language Models

Author: Zhang, Weijia, Huang, Jia-Hong, Vakulenko, Svitlana, Xu, Yumo, Rajapakse, Thilina, and Kanoulas, Evangelos
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: Query-focused summarization (QFS) is a fundamental task in natural language processing with broad applications, including search engines and report generation. However, traditional approaches assume the availability of relevant documents, which may not always hold in practical scenarios, especially in highly specialized topics. To address this limitation, we propose a novel knowledge-intensive approach that reframes QFS as a knowledge-intensive task setup. This approach comprises two main components: a retrieval module and a summarization controller. The retrieval module efficiently retrieves potentially relevant documents from a large-scale knowledge corpus based on the given textual query, eliminating the dependence on pre-existing document sets. The summarization controller seamlessly integrates a powerful large language model (LLM)-based summarizer with a carefully tailored prompt, ensuring the generated summary is comprehensive and relevant to the query. To assess the effectiveness of our approach, we create a new dataset, along with human-annotated relevance labels, to facilitate comprehensive evaluation covering both retrieval and summarization performance. Extensive experiments demonstrate the superior performance of our approach, particularly its ability to generate accurate summaries without relying on the availability of relevant documents initially. This underscores our method's versatility and practical applicability across diverse query scenarios., Comment: Accepted by the 27th International Conference on Pattern Recognition (ICPR 2024)
Published: 2024

2. Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models

Author: Wu, Zhengxuan, Zhang, Yuhao, Qi, Peng, Xu, Yumo, Han, Rujun, Zhang, Yian, Chen, Jifan, Min, Bonan, and Huang, Zhiheng
Subjects: Computer Science - Computation and Language
Abstract: Modern language models (LMs) need to follow human instructions while being faithful; yet, they often fail to achieve both. Here, we provide concrete evidence of a trade-off between instruction following (i.e., follow open-ended instructions) and faithfulness (i.e., ground responses in given context) when training LMs with these objectives. For instance, fine-tuning LLaMA-7B on instruction following datasets renders it less faithful. Conversely, instruction-tuned Vicuna-7B shows degraded performance at following instructions when further optimized on tasks that require contextual grounding. One common remedy is multi-task learning (MTL) with data mixing, yet it remains far from achieving a synergic outcome. We propose a simple yet effective method that relies on Rejection Sampling for Continued Self-instruction Tuning (ReSet), which significantly outperforms vanilla MTL. Surprisingly, we find that less is more, as training ReSet with high-quality, yet substantially smaller data (three-fold less) yields superior results. Our findings offer a better understanding of objective discrepancies in alignment training of LMs., Comment: preprint
Published: 2024

3. RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering

Author: Han, Rujun, Zhang, Yuhao, Qi, Peng, Xu, Yumo, Wang, Jenyuan, Liu, Lan, Wang, William Yang, Min, Bonan, and Castelli, Vittorio
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Question answering based on retrieval augmented generation (RAG-QA) is an important research topic in NLP and has a wide range of real-world applications. However, most existing datasets for this task are either constructed using a single source corpus or consist of short extractive answers, which fall short of evaluating large language model (LLM) based RAG-QA systems on cross-domain generalization. To address these limitations, we create Long-form RobustQA (LFRQA), a new dataset comprising human-written long-form answers that integrate short extractive answers from multiple documents into a single, coherent narrative, covering 26K queries and large corpora across seven different domains. We further propose RAG-QA Arena by directly comparing model-generated answers against LFRQA's answers using LLMs as evaluators. We show via extensive experiments that RAG-QA Arena and human judgments on answer quality are highly correlated. Moreover, only 41.3% of the most competitive LLM's answers are preferred to LFRQA's answers, demonstrating RAG-QA Arena as a challenging evaluation platform for future research.
Published: 2024

4. Weakly Supervised Domain Detection

Author: Xu, Yumo and Lapata, Mirella
Subjects: Computational linguistics. Natural language processing, P98-98.5
Abstract: In this paper we introduce domain detection as a new natural language processing task. We argue that the ability to detect textual segments that are domain-heavy (i.e., sentences or phrases that are representative of and provide evidence for a given domain) could enhance the robustness and portability of various text classification applications. We propose an encoder-detector framework for domain detection and bootstrap classifiers with multiple instance learning. The model is hierarchically organized and suited to multilabel classification. We demonstrate that despite learning with minimal supervision, our model can be applied to text spans of different granularities, languages, and genres. We also showcase the potential of domain detection for text summarization.
Published: 2019
Full Text: View/download PDF

5. Fine-Grained Natural Language Inference Based Faithfulness Evaluation for Diverse Summarisation Tasks

Author: Zhang, Huajian, Xu, Yumo, and Perez-Beltrachini, Laura
Subjects: Computer Science - Computation and Language
Abstract: We study existing approaches to leverage off-the-shelf Natural Language Inference (NLI) models for the evaluation of summary faithfulness and argue that these are sub-optimal due to the granularity level considered for premises and hypotheses. That is, the smaller content unit considered as hypothesis is a sentence and premises are made up of a fixed number of document sentences. We propose a novel approach, namely InFusE, that uses a variable premise size and simplifies summary sentences into shorter hypotheses. Departing from previous studies which focus on single short document summarisation, we analyse NLI based faithfulness evaluation for diverse summarisation tasks. We introduce DiverSumm, a new benchmark comprising long form summarisation (long documents and summaries) and diverse summarisation tasks (e.g., meeting and multi-document summarisation). In experiments, InFusE obtains superior performance across the different summarisation tasks. Our code and data are available at https://github.com/HJZnlp/infuse., Comment: EACL 2024
Published: 2024

6. QTSumm: Query-Focused Summarization over Tabular Data

Author: Zhao, Yilun, Qi, Zhenting, Nan, Linyong, Mi, Boyu, Liu, Yixin, Zou, Weijin, Han, Simeng, Chen, Ruizhe, Tang, Xiangru, Xu, Yumo, Radev, Dragomir, and Cohan, Arman
Subjects: Computer Science - Computation and Language
Abstract: People primarily consult tables to conduct data analysis or answer specific questions. Text generation systems that can provide accurate table summaries tailored to users' information needs can facilitate more efficient access to relevant data insights. Motivated by this, we define a new query-focused table summarization task, where text generation models have to perform human-like reasoning and analysis over the given table to generate a tailored summary. We introduce a new benchmark named QTSumm for this task, which contains 7,111 human-annotated query-summary pairs over 2,934 tables covering diverse topics. We investigate a set of strong baselines on QTSumm, including text generation, table-to-text generation, and large language models. Experimental results and manual analysis reveal that the new task presents significant challenges in table-to-text generation for future research. Moreover, we propose a new approach named ReFactor, to retrieve and reason over query-relevant information from tabular data to generate several natural language facts. Experimental results demonstrate that ReFactor can bring improvements to baselines by concatenating the generated facts to the model input. Our data and code are publicly available at https://github.com/yale-nlp/QTSumm., Comment: Accepted at EMNLP 2023
Published: 2023

7. Text Summarization with Oracle Expectation

Author: Xu, Yumo and Lapata, Mirella
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Extractive summarization produces summaries by identifying and concatenating the most important sentences in a document. Since most summarization datasets do not come with gold labels indicating whether document sentences are summary-worthy, different labeling algorithms have been proposed to extrapolate oracle extracts for model training. In this work, we identify two flaws with the widely used greedy labeling approach: it delivers suboptimal and deterministic oracles. To alleviate both issues, we propose a simple yet effective labeling algorithm that creates soft, expectation-based sentence labels. We define a new learning objective for extractive summarization which incorporates learning signals from multiple oracle summaries and prove it is equivalent to estimating the oracle expectation for each document sentence. Without any architectural modifications, the proposed labeling scheme achieves superior performance on a variety of summarization benchmarks across domains and languages, in both supervised and zero-shot settings., Comment: 18 pages, 5 figures
Published: 2022

8. Tackling Query-Focused Summarization as A Knowledge-Intensive Task: A Pilot Study

Author: Zhang, Weijia, Vakulenko, Svitlana, Rajapakse, Thilina, Xu, Yumo, and Kanoulas, Evangelos
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: Query-focused summarization (QFS) requires generating a summary given a query using a set of relevant documents. However, such relevant documents should be annotated manually and thus are not readily available in realistic scenarios. To address this limitation, we tackle the QFS task as a knowledge-intensive (KI) task without access to any relevant documents. Instead, we assume that these documents are present in a large-scale knowledge corpus and should be retrieved first. To explore this new setting, we build a new dataset (KI-QFS) by adapting existing QFS datasets. In this dataset, answering the query requires document retrieval from a knowledge corpus. We construct three different knowledge corpora, and we further provide relevance annotations to enable retrieval evaluation. Finally, we benchmark the dataset with state-of-the-art QFS models and retrieval-enhanced models. The experimental results demonstrate that QFS models perform significantly worse on KI-QFS compared to the original QFS task, indicating that the knowledge-intensive setting is much more challenging and offers substantial room for improvement. We believe that our investigation will inspire further research into addressing QFS in more realistic scenarios., Comment: Accepted by Gen-IR@SIGIR 2023 workshop
Published: 2021

9. Document summarization with neural query modeling

Author: Xu, Yumo, Lapata, Mirella, and Cohen, Shay
Subjects: Neural Query Modeling, Document summarization, query focused summarization, QFS, generic summarization
Abstract: Document summarization is a natural language processing task that aims to produce a short summary that concisely delivers the most important information of a document or multiple documents. Over the last few decades, the task has drawn much attention from both academia and industry, as it provides effective tools to manage and access text information. For example, through a newswire summarization engine, users can quickly digest a cluster of news articles by reading a short summary of the topic. Such summaries can, meanwhile, be used by news recommendation and question answering engines. Depending on the users' role in the summarization process, document summarization falls into two broad categories: generic summarization and query focused summarization (QFS). The former focuses on information intrinsically salient in the input text, while the latter also caters to requests explicitly specified by users. Despite the difference between generic summarization and QFS in their task formulations, we argue that all summaries address queries, even if they are not formulated explicitly. In this thesis, we introduce query modeling in the document summarization context as a critical objective for incorporating observed or latent user intent. We investigate different approaches that explore this theme with deep neural networks. We develop novel systems with neural query modeling for both extractive summarization, where summaries are composed of salient segments (e.g., sentences) from the original document(s), and abstractive summarization, where summaries are made up of words or phrases that do not exist in the input. The recent availability of large-scale datasets has driven the development of neural models that create generic summaries. However, training data in the form of queries, documents, and summaries for QFS is scarce. As most existing research in QFS has employed an extractive approach, we first consider better modeling query-cluster interactions for low-resource extractive QFS. In contrast to previous work with retrieval-style methods for assembling query-relevant summaries, we propose a framework that progressively estimates whether text segments should be included in the summary. Notably, modules of this framework can be independently developed and can leverage training data if available. We present an instantiation of this framework with distant supervision from question answering where various resources exist to identify segments which are likely to answer the query. Experiments on benchmark datasets show that our framework achieves competitive results and is robust across domains. Ideally, summaries should be abstracts, and the hidden costs incurred by annotating QA pairs should be avoided in query modeling. The second part of this thesis focuses on the low-resource challenge in abstractive QFS, and builds an abstractive QFS system which is trained query-free. Concretely, we propose to decompose the task into query modeling and conditional language modeling. For query modeling, we first introduce a uniﬁed representation for summaries and queries to exploit training resources in generic summarization, on top of which a weakly supervised model is optimized for evidence estimation. The proposed framework achieves state-of-the-art performance in generating query focused abstracts across existing benchmarks. Finally, the third part of this thesis moves beyond QFS. We provide a uniﬁed modeling framework for any kind of summarization, under the assumption that all summaries are a response to a query, which is observed in the case of QFS and latent in the case of generic summarization. We model queries as discrete latent variables over document tokens, and learn representations compatible with observed and unobserved query verbalizations. Requiring no further optimization on downstream summarization tasks, experiments show that our approach outperforms strong comparison systems across benchmarks, query types, document settings, and target domains.
Published: 2022
Full Text: View/download PDF

10. Text Summarization with Latent Queries

Author: Xu, Yumo and Lapata, Mirella
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: The availability of large-scale datasets has driven the development of neural models that create summaries from single documents, for generic purposes. When using a summarization system, users often have specific intents with various language realizations, which, depending on the information need, can range from a single keyword to a long narrative composed of multiple questions. Existing summarization systems, however, often either fail to support or act robustly on this query focused summarization task. We introduce LaQSum, the first unified text summarization system that learns Latent Queries from documents for abstractive summarization with any existing query forms. Under a deep generative framework, our system jointly optimizes a latent query model and a conditional language model, allowing users to plug-and-play queries of any type at test time. Despite learning from only generic summarization data and requiring no further optimization for downstream summarization tasks, our system robustly outperforms strong comparison systems across summarization benchmarks with different query types, document settings, and target domains., Comment: 12 pages
Published: 2021

11. Generating Query Focused Summaries from Query-Free Resources

Author: Xu, Yumo and Lapata, Mirella
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: The availability of large-scale datasets has driven the development of neural models that create generic summaries from single or multiple documents. In this work we consider query focused summarization (QFS), a task for which training data in the form of queries, documents, and summaries is not readily available. We propose to decompose QFS into (1) query modeling (i.e., finding supportive evidence within a set of documents for a query) and (2) conditional language modeling (i.e., summary generation). We introduce MaRGE, a Masked ROUGE Regression framework for evidence estimation and ranking which relies on a unified representation for summaries and queries, so that summaries in generic data can be converted into proxy queries for learning a query model. Experiments across QFS benchmarks and query types show that our model achieves state-of-the-art performance despite learning from weak supervision., Comment: ACL 2021
Published: 2020

12. Meta Dialogue Policy Learning

Author: Xu, Yumo, Zhu, Chenguang, Peng, Baolin, and Zeng, Michael
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Dialog policy determines the next-step actions for agents and hence is central to a dialogue system. However, when migrated to novel domains with little data, a policy model can fail to adapt due to insufficient interactions with the new environment. We propose Deep Transferable Q-Network (DTQN) to utilize shareable low-level signals between domains, such as dialogue acts and slots. We decompose the state and action representation space into feature subspaces corresponding to these low-level components to facilitate cross-domain knowledge transfer. Furthermore, we embed DTQN in a meta-learning framework and introduce Meta-DTQN with a dual-replay mechanism to enable effective off-policy training and adaptation. In experiments, our model outperforms baseline models in terms of both success rate and dialogue efficiency on the multi-domain dialogue dataset MultiWOZ 2.0., Comment: 10 pages, 3 figures
Published: 2020

13. Query Focused Multi-Document Summarization with Distant Supervision

Author: Xu, Yumo and Lapata, Mirella
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: We consider the problem of better modeling query-cluster interactions to facilitate query focused multi-document summarization (QFS). Due to the lack of training data, existing work relies heavily on retrieval-style methods for estimating the relevance between queries and text segments. In this work, we leverage distant supervision from question answering where various resources are available to more explicitly capture the relationship between queries and documents. We propose a coarse-to-fine modeling framework which introduces separate modules for estimating whether segments are relevant to the query, likely to contain an answer, and central. Under this framework, a trained evidence estimator further discerns which retrieved segments might answer the query for final selection in the summary. We demonstrate that our framework outperforms strong comparison systems on standard QFS benchmarks., Comment: 11 pages, 3 figures
Published: 2020

14. Bootstrapping a Crosslingual Semantic Parser

Author: Sherborne, Tom, Xu, Yumo, and Lapata, Mirella
Subjects: Computer Science - Computation and Language
Abstract: Recent progress in semantic parsing scarcely considers languages other than English but professional translation can be prohibitively expensive. We adapt a semantic parser trained on a single language, such as English, to new languages and multiple domains with minimal annotation. We query if machine translation is an adequate substitute for training data, and extend this to investigate bootstrapping using joint training with English, paraphrasing, and multilingual pre-trained models. We develop a Transformer-based parser combining paraphrases by ensembling attention over multiple encoders and present new versions of ATIS and Overnight in German and Chinese for evaluation. Experimental results indicate that MT can approximate training data in a new language for accurate parsing when augmented with paraphrasing through multiple MT engines. Considering when MT is inadequate, we also find that using our approach achieves parsing accuracy within 2% of complete translation using only 50% of training data., Comment: Camera Ready for EMNLP2020 Findings
Published: 2020

15. Abstractive Summarizers are Excellent Extractive Summarizers

Author: Varab, Daniel, primary and Xu, Yumo, additional
Published: 2023
Full Text: View/download PDF

16. QTSumm: Query-Focused Summarization over Tabular Data

Author: Zhao, Yilun, primary, Qi, Zhenting, additional, Nan, Linyong, additional, Mi, Boyu, additional, Liu, Yixin, additional, Zou, Weijin, additional, Han, Simeng, additional, Chen, Ruizhe, additional, Tang, Xiangru, additional, Xu, Yumo, additional, Radev, Dragomir, additional, and Cohan, Arman, additional
Published: 2023
Full Text: View/download PDF

17. Document Summarization with Latent Queries

Author: Xu, Yumo and Lapata, Mirella
Subjects: Human-Computer Interaction, Linguistics and Language, Artificial Intelligence, Communication, Computer Science Applications
Abstract: The availability of large-scale datasets has driven the development of neural models that create generic summaries for single or multiple documents. For query-focused summarization (QFS), labeled training data in the form of queries, documents, and summaries is not readily available. We provide a unified modeling framework for any kind of summarization, under the assumption that all summaries are a response to a query, which is observed in the case of QFS and latent in the case of generic summarization. We model queries as discrete latent variables over document tokens, and learn representations compatible with observed and unobserved query verbalizations. Our framework formulates summarization as a generative process, and jointly optimizes a latent query model and a conditional language model. Despite learning from generic summarization data only, our approach outperforms strong comparison systems across benchmarks, query types, document settings, and target domains.1
Published: 2022

18. QTSumm: A New Benchmark for Query-Focused Table Summarization

Author: Zhao, Yilun, Qi, Zhenting, Nan, Linyong, Mi, Boyu, Liu, Yixin, Zou, Weijin, Han, Simeng, Tang, Xiangru, Xu, Yumo, Cohan, Arman, and Radev, Dragomir
Subjects: FOS: Computer and information sciences, Computer Science - Computation and Language, Computation and Language (cs.CL)
Abstract: People primarily consult tables to conduct data analysis or answer specific questions. Text generation systems that can provide accurate table summaries tailored to users' information needs can facilitate more efficient access to relevant data insights. However, existing table-to-text generation studies primarily focus on converting tabular data into coherent statements, rather than addressing information-seeking purposes. In this paper, we define a new query-focused table summarization task, where text generation models have to perform human-like reasoning and analysis over the given table to generate a tailored summary, and we introduce a new benchmark named QTSumm for this task. QTSumm consists of 5,625 human-annotated query-summary pairs over 2,437 tables on diverse topics. Moreover, we investigate state-of-the-art models (i.e., text generation, table-to-text generation, and large language models) on the QTSumm dataset. Experimental results and manual analysis reveal that our benchmark presents significant challenges in table-to-text generation for future research., Comment: work in progress
Published: 2023
Full Text: View/download PDF

19. Generating Query Focused Summaries from Query-Free Resources

Author: Xu, Yumo, primary and Lapata, Mirella, additional
Published: 2021
Full Text: View/download PDF

20. Coarse-to-Fine Query Focused Multi-Document Summarization

Author: Xu, Yumo, primary and Lapata, Mirella, additional
Published: 2020
Full Text: View/download PDF

21. Bootstrapping a Crosslingual Semantic Parser

Author: Sherborne, Tom, primary, Xu, Yumo, additional, and Lapata, Mirella, additional
Published: 2020
Full Text: View/download PDF

22. Stock Movement Prediction from Tweets and Historical Prices

Author: Xu, Yumo and Cohen, Shay
Abstract: Stock movement prediction is a challenging problem: the market is highly stochastic, and we make temporally-dependent predictions from chaotic data. We treat these three complexities and present a novel deep generative model jointly exploiting text and price signals for this task. Unlike the case with discriminative or topic modeling, our model introduces recurrent, continuous latent variables for a better treatment of stochasticity, and uses neural variational inference to address the intractable posterior inference. We also provide a hybrid objective with temporal auxiliary to flexibly capture predictive dependencies. We demonstrate the state-of- the-art performance of our proposed model on a new stock movement prediction dataset which we collected.11https://github.com/yumoxu/stocknet-dataset
Published: 2018

23. Trainable Dynamic Subsampling for End-to-End Speech Recognition

Author: Zhang, Shucong, primary, Loweimi, Erfan, additional, Xu, Yumo, additional, Bell, Peter, additional, and Renals, Steve, additional
Published: 2019
Full Text: View/download PDF

24. Stock Movement Prediction from Tweets and Historical Prices

Author: Xu, Yumo, primary and Cohen, Shay B., additional
Published: 2018
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

24 results on '"Xu, Yumo"'

1. Beyond Relevant Documents: A Knowledge-Intensive Approach for Query-Focused Summarization using Large Language Models

2. Dancing in Chains: Reconciling Instruction Following and Faithfulness in Language Models

3. RAG-QA Arena: Evaluating Domain Robustness for Long-form Retrieval Augmented Question Answering

4. Weakly Supervised Domain Detection

5. Fine-Grained Natural Language Inference Based Faithfulness Evaluation for Diverse Summarisation Tasks

6. QTSumm: Query-Focused Summarization over Tabular Data

7. Text Summarization with Oracle Expectation

8. Tackling Query-Focused Summarization as A Knowledge-Intensive Task: A Pilot Study

9. Document summarization with neural query modeling

10. Text Summarization with Latent Queries

11. Generating Query Focused Summaries from Query-Free Resources

12. Meta Dialogue Policy Learning

13. Query Focused Multi-Document Summarization with Distant Supervision

14. Bootstrapping a Crosslingual Semantic Parser

15. Abstractive Summarizers are Excellent Extractive Summarizers

16. QTSumm: Query-Focused Summarization over Tabular Data

17. Document Summarization with Latent Queries

18. QTSumm: A New Benchmark for Query-Focused Table Summarization

19. Generating Query Focused Summaries from Query-Free Resources

20. Coarse-to-Fine Query Focused Multi-Document Summarization

21. Bootstrapping a Crosslingual Semantic Parser

22. Stock Movement Prediction from Tweets and Historical Prices

23. Trainable Dynamic Subsampling for End-to-End Speech Recognition

24. Stock Movement Prediction from Tweets and Historical Prices

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

24 results on '"Xu, Yumo"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources