85,658 results on '"Fact checking"'
Search Results
2. 'The Data Says Otherwise'-Towards Automated Fact-checking and Communication of Data Claims
- Author
-
Fu, Yu, Guo, Shunan, Hoffswell, Jane, Bursztyn, Victor S., Rossi, Ryan, and Stasko, John
- Subjects
Computer Science - Human-Computer Interaction ,H.5.2 ,I.7.2 ,I.2.7 - Abstract
Fact-checking data claims requires data evidence retrieval and analysis, which can become tedious and intractable when done manually. This work presents Aletheia, an automated fact-checking prototype designed to facilitate data claims verification and enhance data evidence communication. For verification, we utilize a pre-trained LLM to parse the semantics for evidence retrieval. To effectively communicate the data evidence, we design representations in two forms: data tables and visualizations, tailored to various data fact types. Additionally, we design interactions that showcase a real-world application of these techniques. We evaluate the performance of two core NLP tasks with a curated dataset comprising 400 data claims and compare the two representation forms regarding viewers' assessment time, confidence, and preference via a user study with 20 participants. The evaluation offers insights into the feasibility and bottlenecks of using LLMs for data fact-checking tasks, potential advantages and disadvantages of using visualizations over data tables, and design recommendations for presenting data evidence., Comment: 20 pages, 13 figures, UIST 2024
- Published
- 2024
- Full Text
- View/download PDF
3. Community-based fact-checking reduces the spread of misleading posts on social media
- Author
-
Chuai, Yuwei, Pilarski, Moritz, Renault, Thomas, Restrepo-Amariles, David, Troussel-Clément, Aurore, Lenzini, Gabriele, and Pröllochs, Nicolas
- Subjects
Computer Science - Social and Information Networks - Abstract
Community-based fact-checking is a promising approach to verify social media content and correct misleading posts at scale. Yet, causal evidence regarding its effectiveness in reducing the spread of misinformation on social media is missing. Here, we performed a large-scale empirical study to analyze whether community notes reduce the spread of misleading posts on X. Using a Difference-in-Differences design and repost time series data for N=237,677 (community fact-checked) cascades that had been reposted more than 431 million times, we found that exposing users to community notes reduced the spread of misleading posts by, on average, 62.0%. Furthermore, community notes increased the odds that users delete their misleading posts by 103.4%. However, our findings also suggest that community notes might be too slow to intervene in the early (and most viral) stage of the diffusion. Our work offers important implications to enhance the effectiveness of community-based fact-checking approaches on social media.
- Published
- 2024
4. HybridFC: A Hybrid Fact-Checking Approach for Knowledge Graphs
- Author
-
Qudus, Umair, Roeder, Michael, Saleem, Muhammad, and Ngomo, Axel-Cyrille Ngonga
- Subjects
Computer Science - Machine Learning ,Computer Science - Artificial Intelligence ,Computer Science - Databases - Abstract
We consider fact-checking approaches that aim to predict the veracity of assertions in knowledge graphs. Five main categories of fact-checking approaches for knowledge graphs have been proposed in the recent literature, of which each is subject to partially overlapping limitations. In particular, current text-based approaches are limited by manual feature engineering. Path-based and rule-based approaches are limited by their exclusive use of knowledge graphs as background knowledge, and embedding-based approaches suffer from low accuracy scores on current fact-checking tasks. We propose a hybrid approach -- dubbed HybridFC -- that exploits the diversity of existing categories of fact-checking approaches within an ensemble learning setting to achieve a significantly better prediction performance. In particular, our approach outperforms the state of the art by 0.14 to 0.27 in terms of Area Under the Receiver Operating Characteristic curve on the FactBench dataset. Our code is open-source and can be found at https://github.com/dice-group/HybridFC.
- Published
- 2024
- Full Text
- View/download PDF
5. Enhancing Natural Language Inference Performance with Knowledge Graph for COVID-19 Automated Fact-Checking in Indonesian Language
- Author
-
Muharram, Arief Purnama and Purwarianti, Ayu
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Automated fact-checking is a key strategy to overcome the spread of COVID-19 misinformation on the internet. These systems typically leverage deep learning approaches through Natural Language Inference (NLI) to verify the truthfulness of information based on supporting evidence. However, one challenge that arises in deep learning is performance stagnation due to a lack of knowledge during training. This study proposes using a Knowledge Graph (KG) as external knowledge to enhance NLI performance for automated COVID-19 fact-checking in the Indonesian language. The proposed model architecture comprises three modules: a fact module, an NLI module, and a classifier module. The fact module processes information from the KG, while the NLI module handles semantic relationships between the given premise and hypothesis. The representation vectors from both modules are concatenated and fed into the classifier module to produce the final result. The model was trained using the generated Indonesian COVID-19 fact-checking dataset and the COVID-19 KG Bahasa Indonesia. Our study demonstrates that incorporating KGs can significantly improve NLI performance in fact-checking, achieving the best accuracy of 0,8616. This suggests that KGs are a valuable component for enhancing NLI performance in automated fact-checking.
- Published
- 2024
6. Evidence-backed Fact Checking using RAG and Few-Shot In-Context Learning with LLMs
- Author
-
Singhal, Ronit, Patwa, Pransh, Patwa, Parth, Chadha, Aman, and Das, Amitava
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Given the widespread dissemination of misinformation on social media, implementing fact-checking mechanisms for online claims is essential. Manually verifying every claim is highly challenging, underscoring the need for an automated fact-checking system. This paper presents our system designed to address this issue. We utilize the Averitec dataset to assess the veracity of claims. In addition to veracity prediction, our system provides supporting evidence, which is extracted from the dataset. We develop a Retrieve and Generate (RAG) pipeline to extract relevant evidence sentences from a knowledge base, which are then inputted along with the claim into a large language model (LLM) for classification. We also evaluate the few-shot In-Context Learning (ICL) capabilities of multiple LLMs. Our system achieves an 'Averitec' score of 0.33, which is a 22% absolute improvement over the baseline. All code will be made available on All code will be made available on https://github.com/ronit-singhal/evidence-backed-fact-checking-using-rag-and-few-shot-in-context-learning-with-llms.
- Published
- 2024
7. CommunityKG-RAG: Leveraging Community Structures in Knowledge Graphs for Advanced Retrieval-Augmented Generation in Fact-Checking
- Author
-
Chang, Rong-Ching and Zhang, Jiawei
- Subjects
Computer Science - Computation and Language - Abstract
Despite advancements in Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) systems, their effectiveness is often hindered by a lack of integration with entity relationships and community structures, limiting their ability to provide contextually rich and accurate information retrieval for fact-checking. We introduce CommunityKG-RAG (Community Knowledge Graph-Retrieval Augmented Generation), a novel zero-shot framework that integrates community structures within Knowledge Graphs (KGs) with RAG systems to enhance the fact-checking process. Capable of adapting to new domains and queries without additional training, CommunityKG-RAG utilizes the multi-hop nature of community structures within KGs to significantly improve the accuracy and relevance of information retrieval. Our experimental results demonstrate that CommunityKG-RAG outperforms traditional methods, representing a significant advancement in fact-checking by offering a robust, scalable, and efficient solution.
- Published
- 2024
8. Zero-Shot Learning and Key Points Are All You Need for Automated Fact-Checking
- Author
-
Mohammadkhani, Mohammad Ghiasvand, Mohammadkhani, Ali Ghiasvand, and Beigy, Hamid
- Subjects
Computer Science - Computation and Language - Abstract
Automated fact-checking is an important task because determining the accurate status of a proposed claim within the vast amount of information available online is a critical challenge. This challenge requires robust evaluation to prevent the spread of false information. Modern large language models (LLMs) have demonstrated high capability in performing a diverse range of Natural Language Processing (NLP) tasks. By utilizing proper prompting strategies, their versatility due to their understanding of large context sizes and zero-shot learning ability enables them to simulate human problem-solving intuition and move towards being an alternative to humans for solving problems. In this work, we introduce a straightforward framework based on Zero-Shot Learning and Key Points (ZSL-KeP) for automated fact-checking, which despite its simplicity, performed well on the AVeriTeC shared task dataset by robustly improving the baseline and achieving 10th place.
- Published
- 2024
9. LiveFC: A System for Live Fact-Checking of Audio Streams
- Author
-
V, Venktesh and Setty, Vinay
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
The advances in the digital era have led to rapid dissemination of information. This has also aggravated the spread of misinformation and disinformation. This has potentially serious consequences, such as civil unrest. While fact-checking aims to combat this, manual fact-checking is cumbersome and not scalable. While automated fact-checking approaches exist, they do not operate in real-time and do not always account for spread of misinformation through different modalities. This is particularly important as proactive fact-checking on live streams in real-time can help people be informed of false narratives and prevent catastrophic consequences that may cause civil unrest. This is particularly relevant with the rapid dissemination of information through video on social media platforms or other streams like political rallies and debates. Hence, in this work we develop a platform named LiveFC, that can aid in fact-checking live audio streams in real-time. LiveFC has a user-friendly interface that displays the claims detected along with their veracity and evidence for live streams with associated speakers for claims from respective segments. The app can be accessed at http://livefc.factiverse.ai and a screen recording of the demo can be found at https://bit.ly/3WVAoIw., Comment: Under Review, 11 pages
- Published
- 2024
10. The Implications of Open Generative Models in Human-Centered Data Science Work: A Case Study with Fact-Checking Organizations
- Author
-
Wolfe, Robert and Mitra, Tanushree
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence ,Computer Science - Computation and Language ,Computer Science - Computers and Society ,Computer Science - Emerging Technologies - Abstract
Calls to use open generative language models in academic research have highlighted the need for reproducibility and transparency in scientific research. However, the impact of generative AI extends well beyond academia, as corporations and public interest organizations have begun integrating these models into their data science pipelines. We expand this lens to include the impact of open models on organizations, focusing specifically on fact-checking organizations, which use AI to observe and analyze large volumes of circulating misinformation, yet must also ensure the reproducibility and impartiality of their work. We wanted to understand where fact-checking organizations use open models in their data science pipelines; what motivates their use of open models or proprietary models; and how their use of open or proprietary models can inform research on the societal impact of generative AI. To answer these questions, we conducted an interview study with N=24 professionals at 20 fact-checking organizations on six continents. Based on these interviews, we offer a five-component conceptual model of where fact-checking organizations employ generative AI to support or automate parts of their data science pipeline, including Data Ingestion, Data Analysis, Data Retrieval, Data Delivery, and Data Sharing. We then provide taxonomies of fact-checking organizations' motivations for using open models and the limitations that prevent them for further adopting open models, finding that they prefer open models for Organizational Autonomy, Data Privacy and Ownership, Application Specificity, and Capability Transparency. However, they nonetheless use proprietary models due to perceived advantages in Performance, Usability, and Safety, as well as Opportunity Costs related to participation in emerging generative AI ecosystems. Our work provides novel perspective on open models in data-driven organizations., Comment: Accepted at Artificial Intelligence, Ethics, and Society 2024
- Published
- 2024
11. The doctor will polygraph you now: ethical concerns with AI for fact-checking patients
- Author
-
Anibal, James, Gunkel, Jasmine, Huth, Hannah, Nguyen, Hang, Awan, Shaheen, Bensoussan, Yael, and Wood, Bradford
- Subjects
Computer Science - Computers and Society - Abstract
Clinical artificial intelligence (AI) methods have been proposed for predicting social behaviors which could be reasonably understood from patient-reported data. This raises ethical concerns about respect, privacy, and patient awareness/control over how their health data is used. Ethical concerns surrounding clinical AI systems for social behavior verification were divided into three main categories: (1) the use of patient data retrospectively without informed consent for the specific task of verification, (2) the potential for inaccuracies or biases within such systems, and (3) the impact on trust in patient-provider relationships with the introduction of automated AI systems for fact-checking. Additionally, this report showed the simulated misuse of a verification system and identified a potential LLM bias against patient-reported information in favor of multimodal data, published literature, and the outputs of other AI methods (i.e., AI self-trust). Finally, recommendations were presented for mitigating the risk that AI verification systems will cause harm to patients or undermine the purpose of the healthcare system., Comment: 9 pages, 1 figure, 2 tables
- Published
- 2024
12. Fact-Checking or Not? News Verification Behaviours of Young People in Hong Kong
- Author
-
Donna Chu and Frankie Ho Chun Wong
- Abstract
This paper discusses the factors affecting the behaviours for coping with fake news among young people. The data were collected from a survey conducted in late 2019, which sampled 2112 secondary school students from 21 partnering schools. This study aims to understand the opinions and behaviours of teenagers towards disinformation when fake news was prevalent during the anti-extradition bill protests in Hong Kong. It finds that awareness of the problem alone had limited influence in facilitating coping strategies. Civic awareness and interaction with social media were useful predictors of internal and external coping behaviours, respectively. Confidence about one's ability to detect fake news was a crucial factor, yet a concern for the value of truth stood out as the strongest predictor of fake news coping behaviours.
- Published
- 2024
- Full Text
- View/download PDF
13. Factuality challenges in the era of large language models and opportunities for fact-checking
- Author
-
Augenstein, Isabelle, Baldwin, Timothy, Cha, Meeyoung, Chakraborty, Tanmoy, Ciampaglia, Giovanni Luca, Corney, David, DiResta, Renee, Ferrara, Emilio, Hale, Scott, Halevy, Alon, Hovy, Eduard, Ji, Heng, Menczer, Filippo, Miguez, Ruben, Nakov, Preslav, Scheufele, Dietram, Sharma, Shivam, and Zagni, Giovanni
- Published
- 2024
- Full Text
- View/download PDF
14. QuestGen: Effectiveness of Question Generation Methods for Fact-Checking Applications
- Author
-
Setty, Ritvik and Setty, Vinay
- Subjects
Computer Science - Computation and Language ,H.3.3 - Abstract
Verifying fact-checking claims poses a significant challenge, even for humans. Recent approaches have demonstrated that decomposing claims into relevant questions to gather evidence enhances the efficiency of the fact-checking process. In this paper, we provide empirical evidence showing that this question decomposition can be effectively automated. We demonstrate that smaller generative models, fine-tuned for the question generation task using data augmentation from various datasets, outperform large language models by up to 8%. Surprisingly, in some cases, the evidence retrieved using machine-generated questions proves to be significantly more effective for fact-checking than that obtained from human-written questions. We also perform manual evaluation of the decomposed questions to assess the quality of the questions generated., Comment: Accepted in CIKM 2024 as a short paper 4 pages and 1 page references. Fixed typo in author name
- Published
- 2024
- Full Text
- View/download PDF
15. Independent fact-checking organizations exhibit a departure from political neutrality
- Author
-
Singh, Sahajpreet, Masud, Sarah, and Chakraborty, Tanmoy
- Subjects
Computer Science - Social and Information Networks ,Statistics - Applications - Abstract
Independent fact-checking organizations have emerged as the crusaders to debunk fake news. However, they may not always remain neutral, as they can be selective in the false news they choose to expose and in how they present the information. They can deviate from neutrality by being selective in what false news they debunk and how the information is presented. Prompting the now popular large language model, GPT-3.5, with journalistic frameworks, we establish a longitudinal measure (2018-2023) for political neutrality that looks beyond the left-right spectrum. Specified on a range of -1 to 1 (with zero being absolute neutrality), we establish the extent of negative portrayal of political entities that makes a difference in the readers' perception in the USA and India. Here, we observe an average score of -0.17 and -0.24 in the USA and India, respectively. The findings indicate how seemingly objective fact-checking can still carry distorted political views, indirectly and subtly impacting the perception of consumers of the news., Comment: 11 pages, 2 figures
- Published
- 2024
16. MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking
- Author
-
Chen, Ting-Chih, Tang, Chia-Wei, and Thomas, Chris
- Subjects
Computer Science - Artificial Intelligence ,Computer Science - Computation and Language - Abstract
Fact-checking real-world claims often requires reviewing multiple multimodal documents to assess a claim's truthfulness, which is a highly laborious and time-consuming task. In this paper, we present a summarization model designed to generate claim-specific summaries useful for fact-checking from multimodal, multi-document datasets. The model takes inputs in the form of documents, images, and a claim, with the objective of assisting in fact-checking tasks. We introduce a dynamic perceiver-based model that can handle inputs from multiple modalities of arbitrary lengths. To train our model, we leverage a novel reinforcement learning-based entailment objective to generate summaries that provide evidence distinguishing between different truthfulness labels. To assess the efficacy of our approach, we conduct experiments on both an existing benchmark and a new dataset of multi-document claims that we contribute. Our approach outperforms the SOTA approach by 4.6% in the claim verification task on the MOCHEG dataset and demonstrates strong performance on our new Multi-News-Fact-Checking dataset., Comment: 16 pages, 7 figures, The 62nd Annual Meeting of the Association for Computational Linguistics
- Published
- 2024
17. Automated Justification Production for Claim Veracity in Fact Checking: A Survey on Architectures and Approaches
- Author
-
Eldifrawi, Islam, Wang, Shengrui, and Trabelsi, Amine
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Information Retrieval ,Computer Science - Machine Learning - Abstract
Automated Fact-Checking (AFC) is the automated verification of claim accuracy. AFC is crucial in discerning truth from misinformation, especially given the huge amounts of content are generated online daily. Current research focuses on predicting claim veracity through metadata analysis and language scrutiny, with an emphasis on justifying verdicts. This paper surveys recent methodologies, proposing a comprehensive taxonomy and presenting the evolution of research in that landscape. A comparative analysis of methodologies and future directions for improving fact-checking explainability are also discussed., Comment: Accepted to ACL 2024 Main Conference
- Published
- 2024
18. Generative Large Language Models in Automated Fact-Checking: A Survey
- Author
-
Vykopal, Ivan, Pikuliak, Matúš, Ostermann, Simon, and Šimko, Marián
- Subjects
Computer Science - Computation and Language - Abstract
The dissemination of false information across online platforms poses a serious societal challenge, necessitating robust measures for information verification. While manual fact-checking efforts are still instrumental, the growing volume of false information requires automated methods. Large language models (LLMs) offer promising opportunities to assist fact-checkers, leveraging LLM's extensive knowledge and robust reasoning capabilities. In this survey paper, we investigate the utilization of generative LLMs in the realm of fact-checking, illustrating various approaches that have been employed and techniques for prompting or fine-tuning LLMs. By providing an overview of existing approaches, this survey aims to improve the understanding of utilizing LLMs in fact-checking and to facilitate further progress in LLMs' involvement in this process.
- Published
- 2024
19. Evaluating Transparency of Machine Generated Fact Checking Explanations
- Author
-
Xing, Rui, Baldwin, Timothy, and Lau, Jey Han
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
An important factor when it comes to generating fact-checking explanations is the selection of evidence: intuitively, high-quality explanations can only be generated given the right evidence. In this work, we investigate the impact of human-curated vs. machine-selected evidence for explanation generation using large language models. To assess the quality of explanations, we focus on transparency (whether an explanation cites sources properly) and utility (whether an explanation is helpful in clarifying a claim). Surprisingly, we found that large language models generate similar or higher quality explanations using machine-selected evidence, suggesting carefully curated evidence (by humans) may not be necessary. That said, even with the best model, the generated explanations are not always faithful to the sources, suggesting further room for improvement in explanation generation for fact-checking.
- Published
- 2024
20. MFC-Bench: Benchmarking Multimodal Fact-Checking with Large Vision-Language Models
- Author
-
Wang, Shengkang, Lin, Hongzhan, Luo, Ziyang, Ye, Zhen, Chen, Guang, and Ma, Jing
- Subjects
Computer Science - Computation and Language ,Computer Science - Computer Vision and Pattern Recognition - Abstract
Large vision-language models (LVLMs) have significantly improved multimodal reasoning tasks, such as visual question answering and image captioning. These models embed multimodal facts within their parameters, rather than relying on external knowledge bases to store factual information explicitly. However, the content discerned by LVLMs may deviate from actual facts due to inherent bias or incorrect inference. To address this issue, we introduce MFC-Bench, a rigorous and comprehensive benchmark designed to evaluate the factual accuracy of LVLMs across three tasks: Manipulation, Out-of-Context, and Veracity Classification. Through our evaluation on MFC-Bench, we benchmarked 12 diverse and representative LVLMs, uncovering that current models still fall short in multimodal fact-checking and demonstrate insensitivity to various forms of manipulated content. We hope that MFC-Bench could raise attention to the trustworthy artificial intelligence potentially assisted by LVLMs in the future. The MFC-Bench and accompanying resources are publicly accessible at https://github.com/wskbest/MFC-Bench, contributing to ongoing research in the multimodal fact-checking field., Comment: 22 pages, 8 figures
- Published
- 2024
21. Document-level Claim Extraction and Decontextualisation for Fact-Checking
- Author
-
Deng, Zhenyun, Schlichtkrull, Michael, and Vlachos, Andreas
- Subjects
Computer Science - Computation and Language - Abstract
Selecting which claims to check is a time-consuming task for human fact-checkers, especially from documents consisting of multiple sentences and containing multiple claims. However, existing claim extraction approaches focus more on identifying and extracting claims from individual sentences, e.g., identifying whether a sentence contains a claim or the exact boundaries of the claim within a sentence. In this paper, we propose a method for document-level claim extraction for fact-checking, which aims to extract check-worthy claims from documents and decontextualise them so that they can be understood out of context. Specifically, we first recast claim extraction as extractive summarization in order to identify central sentences from documents, then rewrite them to include necessary context from the originating document through sentence decontextualisation. Evaluation with both automatic metrics and a fact-checking professional shows that our method is able to extract check-worthy claims from documents more accurately than previous work, while also improving evidence retrieval., Comment: Accepted to ACL 2024
- Published
- 2024
22. The Impact and Opportunities of Generative AI in Fact-Checking
- Author
-
Wolfe, Robert and Mitra, Tanushree
- Subjects
Computer Science - Human-Computer Interaction ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society - Abstract
Generative AI appears poised to transform white collar professions, with more than 90% of Fortune 500 companies using OpenAI's flagship GPT models, which have been characterized as "general purpose technologies" capable of effecting epochal changes in the economy. But how will such technologies impact organizations whose job is to verify and report factual information, and to ensure the health of the information ecosystem? To investigate this question, we conducted 30 interviews with N=38 participants working at 29 fact-checking organizations across six continents, asking about how they use generative AI and the opportunities and challenges they see in the technology. We found that uses of generative AI envisioned by fact-checkers differ based on organizational infrastructure, with applications for quality assurance in Editing, for trend analysis in Investigation, and for information literacy in Advocacy. We used the TOE framework to describe participant concerns ranging from the Technological (lack of transparency), to the Organizational (resource constraints), to the Environmental (uncertain and evolving policy). Building on the insights of our participants, we describe value tensions between fact-checking and generative AI, and propose a novel Verification dimension to the design space of generative models for information verification work. Finally, we outline an agenda for fairness, accountability, and transparency research to support the responsible use of generative AI in fact-checking. Throughout, we highlight the importance of human infrastructure and labor in producing verified information in collaboration with AI. We expect that this work will inform not only the scientific literature on fact-checking, but also contribute to understanding of organizational adaptation to a powerful but unreliable new technology., Comment: To be published at the ACM Conference on Fairness, Accountability, and Transparency (FAccT) 2024
- Published
- 2024
- Full Text
- View/download PDF
23. Automatic News Generation and Fact-Checking System Based on Language Processing
- Author
-
Peng, Xirui, Xu, Qiming, Feng, Zheng, Zhao, Haopeng, Tan, Lianghao, Zhou, Yan, Zhang, Zecheng, Gong, Chenwei, and Zheng, Yingqiao
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning ,I.5 ,H.4 - Abstract
This paper explores an automatic news generation and fact-checking system based on language processing, aimed at enhancing the efficiency and quality of news production while ensuring the authenticity and reliability of the news content. With the rapid development of Natural Language Processing (NLP) and deep learning technologies, automatic news generation systems are capable of extracting key information from massive data and generating well-structured, fluent news articles. Meanwhile, by integrating fact-checking technology, the system can effectively prevent the spread of false news and improve the accuracy and credibility of news. This study details the key technologies involved in automatic news generation and factchecking, including text generation, information extraction, and the application of knowledge graphs, and validates the effectiveness of these technologies through experiments. Additionally, the paper discusses the future development directions of automatic news generation and fact-checking systems, emphasizing the importance of further integration and innovation of technologies. The results show that with continuous technological optimization and practical application, these systems will play an increasingly important role in the future news industry, providing more efficient and reliable news services.
- Published
- 2024
24. ViWikiFC: Fact-Checking for Vietnamese Wikipedia-Based Textual Knowledge Source
- Author
-
Le, Hung Tuan, To, Long Truong, Nguyen, Manh Trong, and Van Nguyen, Kiet
- Subjects
Computer Science - Computation and Language - Abstract
Fact-checking is essential due to the explosion of misinformation in the media ecosystem. Although false information exists in every language and country, most research to solve the problem mainly concentrated on huge communities like English and Chinese. Low-resource languages like Vietnamese are necessary to explore corpora and models for fact verification. To bridge this gap, we construct ViWikiFC, the first manual annotated open-domain corpus for Vietnamese Wikipedia Fact Checking more than 20K claims generated by converting evidence sentences extracted from Wikipedia articles. We analyze our corpus through many linguistic aspects, from the new dependency rate, the new n-gram rate, and the new word rate. We conducted various experiments for Vietnamese fact-checking, including evidence retrieval and verdict prediction. BM25 and InfoXLM (Large) achieved the best results in two tasks, with BM25 achieving an accuracy of 88.30% for SUPPORTS, 86.93% for REFUTES, and only 56.67% for the NEI label in the evidence retrieval task, InfoXLM (Large) achieved an F1 score of 86.51%. Furthermore, we also conducted a pipeline approach, which only achieved a strict accuracy of 67.00% when using InfoXLM (Large) and BM25. These results demonstrate that our dataset is challenging for the Vietnamese language model in fact-checking tasks.
- Published
- 2024
25. Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models
- Author
-
Zarharan, Majid, Wullschleger, Pascal, Kia, Babak Behkam, Pilehvar, Mohammad Taher, and Foster, Jennifer
- Subjects
Computer Science - Computation and Language - Abstract
This paper presents a comprehensive analysis of explainable fact-checking through a series of experiments, focusing on the ability of large language models to verify public health claims and provide explanations or justifications for their veracity assessments. We examine the effectiveness of zero/few-shot prompting and parameter-efficient fine-tuning across various open and closed-source models, examining their performance in both isolated and joint tasks of veracity prediction and explanation generation. Importantly, we employ a dual evaluation approach comprising previously established automatic metrics and a novel set of criteria through human evaluation. Our automatic evaluation indicates that, within the zero-shot scenario, GPT-4 emerges as the standout performer, but in few-shot and parameter-efficient fine-tuning contexts, open-source models demonstrate their capacity to not only bridge the performance gap but, in some instances, surpass GPT-4. Human evaluation reveals yet more nuance as well as indicating potential problems with the gold explanations.
- Published
- 2024
26. Can LLMs Produce Faithful Explanations For Fact-checking? Towards Faithful Explainable Fact-Checking via Multi-Agent Debate
- Author
-
Kim, Kyungha, Lee, Sangyun, Huang, Kung-Hsiang, Chan, Hou Pong, Li, Manling, and Ji, Heng
- Subjects
Computer Science - Computation and Language - Abstract
Fact-checking research has extensively explored verification but less so the generation of natural-language explanations, crucial for user trust. While Large Language Models (LLMs) excel in text generation, their capability for producing faithful explanations in fact-checking remains underexamined. Our study investigates LLMs' ability to generate such explanations, finding that zero-shot prompts often result in unfaithfulness. To address these challenges, we propose the Multi-Agent Debate Refinement (MADR) framework, leveraging multiple LLMs as agents with diverse roles in an iterative refining process aimed at enhancing faithfulness in generated explanations. MADR ensures that the final explanation undergoes rigorous validation, significantly reducing the likelihood of unfaithful elements and aligning closely with the provided evidence. Experimental results demonstrate that MADR significantly improves the faithfulness of LLM-generated explanations to the evidence, advancing the credibility and trustworthiness of these explanations.
- Published
- 2024
27. Fact-checking journalism and political argumentation : a British perspective.
- Author
-
Birks, Jen
- Subjects
Journalism -- Objectivity ,Journalism -- Political aspects ,Journalistic ethics ,Journalism ,Media and Communication ,Political Communication - Abstract
Summary: Dr Birks eloquently guides the reader through a complex field of mediated post-truth politics in an engaging and accessible critique of fact checking journalism. In this book we are treated to a highly sophisticated analysis of the dynamics and epistemologies of fact checking journalism, with detailed examples from a UK perspective, where Birks demonstrates the importance of understanding fact-checking in relation to the quality of political debate. The book is essential reading for anyone wanting to understand the contested nature of fact and truth claims in journalism today.? ?Einar Thorsen, Associate Professor of Journalism and Communication, Bournemouth University, UK ?In an era of post-factual democracy, fact-checkers should act as an antidote to politician?s ?bullshit?, helping citizens to make informed choices. Birks? important research is a must read for anyone seeking to understand fact checkers or create a fact check system that plays a role in monitoring the health of democratic debate.? ?Darren G. Lilleker, Associate Professor in Political Communication, Bournemouth University, UK This timely book examines the role of fact-checking journalism within political policy debates, and its potential contribution to public engagement. Understanding facts not to operate in a political vacuum, the book argues for a wide remit for fact-checking journalism beyond empirically-checkable facts, to include the causal relationships and predictions that form part of wider political arguments and are central to electoral pledges. Whilst these statements cannot be proven or disproven, fact-checking can, and sometimes does, ask pertinent critical questions about the premises of those claims and arguments. The analysis centres on the three dedicated national British fact-checkers during the UK?s 2017 snap general election, including their activity and engagement on Twitter. The book also makes a close political discourse and argumentation analysis of three key issue debates in flagship reporting from Channel 4 News and the BBC.
- Published
- 2019
28. Examining the Potential of ChatGPT on Biomedical Information Retrieval: Fact-Checking Drug-Disease Associations
- Author
-
Gao, Zhenxiang, Li, Lingyao, Ma, Siyuan, Wang, Qinyong, Hemphill, Libby, and Xu, Rong
- Published
- 2024
- Full Text
- View/download PDF
29. Credible, Unreliable or Leaked?: Evidence Verification for Enhanced Automated Fact-checking
- Author
-
Chrysidis, Zacharias, Papadopoulos, Stefanos-Iordanis, Papadopoulos, Symeon, and Petrantonakis, Panagiotis C.
- Subjects
Computer Science - Computation and Language ,Computer Science - Computers and Society ,Computer Science - Information Retrieval ,Computer Science - Social and Information Networks - Abstract
Automated fact-checking (AFC) is garnering increasing attention by researchers aiming to help fact-checkers combat the increasing spread of misinformation online. While many existing AFC methods incorporate external information from the Web to help examine the veracity of claims, they often overlook the importance of verifying the source and quality of collected "evidence". One overlooked challenge involves the reliance on "leaked evidence", information gathered directly from fact-checking websites and used to train AFC systems, resulting in an unrealistic setting for early misinformation detection. Similarly, the inclusion of information from unreliable sources can undermine the effectiveness of AFC systems. To address these challenges, we present a comprehensive approach to evidence verification and filtering. We create the "CREDible, Unreliable or LEaked" (CREDULE) dataset, which consists of 91,632 articles classified as Credible, Unreliable and Fact checked (Leaked). Additionally, we introduce the EVidence VERification Network (EVVER-Net), trained on CREDULE to detect leaked and unreliable evidence in both short and long texts. EVVER-Net can be used to filter evidence collected from the Web, thus enhancing the robustness of end-to-end AFC systems. We experiment with various language models and show that EVVER-Net can demonstrate impressive performance of up to 91.5% and 94.4% accuracy, while leveraging domain credibility scores along with short or long texts, respectively. Finally, we assess the evidence provided by widely-used fact-checking datasets including LIAR-PLUS, MOCHEG, FACTIFY, NewsCLIPpings+ and VERITE, some of which exhibit concerning rates of leaked and unreliable evidence.
- Published
- 2024
- Full Text
- View/download PDF
30. ReproHum #0087-01: Human Evaluation Reproduction Report for Generating Fact Checking Explanations
- Author
-
Loakman, Tyler and Lin, Chenghua
- Subjects
Computer Science - Computation and Language - Abstract
This paper presents a partial reproduction of Generating Fact Checking Explanations by Anatanasova et al (2020) as part of the ReproHum element of the ReproNLP shared task to reproduce the findings of NLP research regarding human evaluation. This shared task aims to investigate the extent to which NLP as a field is becoming more or less reproducible over time. Following the instructions provided by the task organisers and the original authors, we collect relative rankings of 3 fact-checking explanations (comprising a gold standard and the outputs of 2 models) for 40 inputs on the criteria of Coverage. The results of our reproduction and reanalysis of the original work's raw results lend support to the original findings, with similar patterns seen between the original work and our reproduction. Whilst we observe slight variation from the original results, our findings support the main conclusions drawn by the original authors pertaining to the efficacy of their proposed models., Comment: Accepted to HumEval at LREC-Coling 2024. Table 1 updated
- Published
- 2024
31. MiniCheck: Efficient Fact-Checking of LLMs on Grounding Documents
- Author
-
Tang, Liyan, Laban, Philippe, and Durrett, Greg
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Recognizing if LLM output can be grounded in evidence is central to many tasks in NLP: retrieval-augmented generation, summarization, document-grounded dialogue, and more. Current approaches to this kind of "fact-checking" are based on verifying each piece of a model generation against potential evidence using an LLM. However, this process can be very computationally expensive, requiring many calls to LLMs to check a single response. In this work, we show how to build small models that have GPT-4-level performance but for 400x lower cost. We do this by constructing synthetic training data with GPT-4, which involves creating realistic yet challenging instances of factual errors via a structured generation procedure. Training on this data teaches models to check each fact in the claim and recognize synthesis of information across sentences. For evaluation, we unify pre-existing datasets into a benchmark LLM-AggreFact, collected from recent work on fact-checking and grounding LLM generations. Our best system MiniCheck-FT5 (770M parameters) outperforms all systems of comparable size and reaches GPT-4 accuracy. We release LLM-AggreFact, code for data synthesis, and models., Comment: LLM-AggreFact benchmark, MiniCheck models, data generation code at https://github.com/Liyan06/MiniCheck
- Published
- 2024
32. Fact Checking Beyond Training Set
- Author
-
Karisani, Payam and Ji, Heng
- Subjects
Computer Science - Computation and Language ,Computer Science - Machine Learning - Abstract
Evaluating the veracity of everyday claims is time consuming and in some cases requires domain expertise. We empirically demonstrate that the commonly used fact checking pipeline, known as the retriever-reader, suffers from performance deterioration when it is trained on the labeled data from one domain and used in another domain. Afterwards, we delve into each component of the pipeline and propose novel algorithms to address this problem. We propose an adversarial algorithm to make the retriever component robust against distribution shift. Our core idea is to initially train a bi-encoder on the labeled source data, and then, to adversarially train two separate document and claim encoders using unlabeled target data. We then focus on the reader component and propose to train it such that it is insensitive towards the order of claims and evidence documents. Our empirical evaluations support the hypothesis that such a reader shows a higher robustness against distribution shift. To our knowledge, there is no publicly available multi-topic fact checking dataset. Thus, we propose a simple automatic method to re-purpose two well-known fact checking datasets. We then construct eight fact checking scenarios from these datasets, and compare our model to a set of strong baseline models, including recent domain adaptation models that use GPT4 for generating synthetic data., Comment: NAACL 2024
- Published
- 2024
33. RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict
- Author
-
Zeng, Yirong, Ding, Xiao, Zhao, Yi, Li, Xiangyu, Zhang, Jie, Yao, Chao, Liu, Ting, and Qin, Bing
- Subjects
Computer Science - Computation and Language - Abstract
Fact-checking is the task of verifying the factuality of a given claim by examining the available evidence. High-quality evidence plays a vital role in enhancing fact-checking systems and facilitating the generation of explanations that are understandable to humans. However, the provision of both sufficient and relevant evidence for explainable fact-checking systems poses a challenge. To tackle this challenge, we propose a method based on a Large Language Model to automatically retrieve and summarize evidence from the Web. Furthermore, we construct RU22Fact, a novel multilingual explainable fact-checking dataset on the Russia-Ukraine conflict in 2022 of 16K samples, each containing real-world claims, optimized evidence, and referenced explanation. To establish a baseline for our dataset, we also develop an end-to-end explainable fact-checking system to verify claims and generate explanations. Experimental results demonstrate the prospect of optimized evidence in increasing fact-checking performance and also indicate the possibility of further progress in the end-to-end claim verification and explanation generation tasks., Comment: 12 pages, 3 figures, accepted by lrec-coling2024
- Published
- 2024
34. Fact Checking Chatbot: A Misinformation Intervention for Instant Messaging Apps and an Analysis of Trust in the Fact Checkers
- Author
-
Lim, Gionnieve and Perrault, Simon T.
- Subjects
Computer Science - Human-Computer Interaction - Abstract
In Singapore, there has been a rise in misinformation on mobile instant messaging services (MIMS). MIMS support both small peer-to-peer networks and large groups. Misinformation in the former may spread due to recipients' trust in the sender while in the latter, misinformation can directly reach a wide audience. The encryption of MIMS makes it difficult to address misinformation directly. As such, chatbots have become an alternative solution where users can disclose their chat content directly to fact checking services. To understand how effective fact checking chatbots are as an intervention and how trust in three different fact checkers (i.e., Government, News Outlets, and Artificial Intelligence) may affect this trust, we conducted a within-subjects experiment with 527 Singapore residents. We found mixed results for the fact checkers but support for the chatbot intervention overall. We also found a striking contradiction between participants' trust in the fact checkers and their behaviour towards them. Specifically, those who reported a high level of trust in the government performed worse and tended to follow the fact checking tool less when it was endorsed by the government.
- Published
- 2024
- Full Text
- View/download PDF
35. Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification
- Author
-
Fadeeva, Ekaterina, Rubashevskii, Aleksandr, Shelmanov, Artem, Petrakov, Sergey, Li, Haonan, Mubarak, Hamdy, Tsymbalov, Evgenii, Kuzmin, Gleb, Panchenko, Alexander, Baldwin, Timothy, Nakov, Preslav, and Panov, Maxim
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Machine Learning - Abstract
Large language models (LLMs) are notorious for hallucinating, i.e., producing erroneous claims in their output. Such hallucinations can be dangerous, as occasional factual inaccuracies in the generated text might be obscured by the rest of the output being generally factually correct, making it extremely hard for the users to spot them. Current services that leverage LLMs usually do not provide any means for detecting unreliable generations. Here, we aim to bridge this gap. In particular, we propose a novel fact-checking and hallucination detection pipeline based on token-level uncertainty quantification. Uncertainty scores leverage information encapsulated in the output of a neural network or its layers to detect unreliable predictions, and we show that they can be used to fact-check the atomic claims in the LLM output. Moreover, we present a novel token-level uncertainty quantification method that removes the impact of uncertainty about what claim to generate on the current step and what surface form to use. Our method Claim Conditioned Probability (CCP) measures only the uncertainty of a particular claim value expressed by the model. Experiments on the task of biography generation demonstrate strong improvements for CCP compared to the baselines for seven LLMs and four languages. Human evaluation reveals that the fact-checking pipeline based on uncertainty quantification is competitive with a fact-checking tool that leverages external knowledge., Comment: Accepted to ACL-2024 (Findings). Ekaterina Fadeeva, Aleksandr Rubashevskii, and Artem Shelmanov contributed equally
- Published
- 2024
36. Multimodal Large Language Models to Support Real-World Fact-Checking
- Author
-
Geng, Jiahui, Kementchedjhieva, Yova, Nakov, Preslav, and Gurevych, Iryna
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Multimodal large language models (MLLMs) carry the potential to support humans in processing vast amounts of information. While MLLMs are already being used as a fact-checking tool, their abilities and limitations in this regard are understudied. Here is aim to bridge this gap. In particular, we propose a framework for systematically assessing the capacity of current multimodal models to facilitate real-world fact-checking. Our methodology is evidence-free, leveraging only these models' intrinsic knowledge and reasoning capabilities. By designing prompts that extract models' predictions, explanations, and confidence levels, we delve into research questions concerning model accuracy, robustness, and reasons for failure. We empirically find that (1) GPT-4V exhibits superior performance in identifying malicious and misleading multimodal claims, with the ability to explain the unreasonable aspects and underlying motives, and (2) existing open-source models exhibit strong biases and are highly sensitive to the prompt. Our study offers insights into combating false multimodal information and building secure, trustworthy multimodal models. To the best of our knowledge, we are the first to evaluate MLLMs for real-world fact-checking.
- Published
- 2024
37. Are Fact-Checking Tools Helpful? An Exploration of the Usability of Google Fact Check
- Author
-
Yang, Qiangeng, Christensen, Tess, Gilda, Shlok, Fernandes, Juliana, Oliveira, Daniela, Wilson, Ronald, and Woodard, Damon
- Subjects
Computer Science - Social and Information Networks - Abstract
Fact-checking-specific search engines such as Google Fact Check are a promising way to combat misinformation on social media, especially during significant events such as the COVID-19 pandemic and the U.S. presidential elections, but the usability of such an approach has not been thoroughly studied. We evaluated the performance of Google Fact Check by analyzing the retrieved fact-checking results regarding 1,000 COVID-19-related false claims and found it able to retrieve the fact-checking results for 15.8% of the input claims, and the results are relatively reliable. We also found that the false claims receiving different fact-checking verdicts (i.e., "False," "Partly False," "True," and "Unratable") tend to reflect diverse emotional tones, and fact-checking sources tend to check the claims in different lengths and using dictionary words to various extents. Claim variations addressing the same issue yet described differently are likely to retrieve distinct fact-checking results. We suggested that the quantities of the retrieved fact-checking results could be optimized and that slightly adjusting input wording may be the best practice for users to retrieve more useful information. This study aims to contribute to the understanding of state-of-the-art fact-checking tools and information integrity.
- Published
- 2024
38. Heterogeneous Graph Reasoning for Fact Checking over Texts and Tables
- Author
-
Gong, Haisong, Xu, Weizhi, wu, Shu, Liu, Qiang, and Wang, Liang
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Fact checking aims to predict claim veracity by reasoning over multiple evidence pieces. It usually involves evidence retrieval and veracity reasoning. In this paper, we focus on the latter, reasoning over unstructured text and structured table information. Previous works have primarily relied on fine-tuning pretrained language models or training homogeneous-graph-based models. Despite their effectiveness, we argue that they fail to explore the rich semantic information underlying the evidence with different structures. To address this, we propose a novel word-level Heterogeneous-graph-based model for Fact Checking over unstructured and structured information, namely HeterFC. Our approach leverages a heterogeneous evidence graph, with words as nodes and thoughtfully designed edges representing different evidence properties. We perform information propagation via a relational graph neural network, facilitating interactions between claims and evidence. An attention-based method is utilized to integrate information, combined with a language model for generating predictions. We introduce a multitask loss function to account for potential inaccuracies in evidence retrieval. Comprehensive experiments on the large fact checking dataset FEVEROUS demonstrate the effectiveness of HeterFC. Code will be released at: https://github.com/Deno-V/HeterFC., Comment: Accepted by 38th Association for the Advancement of Artificial Intelligence, AAAI
- Published
- 2024
39. Surprising Efficacy of Fine-Tuned Transformers for Fact-Checking over Larger Language Models
- Author
-
Setty, Vinay
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
In this paper, we explore the challenges associated with establishing an end-to-end fact-checking pipeline in a real-world context, covering over 90 languages. Our real-world experimental benchmarks demonstrate that fine-tuning Transformer models specifically for fact-checking tasks, such as claim detection and veracity prediction, provide superior performance over large language models (LLMs) like GPT-4, GPT-3.5-Turbo, and Mistral-7b. However, we illustrate that LLMs excel in generative tasks such as question decomposition for evidence retrieval. Through extensive evaluation, we show the efficacy of fine-tuned models for fact-checking in a multilingual setting and complex claims that include numerical quantities., Comment: Accepted in SIGIR 2024 (industry track)
- Published
- 2024
40. Entanglement: Balancing Punishment and Compensation, Repeated Dilemma Game-Theoretic Analysis of Maximum Compensation Problem for Bypass and Least Cost Paths in Fact-Checking, Case of Fake News with Weak Wallace's Law
- Author
-
Kawahata, Yasuko
- Subjects
Physics - Physics and Society ,Computer Science - Artificial Intelligence ,Economics - Theoretical Economics - Abstract
This research note is organized with respect to a novel approach to solving problems related to the spread of fake news and effective fact-checking. Focusing on the least-cost routing problem, the discussion is organized with respect to the use of Metzler functions and Metzler matrices to model the dynamics of information propagation among news providers. With this approach, we designed a strategy to minimize the spread of fake news, which is detrimental to informational health, while at the same time maximizing the spread of credible information. In particular, through the punitive dominance problem and the maximum compensation problem, we developed and examined a path to reassess the incentives of news providers to act and to analyze their impact on the equilibrium of the information market. By applying the concept of entanglement to the context of information propagation, we shed light on the complexity of interactions among news providers and contribute to the formulation of more effective information management strategies. This study provides new theoretical and practical insights into issues related to fake news and fact-checking, and will be examined against improving informational health and public digital health.This paper is partially an attempt to utilize "Generative AI" and was written with educational intent. There are currently no plans for it to become a peer-reviewed paper., Comment: Recurring Dilemma, Wallace's Law, Entanglement, Detour Path, Least Cost Path, Metzler Function, Metzler Matrix, Fake News, Fact-Checking, Punitive Dominance Problem, Maximum Compensation Problem, Informational health
- Published
- 2024
41. Impact of a Mock-up Fact-checking Extension on HPV Vaccine Misinformation: A Survey Experiment
- Author
-
The University of Hong Kong and Zhiyuan Hou, Associate Professor
- Published
- 2024
42. FactCheck Editor: Multilingual Text Editor with End-to-End fact-checking
- Author
-
Setty, Vinay
- Subjects
Computer Science - Computation and Language - Abstract
We introduce 'FactCheck Editor', an advanced text editor designed to automate fact-checking and correct factual inaccuracies. Given the widespread issue of misinformation, often a result of unintentional mistakes by content creators, our tool aims to address this challenge. It supports over 90 languages and utilizes transformer models to assist humans in the labor-intensive process of fact verification. This demonstration showcases a complete workflow that detects text claims in need of verification, generates relevant search engine queries, and retrieves appropriate documents from the web. It employs Natural Language Inference (NLI) to predict the veracity of claims and uses LLMs to summarize the evidence and suggest textual revisions to correct any errors in the text. Additionally, the effectiveness of models used in claim detection and veracity assessment is evaluated across multiple languages., Comment: Accepted in SIGIR 2024 (demo track)
- Published
- 2024
43. Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM
- Author
-
Zhang, Xuan and Gao, Wei
- Subjects
Computer Science - Computation and Language - Abstract
Retrieval-augmented language models have exhibited promising performance across various areas of natural language processing (NLP), including fact-critical tasks. However, due to the black-box nature of advanced large language models (LLMs) and the non-retrieval-oriented supervision signal of specific tasks, the training of retrieval model faces significant challenges under the setting of black-box LLM. We propose an approach leveraging Fine-grained Feedback with Reinforcement Retrieval (FFRR) to enhance fact-checking on news claims by using black-box LLM. FFRR adopts a two-level strategy to gather fine-grained feedback from the LLM, which serves as a reward for optimizing the retrieval policy, by rating the retrieved documents based on the non-retrieval ground truth of the task. We evaluate our model on two public datasets for real-world news claim verification, and the results demonstrate that FFRR achieves significant improvements over strong LLM-enabled and non-LLM baselines., Comment: Accepted by COLING 2024
- Published
- 2024
44. RAGAR, Your Falsehood Radar: RAG-Augmented Reasoning for Political Fact-Checking using Multimodal Large Language Models
- Author
-
Khaliq, M. Abdul, Chang, P., Ma, M., Pflugfelder, B., and Miletić, F.
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence ,Computer Science - Computers and Society ,Computer Science - Emerging Technologies ,Computer Science - Multiagent Systems - Abstract
The escalating challenge of misinformation, particularly in political discourse, requires advanced fact-checking solutions; this is even clearer in the more complex scenario of multimodal claims. We tackle this issue using a multimodal large language model in conjunction with retrieval-augmented generation (RAG), and introduce two novel reasoning techniques: Chain of RAG (CoRAG) and Tree of RAG (ToRAG). They fact-check multimodal claims by extracting both textual and image content, retrieving external information, and reasoning subsequent questions to be answered based on prior evidence. We achieve a weighted F1-score of 0.85, surpassing a baseline reasoning technique by 0.14 points. Human evaluation confirms that the vast majority of our generated fact-check explanations contain all information from gold standard data., Comment: 8 pages, submitted to ACL Rolling Review June 2024
- Published
- 2024
45. QuanTemp: A real-world open-domain benchmark for fact-checking numerical claims
- Author
-
V, Venktesh, Anand, Abhijit, Anand, Avishek, and Setty, Vinay
- Subjects
Computer Science - Computation and Language ,Computer Science - Artificial Intelligence - Abstract
Automated fact checking has gained immense interest to tackle the growing misinformation in the digital era. Existing systems primarily focus on synthetic claims on Wikipedia, and noteworthy progress has also been made on real-world claims. In this work, we release QuanTemp, a diverse, multi-domain dataset focused exclusively on numerical claims, encompassing temporal, statistical and diverse aspects with fine-grained metadata and an evidence collection without leakage. This addresses the challenge of verifying real-world numerical claims, which are complex and often lack precise information, not addressed by existing works that mainly focus on synthetic claims. We evaluate and quantify the limitations of existing solutions for the task of verifying numerical claims. We also evaluate claim decomposition based methods, numerical understanding based models and our best baselines achieves a macro-F1 of 58.32. This demonstrates that QuanTemp serves as a challenging evaluation set for numerical claim verification., Comment: 11 pages, 1 figure,Accepted for publication at the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2024)
- Published
- 2024
46. Cross-Lingual Learning vs. Low-Resource Fine-Tuning: A Case Study with Fact-Checking in Turkish
- Author
-
Cekinel, Recep Firat, Karagoz, Pinar, and Coltekin, Cagri
- Subjects
Computer Science - Computation and Language - Abstract
The rapid spread of misinformation through social media platforms has raised concerns regarding its impact on public opinion. While misinformation is prevalent in other languages, the majority of research in this field has concentrated on the English language. Hence, there is a scarcity of datasets for other languages, including Turkish. To address this concern, we have introduced the FCTR dataset, consisting of 3238 real-world claims. This dataset spans multiple domains and incorporates evidence collected from three Turkish fact-checking organizations. Additionally, we aim to assess the effectiveness of cross-lingual transfer learning for low-resource languages, with a particular focus on Turkish. We demonstrate in-context learning (zero-shot and few-shot) performance of large language models in this context. The experimental results indicate that the dataset has the potential to advance research in the Turkish language., Comment: LREC-COLING 2024
- Published
- 2024
47. FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs
- Author
-
Choi, Eun Cheol and Ferrara, Emilio
- Subjects
Computer Science - Computation and Language ,Computer Science - Computers and Society ,Computer Science - Human-Computer Interaction ,Computer Science - Social and Information Networks - Abstract
Our society is facing rampant misinformation harming public health and trust. To address the societal challenge, we introduce FACT-GPT, a system leveraging Large Language Models (LLMs) to automate the claim matching stage of fact-checking. FACT-GPT, trained on a synthetic dataset, identifies social media content that aligns with, contradicts, or is irrelevant to previously debunked claims. Our evaluation shows that our specialized LLMs can match the accuracy of larger models in identifying related claims, closely mirroring human judgment. This research provides an automated solution for efficient claim matching, demonstrates the potential of LLMs in supporting fact-checkers, and offers valuable resources for further research in the field.
- Published
- 2024
48. The epistemic status of reproducibility in political fact-checking
- Author
-
Fernández-Roldan, Alejandro and Teira, David
- Published
- 2024
- Full Text
- View/download PDF
49. CAN FACT-CHECKING INFLUENCE USER BELIEFS ABOUT MISINFORMATION CLAIMS: AN EXAMINATION OF CONTINGENT EFFECTS.
- Author
-
Bhattacherjee, Anol
- Abstract
Prior research has suggested that corrective fact-checking has inconsistent effects on beliefs about online misinformation claims. This study attempts to explain this inconsistency using three contingent factors--claim-source credibility, fact-checker credibility, and attitude strength--which respectively relate to three key parties in the fact-checking process: the source of a misleading claim, the fact-checker, and the user evaluating the fact-check. I hypothesize the interplay between these factors, which is tested using two online experiments on COVID-19-related misinformation with over 900 participants. Multilevel analysis of pretest-posttest, repeated measures data supports the hypothesized moderating effects and offers additional insights about how these effects vary between earlier versus later phases of misinformation cycles. The paper concludes with a discussion of contributions to research and practice. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. Retrieval Augmented Verification: Unveiling Disinformation with Structured Representations for Zero-Shot Real-Time Evidence-guided Fact-Checking of Multi-modal Social media posts
- Author
-
Dey, Arka Ujjal, Llabrés, Artemis, Valveny, Ernest, and Karatzas, Dimosthenis
- Subjects
Computer Science - Multimedia - Abstract
Social Media posts, where real images are unscrupulously reused along with provocative text to promote a particular idea, have been one of the major sources of disinformation. By design, these claims are without editorial oversight and accessible to a vast population who otherwise may not have access to multiple information sources. This implies the need to fact-check these posts and clearly explain which parts of the posts are fake. In the supervised learning setup, this is often reduced to a binary classification problem, neglecting all intermediate stages. Further, these claims often involve recent events on which systems trained on historical data are prone to fail. In this work, we propose a zero-shot approach by retrieving real-time web-scraped evidence from multiple news websites and matching them with the claim text and image using pretrained language vision systems. We propose a graph structured representation, which a) allows us to gather evidence automatically and b) helps generate interpretable results by explicitly pointing out which parts of the claim can not be verified. Our zero-shot method, with improved interpretability, generates competitive results against the state-of-the-art methods
- Published
- 2024
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.