1. Synergistic Detection of Multimodal Fake News Leveraging TextGCN and Vision Transformer.
- Author
-
M, Visweswaran, Mohan, Jayanth, Sachin Kumar, S, and Soman, K P
- Subjects
TRANSFORMER models ,FAKE news ,CONVOLUTIONAL neural networks ,FEATURE extraction ,MULTIMODAL user interfaces ,DIGITAL technology ,CELL fusion - Abstract
In today's digital age, the rapid spread of fake news is a pressing concern. Fake news, whether intentional or inadvertent, manipulates public sentiment and threatens the integrity of online information. To address this, effective detection and prevention methods are vital. Detecting and addressing this multimodal fake news is an intricate challenge as, unlike traditional news articles that predominantly rely on textual content, multimodal fake news leverages the persuasive power of visual elements, making its identification a formidable task. Manipulated images can significantly sway individuals' perceptions and beliefs, making the detection of such deceptive content complex. Our research introduces an innovative approach to multimodal fake news identification by presenting a fusion-based methodology that harnesses the capabilities of Text Graph Convolutional Neural Networks (TextGCN) and Vision Transformers (ViT) to effectively utilise both text and image modalities. The proposed Methodology starts with preprocessing textual content using TextGCN, allowing for the capture of intricate structural dependencies among words and phrases. Simultaneously, visual features are extracted from associated images using ViT. Through a fusion mechanism, these modalities seamlessly integrate, yielding superior embeddings. The primary contributions encompass an in-depth exploration of multimodal fake news detection through a fusion-based approach. What sets our approach apart from existing techniques is its integration of graph-based feature extraction through TextGCN. While previous methods predominantly rely on text or image features, our approach harnesses the additional semantic information and intricate relationships within a graph structure, in addition to image embeddings. This enables our method to capture more comprehensive understanding of the data, resulting in increased accuracy and reliability. Our experiments demonstrate the exceptional performance of our fusion-based approach, which leverages multiple modalities and incorporates graph-based representations and semantic relationships. This method outperformed single modalities of text or image, achieving an impressive accuracy of 94.17% using a neural network after fusion. By seamlessly integrating graph-based representations and semantic relationships, our fusion-based technique represents a significant stride in addressing the challenges posed by multimodal fake news. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF