Author: "Geng, Jiahui" / Database: OAIster - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Geng, Jiahui"' showing total 9 results

Start Over Author "Geng, Jiahui" Database OAIster

9 results on '"Geng, Jiahui"'

1. Multimodal Large Language Models to Support Real-World Fact-Checking

Author: Geng, Jiahui, Kementchedjhieva, Yova, Nakov, Preslav, Gurevych, Iryna, Geng, Jiahui, Kementchedjhieva, Yova, Nakov, Preslav, and Gurevych, Iryna
Abstract: Multimodal large language models (MLLMs) carry the potential to support humans in processing vast amounts of information. While MLLMs are already being used as a fact-checking tool, their abilities and limitations in this regard are understudied. Here is aim to bridge this gap. In particular, we propose a framework for systematically assessing the capacity of current multimodal models to facilitate real-world fact-checking. Our methodology is evidence-free, leveraging only these models' intrinsic knowledge and reasoning capabilities. By designing prompts that extract models' predictions, explanations, and confidence levels, we delve into research questions concerning model accuracy, robustness, and reasons for failure. We empirically find that (1) GPT-4V exhibits superior performance in identifying malicious and misleading multimodal claims, with the ability to explain the unreasonable aspects and underlying motives, and (2) existing open-source models exhibit strong biases and are highly sensitive to the prompt. Our study offers insights into combating false multimodal information and building secure, trustworthy multimodal models. To the best of our knowledge, we are the first to evaluate MLLMs for real-world fact-checking.
Published: 2024

2. OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

Author: Wang, Yuxia, Wang, Minghan, Iqbal, Hasan, Georgiev, Georgi, Geng, Jiahui, Nakov, Preslav, Wang, Yuxia, Wang, Minghan, Iqbal, Hasan, Georgiev, Georgi, Geng, Jiahui, and Nakov, Preslav
Abstract: The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. Difficulties lie in assessing the factuality of free-form responses in open domains. Also, different papers use disparate evaluation benchmarks and measurements, which renders them hard to compare and hampers future progress. To mitigate these issues, we propose OpenFactCheck, a unified factuality evaluation framework for LLMs. OpenFactCheck consists of three modules: (i) CUSTCHECKER allows users to easily customize an automatic fact-checker and verify the factual correctness of documents and claims, (ii) LLMEVAL, a unified evaluation framework assesses LLM's factuality ability from various perspectives fairly, and (iii) CHECKEREVAL is an extensible solution for gauging the reliability of automatic fact-checkers' verification results using human-annotated datasets. OpenFactCheck is publicly released at https://github.com/yuxiaw/OpenFactCheck., Comment: 19 pages, 8 tables, 8 figures
Published: 2024

3. A Comprehensive Study on Dataset Distillation: Performance, Privacy, Robustness and Fairness

Author: Chen, Zongxiong, Geng, Jiahui, Zhu, Derui, Woisetschlaeger, Herbert, Li, Qing, Schimmler, Sonja, Mayer, Ruben, Rong, Chunming, Chen, Zongxiong, Geng, Jiahui, Zhu, Derui, Woisetschlaeger, Herbert, Li, Qing, Schimmler, Sonja, Mayer, Ruben, and Rong, Chunming
Abstract: The aim of dataset distillation is to encode the rich features of an original dataset into a tiny dataset. It is a promising approach to accelerate neural network training and related studies. Different approaches have been proposed to improve the informativeness and generalization performance of distilled images. However, no work has comprehensively analyzed this technique from a security perspective and there is a lack of systematic understanding of potential risks. In this work, we conduct extensive experiments to evaluate current state-of-the-art dataset distillation methods. We successfully use membership inference attacks to show that privacy risks still remain. Our work also demonstrates that dataset distillation can cause varying degrees of impact on model robustness and amplify model unfairness across classes when making predictions. This work offers a large-scale benchmarking framework for dataset distillation evaluation.
Published: 2023

4. A Survey on Dataset Distillation: Approaches, Applications and Future Directions

Author: Geng, Jiahui, Chen, Zongxiong, Wang, Yuandou, Woisetschlaeger, Herbert, Schimmler, Sonja, Mayer, Ruben, Zhao, Zhiming, Rong, Chunming, Geng, Jiahui, Chen, Zongxiong, Wang, Yuandou, Woisetschlaeger, Herbert, Schimmler, Sonja, Mayer, Ruben, Zhao, Zhiming, and Rong, Chunming
Abstract: Dataset distillation is attracting more attention in machine learning as training sets continue to grow and the cost of training state-of-the-art models becomes increasingly high. By synthesizing datasets with high information density, dataset distillation offers a range of potential applications, including support for continual learning, neural architecture search, and privacy protection. Despite recent advances, we lack a holistic understanding of the approaches and applications. Our survey aims to bridge this gap by first proposing a taxonomy of dataset distillation, characterizing existing approaches, and then systematically reviewing the data modalities, and related applications. In addition, we summarize the challenges and discuss future directions for this field of research.
Published: 2023

5. Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

Author: Wang, Yuxia, Reddy, Revanth Gangi, Mujahid, Zain Muhammad, Arora, Arnav, Rubashevskii, Aleksandr, Geng, Jiahui, Afzal, Osama Mohammed, Pan, Liangming, Borenstein, Nadav, Pillai, Aditya, Augenstein, Isabelle, Gurevych, Iryna, Nakov, Preslav, Wang, Yuxia, Reddy, Revanth Gangi, Mujahid, Zain Muhammad, Arora, Arnav, Rubashevskii, Aleksandr, Geng, Jiahui, Afzal, Osama Mohammed, Pan, Liangming, Borenstein, Nadav, Pillai, Aditya, Augenstein, Isabelle, Gurevych, Iryna, and Nakov, Preslav
Abstract: The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the factual accuracy of their outputs. In this work, we present a holistic end-to-end solution for annotating the factuality of LLM-generated responses, which encompasses a multi-stage annotation scheme designed to yield detailed labels concerning the verifiability and factual inconsistencies found in LLM outputs. We further construct an open-domain document-level factuality benchmark in three-level granularity: claim, sentence and document, aiming to facilitate the evaluation of automatic fact-checking systems. Preliminary experiments show that FacTool, FactScore and Perplexity.ai are struggling to identify false claims, with the best F1=0.63 by this annotation solution based on GPT-4. Annotation tool, benchmark and code are available at https://github.com/yuxiaw/Factcheck-GPT., Comment: 30 pages, 13 figures
Published: 2023

6. A Survey of Confidence Estimation and Calibration in Large Language Models

Author: Geng, Jiahui, Cai, Fengyu, Wang, Yuxia, Koeppl, Heinz, Nakov, Preslav, Gurevych, Iryna, Geng, Jiahui, Cai, Fengyu, Wang, Yuxia, Koeppl, Heinz, Nakov, Preslav, and Gurevych, Iryna
Abstract: Large language models (LLMs) have demonstrated remarkable capabilities across a wide range of tasks in various domains. Despite their impressive performance, they can be unreliable due to factual errors in their generations. Assessing their confidence and calibrating them across different tasks can help mitigate risks and enable LLMs to produce better generations. There has been a lot of recent research aiming to address this, but there has been no comprehensive overview to organize it and outline the main lessons learned. The present survey aims to bridge this gap. In particular, we outline the challenges and we summarize recent technical advancements for LLM confidence estimation and calibration. We further discuss their applications and suggest promising directions for future work., Comment: 16 pages, 1 page, 1 table
Published: 2023

7. Towards General Deep Leakage in Federated Learning

Author: Geng, Jiahui, Mou, Yongli, Li, Feifei, Li, Qing, Beyan, Oya, Decker, Stefan, Rong, Chunming, Geng, Jiahui, Mou, Yongli, Li, Feifei, Li, Qing, Beyan, Oya, Decker, Stefan, and Rong, Chunming
Abstract: Unlike traditional central training, federated learning (FL) improves the performance of the global model by sharing and aggregating local models rather than local data to protect the users' privacy. Although this training approach appears secure, some research has demonstrated that an attacker can still recover private data based on the shared gradient information. This on-the-fly reconstruction attack deserves to be studied in depth because it can occur at any stage of training, whether at the beginning or at the end of model training; no relevant dataset is required and no additional models need to be trained. We break through some unrealistic assumptions and limitations to apply this reconstruction attack in a broader range of scenarios. We propose methods that can reconstruct the training data from shared gradients or weights, corresponding to the FedSGD and FedAvg usage scenarios, respectively. We propose a zero-shot approach to restore labels even if there are duplicate labels in the batch. We study the relationship between the label and image restoration. We find that image restoration fails even if there is only one incorrectly inferred label in the batch; we also find that when batch images have the same label, the corresponding image is restored as a fusion of that class of images. Our approaches are evaluated on classic image benchmarks, including CIFAR-10 and ImageNet. The batch size, image quality, and the adaptability of the label distribution of our approach exceed those of GradInversion, the state-of-the-art.
Published: 2021

8. DID-eFed: Facilitating Federated Learning as a Service with Decentralized Identities

Author: Geng, Jiahui, Kanwal, Neel, Jaatun, Martin Gilje, Rong, Chunming, Geng, Jiahui, Kanwal, Neel, Jaatun, Martin Gilje, and Rong, Chunming
Abstract: We have entered the era of big data, and it is considered to be the "fuel" for the flourishing of artificial intelligence applications. The enactment of the EU General Data Protection Regulation (GDPR) raises concerns about individuals' privacy in big data. Federated learning (FL) emerges as a functional solution that can help build high-performance models shared among multiple parties while still complying with user privacy and data confidentiality requirements. Although FL has been intensively studied and used in real applications, there is still limited research related to its prospects and applications as a FLaaS (Federated Learning as a Service) to interested 3rd parties. In this paper, we present a FLaaS system: DID-eFed, where FL is facilitated by decentralized identities (DID) and a smart contract. DID enables a more flexible and credible decentralized access management in our system, while the smart contract offers a frictionless and less error-prone process. We describe particularly the scenario where our DID-eFed enables the FLaaS among hospitals and research institutions., Comment: Paper accepted in EASE2021
Published: 2021
Full Text: View/download PDF

9. Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder

Author: Kim, Yunsu, Geng, Jiahui, Ney, Hermann, Kim, Yunsu, Geng, Jiahui, and Ney, Hermann
Abstract: Unsupervised learning of cross-lingual word embedding offers elegant matching of words across languages, but has fundamental limitations in translating sentences. In this paper, we propose simple yet effective methods to improve word-by-word translation of cross-lingual embeddings, using only monolingual corpora but without any back-translation. We integrate a language model for context-aware search, and use a novel denoising autoencoder to handle reordering. Our system surpasses state-of-the-art unsupervised neural translation systems without costly iterative training. We also analyze the effect of vocabulary size and denoising type on the translation performance, which provides better understanding of learning the cross-lingual word embedding and its usage in translation., Comment: Published in EMNLP 2018, with links to the source code
Published: 2019

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

9 results on '"Geng, Jiahui"'

1. Multimodal Large Language Models to Support Real-World Fact-Checking

2. OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

3. A Comprehensive Study on Dataset Distillation: Performance, Privacy, Robustness and Fairness

4. A Survey on Dataset Distillation: Approaches, Applications and Future Directions

5. Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

6. A Survey of Confidence Estimation and Calibration in Large Language Models

7. Towards General Deep Leakage in Federated Learning

8. DID-eFed: Facilitating Federated Learning as a Service with Decentralized Identities

9. Improving Unsupervised Word-by-Word Translation with Language Model and Denoising Autoencoder

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Publication Type

Database

9 results on '"Geng, Jiahui"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources