Author: "Gkoumas, Dimitris" / Publication Type: Electronic Resources - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Gkoumas, Dimitris"' showing total 13 results

Start Over Author "Gkoumas, Dimitris" Publication Type Electronic Resources

13 results on '"Gkoumas, Dimitris"'

1. Quantum cognitively motivated context-aware multimodal representation learning for human language analysis

Author: Gkoumas, Dimitris
Abstract: A long-standing goal in the field of Artificial Intelligence (AI) is to develop systems that can perceive and understand human multimodal language. This requires both the consideration of context in the form of surrounding utterances in a conversation, i.e., 'context modelling', as well as the impact of different modalities (e.g., linguistic, visual acoustic), i.e., 'multimodal fusion'. In the last few years, significant strides have been made towards the interpretation of human language due to simultaneous advancement in deep learning, data gathering and computing infrastructure. AI models have been investigated to either model interactions across distinct modalities, i.e., linguistic, visual and acoustic, or model interactions across parties in a conversation, achieving unprecedented levels of performance. However, AI models are often designed with only performance as their design target, leaving aside other essential factors such as transparency, interpretability, and how humans understand and reason about cognitive states. In line with this observation, in this dissertation, we develop quantum probabilistic neural models and techniques that allow us to capture rational and irrational cognitive biases, without requiring 'a priori' understanding and identification of them. First, we present a comprehensive empirical comparison of state-of-the-art (SOTA) modality fusion strategies for video sentiment analysis. The findings provide us helpful insights into the development of more effective modality fusion models incorporating quantum-inspired components. Second, we introduce an end-to-end complex-valued neural model for video sentiment analysis, simulating quantum procedural steps, outside of physics, into the neural network modelling paradigm. Third, we investigate non-classical correlations across different modalities. In particular, we describe a methodology to model interactions between image and text for an information retrieval scenario. The results provide us with theoretical and empirical insights to develop a transparent end-to-end probabilistic neural model for video emotion detection in conversations, capturing non-classical correlations across distinct modalities. Fourth, we introduce a theoretical framework to model user's cognitive states underlying their multimodal decision perspectives, and propose a methodology to capture interference of modalities in decision making. Overall, we show that our models advance the SOTA on various affective analysis tasks, achieve high transparency due to the mapping to quantum physics meanings, and improve post-hoc interpretability, unearthing useful and explainable knowledge about cross-modal interactions.
Published: 2021
Full Text: View/download PDF

2. Feedback-aligned Mixed LLMs for Machine Language-Molecule Translation

Author: Gkoumas, Dimitris, Liakata, Maria, Gkoumas, Dimitris, and Liakata, Maria
Abstract: The intersection of chemistry and Artificial Intelligence (AI) is an active area of research focused on accelerating scientific discovery. While using large language models (LLMs) with scientific modalities has shown potential, there are significant challenges to address, such as improving training efficiency and dealing with the out-of-distribution problem. Focussing on the task of automated language-molecule translation, we are the first to use state-of-the art (SOTA) human-centric optimisation algorithms in the cross-modal setting, successfully aligning cross-language-molecule modals. We empirically show that we can augment the capabilities of scientific LLMs without the need for extensive data or large models. We conduct experiments using only 10% of the available data to mitigate memorisation effects associated with training large models on extensive datasets. We achieve significant performance gains, surpassing the best benchmark model trained on extensive in-distribution data by a large margin and reach new SOTA levels. Additionally we are the first to propose employing non-linear fusion for mixing cross-modal LLMs which further boosts performance gains without increasing training costs or data needs. Finally, we introduce a fine-grained, domain-agnostic evaluation method to assess hallucination in LLMs and promote responsible use.
Published: 2024

3. ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation

Author: Gkoumas, Dimitris and Gkoumas, Dimitris
Abstract: The field of chemistry and Artificial Intelligence (AI) intersection is an area of active research that aims to accelerate scientific discovery. The integration of large language models (LLMs) with scientific modalities has shown significant promise in this endeavour. However, challenges persist in effectively addressing training efficacy and the out-of-distribution problem, particularly as existing approaches rely on larger models and datasets. In this context, we focus on machine language-molecule translation and deploy a novel training approach called contrastive preference optimisation, which avoids generating translations that are merely adequate but not perfect. To ensure generalisability and mitigate memorisation effects, we conduct experiments using only 10\% of the data. Our results demonstrate that our models achieve up to a 32\% improvement compared to counterpart models. We also introduce a scalable fine-grained evaluation methodology that accommodates responsibility.
Published: 2024

4. Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia

Author: Gkoumas, Dimitris, Purver, Matthew, Liakata, Maria, Gkoumas, Dimitris, Purver, Matthew, and Liakata, Maria
Abstract: Dementia is associated with language disorders which impede communication. Here, we automatically learn linguistic disorder patterns by making use of a moderately-sized pre-trained language model and forcing it to focus on reformulated natural language processing (NLP) tasks and associated linguistic patterns. Our experiments show that NLP tasks that encapsulate contextual information and enhance the gradient signal with linguistic patterns benefit performance. We then use the probability estimates from the best model to construct digital linguistic markers measuring the overall quality in communication and the intensity of a variety of language disorders. We investigate how the digital markers characterize dementia speech from a longitudinal perspective. We find that our proposed communication marker is able to robustly and reliably characterize the language of people with dementia, outperforming existing linguistic approaches; and shows external validity via significant correlation with clinical markers of behaviour. Finally, our proposed linguistic disorder markers provide useful insights into gradual language impairment associated with disease progression., Comment: It has been accepted to appear at EMNLP23
Published: 2023

5. A Digital Language Coherence Marker for Monitoring Dementia

Author: Gkoumas, Dimitris, Tsakalidis, Adam, Liakata, Maria, Gkoumas, Dimitris, Tsakalidis, Adam, and Liakata, Maria
Abstract: The use of spontaneous language to derive appropriate digital markers has become an emergent, promising and non-intrusive method to diagnose and monitor dementia. Here we propose methods to capture language coherence as a cost-effective, human-interpretable digital marker for monitoring cognitive changes in people with dementia. We introduce a novel task to learn the temporal logical consistency of utterances in short transcribed narratives and investigate a range of neural approaches. We compare such language coherence patterns between people with dementia and healthy controls and conduct a longitudinal evaluation against three clinical bio-markers to investigate the reliability of our proposed digital coherence marker. The coherence marker shows a significant difference between people with mild cognitive impairment, those with Alzheimer's Disease and healthy controls. Moreover our analysis shows high association between the coherence marker and the clinical bio-markers as well as generalisability potential to other related conditions., Comment: It has been accepted to appear at EMNLP23
Published: 2023

6. What makes the difference?:An empirical comparison of fusion strategies for multimodal language analysis

Author: Gkoumas, Dimitris, Li, Qiuchi, Lioma, Christina, Yu, Yijun, Song, Dawei, Gkoumas, Dimitris, Li, Qiuchi, Lioma, Christina, Yu, Yijun, and Song, Dawei
Abstract: Multimodal video sentiment analysis is a rapidly growing area. It combines verbal (i.e., linguistic) and non-verbal modalities (i.e., visual, acoustic) to predict the sentiment of utterances. A recent trend has been geared towards different modality fusion models utilizing various attention, memory and recurrent components. However, there lacks a systematic investigation on how these different components contribute to solving the problem as well as their limitations. This paper aims to fill the gap, marking the following key innovations. We present the first large-scale and comprehensive empirical comparison of eleven state-of-the-art (SOTA) modality fusion approaches in two video sentiment analysis tasks, with three SOTA benchmark corpora. An in-depth analysis of the results shows that the attention mechanisms are the most effective for modelling crossmodal interactions, yet they are computationally expensive. Second, additional levels of crossmodal interaction decrease performance. Third, positive sentiment utterances are the most challenging cases for all approaches. Finally, integrating context and utilizing the linguistic modality as a pivot for non-verbal modalities improve performance. We expect that the findings would provide helpful insights and guidance to the development of more effective modality fusion models.
Published: 2021

7. Quantum-inspired multimodal fusion for video sentiment analysis

Author: Li, Qiuchi, Gkoumas, Dimitris, Lioma, Christina, Melucci, Massimo, Li, Qiuchi, Gkoumas, Dimitris, Lioma, Christina, and Melucci, Massimo
Abstract: We tackle the crucial challenge of fusing different modalities of features for multimodal sentiment analysis. Mainly based on neural networks, existing approaches largely model multimodal interactions in an implicit and hard-to-understand manner. We address this limitation with inspirations from quantum theory, which contains principled methods for modeling complicated interactions and correlations. In our quantum-inspired framework, the word interaction within a single modality and the interaction across modalities are formulated with superposition and entanglement respectively at different stages. The complex-valued neural network implementation of the framework achieves comparable results to state-of-the-art systems on two benchmarking video sentiment analysis datasets. In the meantime, we produce the unimodal and bimodal sentiment directly from the model to interpret the entangled decision.
Published: 2021

8. A Longitudinal Multi-modal Dataset for Dementia Monitoring and Diagnosis

Author: Gkoumas, Dimitris, Wang, Bo, Tsakalidis, Adam, Wolters, Maria, Zubiaga, Arkaitz, Purver, Matthew, Liakata, Maria, Gkoumas, Dimitris, Wang, Bo, Tsakalidis, Adam, Wolters, Maria, Zubiaga, Arkaitz, Purver, Matthew, and Liakata, Maria
Abstract: Dementia affects cognitive functions of adults, including memory, language, and behaviour. Standard diagnostic biomarkers such as MRI are costly, whilst neuropsychological tests suffer from sensitivity issues in detecting dementia onset. The analysis of speech and language has emerged as a promising and non-intrusive technology to diagnose and monitor dementia. Currently, most work in this direction ignores the multi-modal nature of human communication and interactive aspects of everyday conversational interaction. Moreover, most studies ignore changes in cognitive status over time due to the lack of consistent longitudinal data. Here we introduce a novel fine-grained longitudinal multi-modal corpus collected in a natural setting from healthy controls and people with dementia over two phases, each spanning 28 sessions. The corpus consists of spoken conversations, a subset of which are transcribed, as well as typed and written thoughts and associated extra-linguistic information such as pen strokes and keystrokes. We present the data collection process and describe the corpus in detail. Furthermore, we establish baselines for capturing longitudinal changes in language across different modalities for two cohorts, healthy controls and people with dementia, outlining future research directions enabled by the corpus.
Published: 2021

9. Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis

Author: Gkoumas, Dimitris, Li, Qiuchi, Dehdashti, Shahram, Melucci, Massimo, Yu, Yijun, Song, Dawei, Gkoumas, Dimitris, Li, Qiuchi, Dehdashti, Shahram, Melucci, Massimo, Yu, Yijun, and Song, Dawei
Abstract: Video sentiment analysis as a decision-making process is inherently complex, involving the fusion of decisions from multiple modalities and the so-caused cognitive biases. Inspired by recent advances in quantum cognition, we show that the sentiment judgment from one modality could be incompatible with the judgment from another, i.e., the order matters and they cannot be jointly measured to produce a final decision. Thus the cognitive process exhibits "quantum-like" biases that cannot be captured by classical probability theories. Accordingly, we propose a fundamentally new, quantum cognitively motivated fusion strategy for predicting sentiment judgments. In particular, we formulate utterances as quantum superposition states of positive and negative sentiment judgments, and uni-modal classifiers as mutually incompatible observables, on a complex-valued Hilbert space with positive-operator valued measures. Experiments on two benchmarking datasets illustrate that our model significantly outperforms various existing decision level and a range of state-of-the-art content-level fusion approaches. The results also show that the concept of incompatibility allows effective handling of all combination patterns, including those extreme cases that are wrongly predicted by all uni-modal classifiers., Comment: The uploaded version is a preprint of the accepted AAAI-21 paper
Published: 2021

10. A Survey of Quantum Theory Inspired Approaches to Information Retrieval

Author: Uprety, Sagar, Gkoumas, Dimitris, Song, Dawei, Uprety, Sagar, Gkoumas, Dimitris, and Song, Dawei
Abstract: Since 2004, researchers have been using the mathematical framework of Quantum Theory (QT) in Information Retrieval (IR). QT offers a generalized probability and logic framework. Such a framework has been shown capable of unifying the representation, ranking and user cognitive aspects of IR, and helpful in developing more dynamic, adaptive and context-aware IR systems. Although Quantum-inspired IR is still a growing area, a wide array of work in different aspects of IR has been done and produced promising results. This paper presents a survey of the research done in this area, aiming to show the landscape of the field and draw a road-map of future directions., Comment: Accepted for publication at ACM Computing Surveys on May 20, 2020
Published: 2020

11. Exploiting 'Quantum-like Interference' in Decision Fusion for Ranking Multimodal Documents

Author: Gkoumas, Dimitris, Sogn, Dawei, Gkoumas, Dimitris, and Sogn, Dawei
Abstract: Fusing and ranking multimodal information remains always a challenging task. A robust decision-level fusion method should not only be dynamically adaptive for assigning weights to each representation but also incorporate inter-relationships among different modalities. In this paper, we propose a quantum-inspired model for fusing and ranking visual and textual information accounting for the dependency between the aforementioned modalities. At first, we calculate the text-based and image-based similarity individually. Two different approaches have been applied for computing each unimodal similarity. The first one makes use of the bag-of-words model. For the second one, a pre-trained VGG19 model on ImageNet has been used for calculating the image similarity, while a query expansion approach has been applied to the text-based query for improving the retrieval performance. Afterward, the local similarity scores fit the proposed quantum-inspired model. The inter-dependency between the two modalities is captured implicitly through "quantum interference". Finally, the documents are ranked based on the proposed similarity measurement. We test our approach on ImageCLEF2007photo data collection and show the effectiveness of the proposed approach. A series of interesting findings are discussed, which would provide theoretical and empirical foundations for future development of this direction.
Published: 2018

12. Investigating Bell Inequalities for Multidimensional Relevance Judgments in Information Retrieval

Author: Uprety, Sagar, Gkoumas, Dimitris, Song, Dawei, Uprety, Sagar, Gkoumas, Dimitris, and Song, Dawei
Abstract: Relevance judgment in Information Retrieval is influenced by multiple factors. These include not only the topicality of the documents but also other user oriented factors like trust, user interest, etc. Recent works have identified these various factors into seven dimensions of relevance. In a previous work, these relevance dimensions were quantified and user's cognitive state with respect to a document was represented as a state vector in a Hilbert Space, with each relevance dimension representing a basis. It was observed that relevance dimensions are incompatible in some documents, when making a judgment. Incompatibility being a fundamental feature of Quantum Theory, this motivated us to test the Quantum nature of relevance judgments using Bell type inequalities. However, none of the Bell-type inequalities tested have shown any violation. We discuss our methodology to construct incompatible basis for documents from real world query log data, the experiments to test Bell inequalities on this dataset and possible reasons for the lack of violation., Comment: 11th Quantum Interaction Conference, Nice, France
Published: 2018

13. Investigating non-classical correlations between decision fused multi-modal documents

Author: Gkoumas, Dimitris, Uprety, Sagar, Song, Dawei, Gkoumas, Dimitris, Uprety, Sagar, and Song, Dawei
Abstract: Correlation has been widely used to facilitate various information retrieval methods such as query expansion, relevance feedback, document clustering, and multi-modal fusion. Especially, correlation and independence are important issues when fusing different modalities that influence a multi-modal information retrieval process. The basic idea of correlation is that an observable can help predict or enhance another observable. In quantum mechanics, quantum correlation, called entanglement, is a sort of correlation between the observables measured in atomic-size particles when these particles are not necessarily collected in ensembles. In this paper, we examine a multimodal fusion scenario that might be similar to that encountered in physics by firstly measuring two observables (i.e., text-based relevance and image-based relevance) of a multi-modal document without counting on an ensemble of multi-modal documents already labeled in terms of these two variables. Then, we investigate the existence of non-classical correlations between pairs of multi-modal documents. Despite there are some basic differences between entanglement and classical correlation encountered in the macroscopic world, we investigate the existence of this kind of non-classical correlation through the Bell inequality violation. Here, we experimentally test several novel association methods in a small-scale experiment. However, in the current experiment we did not find any violation of the Bell inequality. Finally, we present a series of interesting discussions, which may provide theoretical and empirical insights and inspirations for future development of this direction.
Published: 2018

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

13 results on '"Gkoumas, Dimitris"'

1. Quantum cognitively motivated context-aware multimodal representation learning for human language analysis

2. Feedback-aligned Mixed LLMs for Machine Language-Molecule Translation

3. ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation

4. Reformulating NLP tasks to Capture Longitudinal Manifestation of Language Disorders in People with Dementia

5. A Digital Language Coherence Marker for Monitoring Dementia

6. What makes the difference?:An empirical comparison of fusion strategies for multimodal language analysis

7. Quantum-inspired multimodal fusion for video sentiment analysis

8. A Longitudinal Multi-modal Dataset for Dementia Monitoring and Diagnosis

9. Quantum Cognitively Motivated Decision Fusion for Video Sentiment Analysis

10. A Survey of Quantum Theory Inspired Approaches to Information Retrieval

11. Exploiting 'Quantum-like Interference' in Decision Fusion for Ranking Multimodal Documents

12. Investigating Bell Inequalities for Multidimensional Relevance Judgments in Information Retrieval

13. Investigating non-classical correlations between decision fused multi-modal documents

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Publication Year Range

Language

Publication Type

Database

13 results on '"Gkoumas, Dimitris"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources