Author: "Bianchi, Federico" / Database: arXiv - Searchworks@Jio Institute Digital Library Search Results

1. Belief in the Machine: Investigating Epistemological Blind Spots of Language Models

Author: Suzgun, Mirac, Gur, Tayfun, Bianchi, Federico, Ho, Daniel E., Icard, Thomas, Jurafsky, Dan, and Zou, James
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Computers and Society
Abstract: As language models (LMs) become integral to fields like healthcare, law, and journalism, their ability to differentiate between fact, belief, and knowledge is essential for reliable decision-making. Failure to grasp these distinctions can lead to significant consequences in areas such as medical diagnosis, legal judgments, and dissemination of fake news. Despite this, current literature has largely focused on more complex issues such as theory of mind, overlooking more fundamental epistemic challenges. This study systematically evaluates the epistemic reasoning capabilities of modern LMs, including GPT-4, Claude-3, and Llama-3, using a new dataset, KaBLE, consisting of 13,000 questions across 13 tasks. Our results reveal key limitations. First, while LMs achieve 86% accuracy on factual scenarios, their performance drops significantly with false scenarios, particularly in belief-related tasks. Second, LMs struggle with recognizing and affirming personal beliefs, especially when those beliefs contradict factual data, which raises concerns for applications in healthcare and counseling, where engaging with a person's beliefs is critical. Third, we identify a salient bias in how LMs process first-person versus third-person beliefs, performing better on third-person tasks (80.7%) compared to first-person tasks (54.4%). Fourth, LMs lack a robust understanding of the factive nature of knowledge, namely, that knowledge inherently requires truth. Fifth, LMs rely on linguistic cues for fact-checking and sometimes bypass the deeper reasoning. These findings highlight significant concerns about current LMs' ability to reason about truth, belief, and knowledge while emphasizing the need for advancements in these areas before broad deployment in critical sectors., Comment: https://github.com/suzgunmirac/belief-in-the-machine
Published: 2024

2. h4rm3l: A Dynamic Benchmark of Composable Jailbreak Attacks for LLM Safety Assessment

Author: Doumbouya, Moussa Koulako Bala, Nandi, Ananjan, Poesia, Gabriel, Ghilardi, Davide, Goldie, Anna, Bianchi, Federico, Jurafsky, Dan, and Manning, Christopher D.
Subjects: Computer Science - Cryptography and Security, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computers and Society, Computer Science - Machine Learning, 68, I.2, I.2.0, I.2.1, I.2.5, I.2.7, K.6.5, K.4.2
Abstract: The safety of Large Language Models (LLMs) remains a critical concern due to a lack of adequate benchmarks for systematically evaluating their ability to resist generating harmful content. Previous efforts towards automated red teaming involve static or templated sets of illicit requests and adversarial prompts which have limited utility given jailbreak attacks' evolving and composable nature. We propose a novel dynamic benchmark of composable jailbreak attacks to move beyond static datasets and taxonomies of attacks and harms. Our approach consists of three components collectively called h4rm3l: (1) a domain-specific language that formally expresses jailbreak attacks as compositions of parameterized prompt transformation primitives, (2) bandit-based few-shot program synthesis algorithms that generate novel attacks optimized to penetrate the safety filters of a target black box LLM, and (3) open-source automated red-teaming software employing the previous two components. We use h4rm3l to generate a dataset of 2656 successful novel jailbreak attacks targeting 6 state-of-the-art (SOTA) open-source and proprietary LLMs. Several of our synthesized attacks are more effective than previously reported ones, with Attack Success Rates exceeding 90% on SOTA closed language models such as claude-3-haiku and GPT4-o. By generating datasets of jailbreak attacks in a unified formal representation, h4rm3l enables reproducible benchmarking and automated red-teaming, contributes to understanding LLM safety limitations, and supports the development of robust defenses in an increasingly LLM-integrated world. Warning: This paper and related research artifacts contain offensive and potentially disturbing prompts and model-generated content.
Published: 2024

3. TextGrad: Automatic 'Differentiation' via Text

Author: Yuksekgonul, Mert, Bianchi, Federico, Boen, Joseph, Liu, Sheng, Huang, Zhi, Guestrin, Carlos, and Zou, James
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: AI is undergoing a paradigm shift, with breakthroughs achieved by systems orchestrating multiple large language models (LLMs) and other complex components. As a result, developing principled and automated optimization methods for compound AI systems is one of the most important new challenges. Neural networks faced a similar challenge in its early days until backpropagation and automatic differentiation transformed the field by making optimization turn-key. Inspired by this, we introduce TextGrad, a powerful framework performing automatic ``differentiation'' via text. TextGrad backpropagates textual feedback provided by LLMs to improve individual components of a compound AI system. In our framework, LLMs provide rich, general, natural language suggestions to optimize variables in computation graphs, ranging from code snippets to molecular structures. TextGrad follows PyTorch's syntax and abstraction and is flexible and easy-to-use. It works out-of-the-box for a variety of tasks, where the users only provide the objective function without tuning components or prompts of the framework. We showcase TextGrad's effectiveness and generality across a diverse range of applications, from question answering and molecule optimization to radiotherapy treatment planning. Without modifying the framework, TextGrad improves the zero-shot accuracy of GPT-4o in Google-Proof Question Answering from $51\%$ to $55\%$, yields $20\%$ relative performance gain in optimizing LeetCode-Hard coding problem solutions, improves prompts for reasoning, designs new druglike small molecules with desirable in silico binding, and designs radiation oncology treatment plans with high specificity. TextGrad lays a foundation to accelerate the development of the next-generation of AI systems., Comment: 41 pages, 6 figures
Published: 2024

4. Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content

Author: Bianchi, Federico and Zou, James
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: The risks derived from large language models (LLMs) generating deceptive and damaging content have been the subject of considerable research, but even safe generations can lead to problematic downstream impacts. In our study, we shift the focus to how even safe text coming from LLMs can be easily turned into potentially dangerous content through Bait-and-Switch attacks. In such attacks, the user first prompts LLMs with safe questions and then employs a simple find-and-replace post-hoc technique to manipulate the outputs into harmful narratives. The alarming efficacy of this approach in generating toxic content highlights a significant challenge in developing reliable safety guardrails for LLMs. In particular, we stress that focusing on the safety of the verbatim LLM outputs is insufficient and that we also need to consider post-hoc transformations.
Published: 2024

5. How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis

Author: Bianchi, Federico, Chia, Patrick John, Yuksekgonul, Mert, Tagliabue, Jacopo, Jurafsky, Dan, and Zou, James
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Computer Science and Game Theory
Abstract: Negotiation is the basis of social interactions; humans negotiate everything from the price of cars to how to share common resources. With rapidly growing interest in using large language models (LLMs) to act as agents on behalf of human users, such LLM agents would also need to be able to negotiate. In this paper, we study how well LLMs can negotiate with each other. We develop NegotiationArena: a flexible framework for evaluating and probing the negotiation abilities of LLM agents. We implemented three types of scenarios in NegotiationArena to assess LLM's behaviors in allocating shared resources (ultimatum games), aggregate resources (trading games) and buy/sell goods (price negotiations). Each scenario allows for multiple turns of flexible dialogues between LLM agents to allow for more complex negotiations. Interestingly, LLM agents can significantly boost their negotiation outcomes by employing certain behavioral tactics. For example, by pretending to be desolate and desperate, LLMs can improve their payoffs by 20\% when negotiating against the standard GPT-4. We also quantify irrational negotiation behaviors exhibited by the LLM agents, many of which also appear in humans. Together, \NegotiationArena offers a new environment to investigate LLM interactions, enabling new insights into LLM's theory of mind, irrationality, and reasoning abilities.
Published: 2024

6. Vehicle-to-Grid and ancillary services:a profitability analysis under uncertainty

Author: Bianchi, Federico, Falsone, Alessandro, and Vignali, Riccardo
Subjects: Computer Science - Robotics, Electrical Engineering and Systems Science - Systems and Control
Abstract: The rapid and massive diffusion of electric vehicles poses new challenges to the electric system, which must be able to supply these new loads, but at the same time opens up new opportunities thanks to the possible provision of ancillary services. Indeed, in the so-called Vehicle-to-Grid (V2G) set-up, the charging power can be modulated throughout the day so that a fleet of vehicles can absorb an excess of power from the grid or provide extra power during a shortage.To this end, many works in the literature focus on the optimization of each vehicle daily charging profiles to offer the requested ancillary services while guaranteeing a charged battery for each vehicle at the end of the day. However, the size of the economic benefits related to the provision of ancillary services varies significantly with the modeling approaches, different assumptions, and considered scenarios. In this paper we propose a profitability analysis with reference to a recently proposed framework for V2G optimal operation in presence of uncertainty. We provide necessary and sufficient conditions for profitability in a simplified case and we show via simulation that they also hold for the general case., Comment: Accepted by IFAC for publication under a Creative Commons Licence CC-BY-NC-ND
Published: 2023

7. Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions

Author: Bianchi, Federico, Suzgun, Mirac, Attanasio, Giuseppe, Röttger, Paul, Jurafsky, Dan, Hashimoto, Tatsunori, and Zou, James
Subjects: Computer Science - Computation and Language
Abstract: Training large language models to follow instructions makes them perform better on a wide range of tasks and generally become more helpful. However, a perfectly helpful model will follow even the most malicious instructions and readily generate harmful content. In this paper, we raise concerns over the safety of models that only emphasize helpfulness, not harmlessness, in their instruction-tuning. We show that several popular instruction-tuned models are highly unsafe. Moreover, we show that adding just 3% safety examples (a few hundred demonstrations) when fine-tuning a model like LLaMA can substantially improve its safety. Our safety-tuning does not make models significantly less capable or helpful as measured by standard benchmarks. However, we do find exaggerated safety behaviours, where too much safety-tuning makes models refuse perfectly safe prompts if they superficially resemble unsafe ones. As a whole, our results illustrate trade-offs in training LLMs to be helpful and training them to be safe.
Published: 2023

8. XSTest: A Test Suite for Identifying Exaggerated Safety Behaviours in Large Language Models

Author: Röttger, Paul, Kirk, Hannah Rose, Vidgen, Bertie, Attanasio, Giuseppe, Bianchi, Federico, and Hovy, Dirk
Subjects: Computer Science - Computation and Language, Computer Science - Artificial Intelligence
Abstract: Without proper safeguards, large language models will readily follow malicious instructions and generate toxic content. This risk motivates safety efforts such as red-teaming and large-scale feedback learning, which aim to make models both helpful and harmless. However, there is a tension between these two objectives, since harmlessness requires models to refuse to comply with unsafe prompts, and thus not be helpful. Recent anecdotal evidence suggests that some models may have struck a poor balance, so that even clearly safe prompts are refused if they use similar language to unsafe prompts or mention sensitive topics. In this paper, we introduce a new test suite called XSTest to identify such eXaggerated Safety behaviours in a systematic way. XSTest comprises 250 safe prompts across ten prompt types that well-calibrated models should not refuse to comply with, and 200 unsafe prompts as contrasts that models, for most applications, should refuse. We describe XSTest's creation and composition, and then use the test suite to highlight systematic failure modes in state-of-the-art language models as well as more general challenges in building safer language models., Comment: Accepted at NAACL 2024 (Main Conference)
Published: 2023

9. E Pluribus Unum: Guidelines on Multi-Objective Evaluation of Recommender Systems

Author: Chia, Patrick John, Attanasio, Giuseppe, Tagliabue, Jacopo, Bianchi, Federico, Greco, Ciro, Moreira, Gabriel de Souza P., Eynard, Davide, and Husain, Fahd
Subjects: Computer Science - Information Retrieval
Abstract: Recommender Systems today are still mostly evaluated in terms of accuracy, with other aspects beyond the immediate relevance of recommendations, such as diversity, long-term user retention and fairness, often taking a back seat. Moreover, reconciling multiple performance perspectives is by definition indeterminate, presenting a stumbling block to those in the pursuit of rounded evaluation of Recommender Systems. EvalRS 2022 -- a data challenge designed around Multi-Objective Evaluation -- was a first practical endeavour, providing many insights into the requirements and challenges of balancing multiple objectives in evaluation. In this work, we reflect on EvalRS 2022 and expound upon crucial learnings to formulate a first-principles approach toward Multi-Objective model selection, and outline a set of guidelines for carrying out a Multi-Objective Evaluation challenge, with potential applicability to the problem of rounded evaluation of competing models in real-world deployments., Comment: 15 pages, under submission
Published: 2023

10. EvalRS 2023. Well-Rounded Recommender Systems For Real-World Deployments

Author: Bianchi, Federico, Chia, Patrick John, Greco, Ciro, Pomo, Claudio, Moreira, Gabriel, Eynard, Davide, Husain, Fahd, and Tagliabue, Jacopo
Subjects: Computer Science - Information Retrieval, Computer Science - Computers and Society
Abstract: EvalRS aims to bring together practitioners from industry and academia to foster a debate on rounded evaluation of recommender systems, with a focus on real-world impact across a multitude of deployment scenarios. Recommender systems are often evaluated only through accuracy metrics, which fall short of fully characterizing their generalization capabilities and miss important aspects, such as fairness, bias, usefulness, informativeness. This workshop builds on the success of last year's workshop at CIKM, but with a broader scope and an interactive format., Comment: EvalRS 2023 is a workshop at KDD23. Code and hackathon materials: https://github.com/RecList/evalRS-KDD-2023
Published: 2023

11. Oxidized organic molecules in the tropical free troposphere over Amazonia

Author: Zha, Qiaozhi, Aliaga, Diego, Krejci, Radovan, Sinclair, Victoria, Wu, Cheng, Scholz, Wiebke, Heikkinen, Liine, Partoll, Eva, Gramlich, Yvette, Huang, Wei, Leiminger, Markus, Enroth, Joonas, Peräkylä, Otso, Cai, Runlong, Chen, Xuemeng, Koenig, Alkuin Maximilian, Velarde, Fernando, Moreno, Isabel, Petäjä, Tuukka, Artaxo, Paulo, Laj, Paolo, Hansel, Armin, Carbone, Samara, Kulmala, Markku, Andrade, Marcos, Worsnop, Douglas, Mohr, Claudia, and Bianchi, Federico
Subjects: Physics - Atmospheric and Oceanic Physics, Astrophysics - Earth and Planetary Astrophysics, Physics - Geophysics
Abstract: New particle formation (NPF) in the tropical free troposphere (FT) is a globally important source of cloud condensation nuclei, affecting cloud properties and climate. Oxidized organic molecules (OOMs) produced from biogenic volatile organic compounds are believed to contribute to aerosol formation in the tropical FT, but without direct chemical observations. We performed in-situ molecular-level OOMs measurements at the Bolivian station Chacaltaya at 5240 meters above sea level, on the western edge of Amazonia. For the first time, we demonstrate the presence of OOMs, mainly with 4-5 carbon atoms, simultaneously in both gas and particulate phases in tropical FT air from Amazonia. These observations, combined with air mass history analyses, indicate that the observed OOMs are linked to isoprene emitted from the rainforests hundreds of kilometers away. Based on particle-phase measurements, we find that these compounds can contribute to the growth of newly formed particles, and are potentially crucial for new particle formation in the tropical free troposphere on a continental scale. Our study will thus improve the understanding of aerosol formation process in the tropics.
Published: 2023

12. Beyond Digital 'Echo Chambers': The Role of Viewpoint Diversity in Political Discussion

Author: Hada, Rishav, Fard, Amir Ebrahimi, Shugars, Sarah, Bianchi, Federico, Rossini, Patricia, Hovy, Dirk, Tromble, Rebekah, and Tintarev, Nava
Subjects: Computer Science - Computation and Language
Abstract: Increasingly taking place in online spaces, modern political conversations are typically perceived to be unproductively affirming -- siloed in so called ``echo chambers'' of exclusively like-minded discussants. Yet, to date we lack sufficient means to measure viewpoint diversity in conversations. To this end, in this paper, we operationalize two viewpoint metrics proposed for recommender systems and adapt them to the context of social media conversations. This is the first study to apply these two metrics (Representation and Fragmentation) to real world data and to consider the implications for online conversations specifically. We apply these measures to two topics -- daylight savings time (DST), which serves as a control, and the more politically polarized topic of immigration. We find that the diversity scores for both Fragmentation and Representation are lower for immigration than for DST. Further, we find that while pro-immigrant views receive consistent pushback on the platform, anti-immigrant views largely operate within echo chambers. We observe less severe yet similar patterns for DST. Taken together, Representation and Fragmentation paint a meaningful and important new picture of viewpoint diversity., Comment: Camera-ready version in WSDM 2023
Published: 2022

13. SocioProbe: What, When, and Where Language Models Learn about Sociodemographics

Author: Lauscher, Anne, Bianchi, Federico, Bowman, Samuel, and Hovy, Dirk
Subjects: Computer Science - Computation and Language
Abstract: Pre-trained language models (PLMs) have outperformed other NLP models on a wide range of tasks. Opting for a more thorough understanding of their capabilities and inner workings, researchers have established the extend to which they capture lower-level knowledge like grammaticality, and mid-level semantic knowledge like factual understanding. However, there is still little understanding of their knowledge of higher-level aspects of language. In particular, despite the importance of sociodemographic aspects in shaping our language, the questions of whether, where, and how PLMs encode these aspects, e.g., gender or age, is still unexplored. We address this research gap by probing the sociodemographic knowledge of different single-GPU PLMs on multiple English data sets via traditional classifier probing and information-theoretic minimum description length probing. Our results show that PLMs do encode these sociodemographics, and that this knowledge is sometimes spread across the layers of some of the tested PLMs. We further conduct a multilingual analysis and investigate the effect of supplementary training to further explore to what extent, where, and with what amount of pre-training data the knowledge is encoded. Our overall results indicate that sociodemographic knowledge is still a major challenge for NLP. PLMs require large amounts of pre-training data to acquire the knowledge and models that excel in general language understanding do not seem to own more knowledge about these aspects., Comment: Accepted for publication at EMNLP 2022
Published: 2022

14. Easily Accessible Text-to-Image Generation Amplifies Demographic Stereotypes at Large Scale

Author: Bianchi, Federico, Kalluri, Pratyusha, Durmus, Esin, Ladhak, Faisal, Cheng, Myra, Nozza, Debora, Hashimoto, Tatsunori, Jurafsky, Dan, Zou, James, and Caliskan, Aylin
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: Machine learning models that convert user-written text descriptions into images are now widely available online and used by millions of users to generate millions of images a day. We investigate the potential for these models to amplify dangerous and complex stereotypes. We find a broad range of ordinary prompts produce stereotypes, including prompts simply mentioning traits, descriptors, occupations, or objects. For example, we find cases of prompting for basic traits or social roles resulting in images reinforcing whiteness as ideal, prompting for occupations resulting in amplification of racial and gender disparities, and prompting for objects resulting in reification of American norms. Stereotypes are present regardless of whether prompts explicitly mention identity and demographic language or avoid such language. Moreover, stereotypes persist despite mitigation strategies; neither user attempts to counter stereotypes by requesting images with specific counter-stereotypes nor institutional attempts to add system ``guardrails'' have prevented the perpetuation of stereotypes. Our analysis justifies concerns regarding the impacts of today's models, presenting striking exemplars, and connecting these findings with deep insights into harms drawn from social scientific and humanist disciplines. This work contributes to the effort to shed light on the uniquely complex biases in language-vision models and demonstrates the ways that the mass deployment of text-to-image generation models results in mass dissemination of stereotypes and resulting harms., Comment: FAccT 2023 paper. The published version is available at 10.1145/3593013.3594095
Published: 2022
Full Text: View/download PDF

15. 'It's Not Just Hate': A Multi-Dimensional Perspective on Detecting Harmful Speech Online

Author: Bianchi, Federico, Hills, Stefanie Anja, Rossini, Patricia, Hovy, Dirk, Tromble, Rebekah, and Tintarev, Nava
Subjects: Computer Science - Computation and Language
Abstract: Well-annotated data is a prerequisite for good Natural Language Processing models. Too often, though, annotation decisions are governed by optimizing time or annotator agreement. We make a case for nuanced efforts in an interdisciplinary setting for annotating offensive online speech. Detecting offensive content is rapidly becoming one of the most important real-world NLP tasks. However, most datasets use a single binary label, e.g., for hate or incivility, even though each concept is multi-faceted. This modeling choice severely limits nuanced insights, but also performance. We show that a more fine-grained multi-label approach to predicting incivility and hateful or intolerant content addresses both conceptual and performance issues. We release a novel dataset of over 40,000 tweets about immigration from the US and UK, annotated with six labels for different aspects of incivility and intolerance. Our dataset not only allows for a more nuanced understanding of harmful speech online, models trained on it also outperform or match performance on benchmark datasets., Comment: EMNLP 2022
Published: 2022

16. ProSiT! Latent Variable Discovery with PROgressive SImilarity Thresholds

Author: Fornaciari, Tommaso, Hovy, Dirk, and Bianchi, Federico
Subjects: Computer Science - Computation and Language
Abstract: The most common ways to explore latent document dimensions are topic models and clustering methods. However, topic models have several drawbacks: e.g., they require us to choose the number of latent dimensions a priori, and the results are stochastic. Most clustering methods have the same issues and lack flexibility in various ways, such as not accounting for the influence of different topics on single documents, forcing word-descriptors to belong to a single topic (hard-clustering) or necessarily relying on word representations. We propose PROgressive SImilarity Thresholds - ProSiT, a deterministic and interpretable method, agnostic to the input format, that finds the optimal number of latent dimensions and only has two hyper-parameters, which can be set efficiently via grid search. We compare this method with a wide range of topic models and clustering methods on four benchmark data sets. In most setting, ProSiT matches or outperforms the other methods in terms six metrics of topic coherence and distinctiveness, producing replicable, deterministic results.
Published: 2022

17. Data-Efficient Strategies for Expanding Hate Speech Detection into Under-Resourced Languages

Author: Röttger, Paul, Nozza, Debora, Bianchi, Federico, and Hovy, Dirk
Subjects: Computer Science - Computation and Language
Abstract: Hate speech is a global phenomenon, but most hate speech datasets so far focus on English-language content. This hinders the development of more effective hate speech detection models in hundreds of languages spoken by billions across the world. More data is needed, but annotating hateful content is expensive, time-consuming and potentially harmful to annotators. To mitigate these issues, we explore data-efficient strategies for expanding hate speech detection into under-resourced languages. In a series of experiments with mono- and multilingual models across five non-English languages, we find that 1) a small amount of target-language fine-tuning data is needed to achieve strong performance, 2) the benefits of using more such data decrease exponentially, and 3) initial fine-tuning on readily-available English data can partially substitute target-language data and improve model generalisability. Based on these findings, we formulate actionable recommendations for hate speech detection in low-resource language settings., Comment: Accepted at EMNLP 2022 (Main Conference)
Published: 2022

18. Is It Worth the (Environmental) Cost? Limited Evidence for Temporal Adaptation via Continuous Training

Author: Attanasio, Giuseppe, Nozza, Debora, Bianchi, Federico, and Hovy, Dirk
Subjects: Computer Science - Computation and Language
Abstract: Language is constantly changing and evolving, leaving language models to become quickly outdated. Consequently, we should continuously update our models with new data to expose them to new events and facts. However, that requires additional computing, which means new carbon emissions. Do any measurable benefits justify this cost? This paper looks for empirical evidence to support continuous training. We reproduce existing benchmarks and extend them to include additional time periods, models, and tasks. Our results show that the downstream task performance of temporally adapted English models for social media data do not improve over time. Pretrained models without temporal adaptation are actually significantly more effective and efficient. However, we also note a lack of suitable temporal benchmarks. Our findings invite a critical reflection on when and how to temporally adapt language models, accounting for sustainability., Comment: 8 pages
Published: 2022

19. When and why vision-language models behave like bags-of-words, and what to do about it?

Author: Yuksekgonul, Mert, Bianchi, Federico, Kalluri, Pratyusha, Jurafsky, Dan, and Zou, James
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence, Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Despite the success of large vision and language models (VLMs) in many downstream applications, it is unclear how well they encode compositional information. Here, we create the Attribution, Relation, and Order (ARO) benchmark to systematically evaluate the ability of VLMs to understand different types of relationships, attributes, and order. ARO consists of Visual Genome Attribution, to test the understanding of objects' properties; Visual Genome Relation, to test for relational understanding; and COCO & Flickr30k-Order, to test for order sensitivity. ARO is orders of magnitude larger than previous benchmarks of compositionality, with more than 50,000 test cases. We show where state-of-the-art VLMs have poor relational understanding, can blunder when linking objects to their attributes, and demonstrate a severe lack of order sensitivity. VLMs are predominantly trained and evaluated on large datasets with rich compositional structure in the images and captions. Yet, training on these datasets has not been enough to address the lack of compositional understanding, and evaluating on these datasets has failed to surface this deficiency. To understand why these limitations emerge and are not represented in the standard tests, we zoom into the evaluation and training procedures. We demonstrate that it is possible to perform well on retrieval over existing datasets without using the composition and order information. Given that contrastive pretraining optimizes for retrieval on datasets with similar shortcuts, we hypothesize that this can explain why the models do not need to learn to represent compositional information. This finding suggests a natural solution: composition-aware hard negative mining. We show that a simple-to-implement modification of contrastive learning significantly improves the performance on tasks requiring understanding of order and compositionality., Comment: ICLR 2023 Oral (notable-top-5%)
Published: 2022

20. Real-Time Oil Leakage Detection on Aftermarket Motorcycle Damping System with Convolutional Neural Networks

Author: Bianchi, Federico, Speziali, Stefano, Marini, Andrea, Proietti, Massimiliano, Menculini, Lorenzo, Garinei, Alberto, Bellani, Gabriele, and Marconi, Marcello
Subjects: Computer Science - Computer Vision and Pattern Recognition
Abstract: In this work, we describe in detail how Deep Learning and Computer Vision can help to detect fault events of the AirTender system, an aftermarket motorcycle damping system component. One of the most effective ways to monitor the AirTender functioning is to look for oil stains on its surface. Starting from real-time images, AirTender is first detected in the motorbike suspension system, simulated indoor, and then, a binary classifier determines whether AirTender is spilling oil or not. The detection is made with the help of the Yolo5 architecture, whereas the classification is carried out with the help of a suitably designed Convolutional Neural Network, OilNet40. In order to detect oil leaks more clearly, we dilute the oil in AirTender with a fluorescent dye with an excitation wavelength peak of approximately 390 nm. AirTender is then illuminated with suitable UV LEDs. The whole system is an attempt to design a low-cost detection setup. An on-board device, such as a mini-computer, is placed near the suspension system and connected to a full hd camera framing AirTender. The on-board device, through our Neural Network algorithm, is then able to localize and classify AirTender as normally functioning (non-leak image) or anomaly (leak image)., Comment: analysis of literature reviewed, n.2 figures added, minor corrections
Published: 2022
Full Text: View/download PDF

21. EvalRS: a Rounded Evaluation of Recommender Systems

Author: Tagliabue, Jacopo, Bianchi, Federico, Schnabel, Tobias, Attanasio, Giuseppe, Greco, Ciro, Moreira, Gabriel de Souza P., and Chia, Patrick John
Subjects: Computer Science - Information Retrieval
Abstract: Much of the complexity of Recommender Systems (RSs) comes from the fact that they are used as part of more complex applications and affect user experience through a varied range of user interfaces. However, research focused almost exclusively on the ability of RSs to produce accurate item rankings while giving little attention to the evaluation of RS behavior in real-world scenarios. Such narrow focus has limited the capacity of RSs to have a lasting impact in the real world and makes them vulnerable to undesired behavior, such as reinforcing data biases. We propose EvalRS as a new type of challenge, in order to foster this discussion among practitioners and build in the open new methodologies for testing RSs "in the wild"., Comment: CIKM 2022 Data Challenge Paper
Published: 2022

22. Contrastive language and vision learning of general fashion concepts

Author: Chia, Patrick John, Attanasio, Giuseppe, Bianchi, Federico, Terragni, Silvia, Magalhães, Ana Rita, Goncalves, Diogo, Greco, Ciro, and Tagliabue, Jacopo
Subjects: Computer Science - Information Retrieval, Computer Science - Computation and Language
Abstract: The steady rise of online shopping goes hand in hand with the development of increasingly complex ML and NLP models. While most use cases are cast as specialized supervised learning problems, we argue that practitioners would greatly benefit from more transferable representations of products. In this work, we build on recent developments in contrastive learning to train FashionCLIP, a CLIP-like model for the fashion industry. We showcase its capabilities for retrieval, classification and grounding, and release our model and code to the community., Comment: Latest version available at https://www.nature.com/articles/s41598-022-23052-9; model available at https://huggingface.co/patrickjohncyh/fashion-clip
Published: 2022

23. 'Does it come in black?' CLIP-like models are zero-shot recommenders

Author: Chia, Patrick John, Tagliabue, Jacopo, Bianchi, Federico, Greco, Ciro, and Goncalves, Diogo
Subjects: Computer Science - Information Retrieval, Computer Science - Artificial Intelligence
Abstract: Product discovery is a crucial component for online shopping. However, item-to-item recommendations today do not allow users to explore changes along selected dimensions: given a query item, can a model suggest something similar but in a different color? We consider item recommendations of the comparative nature (e.g. "something darker") and show how CLIP-based models can support this use case in a zero-shot manner. Leveraging a large model built for fashion, we introduce GradREC and its industry potential, and offer a first rounded assessment of its strength and weaknesses., Comment: Accepted at ACL 2022 (ECNLP)
Published: 2022

24. Twitter-Demographer: A Flow-based Tool to Enrich Twitter Data

Author: Bianchi, Federico, Cutrona, Vincenzo, and Hovy, Dirk
Subjects: Computer Science - Computation and Language
Abstract: Twitter data have become essential to Natural Language Processing (NLP) and social science research, driving various scientific discoveries in recent years. However, the textual data alone are often not enough to conduct studies: especially social scientists need more variables to perform their analysis and control for various factors. How we augment this information, such as users' location, age, or tweet sentiment, has ramifications for anonymity and reproducibility, and requires dedicated effort. This paper describes Twitter-Demographer, a simple, flow-based tool to enrich Twitter data with additional information about tweets and users. Twitter-Demographer is aimed at NLP practitioners and (computational) social scientists who want to enrich their datasets with aggregated information, facilitating reproducibility, and providing algorithmic privacy-by-design measures for pseudo-anonymity. We discuss our design choices, inspired by the flow-based programming paradigm, to use black-box components that can easily be chained together and extended. We also analyze the ethical issues related to the use of this tool, and the built-in measures to facilitate pseudo-anonymity.
Published: 2022

25. Beyond NDCG: behavioral testing of recommender systems with RecList

Author: Chia, Patrick John, Tagliabue, Jacopo, Bianchi, Federico, He, Chloe, and Ko, Brian
Subjects: Computer Science - Information Retrieval, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: As with most Machine Learning systems, recommender systems are typically evaluated through performance metrics computed over held-out data points. However, real-world behavior is undoubtedly nuanced: ad hoc error analysis and deployment-specific tests must be employed to ensure the desired quality in actual deployments. In this paper, we propose RecList, a behavioral-based testing methodology. RecList organizes recommender systems by use case and introduces a general plug-and-play procedure to scale up behavioral testing. We demonstrate its capabilities by analyzing known algorithms and black-box commercial systems, and we release RecList as an open source, extensible package for the community., Comment: Paper accepted to the WebConf 2022
Published: 2021
Full Text: View/download PDF

26. Language Invariant Properties in Natural Language Processing

Author: Bianchi, Federico, Nozza, Debora, and Hovy, Dirk
Subjects: Computer Science - Computation and Language
Abstract: Meaning is context-dependent, but many properties of language (should) remain the same even if we transform the context. For example, sentiment, entailment, or speaker properties should be the same in a translation and original of a text. We introduce language invariant properties: i.e., properties that should not change when we transform text, and how they can be used to quantitatively evaluate the robustness of transformation algorithms. We use translation and paraphrasing as transformation examples, but our findings apply more broadly to any transformation. Our results indicate that many NLP transformations change properties like author characteristics, i.e., make them sound more male. We believe that studying these properties will allow NLP to address both social factors and pragmatic aspects of language. We also release an application suite that can be used to evaluate the invariance of transformation applications.
Published: 2021

27. SWEAT: Scoring Polarization of Topics across Different Corpora

Author: Bianchi, Federico, Marelli, Marco, Nicoli, Paolo, and Palmonari, Matteo
Subjects: Computer Science - Computation and Language
Abstract: Understanding differences of viewpoints across corpora is a fundamental task for computational social sciences. In this paper, we propose the Sliced Word Embedding Association Test (SWEAT), a novel statistical measure to compute the relative polarization of a topical wordset across two distributional representations. To this end, SWEAT uses two additional wordsets, deemed to have opposite valence, to represent two different poles. We validate our approach and illustrate a case study to show the usefulness of the introduced measure., Comment: Published as a conference paper at EMNLP2021
Published: 2021

28. Contrastive Language-Image Pre-training for the Italian Language

Author: Bianchi, Federico, Attanasio, Giuseppe, Pisoni, Raphael, Terragni, Silvia, Sarti, Gabriele, and Lakshmi, Sri
Subjects: Computer Science - Computation and Language, Computer Science - Computer Vision and Pattern Recognition
Abstract: CLIP (Contrastive Language-Image Pre-training) is a very recent multi-modal model that jointly learns representations of images and texts. The model is trained on a massive amount of English data and shows impressive performance on zero-shot classification tasks. Training the same model on a different language is not trivial, since data in other languages might be not enough and the model needs high-quality translations of the texts to guarantee a good performance. In this paper, we present the first CLIP model for the Italian Language (CLIP-Italian), trained on more than 1.4 million image-text pairs. Results show that CLIP-Italian outperforms the multilingual CLIP model on the tasks of image retrieval and zero-shot classification.
Published: 2021

29. Solving Sensor Placement Problems In Real Water Distribution Networks Using Adiabatic Quantum Computation

Author: Speziali, Stefano, Bianchi, Federico, Marini, Andrea, Menculini, Lorenzo, Proietti, Massimiliano, Termite, Loris F., Garinei, Alberto, Marconi, Marcello, and Delogu, Andrea
Subjects: Mathematics - Optimization and Control, Quantum Physics
Abstract: Quantum annealing has emerged in the last few years as a promising quantum computing approach to solving large-scale combinatorial optimization problems. In this paper, we formulate the problem of correctly placing pressure sensors on a Water Distribution Network (WDN) as a combinatorial optimization problem in the form of a Quadratic Unconstrained Binary Optimization (QUBO) or Ising model. Optimal sensor placement is indeed key to detect and isolate fault events. We outline the QUBO and Ising formulations for the sensor placement problem starting from the network topology and few other features. We present a detailed procedure to solve the problem by minimizing its Hamiltonian using PyQUBO, an open-source Python Library. We then apply our methods to the case of a real Water Distribution Network. Both simulated annealing and a hybrid quantum-classical approach on a D-Wave machine are employed., Comment: 10 pages, 3 figures. v2: minor corrections
Published: 2021

30. Multiclass classification of dephasing channels

Author: Palmieri, Adriano M., Bianchi, Federico, Paris, Matteo G. A., and Benedetti, Claudia
Subjects: Quantum Physics
Abstract: We address the use of neural networks (NNs) in classifying the environmental parameters of single-qubit dephasing channels. In particular, we investigate the performance of linear perceptrons and of two non-linear NN architectures. At variance with time-series-based approaches, our goal is to learn a discretized probability distribution over the parameters using tomographic data at just two random instants of time. We consider dephasing channels originating either from classical 1/f{\alpha} noise or from the interaction with a bath of quantum oscillators. The parameters to be classified are the color {\alpha} of the classical noise or the Ohmicity parameter s of the quantum environment. In both cases, we found that NNs are able to exactly classify parameters into 16 classes using noiseless data (a linear NN is enough for the color, whereas a single-layer NN is needed for the Ohmicity). In the presence of noisy data (e.g. coming from noisy tomographic measurements), the network is able to classify the color of the 1/f{\alpha} noise into 16 classes with about 70% accuracy, whereas classification of Ohmicity turns out to be challenging. We also consider a more coarse-grained task, and train the network to discriminate between two macro-classes corresponding to {\alpha} \lessgtr 1 and s \lessgtr 1, obtaining up to 96% and 79% accuracy using single-layer NNs.
Published: 2021
Full Text: View/download PDF

31. SIGIR 2021 E-Commerce Workshop Data Challenge

Author: Tagliabue, Jacopo, Greco, Ciro, Roy, Jean-Francis, Yu, Bingqing, Chia, Patrick John, Bianchi, Federico, and Cassani, Giovanni
Subjects: Computer Science - Information Retrieval
Abstract: The 2021 SIGIR workshop on eCommerce is hosting the Coveo Data Challenge for "In-session prediction for purchase intent and recommendations". The challenge addresses the growing need for reliable predictions within the boundaries of a shopping session, as customer intentions can be different depending on the occasion. The need for efficient procedures for personalization is even clearer if we consider the e-commerce landscape more broadly: outside of giant digital retailers, the constraints of the problem are stricter, due to smaller user bases and the realization that most users are not frequently returning customers. We release a new session-based dataset including more than 30M fine-grained browsing events (product detail, add, purchase), enriched by linguistic behavior (queries made by shoppers, with items clicked and items not clicked after the query) and catalog meta-data (images, text, pricing information). On this dataset, we ask participants to showcase innovative solutions for two open problems: a recommendation task (where a model is shown some events at the start of a session, and it is asked to predict future product interactions); an intent prediction task, where a model is shown a session containing an add-to-cart event, and it is asked to predict whether the item will be bought before the end of the session., Comment: SIGIR eCOM 2021 Data Challenge
Published: 2021

32. Language in a (Search) Box: Grounding Language Learning in Real-World Human-Machine Interaction

Author: Bianchi, Federico, Greco, Ciro, and Tagliabue, Jacopo
Subjects: Computer Science - Computation and Language
Abstract: We investigate grounded language learning through real-world data, by modelling a teacher-learner dynamics through the natural interactions occurring between users and search engines; in particular, we explore the emergence of semantic generalization from unsupervised dense representations outside of synthetic environments. A grounding domain, a denotation function and a composition function are learned from user data only. We show how the resulting semantics for noun phrases exhibits compositional properties while being fully learnable without any explicit labelling. We benchmark our grounded semantics on compositionality and zero-shot inference tasks, and we show that it provides better results and better generalizations than SOTA non-grounded models, such as word2vec and BERT., Comment: Published as a conference paper at NAACL2021
Published: 2021

33. Query2Prod2Vec Grounded Word Embeddings for eCommerce

Author: Bianchi, Federico, Tagliabue, Jacopo, and Yu, Bingqing
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: We present Query2Prod2Vec, a model that grounds lexical representations for product search in product embeddings: in our model, meaning is a mapping between words and a latent space of products in a digital shop. We leverage shopping sessions to learn the underlying space and use merchandising annotations to build lexical analogies for evaluation: our experiments show that our model is more accurate than known techniques from the NLP and IR literature. Finally, we stress the importance of data efficiency for product search outside of retail giants, and highlight how Query2Prod2Vec fits with practical constraints faced by most practitioners., Comment: Published as a conference paper at NAACL2021 - Industry Track
Published: 2021

34. BERT Goes Shopping: Comparing Distributional Models for Product Representations

Author: Bianchi, Federico, Yu, Bingqing, and Tagliabue, Jacopo
Subjects: Computer Science - Computation and Language, Computer Science - Information Retrieval
Abstract: Word embeddings (e.g., word2vec) have been applied successfully to eCommerce products through~\textit{prod2vec}. Inspired by the recent performance improvements on several NLP tasks brought by contextualized embeddings, we propose to transfer BERT-like architectures to eCommerce: our model -- ~\textit{Prod2BERT} -- is trained to generate representations of products through masked session modeling. Through extensive experiments over multiple shops, different tasks, and a range of design choices, we systematically compare the accuracy of~\textit{Prod2BERT} and~\textit{prod2vec} embeddings: while~\textit{Prod2BERT} is found to be superior in several scenarios, we highlight the importance of resources and hyperparameters in the best performing models. Finally, we provide guidelines to practitioners for training embeddings under a variety of computational and data constraints., Comment: Updated version. Published as a workshop paper at ECNLP 4 at ACL-IJCNLP 2021
Published: 2020

35. Fantastic Embeddings and How to Align Them: Zero-Shot Inference in a Multi-Shop Scenario

Author: Bianchi, Federico, Tagliabue, Jacopo, Yu, Bingqing, Bigon, Luca, and Greco, Ciro
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning
Abstract: This paper addresses the challenge of leveraging multiple embedding spaces for multi-shop personalization, proving that zero-shot inference is possible by transferring shopping intent from one website to another without manual intervention. We detail a machine learning pipeline to train and optimize embeddings within shops first, and support the quantitative findings with additional qualitative insights. We then turn to the harder task of using learned embeddings across shops: if products from different shops live in the same vector space, user intent - as represented by regions in this space - can then be transferred in a zero-shot fashion across websites. We propose and benchmark unsupervised and supervised methods to "travel" between embedding spaces, each with its own assumptions on data quantity and quality. We show that zero-shot personalization is indeed possible at scale by testing the shared embedding space with two downstream tasks, event prediction and type-ahead suggestions. Finally, we curate a cross-shop anonymized embeddings dataset to foster an inclusive discussion of this important business scenario., Comment: accepted at 2020 SIGIR Workshop On eCommerce
Published: 2020

36. Knowledge Graph Embeddings and Explainable AI

Author: Bianchi, Federico, Rossiello, Gaetano, Costabello, Luca, Palmonari, Matteo, and Minervini, Pasquale
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Knowledge graph embeddings are now a widely adopted approach to knowledge representation in which entities and relationships are embedded in vector spaces. In this chapter, we introduce the reader to the concept of knowledge graph embeddings by explaining what they are, how they can be generated and how they can be evaluated. We summarize the state-of-the-art in this field by describing the approaches that have been introduced to represent knowledge in the vector space. In relation to knowledge representation, we consider the problem of explainability, and discuss models and methods for explaining predictions obtained via knowledge graph embeddings., Comment: Federico Bianchi, Gaetano Rossiello, Luca Costabello, Matteo Plamonari, Pasquale Minervini, Knowledge Graph Embeddings and Explainable AI. In: Ilaria Tiddi, Freddy Lecue, Pascal Hitzler (eds.), Knowledge Graphs for eXplainable AI -- Foundations, Applications and Challenges. Studies on the Semantic Web, IOS Press, Amsterdam, 2020
Published: 2020
Full Text: View/download PDF

37. Cross-lingual Contextualized Topic Models with Zero-shot Learning

Author: Bianchi, Federico, Terragni, Silvia, Hovy, Dirk, Nozza, Debora, and Fersini, Elisabetta
Subjects: Computer Science - Computation and Language
Abstract: Many data sets (e.g., reviews, forums, news, etc.) exist parallelly in multiple languages. They all cover the same content, but the linguistic differences make it impossible to use traditional, bag-of-word-based topic models. Models have to be either single-language or suffer from a huge, but extremely sparse vocabulary. Both issues can be addressed by transfer learning. In this paper, we introduce a zero-shot cross-lingual topic model. Our model learns topics on one language (here, English), and predicts them for unseen documents in different languages (here, Italian, French, German, and Portuguese). We evaluate the quality of the topic predictions for the same document in different languages. Our results show that the transferred topics are coherent and stable across languages, which suggests exciting future research directions., Comment: Updated version. Published as a conference paper at EACL2021
Published: 2020

38. Compass-aligned Distributional Embeddings for Studying Semantic Differences across Corpora

Author: Bianchi, Federico, Di Carlo, Valerio, Nicoli, Paolo, and Palmonari, Matteo
Subjects: Computer Science - Artificial Intelligence, Computer Science - Computation and Language
Abstract: Word2vec is one of the most used algorithms to generate word embeddings because of a good mix of efficiency, quality of the generated representations and cognitive grounding. However, word meaning is not static and depends on the context in which words are used. Differences in word meaning that depends on time, location, topic, and other factors, can be studied by analyzing embeddings generated from different corpora in collections that are representative of these factors. For example, language evolution can be studied using a collection of news articles published in different time periods. In this paper, we present a general framework to support cross-corpora language studies with word embeddings, where embeddings generated from different corpora can be compared to find correspondences and differences in meaning across the corpora. CADE is the core component of our framework and solves the key problem of aligning the embeddings generated from different corpora. In particular, we focus on providing solid evidence about the effectiveness, generality, and robustness of CADE. To this end, we conduct quantitative and qualitative experiments in different domains, from temporal word embeddings to language localization and topical analysis. The results of our experiments suggest that CADE achieves state-of-the-art or superior performance on tasks where several competing approaches are available, yet providing a general method that can be used in a variety of domains. Finally, our experiments shed light on the conditions under which the alignment is reliable, which substantially depends on the degree of cross-corpora vocabulary overlap., Comment: arXiv admin note: text overlap with arXiv:1906.02376
Published: 2020

39. Pre-training is a Hot Topic: Contextualized Document Embeddings Improve Topic Coherence

Author: Bianchi, Federico, Terragni, Silvia, and Hovy, Dirk
Subjects: Computer Science - Computation and Language
Abstract: Topic models extract groups of words from documents, whose interpretation as a topic hopefully allows for a better understanding of the data. However, the resulting word groups are often not coherent, making them harder to interpret. Recently, neural topic models have shown improvements in overall coherence. Concurrently, contextual embeddings have advanced the state of the art of neural models in general. In this paper, we combine contextualized representations with neural topic models. We find that our approach produces more meaningful and coherent topics than traditional bag-of-words topic models and recent neural models. Our results indicate that future improvements in language models will translate into better topic models., Comment: Updated version. Published as a conference paper at ACL-IJCNLP 2021
Published: 2020

40. 'An Image is Worth a Thousand Features': Scalable Product Representations for In-Session Type-Ahead Personalization

Author: Yu, Bingqing, Tagliabue, Jacopo, Greco, Ciro, and Bianchi, Federico
Subjects: Computer Science - Information Retrieval, Computer Science - Machine Learning, Statistics - Machine Learning, I.2.6, I.2.7
Abstract: We address the problem of personalizing query completion in a digital commerce setting, in which the bounce rate is typically high and recurring users are rare. We focus on in-session personalization and improve a standard noisy channel model by injecting dense vectors computed from product images at query time. We argue that image-based personalization displays several advantages over alternative proposals (from data availability to business scalability), and provide quantitative evidence and qualitative support on the effectiveness of the proposed methods. Finally, we show how a shared vector space between similar shops can be used to improve the experience of users browsing across sites, opening up the possibility of applying zero-shot unsupervised personalization to increase conversions. This will prove to be particularly relevant to retail groups that manage multiple brands and/or websites and to multi-tenant SaaS providers that serve multiple clients in the same space.
Published: 2020
Full Text: View/download PDF

41. What the [MASK]? Making Sense of Language-Specific BERT Models

Author: Nozza, Debora, Bianchi, Federico, and Hovy, Dirk
Subjects: Computer Science - Computation and Language
Abstract: Recently, Natural Language Processing (NLP) has witnessed an impressive progress in many areas, due to the advent of novel, pretrained contextual representation models. In particular, Devlin et al. (2019) proposed a model, called BERT (Bidirectional Encoder Representations from Transformers), which enables researchers to obtain state-of-the art performance on numerous NLP tasks by fine-tuning the representations on their data set and task, without the need for developing and training highly-specific architectures. The authors also released multilingual BERT (mBERT), a model trained on a corpus of 104 languages, which can serve as a universal language model. This model obtained impressive results on a zero-shot cross-lingual natural inference task. Driven by the potential of BERT models, the NLP community has started to investigate and generate an abundant number of BERT models that are trained on a particular language, and tested on a specific data domain and task. This allows us to evaluate the true potential of mBERT as a universal language model, by comparing it to the performance of these more specific models. This paper presents the current state of the art in language-specific BERT models, providing an overall picture with respect to different dimensions (i.e. architectures, data domains, and tasks). Our aim is to provide an immediate and straightforward overview of the commonalities and differences between Language-Specific (language-specific) BERT models and mBERT. We also provide an interactive and constantly updated website that can be used to explore the information we have collected, at https://bertlang.unibocconi.it.
Published: 2020

42. Training Temporal Word Embeddings with a Compass

Author: Di Carlo, Valerio, Bianchi, Federico, and Palmonari, Matteo
Subjects: Computer Science - Computation and Language, Computer Science - Machine Learning
Abstract: Temporal word embeddings have been proposed to support the analysis of word meaning shifts during time and to study the evolution of languages. Different approaches have been proposed to generate vector representations of words that embed their meaning during a specific time interval. However, the training process used in these approaches is complex, may be inefficient or it may require large text corpora. As a consequence, these approaches may be difficult to apply in resource-scarce domains or by scientists with limited in-depth knowledge of embedding models. In this paper, we propose a new heuristic to train temporal word embeddings based on the Word2vec model. The heuristic consists in using atemporal vectors as a reference, i.e., as a compass, when training the representations specific to a given time interval. The use of the compass simplifies the training process and makes it more efficient. Experiments conducted using state-of-the-art datasets and methodologies suggest that our approach outperforms or equals comparable approaches while being more robust in terms of the required corpus size., Comment: Accepted at AAAI2019
Published: 2019

43. Experimental neural network enhanced quantum tomography

Author: Palmieri, Adriano Macarone, Kovlakov, Egor, Bianchi, Federico, Yudin, Dmitry, Straupe, Stanislav, Biamonte, Jacob, and Kulik, Sergei
Subjects: Quantum Physics, Condensed Matter - Disordered Systems and Neural Networks, Computer Science - Artificial Intelligence, Computer Science - Machine Learning
Abstract: Quantum tomography is currently ubiquitous for testing any implementation of a quantum information processing device. Various sophisticated procedures for state and process reconstruction from measured data are well developed and benefit from precise knowledge of the model describing state preparation and the measurement apparatus. However, physical models suffer from intrinsic limitations as actual measurement operators and trial states cannot be known precisely. This scenario inevitably leads to state-preparation-and-measurement (SPAM) errors degrading reconstruction performance. Here we develop and experimentally implement a machine learning based protocol reducing SPAM errors. We trained a supervised neural network to filter the experimental data and hence uncovered salient patterns that characterize the measurement probabilities for the original state and the ideal experimental apparatus free from SPAM errors. We compared the neural network state reconstruction protocol with a protocol treating SPAM errors by process tomography, as well as to a SPAM-agnostic protocol with idealized measurements. The average reconstruction fidelity is shown to be enhanced by 10\% and 27\%, respectively. The presented methods apply to the vast range of quantum experiments which rely on tomography., Comment: 11 pages, 3+6 figures; All data and source code are available online; RevTeX
Published: 2019
Full Text: View/download PDF

44. Reasoning over RDF Knowledge Bases using Deep Learning

Author: Ebrahimi, Monireh, Sarker, Md Kamruzzaman, Bianchi, Federico, Xie, Ning, Doran, Derek, and Hitzler, Pascal
Subjects: Computer Science - Machine Learning, Computer Science - Artificial Intelligence, Statistics - Machine Learning
Abstract: Semantic Web knowledge representation standards, and in particular RDF and OWL, often come endowed with a formal semantics which is considered to be of fundamental importance for the field. Reasoning, i.e., the drawing of logical inferences from knowledge expressed in such standards, is traditionally based on logical deductive methods and algorithms which can be proven to be sound and complete and terminating, i.e. correct in a very strong sense. For various reasons, though, in particular, the scalability issues arising from the ever-increasing amounts of Semantic Web data available and the inability of deductive algorithms to deal with noise in the data, it has been argued that alternative means of reasoning should be investigated which bear high promise for high scalability and better robustness. From this perspective, deductive algorithms can be considered the gold standard regarding correctness against which alternative methods need to be tested. In this paper, we show that it is possible to train a Deep Learning system on RDF knowledge graphs, such that it is able to perform reasoning over new RDF knowledge graphs, with high precision and recall compared to the deductive gold standard.
Published: 2018

45. Neutral molecular cluster formation of sulfuric acid dimethylamine observed in real time under atmospheric conditions

Author: Kürten, Andreas, Jokinen, Tuija, Simon, Mario, Sipilä, Mikko, Sarnela, Nina, Junninen, Heikki, Adamov, Alexey, Almeida, João, Amorim, Antonio, Bianchi, Federico, Breitenlechner, Martin, Dommen, Josef, Donahue, Neil M., Duplissy, Jonathan, Ehrharta, Sebastian, Flagan, Richard C., Franchin, Alessandro, Hakala, Jani, Hansel, Armin, Heinritzia, Martin, Hutterli, Manuel, Kangasluoma, Juha, Kirkby, Jasper, Laaksonen, Ari, Lehtipalo, Katrianne, Leiminger, Markus, Makhmutov, Vladimir, Mathot, Serge, Onnela, Antti, Petäjä, Tuukka, Praplan, Arnaud P., Riccobono, Francesco, Rissanen, Matti P., Rondo, Linda, Schobesberger, Siegfried, Seinfeld, John H., Steiner, Gerhard, Tomé, António, Tröstl, Jasmin, Winkler, Paul M., Williamson, Christina, Wimmer, Daniela, Ye, Penglin, Baltensperger, Urs, Carslaw, Kenneth S., Kulmala, Markku, Worsnop, Douglas R., and Curtius, Joachim
Subjects: Physics - Atmospheric and Oceanic Physics, Physics - Chemical Physics
Abstract: For atmospheric sulfuric acid (SA) concentrations the presence of dimethylamine (DMA) at mixing ratios of several parts per trillion by volume can explain observed boundary layer new particle formation rates. However, the concentration and molecular composition of the neutral (uncharged) clusters have not been reported so far due to the lack of suitable instrumentation. Here we report on experiments from the Cosmics Leaving Outdoor Droplets chamber at the European Organization for Nuclear Research revealing the formation of neutral particles containing up to 14 SA and 16 DMA molecules, corresponding to a mobility diameter of about 2 nm, under atmospherically relevant conditions. These measurements bridge the gap between the molecular and particle perspectives of nucleation, revealing the fundamental processes involved in particle formation and growth. The neutral clusters are found to form at or close to the kinetic limit where particle formation is limited only by the collision rate of SA molecules. Even though the neutral particles are stable against evaporation from the SA dimer onward, the formation rates of particles at 1.7-nm size, which contain about 10 SA molecules, are up to 4 orders of magnitude smaller comparedwith those of the dimer due to coagulation and wall loss of particles before they reach 1.7 nm in diameter. This demonstrates that neither the atmospheric particle formation rate nor its dependence on SA can simply be interpreted in terms of cluster evaporation or the molecular composition of a critical nucleus., Comment: Main text plus SI
Published: 2015
Full Text: View/download PDF

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Publication Type

Database

45 results on '"Bianchi, Federico"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources