Author: "Raganato, A" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Raganato, A"' showing total 414 results

Start Over Author "Raganato, A"

414 results on '"Raganato, A"'

1. Synthetic Data Generation with Large Language Models for Personalized Community Question Answering

Author: Braga, Marco, Kasela, Pranav, Raganato, Alessandro, and Pasi, Gabriella
Subjects: Computer Science - Information Retrieval
Abstract: Personalization in Information Retrieval (IR) is a topic studied by the research community since a long time. However, there is still a lack of datasets to conduct large-scale evaluations of personalized IR; this is mainly due to the fact that collecting and curating high-quality user-related information requires significant costs and time investment. Furthermore, the creation of datasets for Personalized IR (PIR) tasks is affected by both privacy concerns and the need for accurate user-related data, which are often not publicly available. Recently, researchers have started to explore the use of Large Language Models (LLMs) to generate synthetic datasets, which is a possible solution to generate data for low-resource tasks. In this paper, we investigate the potential of Large Language Models (LLMs) for generating synthetic documents to train an IR system for a Personalized Community Question Answering task. To study the effectiveness of IR models fine-tuned on LLM-generated data, we introduce a new dataset, named Sy-SE-PQA. We build Sy-SE-PQA based on an existing dataset, SE-PQA, which consists of questions and answers posted on the popular StackExchange communities. Starting from questions in SE-PQA, we generate synthetic answers using different prompt techniques and LLMs. Our findings suggest that LLMs have high potential in generating data tailored to users' needs. The synthetic data can replace human-written training data, even if the generated data may contain incorrect information., Comment: Accepted in WI-IAT '24
Published: 2024

2. Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence

Author: Riva, Alessandro, Raganato, Alessandro, and Melzi, Simone
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Graphics
Abstract: Current data-driven methodologies for point cloud matching demand extensive training time and computational resources, presenting significant challenges for model deployment and application. In the point cloud matching task, recent advancements with an encoder-only Transformer architecture have revealed the emergence of semantically meaningful patterns in the attention heads, particularly resembling Gaussian functions centered on each point of the input shape. In this work, we further investigate this phenomenon by integrating these patterns as fixed attention weights within the attention heads of the Transformer architecture. We evaluate two variants: one utilizing predetermined variance values for the Gaussians, and another where the variance values are treated as learnable parameters. Additionally we analyze the performances on noisy data and explore a possible way to improve robustness to noise. Our findings demonstrate that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization. Furthermore, we conducted an ablation study to identify the specific layers where the infused information is most impactful and to understand the reliance of the network on this information.
Published: 2024

3. How to Blend Concepts in Diffusion Models

Author: Olearo, Lorenzo, Longari, Giorgio, Melzi, Simone, Raganato, Alessandro, and Peñaloza, Rafael
Subjects: Computer Science - Computer Vision and Pattern Recognition, Computer Science - Artificial Intelligence
Abstract: For the last decade, there has been a push to use multi-dimensional (latent) spaces to represent concepts; and yet how to manipulate these concepts or reason with them remains largely unclear. Some recent methods exploit multiple latent representations and their connection, making this research question even more entangled. Our goal is to understand how operations in the latent space affect the underlying concepts. To that end, we explore the task of concept blending through diffusion models. Diffusion models are based on a connection between a latent representation of textual prompts and a latent space that enables image reconstruction and generation. This task allows us to try different text-based combination strategies, and evaluate easily through a visual analysis. Our conclusion is that concept blending through space manipulation is possible, although the best strategy depends on the context of the blend.
Published: 2024

4. A Systematic Study of Inner-Attention-Based Sentence Representations in Multilingual Neural Machine Translation

Author: Vázquez, Raúl, Raganato, Alessandro, Creutz, Mathias, and Tiedemann, Jörg
Subjects: Computational linguistics. Natural language processing, P98-98.5
Abstract: Neural machine translation has considerably improved the quality of automatic translations by learning good representations of input sentences. In this article, we explore a multilingual translation model capable of producing fixed-size sentence representations by incorporating an intermediate crosslingual shared layer, which we refer to as attention bridge. This layer exploits the semantics from each language and develops into a language-agnostic meaning representation that can be efficiently used for transfer learning. We systematically study the impact of the size of the attention bridge and the effect of including additional languages in the model. In contrast to related previous work, we demonstrate that there is no conflict between translation performance and the use of sentence representations in downstream tasks. In particular, we show that larger intermediate layers not only improve translation quality, especially for long sentences, but also push the accuracy of trainable classification tasks. Nevertheless, shorter representations lead to increased compression that is beneficial in non-trainable similarity tasks. Similarly, we show that trainable downstream tasks benefit from multilingual models, whereas additional language signals do not improve performance in non-trainable benchmarks. This is an important insight that helps to properly design models for specific applications. Finally, we also include an in-depth analysis of the proposed attention bridge and its ability to encode linguistic properties. We carefully analyze the information that is captured by individual attention heads and identify interesting patterns that explain the performance of specific settings in linguistic probing tasks.
Published: 2020
Full Text: View/download PDF

5. SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes

Author: Mickus, Timothee, Zosa, Elaine, Vázquez, Raúl, Vahtola, Teemu, Tiedemann, Jörg, Segonne, Vincent, Raganato, Alessandro, and Apidianaki, Marianna
Subjects: Computer Science - Computation and Language
Abstract: This paper presents the results of the SHROOM, a shared task focused on detecting hallucinations: outputs from natural language generation (NLG) systems that are fluent, yet inaccurate. Such cases of overgeneration put in jeopardy many NLG applications, where correctness is often mission-critical. The shared task was conducted with a newly constructed dataset of 4000 model outputs labeled by 5 annotators each, spanning 3 NLP tasks: machine translation, paraphrase generation and definition modeling. The shared task was tackled by a total of 58 different users grouped in 42 teams, out of which 27 elected to write a system description paper; collectively, they submitted over 300 prediction sets on both tracks of the shared task. We observe a number of key trends in how this approach was tackled -- many participants rely on a handful of model, and often rely either on synthetic data for fine-tuning or zero-shot prompting strategies. While a majority of the teams did outperform our proposed baseline system, the performances of top-scoring systems are still consistent with a random handling of the more challenging items., Comment: SemEval 2024 shared task. Pre-review version
Published: 2024

6. MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki

Author: Mickus, Timothee, Grönroos, Stig-Arne, Attieh, Joseph, Boggia, Michele, De Gibert, Ona, Ji, Shaoxiong, Lopi, Niki Andreas, Raganato, Alessandro, Vázquez, Raúl, and Tiedemann, Jörg
Subjects: Computer Science - Computation and Language
Abstract: NLP in the age of monolithic large language models is approaching its limits in terms of size and information that can be handled. The trend goes to modularization, a necessary step into the direction of designing smaller sub-networks and components with specialized functionality. In this paper, we present the MAMMOTH toolkit: a framework designed for training massively multilingual modular machine translation systems at scale, initially derived from OpenNMT-py and then adapted to ensure efficient training across computation clusters. We showcase its efficiency across clusters of A100 and V100 NVIDIA GPUs, and discuss our design philosophy and plans for future information. The toolkit is publicly available online., Comment: Presented as a demo at EACL 2024
Published: 2024

7. Influence of implant density on mechanical complications in adult spinal deformity surgery

Author: Charles, Yann Philippe, Severac, François, Núñez-Pereira, Susana, Haddad, Sleiman, Vila, Lluis, Pellisé, Ferran, Obeid, Ibrahim, Boissière, Louis, Yilgor, Caglar, Yucekul, Altug, Alanay, Ahmet, Kleinstück, Frank, Loibl, Markus, Gómez-Rice, Alejandro, Raganato, Riccardo, Perez-Grueso, Francisco Javier Sánchez, and Pizones, Javier
Published: 2024
Full Text: View/download PDF

8. What factors are associated with a better restoration of pelvic version after adult spinal deformity surgery?

Author: Raganato, Riccardo, Gómez-Rice, Alejandro, Moreno-Manzanaro, Lucía, Escámez, Fernando, Talavera, Gloria, Aguilar, Antonio, Sánchez-Márquez, José Miguel, Fernández-Baíllo, Nicomedes, Perez-Grueso, Francisco Javier Sánchez, Kleinstück, Frank, Alanay, Ahmet, Obeid, Ibrahim, Pellisé, Ferran, and Pizones, Javier
Published: 2024
Full Text: View/download PDF

9. Democratizing neural machine translation with OPUS-MT

Author: Tiedemann, Jörg, Aulamo, Mikko, Bakshandaeva, Daria, Boggia, Michele, Grönroos, Stig-Arne, Nieminen, Tommi, Raganato, Alessandro, Scherrer, Yves, Vázquez, Raúl, and Virpioja, Sami
Published: 2024
Full Text: View/download PDF

10. Does morbid obesity negatively impact perioperative outcomes following elective reverse shoulder arthroplasty?: a propensity-matched comparative study

Author: Suhirad Khokhar, MD, Cameron Smith, BA, MPH, Riccardo Raganato, MD, Robert Ades, BA, Yungtai Lo, PhD, and Konrad I. Gruson, MD
Subjects: Reverse total shoulder replacement, Return to emergency department, Readmission, Postoperative complications, Morbid obesity, Length of stay, Orthopedic surgery, RD701-811, Diseases of the musculoskeletal system, RC925-935
Abstract: Background: The incidence of primary reverse total shoulder arthroplasty (rTSA) and the prevalence of obesity have increased in the United States. Despite this, the literature assessing the effect of morbid obesity (body mass index≥40 kg/m2) on perioperative surgical outcomes remains inconsistent. Methods: A retrospective review of consecutive elective primary rTSA cases from January 2016 through September 2023 at a single tertiary referral center was performed. All cases involved a short-stem humeral component and screw-in glenoid baseplate from the same implant manufacturer. Surgical and patient demographic data were collected. Morbidly obese patients were propensity matched at least 1:1 with non-morbidly obese patients based on age, gender, modified 5-item frailty index score, adjusted Charlson comorbidity index score, and 12-month preoperative emergency department (ED) visit. Regression analysis was utilized to assess the relationship between morbid obesity and operative time, length of stay, intraoperative total blood volume loss, surgical postoperative complications, in-hospital medical complications, disposition, and 90-day ED return and readmission. Results: There were a total of 175 short-stem rTSA cases performed with a median age of 71 years (interquartile range: 66, 76), of which 19 (10.9%) had a body mass index ≥40 kg/m2. These 19 patients were propensity score matched to 41 non-morbidly obese patients (9 at 1:3, 4 at 1:2, and 6 at 1:1). There were no significant differences between the groups with regard to intraoperative total blood volume loss, operative time, need for transfusion, hospital length of stay, discharge disposition, prevalence for 90-day return to ED, or unplanned 90-day readmission. Conclusion: Morbid obesity should not be considered an absolute contraindication for elective rTSA, particularly in patients who have undergone appropriate preoperative medical clearance.
Published: 2024
Full Text: View/download PDF

11. Future Data Points to Implement in Adult Spinal Deformity Assessment for Artificial Intelligence Modeling Prediction: The Importance of the Biological Dimension.

Author: Haddad, Sleiman, Pizones, Javier, Raganato, Riccardo, Safaee, Michael, Scheer, Justin, Pellisé, Ferran, and Ames, Christopher
Subjects: artificial intelligence, biomarkers, frailty, metabolomics, osteoporosis, sarcopenia, senescence, spinal deformities, tissue sample
Abstract: Adult spinal deformity (ASD) surgery is still associated with high surgical risks. Machine learning algorithms applied to multicenter databases have been created to predict outcomes and complications, optimize patient selection, and improve overall results. However, the multiple data points currently used to create these models allow for 70% of accuracy in prediction. We need to find new variables that can capture the spectrum of probability that is escaping from our control. These proposed variables are based on patients biological dimensions, such as frailty, sarcopenia, muscle and bone (tissue) sampling, serological assessment of cellular senescence, and circulating biomarkers that can measure epigenetics, inflammaging, and -omics. Many of these variables are proven to be modifiable and could be improved with proper nutrition, toxin avoidance, endurance exercise, and even surgery. The purpose of this manuscript is to describe the different future data points that can be implemented in ASD assessment to improve modeling prediction, allow monitoring their response to prerehabilitation programs, and improve patient counseling.
Published: 2023

12. Democratizing Neural Machine Translation with OPUS-MT

Author: Tiedemann, Jörg, Aulamo, Mikko, Bakshandaeva, Daria, Boggia, Michele, Grönroos, Stig-Arne, Nieminen, Tommi, Raganato, Alessandro, Scherrer, Yves, Vazquez, Raul, and Virpioja, Sami
Subjects: Computer Science - Computation and Language
Abstract: This paper presents the OPUS ecosystem with a focus on the development of open machine translation models and tools, and their integration into end-user applications, development platforms and professional workflows. We discuss our on-going mission of increasing language coverage and translation quality, and also describe on-going work on the development of modular translation models and speed-optimized compact solutions for real-time translation on regular desktops and small devices.
Published: 2022

13. Comparatively Assessing Large Language Models for Query Expansion in Information Retrieval via Zero-Shot and Chain-of-Thought Prompting.

Author: Daniele Rizzo, Alessandro Raganato, and Marco Viviani 0001
Published: 2024

14. Incorporating Cognitive Complexity of Text in Dense Retrieval for Personalized Search.

Author: Effrosyni Sokli, Alessandro Raganato, and Gabriella Pasi
Published: 2024

15. Leveraging Prompt Engineering and Large Language Models for Automating MADRS Score Computation for Depression Severity Assessment.

Author: Alessandro Raganato, Francesco Bartoli, Cristina Crocamo, Daniele Cavaleri, Giuseppe Carrà, Gabriella Pasi, and Marco Viviani 0001
Published: 2024

16. SemEval-2024 Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes.

Author: Timothee Mickus, Elaine Zosa, Raúl Vázquez, Teemu Vahtola, Jörg Tiedemann, Vincent Segonne, Alessandro Raganato, and Marianna Apidianaki
Published: 2024
Full Text: View/download PDF

17. AdaKron: An Adapter-based Parameter Efficient Model Tuning with Kronecker Product.

Author: Marco Braga, Alessandro Raganato, and Gabriella Pasi
Published: 2024

18. MAMMOTH: Massively Multilingual Modular Open Translation @ Helsinki.

Author: Timothee Mickus, Stig-Arne Grönroos, Joseph Attieh, Michele Boggia, Ona de Gibert Bonet, Shaoxiong Ji, Niki A. Loppi, Alessandro Raganato, Raúl Vázquez, and Jörg Tiedemann
Published: 2024

19. Democratizing neural machine translation with OPUS-MT.

Author: Jörg Tiedemann, Mikko Aulamo, Daria Bakshandaeva, Michele Boggia, Stig-Arne Grönroos, Tommi Nieminen, Alessandro Raganato, Yves Scherrer, Raúl Vázquez, and Sami Virpioja
Published: 2024
Full Text: View/download PDF

20. Does morbid obesity negatively impact perioperative outcomes following elective reverse shoulder arthroplasty?: a propensity-matched comparative study

Author: Khokhar, Suhirad, Smith, Cameron, Raganato, Riccardo, Ades, Robert, Lo, Yungtai, and Gruson, Konrad I.
Published: 2024
Full Text: View/download PDF

21. Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence.

Author: Alessandro Riva, Alessandro Raganato, and Simone Melzi
Published: 2024
Full Text: View/download PDF

22. How to Blend Concepts in Diffusion Models.

Author: Giorgio Longari, Lorenzo Olearo, Simone Melzi, Rafael Peñaloza, and Alessandro Raganato
Published: 2024
Full Text: View/download PDF

23. SemEval-2024 Shared Task 6: SHROOM, a Shared-task on Hallucinations and Related Observable Overgeneration Mistakes.

Author: Timothee Mickus, Elaine Zosa, Raúl Vázquez, Teemu Vahtola, Jörg Tiedemann, Vincent Segonne, Alessandro Raganato, and Marianna Apidianaki
Published: 2024
Full Text: View/download PDF

24. Dozens of Translation Directions or Millions of Shared Parameters? Comparing Two Types of Multilinguality in Modular Machine Translation.

Author: Michele Boggia, Stig-Arne Grönroos, Niki A. Loppi, Timothee Mickus, Alessandro Raganato, Jörg Tiedemann, and Raúl Vázquez
Published: 2023

25. Personalization in BERT with Adapter Modules and Topic Modelling.

Author: Marco Braga, Alessandro Raganato, and Gabriella Pasi
Published: 2023

26. SemEval-2023 Task 1: Visual Word Sense Disambiguation.

Author: Alessandro Raganato, Iacer Calixto, Asahi Ushio, José Camacho-Collados, and Mohammad Taher Pilehvar
Published: 2023
Full Text: View/download PDF

27. Application of Recycled Carbon Fibers in Aircraft Windows Frame

Author: Minosi, S., Buccoliero, G., Araganese, M., Raganato, U., Tarzia, A., Corvaglia, S., Gallo, N., Chaari, Fakher, Series Editor, Gherardini, Francesco, Series Editor, Ivanov, Vitalii, Series Editor, Cavas-Martínez, Francisco, Editorial Board Member, di Mare, Francesca, Editorial Board Member, Haddar, Mohamed, Editorial Board Member, Kwon, Young W., Editorial Board Member, Trojanowska, Justyna, Editorial Board Member, Xu, Jinyang, Editorial Board Member, Lopresto, Valentina, editor, Papa, Ilaria, editor, and Langella, Antonio, editor
Published: 2023
Full Text: View/download PDF

28. Analyzing practice pattern in treating partial-thickness rotator cuff tears: a dual perspective from national database and American Shoulder and Elbow Surgeons PARCIAL research group

Author: Nguyen, Michael, DiFiori, Monica, Masters, Gabriel, Arthur Chou, Te Feng, Raganato, Riccardo, Bowen, Lucas P., Harmon, Jordan J., Griffin, Tessa C., Winzenried, Alec E., Polce, Evan M., Call, Cory J., Nwadike, Benjamin, Ouseph, Alvin, Nazemi, Monia, McCall, Kyle, Kim, H. Mike, Leary, Emily, Baker, Champ L., 3<ce:sup loc='post">rd</ce:sup>, Barnes, Leslie A., Creighton, R. Alexander, Cuomo, Frances, DiPaola, Matthew J., Foad, Abdullah, Gregory, James M., Grogan, Brian F., Kaar, Scott G., Kohan, Eitan M., Krishnan, Sumant G., Lo, Eddie Y., and Moor, John T.
Published: 2024
Full Text: View/download PDF

29. Sagittal realignment: surgical restoration of the global alignment and proportion score parameters: a subgroup analysis. What are the consequences of failing to realign?

Author: Raganato, Riccardo, Pizones, Javier, Yilgor, Caglar, Moreno-Manzanaro, Lucía, Vila-Casademunt, Alba, Sánchez-Márquez, José Miguel, Fernández-Baíllo, Nicomedes, Sánchez Pérez-Grueso, Francisco Javier, Kleinstück, Frank, Alanay, Ahmet, Obeid, Ibrahim, and Pellisé, Ferran
Published: 2023
Full Text: View/download PDF

30. Iliac screws (is) vs. S2 alar iliac screws (S2AI) for pelvic fixation in adult spinal deformity patients. A propensity score-matched study

Author: A. Gomez-Rice, S. Núñez-Pereira, S. Haddad, R. Raganato, F. Sanchez Perez-Grueso, F. KleinstÜCk, I. Obeid, A. Alanay, F. Pellise, J. Pizones, and E.S.S.G. Essg
Subjects: Neurology. Diseases of the nervous system, RC346-429
Published: 2024
Full Text: View/download PDF

31. Influence of implant density on mechanical complications in adult spinal deformity surgery

Author: Y.P. Charles, F. Séverac, S. Núñez-Pereira, S. Haddad, F. Pellise, I. Obeid, L. Boissiere, C. Yilgor, A. Yucekul, A. Alanay, F. Kleinstück, M. Loibl, R. Raganato, F. Sanchez Perez-Grueso, J. Pizones, and E.S.S.G. Essg
Subjects: Neurology. Diseases of the nervous system, RC346-429
Published: 2024
Full Text: View/download PDF

32. XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization

Author: Raganato, Alessandro, Pasini, Tommaso, Camacho-Collados, Jose, and Pilehvar, Mohammad Taher
Subjects: Computer Science - Computation and Language
Abstract: The ability to correctly model distinct meanings of a word is crucial for the effectiveness of semantic representation techniques. However, most existing evaluation benchmarks for assessing this criterion are tied to sense inventories (usually WordNet), restricting their usage to a small subset of knowledge-based representation techniques. The Word-in-Context dataset (WiC) addresses the dependence on sense inventories by reformulating the standard disambiguation task as a binary classification problem; but, it is limited to the English language. We put forward a large multilingual benchmark, XL-WiC, featuring gold standards in 12 new languages from varied language families and with different degrees of resource availability, opening room for evaluation scenarios such as zero-shot cross-lingual transfer. We perform a series of experiments to determine the reliability of the datasets and to set performance baselines for several recent contextualized multilingual models. Experimental results show that even when no tagged instances are available for a target language, models trained solely on the English data can attain competitive performance in the task of distinguishing different meanings of a word, even for distant languages. XL-WiC is available at https://pilehvar.github.io/xlwic/., Comment: EMNLP2020
Published: 2020

33. Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation

Author: Raganato, Alessandro, Scherrer, Yves, and Tiedemann, Jörg
Subjects: Computer Science - Computation and Language
Abstract: Transformer-based models have brought a radical change to neural machine translation. A key feature of the Transformer architecture is the so-called multi-head attention mechanism, which allows the model to focus simultaneously on different parts of the input. However, recent works have shown that most attention heads learn simple, and often redundant, positional patterns. In this paper, we propose to replace all but one attention head of each encoder layer with simple fixed -- non-learnable -- attentive patterns that are solely based on position and do not require any external knowledge. Our experiments with different data sizes and multiple language pairs show that fixing the attention heads on the encoder side of the Transformer at training time does not impact the translation quality and even increases BLEU scores by up to 3 points in low-resource scenarios., Comment: Accepted to Findings of EMNLP 2020
Published: 2020

34. Influence of Coronoid fixation on the functional outcome and rate of complications in surgically treated acute complex elbow instability

Author: Antuña, Samuel A., Raganato, Riccardo, Dopico, Lucia Ros, and Barco, Raúl
Published: 2023
Full Text: View/download PDF

35. Attention And Positional Encoding Are (Almost) All You Need For Shape Matching.

Author: Alessandro Raganato, Gabriella Pasi, and Simone Melzi
Published: 2023
Full Text: View/download PDF

36. The University of Helsinki submissions to the WMT19 news translation task

Author: Talman, Aarne, Sulubacak, Umut, Vázquez, Raúl, Scherrer, Yves, Virpioja, Sami, Raganato, Alessandro, Hurskainen, Arvi, and Tiedemann, Jörg
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we present the University of Helsinki submissions to the WMT 2019 shared task on news translation in three language pairs: English-German, English-Finnish and Finnish-English. This year, we focused first on cleaning and filtering the training data using multiple data-filtering approaches, resulting in much smaller and cleaner training sets. For English-German, we trained both sentence-level transformer models and compared different document-level translation approaches. For Finnish-English and English-Finnish we focused on different segmentation approaches, and we also included a rule-based system for English-Finnish., Comment: To appear in WMT19
Published: 2019

37. Application of Recycled Carbon Fibers in Aircraft Windows Frame

Author: Minosi, S., primary, Buccoliero, G., additional, Araganese, M., additional, Raganato, U., additional, Tarzia, A., additional, Corvaglia, S., additional, and Gallo, N., additional
Published: 2023
Full Text: View/download PDF

38. MAMMOTH Massively Multilingual Modular Open Translation @ Helsinki

Author: Mickus, T, Gronroos, S, Attieh, J, Boggia, M, De Gibert, O, Ji, S, Loppi, N, Raganato, A, Vazquez, R, Tiedemann, J, Mickus T., Gronroos S. -A., Attieh J., Boggia M., De Gibert O., Ji S., Loppi N. A., Raganato A., Vazquez R., Tiedemann J., Mickus, T, Gronroos, S, Attieh, J, Boggia, M, De Gibert, O, Ji, S, Loppi, N, Raganato, A, Vazquez, R, Tiedemann, J, Mickus T., Gronroos S. -A., Attieh J., Boggia M., De Gibert O., Ji S., Loppi N. A., Raganato A., Vazquez R., and Tiedemann J.
Abstract: NLP in the age of monolithic large language models is approaching its limits in terms of size and information that can be handled. The trend goes to modularization, a necessary step into the direction of designing smaller sub-networks and components with specialized functionality. In this paper, we present the MAMMOTH toolkit: a framework designed for training massively multilingual modular machine translation systems at scale, initially derived from OpenNMT-py and then adapted to ensure efficient training across computation clusters. We showcase its efficiency across clusters of A100 and V100 NVIDIA GPUs, and discuss our design philosophy and plans for future information. The toolkit is publicly available online.
Published: 2024

39. AdaKron: an Adapter-based Parameter Efficient Model Tuning with Kronecker Product

Author: Braga, M, Raganato, A, Pasi, G, Braga M., Raganato A., Pasi G., Braga, M, Raganato, A, Pasi, G, Braga M., Raganato A., and Pasi G.
Abstract: The fine-tuning paradigm has been widely adopted to train neural models tailored for specific tasks. However, the recent upsurge of Large Language Models (LLMs), characterized by billions of parameters, has introduced profound computational challenges to the fine-tuning process. This has fueled intensive research on Parameter-Efficient Fine-Tuning (PEFT) techniques, usually involving the training of a selective subset of the original model parameters. One of the most used approaches is Adapters, which add trainable lightweight layers to the existing pretrained weights. Within this context, we propose AdaKron, an Adapter-based fine-tuning with the Kronecker product. In particular, we leverage the Kronecker product to combine the output of two small networks, resulting in a final vector whose dimension is the product of the dimensions of the individual outputs, allowing us to train only 0.55% of the model's original parameters. We evaluate AdaKron performing a series of experiments on the General Language Understanding Evaluation (GLUE) benchmark, achieving results in the same ballpark as recent state-of-the-art PEFT methods, despite training fewer parameters.
Published: 2024

40. A Multi-Domain Benchmark for Personalized Search Evaluation.

Author: Elias Bassani, Pranav Kasela, Alessandro Raganato, and Gabriella Pasi
Published: 2022
Full Text: View/download PDF

41. Does Stable Diffusion Dream of Electric Sheep.

Author: Simone Melzi, Rafael Peñaloza, and Alessandro Raganato
Published: 2023

42. Multilingual NMT with a language-independent attention bridge

Author: Vázquez, Raúl, Raganato, Alessandro, Tiedemann, Jörg, and Creutz, Mathias
Subjects: Computer Science - Computation and Language
Abstract: In this paper, we propose a multilingual encoder-decoder architecture capable of obtaining multilingual sentence representations by means of incorporating an intermediate {\em attention bridge} that is shared across all languages. That is, we train the model with language-specific encoders and decoders that are connected via self-attention with a shared layer that we call attention bridge. This layer exploits the semantics from each language for performing translation and develops into a language-independent meaning representation that can efficiently be used for transfer learning. We present a new framework for the efficient development of multilingual NMT using this model and scheduled training. We have tested the approach in a systematic way with a multi-parallel data set. We show that the model achieves substantial improvements over strong bilingual models and that it also works well for zero-shot translation, which demonstrates its ability of abstraction and transfer learning.
Published: 2018
Full Text: View/download PDF

43. Self-assembling nanowires from a linear l,d-peptide conjugated to the dextran end group

Author: Raganato, Luca, Del Giudice, Alessandra, Ceccucci, Anita, Sciubba, Fabio, Casciardi, Stefano, Sennato, Simona, Scipioni, Anita, and Masci, Giancarlo
Published: 2022
Full Text: View/download PDF

44. An Empirical Investigation of Word Alignment Supervision for Zero-Shot Multilingual Neural Machine Translation.

Author: Alessandro Raganato, Raúl Vázquez, Mathias Creutz, and Jörg Tiedemann
Published: 2021
Full Text: View/download PDF

45. Wikipedia Entities as Rendezvous across Languages: Grounding Multilingual Language Models by Predicting Wikipedia Hyperlinks.

Author: Iacer Calixto, Alessandro Raganato, and Tommaso Pasini
Published: 2021
Full Text: View/download PDF

46. Recent Trends in Word Sense Disambiguation: A Survey.

Author: Michele Bevilacqua, Tommaso Pasini, Alessandro Raganato, and Roberto Navigli
Published: 2021
Full Text: View/download PDF

47. XL-WSD: An Extra-Large and Cross-Lingual Evaluation Framework for Word Sense Disambiguation.

Author: Tommaso Pasini, Alessandro Raganato, and Roberto Navigli
Published: 2021
Full Text: View/download PDF

48. XL-WiC: A Multilingual Benchmark for Evaluating Semantic Contextualization.

Author: Alessandro Raganato, Tommaso Pasini, José Camacho-Collados, and Mohammad Taher Pilehvar
Published: 2020
Full Text: View/download PDF

49. Fixed Encoder Self-Attention Patterns in Transformer-Based Machine Translation.

Author: Alessandro Raganato, Yves Scherrer, and Jörg Tiedemann
Published: 2020
Full Text: View/download PDF

50. An Evaluation Benchmark for Testing the Word Sense Disambiguation Capabilities of Machine Translation Systems.

Author: Alessandro Raganato, Yves Scherrer, and Jörg Tiedemann
Published: 2020

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Region

Database

Publisher

414 results on '"Raganato, A"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources