8 results on '"Cross-lingual learning"'
Search Results
2. Ad astra or astray: Exploring linguistic knowledge of multilingual BERT through NLI task.
- Author
-
Tikhonova, Maria, Mikhailov, Vladislav, Pisarevskaya, Dina, Malykh, Valentin, and Shavrina, Tatiana
- Subjects
LANGUAGE models ,KNOWLEDGE transfer ,NATURAL languages ,MULTILINGUALISM ,INFERENCE (Logic) ,NUMERACY - Abstract
Recent research has reported that standard fine-tuning approaches can be unstable due to being prone to various sources of randomness, including but not limited to weight initialization, training data order, and hardware. Such brittleness can lead to different evaluation results, prediction confidences, and generalization inconsistency of the same models independently fine-tuned under the same experimental setup. Our paper explores this problem in natural language inference, a common task in benchmarking practices, and extends the ongoing research to the multilingual setting. We propose six novel textual entailment and broad-coverage diagnostic datasets for French, German, and Swedish. Our key findings are that the mBERT model demonstrates fine-tuning instability for categories that involve lexical semantics, logic, and predicate-argument structure and struggles to learn monotonicity, negation, numeracy, and symmetry. We also observe that using extra training data only in English can enhance the generalization performance and fine-tuning stability, which we attribute to the cross-lingual transfer capabilities. However, the ratio of particular features in the additional training data might rather hurt the performance for model instances. We are publicly releasing the datasets, hoping to foster the diagnostic investigation of language models (LMs) in a cross-lingual scenario, particularly in terms of benchmarking, which might promote a more holistic understanding of multilingualism in LMs and cross-lingual knowledge transfer. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
3. From Tokens to Trees: Mapping Syntactic Structures in the Deserts of Data-Scarce Languages
- Author
-
Vilares, David, Muñoz Ortiz, Alberto, Vilares, David, and Muñoz Ortiz, Alberto
- Abstract
[Abstract]: Low-resource learning in natural language processing focuses on developing effective resources, tools, and technologies for languages that are less popular within the industry and academia. This effort is crucial for several reasons, including ensuring that as many languages as possible are represented digitally, and enhancing access to language technologies for native speakers of minority languages. In this context, this paper outlines the motivation, research lines, and results from a Leonardo Grant - by FBBVA - on low-resource languages and parsing as sequence labeling. The project’s primary aim was to devise fast and accurate methods for low-resource syntactic parsing and to examine evaluation strategies as well as strengths and weaknesses in comparison to alternative parsing strategies.
- Published
- 2024
4. Source-Free Transductive Transfer Learning for Structured Prediction
- Author
-
Kurniawan, Kemal Maulana and Kurniawan, Kemal Maulana
- Abstract
Current transfer learning approaches require two strong assumptions: the source domain data is available and the target domain has labelled data. These assumptions are problematic when both the source domain data is private and the target domain has no labelled data. Thus, we consider the source-free unsupervised transfer setup in which the assumptions are violated across both languages and domains (genres). To transfer structured prediction models in the source-free setting, we propose two methods: Parsimonious Parser Transfer (PPT) designed for single-source transfer of dependency parsers across languages, and PPTX which is the multi-source version of PPT. Both methods outperform baselines. We then propose to improve PPTX with logarithmic opinion pooling (PPTX-LOP), and find that it is an effective multi-source transfer method for structured prediction in general. Next, we study if our proposed source-free transfer methods provide improvements when pretrained language models (PTLMs) are employed. We first propose Parsimonious Transfer for Sequence Tagging (PTST) which is a variation of PPT designed for sequence tagging. Then, we evaluate PTST and PPTX-LOP on domain adaptation of semantic tasks using PTLMs. We show that for globally normalised models, PTST and PPTX-LOP improve precision and recall respectively. Besides unlabelled data, the target domain may have models trained on various tasks (but not the task of interest). To investigate if these models can be used successfully to improve performance in source-free transfer, we propose two methods. We find that leveraging these models can improve recall over direct transfer with one of the proposed methods. Finally, we critically discuss and conclude the findings in this thesis. We cover relevant subsequent work and close with a discussion on limitations and future work.
- Published
- 2023
5. Cross-Lingual and Genre-Supervised Parsing and Tagging for Low-Resource Spoken Data
- Author
-
Fosteri, Iliana and Fosteri, Iliana
- Abstract
Dealing with low-resource languages is a challenging task, because of the absence of sufficient data to train machine-learning models to make predictions on these languages. One way to deal with this problem is to use data from higher-resource languages, which enables the transfer of learning from these languages to the low-resource target ones. The present study focuses on dependency parsing and part-of-speech tagging of low-resource languages belonging to the spoken genre, i.e., languages whose treebank data is transcribed speech. These are the following: Beja, Chukchi, Komi-Zyrian, Frisian-Dutch, and Cantonese. Our approach involves investigating different types of transfer languages, employing MACHAMP, a state-of-the-art parser and tagger that uses contextualized word embeddings, mBERT, and XLM-R in particular. The main idea is to explore how the genre, the language similarity, none of the two, or the combination of those affect the model performance in the aforementioned downstream tasks for our selected target treebanks. Our findings suggest that in order to capture speech-specific dependency relations, we need to incorporate at least a few genre-matching source data, while language similarity-matching source data are a better candidate when the task at hand is part-of-speech tagging. We also explore the impact of multi-task learning in one of our proposed methods, but we observe minor differences in the model performance.
- Published
- 2023
6. Improving hate speech detection using Cross-Lingual Learning.
- Author
-
Firmino, Anderson Almeida, de Souza Baptista, Cláudio, and de Paiva, Anselmo Cardoso
- Subjects
- *
HATE speech , *AUTOMATIC speech recognition , *LANGUAGE models , *NATURAL language processing , *PORTUGUESE language , *ITALIAN language - Abstract
The growth of social media worldwide has brought social benefits and challenges. One problem we highlight is the proliferation of hate speech on social media. We propose a novel method for detecting hate speech in texts using Cross-Lingual Learning. Our approach uses transfer learning from Pre-Trained Language Models (PTLM) with large corpora available to solve problems in languages with fewer resources for the specific task. The proposed methodology comprises four stages: corpora acquisition, the PTLM definition, training strategies, and evaluation. We carried out experiments using Pre-Trained Language Models in English, Italian, and Portuguese (BERT and XLM-R) to verify which best suited the proposed method. We used corpora in English (WH) and Italian (Evalita 2018) as the source language and the OffComBr-2 corpus in Portuguese (the target language). The results of the experiments showed that the proposed methodology is promising: for the OffComBr-2 corpus, the best state-of-the-art result was obtained (F1-measure = 92%). • The development of a new methodology for hate speech detection. • Portuguese hate speech detection using Cross-Lingual Learning. • Up to 20% performance improvement over other models using the OffComBr-2 corpus. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Makine öğrenmesi kullanarak düşük kaynaklı diller için radyoloji raporu sınıflandırması : Beyin kanaması tespiti üzerine bır örnek olay incelemesi
- Author
-
Bayrak, Gıyaseddin, Ganiz, Murat Can, Marmara Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Anabilim Dalı, and Bilgisayar Mühendisliği Bilim Dalı
- Subjects
BERT Machine Learning ,Deep Learning ,Derin Öğrenme ,Diller Arası Öğrenme ,Transfer Learning ,Cross-lingual learning ,Transfer Öğrenimi ,Makine Öğrenmesi ,Radiology ,Radyoloji ,Domain Adaptation ,Etki Alanı Uyarlaması ,BERT - Abstract
Radyoloji raporları hastalık teşhis ve yönetim sürecinde önemli bir role sahiptir. Bazı durumlarda radyoloji raporu, hastayı tedavi eden doktorun derhal harekete geçmesi gereken kritik bir bulguya işaret edebilir. Yapay Zeka, özellikle Doğal Dil İşleme modelleri kullanılarak bu tür vakaların radyoloji raporlarından tespit edilmesi sürecinin otomatikleştirilmesi çok önemlidir, daha hızlı karar verilmesini sağlar ve hayat kurtarabilir. Bu, Türkçe radyoloji raporlarında beyin kanaması tespiti bağlamında kritik bulguları tespit etmeye yönelik yeni bir çalışmadır. Denetimli modellerin eğitimi için yaklaşık 30.000 etiketli Beyin Kanama Bilgisayarlı Tomografi (CT) raporu ve eğitim öncesi ve ince ayar kelime yerleştirme ve dil modelleri için yaklaşık 190 bin rapor kullanıyoruz. Bildiğimiz kadarıyla bu çalışma, büyük ölçekli Türkçe radyoloji raporlarının kullanıldığı ilk çalışmadır. Ayrıca, bu tezde, önceden eğitilmiş dil modellerinde ince ayar yapmanın ve statik kelime kalıplama vektörlerinin başarım üzerindeki etkisini gösterilmiştir ve alana özgü verileri kullanarak ince ayarın sınıflandırma başarımını iyileştirdiği sonucuna varılmıştır. Radiology reports play a vital role in the disease diagnosis and management process, and on certain occasions, they may contain critical findings that require immediate action by the treating physician. Automating the process of detecting such cases using Artificial Intelligence, specifically Natural Language Processing models, can significantly improve the speed of decision-making and potentially save lives. This thesis presents a novel study on detecting critical findings related to brain hemorrhage in Turkish radiology reports. We used approximately 30,000 labeled Brain Hemorrhage Computed Tomography (CT) reports to train supervised models and around 190,000 reports for pre-training and fine-tuning word embeddings and language models in mono-lingual and cross-lingual settings. To the best of our knowledge, this is the first study to utilize a large scale of Turkish radiology reports. Additionally, we demonstrate the impact of adapting pre-trained language models and static embeddings to the domain on the performance, finding that fine-tuning using domain-specific data improves classification accuracy.
- Published
- 2023
8. Translation-Based Implicit Annotation Projection for Zero-Shot Cross-Lingual Event Argument Extraction
- Author
-
Lou, Chenwei, Gao, Jun, Yu, Changlong, Wang, Wei, Zhao, Huan, Tu, Weiwei, Xu, Ruifeng, Lou, Chenwei, Gao, Jun, Yu, Changlong, Wang, Wei, Zhao, Huan, Tu, Weiwei, and Xu, Ruifeng
- Abstract
Zero-shot cross-lingual event argument extraction (EAE) is a challenging yet practical problem in Information Extraction. Most previous works heavily rely on external structured linguistic features, which are not easily accessible in real-world scenarios. This paper investigates a translation-based method to implicitly project annotations from the source language to the target language. With the use of translation-based parallel corpora, no additional linguistic features are required during training and inference. As a result, the proposed approach is more cost effective than previous works on zero-shot cross-lingual EAE. Moreover, our implicit annotation projection approach introduces less noises and hence is more effective and robust than explicit ones. Experimental results show that our model achieves the best performance, outperforming a number of competitive baselines. The thorough analysis further demonstrates the effectiveness of our model compared to explicit annotation projection approaches. © 2022 ACM.
- Published
- 2022
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.