Author: "Izabella Thomas" - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"Izabella Thomas"' showing total 11 results

Start Over Author "Izabella Thomas"

11 results on '"Izabella Thomas"'

1. Constitution d’un lexique terminologique trans-biomédical : expérimentations à partir de corpus, listes de vocabulaire et ressources spécialisées

Author: Izabella Thomas, Anastasia Galmiche, Centre de recherche en linguistique et traitement automatique des langues, Lucien Tesnière - UFC (EA 2283) (TESNIERE), Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC), Centre de recherches interdisciplinaires et transculturelles - UFC (UR 3224) (CRIT), and Centre de recherches interdisciplinaires et transculturelles - UFC (EA 3224) (CRIT)
Subjects: [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, 060201 languages & linguistics, 0602 languages and literature, 05 social sciences, 050301 education, 06 humanities and the arts, Sociology, [SHS.LANGUE]Humanities and Social Sciences/Linguistics, 16. Peace & justice, 0503 education, ComputingMilieux_MISCELLANEOUS
Abstract: International audience
Published: 2021

2. Modélisation du contexte des lexies spécialisées en vue de l’élaboration d’un système d’aide à la rédaction scientifique dans le domaine biomédical

Author: Anastasia Galmiche and Izabella Thomas
Subjects: Computer science, Argument, Scientific writing, Field (Bourdieu), Unified Medical Language System, General Earth and Planetary Sciences, Library science, Context (language use), Ontology (information science), Redaction, General Environmental Science, Focus (linguistics)
Abstract: In this paper we propose a modeling of contextual information around a terminological unit, for the needs of a scientific writing aid tool in the biomedical field. We focus more specifically on the modeling of significant phraseic contexts that we formalize as semantically characterized argument patterns. This modeling is based on a large corpus of biomedical scientific articles and relay on semantic types specified in a domain ontology, Unified Medical Language System . Resume Dans cet article nous proposons une modelisation de l’information contextuelle autour des lexies specialisees en vue de l’elaboration d’un systeme d’aide a la redaction scientifique dans le domaine biomedical. Nous considerons plus specifiquement la modelisation du contexte phrastique decrit en termes de schemas actantiels que nous caracterisons semantiquement. Cette modelisation est fondee sur un grand corpus d’articles scientifiques dans le domaine biomedical. Elle s’appuie egalement sur les Types Semantiques d’une ontologie du domaine, Unified Medical Language System .
Published: 2018

3. Formalizing Natural Languages: Applications to Natural Language Processing and Digital Humanities : 15th International Conference, NooJ 2021, Besançon, France, June 9–11, 2021, Revised Selected Papers

Author: Magali Bigey, Annabel Richeton, Max Silberztein, Izabella Thomas, Magali Bigey, Annabel Richeton, Max Silberztein, and Izabella Thomas
Subjects: Natural language processing (Computer science), Computer engineering, Computer networks, Computer science, Data structures (Computer science), Information theory, Application software
Abstract: This book constitutes selected revised papers of the 15th International Conference, NooJ 2021, held in Besançon, France, in June 2021. Due to the COVID-19 pandemic the conference was held online. NooJ is a linguistic development environment that allows linguists to formalize several levels of linguistic phenomena. NooJ provides linguists with tools to develop dictionaries, regular grammars, context-free grammars, context-sensitive grammars and unrestricted grammars as well as their graphical equivalent to formalize each linguistic phenomenon. The 20 full papers presented were carefully reviewed and selected from 62 submissions. The papers are organized in the following topics: linguistic formalization and analysis, digital humanities and teaching, natural language processing applications.
Published: 2022

4. The image of the monolingual dictionary across Europe. Results of the European Survey of Dictionary use and Culture

Author: João Paulo Silvestre, Kristina Koppel, Lars Trap-Jensen, Maria Ribeiro Silveira, Henrik Lorentzen, Knut E. Karlsen, Elena Tamba, Tomislav Stojanov, Voula Giouli, Biljana Nikovska, Ildikó Pilán, Lidija Tanturovska, Stella Markantonatou, Iztok Kosem, Barbora Štěpánková, Kristina Štrkalj Despot, Špela Arhar Holdt, Dirk Geeraerts, Antton Gurrutxaga, Marius-Radu Clim, Oddrun Grønvik, Veronika Vodrážková, Martina Nied Curcio, Carole Tiberius, Sturla Berg-Olsen, Emma Sköldberg, Toma Tasovac, Chris Mulhall, Tanara Zingano Kuhn, Elena Volodina, Jelena Kallas, Amelie Dorn, Louise Holmer, Maria Tuulik, Monika Biesaga, Michal Škrabal, Robert Lew, Andrea Abel, Margit Langemets, Izabella Thomas, Yifat Ben-Moshe, Nikola Ljubešić, Tarja Heinonen, Carlos Valcárcel Riveiro, Ilan Kernerman, Madalin-Ionel Patrascu, Tsvi Sadan, Sascha Wolfer, María José Domínguez Vázquez, Elixabete Etxeberria, Marit Hovdenak, Klara Ceberio, Gabriela Haja, Marie-Aude Lefer, Snežana Petrović, Hilary Nesi, Christian-Emil Ore, Tinatin Margilitadze, Carolin Müller-Spitzer, Nied, MARTINA LUCIA, University of Ljubljana, Centre de recherche en linguistique et traitement automatique des langues, Lucien Tesnière - UFC (EA 2283) (TESNIERE), Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC), Centre de recherches interdisciplinaires et transculturelles - UFC (UR 3224) (CRIT), Centre de recherches interdisciplinaires et transculturelles - UFC (EA 3224) (CRIT), UCL - SSH/ILC/PLIN - Pôle de recherche en linguistique, and USL-B - Séminaire des Sciences du Langage (SeSLa)
Subjects: dictionary use, e-lexicography, monolingual dictionary, 050101 languages & linguistics, 4. Education, 05 social sciences, Section (typography), 050301 education, Library science, Language and Linguistics, Image (mathematics), [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, N/A, User group, 0501 psychology and cognitive sciences, National level, Sociology, [SHS.LANGUE]Humanities and Social Sciences/Linguistics, 0503 education, dictionary use, survey, monolingual dictionaries, ComputingMilieux_MISCELLANEOUS
Abstract: The article presents the results of a survey on dictionary use in Europe, focusing on general monolingual dictionaries. The survey is the broadest survey of dictionary use to date, covering close to 10,000 dictionary users (and non-users) in nearly thirty countries. Our survey covers varied user groups, going beyond the students and translators who have tended to dominate such studies thus far. The survey was delivered via an online survey platform, in language versions specific to each target country. It was completed by 9,562 respondents, over 300 respondents per country on average. The survey consisted of the general section, which was translated and presented to all participants, as well as country-specific sections for a subset of 11 countries, which were drafted by collaborators at the national level. The present report covers the general section. acceptedVersion
Published: 2019

5. Controlled Language and Information on Vaccines: Application to Package Inserts

Author: Izabella Thomas, Dominique A. Vuitton, Valerie de Grivel, J. Renahy, Sylviane Cardey, and Barbara Rath
Subjects: Vocabulary, Package insert, Attitude of Health Personnel, Computer science, Writing, media_common.quotation_subject, Toxicology, computer.software_genre, Lexicon, Risk Assessment, World Wide Web, Consistency (database systems), Patient Education as Topic, Risk Factors, Passive voice, Humans, Pharmacology (medical), Drug Labeling, media_common, Pharmacology, Vaccines, Syntax (programming languages), Grammar, Vaccination, Ambiguity, Protective Factors, Health Literacy, Health Communication, Vocabulary, Controlled, Pamphlets, Patient Safety, Data mining, Comprehension, computer
Abstract: Any ambiguity in texts used in the communication about vaccines can not only interfere with comprehension, but also generate safety and liability issues. Within a survey on the quality of written protocols for at-risk interventional procedures and sanitary crises, we analyzed documents relating to vaccination, and among them, the "package-leaflet" of an anti-H1N1 influenza vaccine, widely disseminated to the public in 2009-2010. Among the most common mistakes, we observed that 1) language was not always adjusted to the non-specialist's level of knowledge; 2) chronology, logic, consistency, and homogeneity were often missing; 3) crucial pieces of information were disseminated all over the text, 4) use of the passive voice did not distinguish between instructions and information; 5) use of synonyms could be misleading and impair translation. We propose the use of "Controlled language" (CL) to improve the situation. By constraining lexicon, grammar and syntax, CL is a way to write documents that are clear, accurate and devoid of ambiguity. However, the set of rules necessary to write in CL is difficult to memorize. We thus developed authoring software (Rédacticiel Prolipsia) to make the creation of a CL by linguists and its use by health professionals easy and adapted to any domain. It may considerably improve the writing of vaccine package inserts/leaflets. It could be used to write information documents about vaccines and their safety, and operating procedures for professionals to prepare, store, and administer vaccines, decide upon proper indication of vaccines, and follow patients after vaccine injection.
Published: 2015

6. English-Vietnamese Machine Translation of Proper Names

Author: Thi Thanh Thao Phan and Izabella Thomas
Subjects: Machine translation, business.industry, Computer science, media_common.quotation_subject, Vietnamese, Limiting, computer.software_genre, language.human_language, Order (business), language, Proper noun, Quality (business), Artificial intelligence, business, computer, Natural language processing, media_common
Abstract: This paper presents some problems involved in the machine translation of proper names (PNs) from English into Vietnamese. Based on the building of an English-Vietnamese parallel corpus of texts with numerous PNs extracted from online BBC News and translated by four machine translation (MT) systems, we implement the PN error classification and analysis. Some pre-processing solutions for reducing and limiting errors are also proposed and tested with a manually annotated corpus in order to significantly improve the MT quality.
Published: 2012

7. La «langue contrôlée» et l’informatisation de son utilisation au service de la qualité des textes médicaux et de la sécurité dans le domaine de la santé

Author: J. Renahy, Barbara Rath, Sylviane Cardey, Izabella Thomas, Dominique A. Vuitton, Xavier Petiaux, Grégory Chippeaux, Bérenger Germain, and Valerie de Grivel
Subjects: Health professionals, business.industry, Health care, Library science, Medicine, business, Humanities
Abstract: Increasing attention is being paid to safety considerations as well as to access to and reliability of information. However, given the complexity and ambiguity of language, it is unfortunate that the quality of the information itself still finds little echo among the current concerns. This article presents the research carried out in the “LiSe” project: a linguistic research that relied on the analysis of healthcare protocols and a collaboration between health professionals and linguists, resulting in the development of a Controlled Language (CL) and a computerized writing assistant. Feedbacks about both the CL and the software are very encouraging: healthcare professionals readily admit that texts written in CL are easier to understand and that CL, as well as using the computerized writing assistant should greatly reduce the risks of misunderstandings, both by professionals and the general public.
Published: 2011

8. Computerization of a ‘Controlled Language’ to Write Medical Standard Operating Procedures (SOPs)

Author: Estelle Seilles, J. Renahy, Izabella Thomas, Dominique A. Vuitton, Marie-Laure Betbeder, Blandine Plaisantin-Alecu, Lucie Laroche, and Oleg Blagosklonov
Subjects: Quality management, Computer science, media_common.quotation_subject, Cohesion (computer science), computer.software_genre, Domain (software engineering), Patient safety, Software, Quality (business), computer-aided writing, General Environmental Science, Accreditation, media_common, standard operating procedures, automated translation, Database, Health professionals, business.industry, linguistics, Test (assessment), General Earth and Planetary Sciences, authoring software, Software engineering, business, computer, controlled language, quality management
Abstract: Accreditation of hospitals includes items regarding the existence of Standard Operating Procedures (SOPs); but these documents can be sources of misunderstanding and patient's safety may be jeopardized. We proposed a solution based on the Controlled Language (CL) concept and developed software services to make CL user-friendly to writers. We carried out: 1) deep linguistic analysis of SOP corpora in two medical domains; 2) language modelling to establish two adapted CLs, 3) improvement of home-made CL Authoring Software by developing software modules and a collaborative corpus-based web-accessible platform for the building of terminological and non-terminological resources, 4) evaluation through focus groups and computer-aided CL-writing test sessions. Health professionals and linguists cooperated closely in a field that is quite new to the health domain. The optimized Prolipsia CL Authoring Software appeared to be a good compromise between users’ needs and CL requirements. All actors agreed that benefits would be gained by using the proposed tools, in terms of patient safety and of work organization, institutional cohesion, and decreased liabilities. They also suggested that software solutions able to analyse the quality of existing texts and help correct them would better fit the situation of institutions which have already got a large corpus of (unsatisfactory) texts at their disposal. Such software is currently at an advanced stage of development, with a first version available.
Full Text: View/download PDF

9. Langues et cultures, systèmes et traduction

Author: Sombat Khruathong, Eun Soon Yu, Rosita Chan, Sylviane Cardey, Gina Melian, Helena Morgadinho, Farouk Bouhadiba, A. Dziadkiewicz, Yves Gentilhomme, Igor Skouratov, Hsiang-I Lin, Izabella Thomas, Duygu Can, Gabriel Sekunda, Kyoko Kuroda, Naga Anuradha Chintalapudi, Xiaohong Wu, and Valentine Grosjean
Subjects: Linguistics and Language, Social Sciences and Humanities, traduction multilingue, dictionnaires, Sciences Humaines et Sociales, segments, parties du discours, Language and Linguistics, phraséologie
Abstract: Nous essaierons dans cet article de mettre en regard des langues de même et diverses origines afin de montrer leurs points communs et leurs différences (concernant leur fonctionnement dans un but de traduction). Ceci nous amènera à revoir la notion de « mot », de « parties du discours ». Nous pourrons montrer aussi combien la perception du monde à travers les civilisations joue son rôle dans l’organisation des langues (les traces du passé dans la pensée en sont des témoins comme le montrent les proverbes et autres composés). L’arabe, le chinois, le coréen, l’espagnol, le français, l’italien, le japonais, le polonais, le portugais, le roumain, le russe, le sanskrit, le thaï et le turc serviront de base à nos remarques et études. Toutes ces remarques nous conduiront, à travers des exemples, à la traduction en général et à la traduction automatique ou aide (dictionnaires) à la traduction en particulier., In this paper we compare languages having the same origin and others with different roots so as to demonstrate what they have in common and how they differ for the purpose of machine translation. In doing so, we will revisit the notions of ‘word’ and ‘part of speech’. These comparisons demonstrate how different are the views of the world through civilisations, and their impact on the structure of languages (compounds, idioms, proverbs will served our demonstration). Arabic, Chinese, French, Italian, Japanese, Korean, Polish, Portuguese, Romanian, Russian, Sanskrit, Spanish, Thai and Turkish will be at the basis of our studies. These comparisons aim at showing through examples how dictionaries should be organised and how to obtain acceptable translations made by machine.

10. Formalizing Natural Languages: Applications to Natural Language Processing and Digital Humanities - 15th International Conference, NooJ 2021, Besançon, France, June 9-11, 2021, Revised Selected Papers

Author: Magali Bigey, Annabel Richeton, Max Silberztein, and Izabella Thomas
Published: 2021
Full Text: View/download PDF

11. Machine translation of proper names from english and french into vietnamese : an error analysis and some proposed solutions

Author: Phan Thi Thanh, Thao, Centre de recherche en linguistique et traitement automatique des langues, Lucien Tesnière - UFC (EA 2283) (TESNIERE), Université de Franche-Comté (UFC), Université Bourgogne Franche-Comté [COMUE] (UBFC)-Université Bourgogne Franche-Comté [COMUE] (UBFC), Université de Franche-Comté, Sylviane Cardey-Greenfield, Lê An Hà, Izabella Thomas, and STAR, ABES
Subjects: Machine translation quality, English-Vietnamese, Parallel corpus, Pre-processing, Qualité de traduction automatique, [SHS.LANGUE] Humanities and Social Sciences/Linguistics, French-Vietnamese, Erreur de traduction, Proper name, Noms propres, Prétraitement, [SHS.LANGUE]Humanities and Social Sciences/Linguistics, Translation error, Corpus parallèle
Abstract: Machine translation (MT) has increasingly become an indispensable tool for decoding themeaning of a text from a source language into a target language in our current information and knowledgeera. In particular, MT of proper names (PN) plays a crucial role in providing the specific and preciseidentification of persons, places, organizations, and artefacts through the languages. Despite a largenumber of studies and significant achievements of named entity recognition in the NLP communityaround the world, there has been almost no research on PNMT for Vietnamese language. Due to the different features of PN writing, transliteration or transcription and translation from a variety of languages including English, French, Russian, Chinese, etc. into Vietnamese, the PNMT from those languages into Vietnamese is still challenging and problematic issue. This study focuses on theproblems of English-Vietnamese and French-Vietnamese PNMT arising from current MT engines. First,it proposes a corpus-based PN classification, then a detailed PNMT error analysis to conclude with somepre-processing solutions in order to improve the MT quality. Through the analysis and classification of PNMT errors from the two English-Vietnamese and French-Vietnamese parallel corpora of texts with PNs, we propose solutions concerning two major issues:(1)corpus annotation for preparing the pre-processing databases, and (2)design of the pre-processingprogram to be used on annotated corpora to reduce the PNMT errors and enhance the quality of MTsystems, including Google, Vietgle, Bing and EVTran. The efficacy of different annotation methods of English and French corpora of PNs and the results of PNMT errors before and after using the pre-processing program on the two annotated corporaare compared and discussed in this study. They prove that the pre-processing solution reducessignificantly PNMT errors and contributes to the improvement of the MT systems’ for Vietnameselanguage., Dans l'ère de l'information et de la connaissance, la traduction automatique (TA) devientprogressivement un outil indispensable pour transposer la signification d'un texte d'une langue source versune langue cible. La TA des noms propres (NP), en particulier, joue un rôle crucial dans ce processus,puisqu'elle permet une identification précise des personnes, des lieux, des organisations et des artefacts àtravers les langues. Malgré un grand nombre d'études et des résultats significatifs concernant lareconnaissance d'entités nommées (dont le nom propre fait partie) dans la communauté de TAL dans lemonde, il n'existe presque aucune recherche sur la traduction automatique des noms propres (TANP) pourle vietnamien. En raison des caractéristiques différentes d'écriture de NP, la translittération ou la transcription etla traduction de plusieurs de langues incluant l'anglais, le français, le russe, le chinois, etc. vers levietnamien, le TANP de ces langues vers le vietnamien est stimulant et problématique. Cette étude seconcentre sur les problèmes de TANP d’anglais vers le vietnamien et de français vers le vietnamienrésultant du moteurs courants de la TA et présente les solutions de prétraitement de ces problèmes pouraméliorer la qualité de la TA. A travers l'analyse et la classification d'erreurs de la TANP faites sur deux corpus parallèles detextes avec PN (anglais-vietnamien et français-vietnamien), nous proposons les solutions concernant deuxproblématiques importantes: (1) l'annotation de corpus, afin de préparer des bases de données pour leprétraitement et (2) la création d'un programme pour prétraiter automatiquement les corpus annotés, afinde réduire les erreurs de la TANP et d'améliorer la qualité de traduction des systèmes de TA, tels queGoogle, Vietgle, Bing et EVTran. L'efficacité de différentes méthodes d'annotation des corpus avec des NP ainsi que les tauxd'erreurs de la TANP avant et après l'application du programme de prétraitement sur les deux corpusannotés est comparés et discutés dans cette thèse. Ils prouvent que le prétraitement réduitsignificativement le taux d'erreurs de la TANP et, par la même, contribue à l'amélioration de traductionautomatique vers la langue vietnamienne.
Published: 2014

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

11 results on '"Izabella Thomas"'

1. Constitution d’un lexique terminologique trans-biomédical : expérimentations à partir de corpus, listes de vocabulaire et ressources spécialisées

2. Modélisation du contexte des lexies spécialisées en vue de l’élaboration d’un système d’aide à la rédaction scientifique dans le domaine biomédical

3. Formalizing Natural Languages: Applications to Natural Language Processing and Digital Humanities : 15th International Conference, NooJ 2021, Besançon, France, June 9–11, 2021, Revised Selected Papers

4. The image of the monolingual dictionary across Europe. Results of the European Survey of Dictionary use and Culture

5. Controlled Language and Information on Vaccines: Application to Package Inserts

6. English-Vietnamese Machine Translation of Proper Names

7. La «langue contrôlée» et l’informatisation de son utilisation au service de la qualité des textes médicaux et de la sécurité dans le domaine de la santé

8. Computerization of a ‘Controlled Language’ to Write Medical Standard Operating Procedures (SOPs)

9. Langues et cultures, systèmes et traduction

10. Formalizing Natural Languages: Applications to Natural Language Processing and Digital Humanities - 15th International Conference, NooJ 2021, Besançon, France, June 9-11, 2021, Revised Selected Papers

11. Machine translation of proper names from english and french into vietnamese : an error analysis and some proposed solutions

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Journal

Database

Publisher

11 results on '"Izabella Thomas"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources