36 results on '"Martínez Garcia, Eva"'
Search Results
2. Influence of inertia and aspect ratio on the torsional galloping of single-axis solar trackers
- Author
-
Martínez-García, Eva, Blanco-Marigorta, Eduardo, Parrondo Gayo, Jorge, and Navarro-Manso, Antonio
- Published
- 2021
- Full Text
- View/download PDF
3. The Proto-oncometabolite Fumarate Binds Glutathione to Amplify ROS-Dependent Signaling
- Author
-
Sullivan, Lucas B., Martinez-Garcia, Eva, Nguyen, Hien, Mullen, Andrew R., Dufour, Eric, Sudarshan, Sunil, Licht, Jonathan D., Deberardinis, Ralph J., and Chandel, Navdeep S.
- Published
- 2013
- Full Text
- View/download PDF
4. EZH2 Is Required for Germinal Center Formation and Somatic EZH2 Mutations Promote Lymphoid Transformation
- Author
-
Béguelin, Wendy, Popovic, Relja, Teater, Matt, Jiang, Yanwen, Bunting, Karen L., Rosen, Monica, Shen, Hao, Yang, Shao Ning, Wang, Ling, Ezponda, Teresa, Martinez-Garcia, Eva, Zhang, Haikuo, Zheng, Yupeng, Verma, Sharad K., McCabe, Michael T., Ott, Heidi M., Van Aller, Glenn S., Kruger, Ryan G., Liu, Yan, McHugh, Charles F., Scott, David W., Chung, Young Rock, Kelleher, Neil, Shaknovich, Rita, Creasy, Caretha L., Gascoyne, Randy D., Wong, Kwok-Kin, Cerchietti, Leandro, Levine, Ross L., Abdel-Wahab, Omar, Licht, Jonathan D., Elemento, Olivier, and Melnick, Ari M.
- Published
- 2013
- Full Text
- View/download PDF
5. Total kinetic analysis reveals how combinatorial methylation patterns are established on lysines 27 and 36 of histone H3
- Author
-
Zheng, Yupeng, Sweet, Steve M. M., Popovic, Relja, Martinez-Garcia, Eva, Tipton, Jeremiah D., Thomas, Paul M., Licht, Jonathan D., and Kelleher, Neil L.
- Published
- 2012
6. The MMSET histone methyl transferase switches global histone methylation and alters gene expression in t(4;14) multiple myeloma cells
- Author
-
Martinez-Garcia, Eva, Popovic, Relja, Min, Dong-Joon, Sweet, Steve M.M., Thomas, Paul M., Zamdborg, Leonid, Heffner, Aaron, Will, Christine, Lamy, Laurence, Staudt, Louis M., Levens, David L., Kelleher, Neil L., and Licht, Jonathan D.
- Published
- 2011
- Full Text
- View/download PDF
7. Document-Level Machine Translation – Ensuring Translational Consistency of Non-Local Phenomena
- Author
-
Martínez Garcia, Eva
- Subjects
Traducción Automática ,Machine Translation ,Document-level ,Context-aware translation ,Lenguajes y Sistemas Informáticos ,Traducción con contexto ,Nivel de documento - Abstract
PhD Thesis written by Eva Martínez Garcia under the supervision of Dr. Cristina España-Bonet and Dr. Lluís Màrquez. The thesis was defended at the Universitat Politècnica de Catalunya in Barcelona on the 19th of December, 2019. The doctoral committee comprised of Dr. Kepa Sarasola (President, University of Basque Country (UPV/EHU)), Dr. Marta Ruiz Costa-Jussà (Universitat Politècnica de Catalunya (UPC)) and Dr. Sara Stymne (Uppsala Universitet). The thesis was awarded an excellent grade and international mention. Tesis doctoral elaborada por Eva Martínez Garcia bajo la supervisión de los doctores Cristina España-Bonet y Lluís Màrquez. La defensa de la tesis tuvo lugar en la Universitat Politècnica de Catalunya en Barcelona el 19 de diciembre de 2019. El tribunal estuvo compuesto por los doctores Kepa Sarasola (Presidente, Universidad del País Vasco (UPV/EHU)), Marta Ruiz Costa-Jussà (Universitat Politècnica de Catalunya (UPC)) y Sara Stymne (Uppsala Universitet). La tesis obtuvo la calificación de sobresaliente y la mención internacional. The thesis work was partially supported by an FPI 2010 grant from the Spanish Ministry of Science and Innovation (MICINN) within the OpenMT-2 project (ref. TIN2009-14675-C03-03) of MICINN, a mobility EEBB 2013 grant from the Spanish Ministry of Economy and Competitiveness (MINECO) for a stay at the Department of Linguistics and Philology at the Uppsala University, and by the TACARDI project (ref. TIN2012-38523-C02-02) of the MINECO.
- Published
- 2021
8. Recurrent exposure to nicotine differentiates human bronchial epithelial cells via epidermal growth factor receptor activation
- Author
-
Martínez-García, Eva, Irigoyen, Marta, Ansó, Elena, Martínez-Irujo, Juan José, and Rouzaut, Ana
- Published
- 2008
- Full Text
- View/download PDF
9. A light method for data generation: a combination of Markov Chains and Word Embeddings
- Author
-
Martínez Garcia, Eva, Nogales, Alberto, Morales Escudero, Javier, Garcia-Tejedor, Álvaro J., Martínez Garcia, Eva, Nogales, Alberto, Morales Escudero, Javier, and Garcia-Tejedor, Álvaro J.
- Abstract
Most of the current state-of-the-art Natural Language Processing (NLP) techniques are highly data-dependent. A significant amount of data is required for their training, and in some scenarios data is scarce. We present a hybrid method to generate new sentences for augmenting the training data. Our approach takes advantage of the combination of Markov Chains and word embeddings to produce high-quality data similar to an initial dataset. In contrast to other neural-based generative methods, it does not need a high amount of training data. Results show how our approach can generate useful data for NLP tools. In particular, we validate our approach by building Transformer-based Language Models using data from three different domains in the context of enriching general purpose chatbots., Las técnicas para el Procesamiento del Lenguaje Natural (PLN) que actualmente conforman el estado del arte necesitan una cantidad importante de datos para su entrenamiento que en algunos escenarios puede ser difícil de conseguir. Presentamos un método híbrido para generar frases nuevas que aumenten los datos de entrenamiento, combinando cadenas de Markov y word embeddings para producir datos de alta calidad similares a un conjunto de datos de partida. Proponemos un método ligero que no necesita una gran cantidad de datos. Los resultados muestran cómo nuestro método es capaz de generar datos útiles. En particular, evaluamos los datos generados generando Modelos de Lenguaje basados en el Transformer utilizando datos de tres dominios diferentes en el contexto de enriquecer chatbots de propósito general.
- Published
- 2020
10. Document-level machine translation : ensuring translational consistency of non-local phenomena
- Author
-
Martínez Garcia, Eva, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, España i Bonet, Cristina, Màrquez, Lluís, and Universitat Politècnica de Catalunya. Departament de Llenguatges i Sistemes Informàtics
- Subjects
Informàtica [Àrees temàtiques de la UPC] - Abstract
In this thesis, we study the automatic translation of documents by taking into account cross-sentence phenomena. This document-level information is typically ignored by most of the standard state-of-the-art Machine Translation (MT) systems, which focus on translating texts processing each of their sentences in isolation. Translating each sentence without looking at its surrounding context can lead to certain types of translation errors, such as inconsistent translations for the same word or for elements in a coreference chain. We introduce methods to attend to document-level phenomena in order to avoid those errors, and thus, reach translations that properly convey the original meaning. Our research starts by identifying the translation errors related to such document-level phenomena that commonly appear in the output of state-of-the-art Statistical Machine Translation (SMT) systems. For two of those errors, namely inconsistent word translations as well as gender and number disagreements among words, we design simple and yet effective post-processing techniques to tackle and correct them. Since these techniques are applied a posteriori, they can access the whole source and target documents, and hence, they are able to perform a global analysis and improve the coherence and consistency of the translation. Nevertheless, since following such a two-pass decoding strategy is not optimal in terms of efficiency, we also focus on introducing the context-awareness during the decoding process itself. To this end, we enhance a document-oriented SMT system with distributional semantic information in the form of bilingual and monolingual word embeddings. In particular, these embeddings are used as Semantic Space Language Models (SSLMs) and as a novel feature function. The goal of the former is to promote word translations that are semantically close to their preceding context, whereas the latter promotes the lexical choice that is closest to its surrounding context, for those words that have varying translations throughout the document. In both cases, the context extends beyond sentence boundaries. Recently, the MT community has transitioned to the neural paradigm. The finalstep of our research proposes an extension of the decoding process for a Neural Machine Translation (NMT) framework, independent of the model architecture, by shallow fusing the information from a neural translation model and the context semantics enclosed in the previously studied SSLMs. The aim of this modification is to introduce the benefits of context information also into the decoding process of NMT systems, as well as to obtain an additional validation for the techniques we explored. The automatic evaluation of our approaches does not reflect significant variations. This is expected since most automatic metrics are neither context-nor semantic-aware and because the phenomena we tackle are rare, leading to few modifications with respect to the baseline translations. On the other hand, manual evaluations demonstrate the positive impact of our approaches since human evaluators tend to prefer the translations produced by our document-aware systems. Therefore, the changes introduced by our enhanced systems are important since they are related to how humans perceive translation quality for long texts. En esta tesis se estudia la traducción automática de documentos teniendo en cuenta fenómenos que ocurren entre oraciones. Típicamente, esta información a nivel de documento se ignora por la mayoría de los sistemas de Traducción Automática (MT), que se centran en traducir los textos procesando cada una de las frases que los componen de manera aislada. Traducir cada frase sin mirar al contexto que la rodea puede llevar a generar cierto tipo de errores de traducción, como pueden ser traducciones inconsistentes para la misma palabra o para elementos que aparecen en la misma cadena de correferencia. En este trabajo se presentan métodos para prestar atención a fenómenos a nivel de documento con el objetivo de evitar este tipo de errores y así llegar a generar traducciones que transmitan correctamente el significado original del texto. Nuestra investigación empieza por identificar los errores de traducción relacionados con los fenómenos a nivel de documento que aparecen de manera común en la salida de los sistemas Estadísticos del Traducción Automática (SMT). Para dos de estos errores, la traducción inconsistente de palabras, así como los desacuerdos en género y número entre palabras, diseñamos técnicas simples pero efectivas como post-procesos para tratarlos y corregirlos. Como estas técnicas se aplican a posteriori, pueden acceder a los documentos enteros tanto del origen como la traducción generada, y así son capaces de hacer un análisis global y mejorar la coherencia y la consistencia de la traducción. Sin embargo, como seguir una estrategia de traducción en dos pasos no es óptima en términos de eficiencia, también nos centramos en introducir la conciencia del contexto durante el propio proceso de generación de la traducción. Para esto, extendemos un sistema SMT orientado a documentos incluyendo información semántica distribucional en forma de word embeddings bilingües y monolingües. En particular, estos embeddings se usan como un Modelo de Lenguaje de Espacio Semántico (SSLM) y como una nueva función característica del sistema. La meta del primero es promover traducciones de palabras que sean semánticamente cercanas a su contexto precedente, mientras que la segunda quiere promover la selección léxica que es más cercana a su contexto para aquellas palabras que tienen diferentes traducciones a lo largo de un documento. En ambos casos, el contexto que se tiene en cuenta va más allá de los límites de una frase u oración. Recientemente, la comunidad MT ha hecho una transición hacia el paradigma neuronal. El paso final de nuestra investigación propone una extensión del proceso de decodificación de un sistema de Traducción Automática Neuronal (NMT), independiente de la arquitectura del modelo de traducción, aplicando la técnica de Shallow Fusion para combinar la información del modelo de traducción neuronal y la información semántica del contexto encerrada en los modelos SSLM estudiados previamente. La motivación de esta modificación está en introducir los beneficios de la información del contexto también en el proceso de decodificación de los sistemas NMT, así como también obtener una validación adicional para las técnicas que se han ido explorando a lo largo de esta tesis. La evaluación automática de nuestras propuestas no refleja variaciones significativas. Esto es un comportamiento esperado ya que la mayoría de las métricas automáticas no se diseñan para ser sensibles al contexto o a la semántica, y además los fenómenos que tratamos son escasos, llevando a pocas modificaciones con respecto a las traducciones de partida. Por otro lado, las evaluaciones manuales demuestran el impacto positivo de nuestras propuestas ya que los evaluadores humanos tienen a preferir las traducciones generadas por nuestros sistemas a nivel de documento. Entonces, los cambios introducidos por nuestros sistemas extendidos son importantes porque están relacionados con la forma en que los humanos perciben la calidad de la traducción de textos largos.
- Published
- 2019
11. Document-level machine translation : ensuring translational consistency of non-local phenomena
- Author
-
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, España i Bonet, Cristina, Màrquez, Lluís, Martínez Garcia, Eva, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, España i Bonet, Cristina, Màrquez, Lluís, and Martínez Garcia, Eva
- Abstract
In this thesis, we study the automatic translation of documents by taking into account cross-sentence phenomena. This document-level information is typically ignored by most of the standard state-of-the-art Machine Translation (MT) systems, which focus on translating texts processing each of their sentences in isolation. Translating each sentence without looking at its surrounding context can lead to certain types of translation errors, such as inconsistent translations for the same word or for elements in a coreference chain. We introduce methods to attend to document-level phenomena in order to avoid those errors, and thus, reach translations that properly convey the original meaning. Our research starts by identifying the translation errors related to such document-level phenomena that commonly appear in the output of state-of-the-art Statistical Machine Translation (SMT) systems. For two of those errors, namely inconsistent word translations as well as gender and number disagreements among words, we design simple and yet effective post-processing techniques to tackle and correct them. Since these techniques are applied a posteriori, they can access the whole source and target documents, and hence, they are able to perform a global analysis and improve the coherence and consistency of the translation. Nevertheless, since following such a two-pass decoding strategy is not optimal in terms of efficiency, we also focus on introducing the context-awareness during the decoding process itself. To this end, we enhance a document-oriented SMT system with distributional semantic information in the form of bilingual and monolingual word embeddings. In particular, these embeddings are used as Semantic Space Language Models (SSLMs) and as a novel feature function. The goal of the former is to promote word translations that are semantically close to their preceding context, whereas the latter promotes the lexical choice that is closest to its surrounding context, for those wo, En esta tesis se estudia la traducción automática de documentos teniendo en cuenta fenómenos que ocurren entre oraciones. Típicamente, esta información a nivel de documento se ignora por la mayoría de los sistemas de Traducción Automática (MT), que se centran en traducir los textos procesando cada una de las frases que los componen de manera aislada. Traducir cada frase sin mirar al contexto que la rodea puede llevar a generar cierto tipo de errores de traducción, como pueden ser traducciones inconsistentes para la misma palabra o para elementos que aparecen en la misma cadena de correferencia. En este trabajo se presentan métodos para prestar atención a fenómenos a nivel de documento con el objetivo de evitar este tipo de errores y así llegar a generar traducciones que transmitan correctamente el significado original del texto. Nuestra investigación empieza por identificar los errores de traducción relacionados con los fenómenos a nivel de documento que aparecen de manera común en la salida de los sistemas Estadísticos del Traducción Automática (SMT). Para dos de estos errores, la traducción inconsistente de palabras, así como los desacuerdos en género y número entre palabras, diseñamos técnicas simples pero efectivas como post-procesos para tratarlos y corregirlos. Como estas técnicas se aplican a posteriori, pueden acceder a los documentos enteros tanto del origen como la traducción generada, y así son capaces de hacer un análisis global y mejorar la coherencia y la consistencia de la traducción. Sin embargo, como seguir una estrategia de traducción en dos pasos no es óptima en términos de eficiencia, también nos centramos en introducir la conciencia del contexto durante el propio proceso de generación de la traducción. Para esto, extendemos un sistema SMT orientado a documentos incluyendo información semántica distribucional en forma de word embeddings bilingües y monolingües. En particular, estos embeddings se usan como un Modelo de Lenguaje de Espacio Semántic, Postprint (published version)
- Published
- 2019
12. Neural Machine Translation of Basque
- Author
-
Etchegoyhen, Thierry, Martínez Garcia, Eva, Azpeitia, Andoni, Labaka Intxauspe, Gorka, Alegría Loinaz, Iñaki, Cortés Etxabe, Itziar, Jauregi Carrera, Amaia, Ellakuria, Igor, Martin, Maite, and Calonge, Eusebi
- Subjects
Machine Translation ,Lenguajes y Sistemas Informáticos - Abstract
We describe the first experimental results in neural machine translation for Basque. As a synthetic language featuring agglutinative morphology, an extended case system, complex verbal morphology and relatively free word order, Basque presents a large number of challenging characteristics for machine translation in general, and for data-driven approaches such as attention-based encoder-decoder models in particular. We present our results on a large range of experiments in Basque-Spanish translation, comparing several neural machine translation system variants with both rule-based and statistical machine translation systems. We demonstrate that significant gains can be obtained with a neural network approach for this challenging language pair, and describe optimal configurations in terms of word segmentation and decoding parameters, measured against test sets that feature multiple references to account for word order variability. This work was supported by the Department of Economic Development and Competitiveness of the Basque Government via the MODELA project.
- Published
- 2018
13. QUALES: Machine Translation Quality Estimation via Supervised and Unsupervised Machine Learning
- Author
-
Etchegoyhen, Thierry, Martínez Garcia, Eva, Azpeitia, Andoni, Alegría Loinaz, Iñaki, Labaka Intxauspe, Gorka, Otegi, Arantza, Sarasola Gabiola, Kepa, Cortés Etxabe, Itziar, Jauregi Carrera, Amaia, Ellakuria, Igor, Calonge, Eusebi, and Martin, Maite
- Subjects
Machine Learning ,Machine Translation ,Estimación de calidad ,Lenguajes y Sistemas Informáticos ,Traducción automática ,Quality Estimation ,Aprendizaje automático - Abstract
La estimación automática de calidad (EAC) de la traducción automática consiste en medir la calidad de traducciones sin acceso a referencias humanas, habitualmente mediante métodos de aprendizaje automático. Un buen sistema EAC puede ayudar en tres aspectos del proceso de traducción asistida por medio de traducción automática y posedición: aumento de la productividad (descartando traducciones automáticas de mala calidad), estimación de costes (ayudando a prever el coste de posedición) y selección de proveedor (si se dispone de varios sistemas de traducción automática). El interés en este campo de investigación ha crecido significativamente en los últimos años, dando lugar a tareas compartidas a nivel mundial (WMT) y a una fuerte actividad científica. En este artículo, se hace un repaso del estado del arte en este área y se presenta el proyecto QUALES que se está realizando. The automatic quality estimation (QE) of machine translation consists in measuring the quality of translations without access to human references, usually via machine learning approaches. A good QE system can help in three aspects of translation processes involving machine translation and post-editing: increasing productivity (by ruling out poor quality machine translation), estimating costs (by helping to forecast the cost of post-editing) and selecting a provider (if several machine translation systems are available). Interest in this research area has grown significantly in recent years, leading to regular shared tasks in the main machine translation conferences and intense scientific activity. In this article we review the state of the art in this research area and present project QUALES, which is under development.
- Published
- 2018
14. Context-Aware Neural Machine Translation Decoding
- Author
-
Martínez Garcia, Eva, primary, Creus, Carles, additional, and España-Bonet, Cristina, additional
- Published
- 2019
- Full Text
- View/download PDF
15. ELRI. European Language Resource Infrastructure
- Author
-
Etchegoyhen, Thierry, Anza Porras, Borja, Azpeitia, Andoni, Martínez Garcia, Eva, Vale, Paulo, Fonseca, José Luis, Lynn, Teresa, Dunne, Jane, Gaspari, Federico, Way, Andy, Arranz, Victoria, Choukri, Khalid, Popescu, Vladimir, Neiva, Pedro, Neto, Rui, Melero, Maite, Perez, David, Branco, António, Branco, Ruben, Gomes, Luís, Etchegoyhen, Thierry, Anza Porras, Borja, Azpeitia, Andoni, Martínez Garcia, Eva, Vale, Paulo, Fonseca, José Luis, Lynn, Teresa, Dunne, Jane, Gaspari, Federico, Way, Andy, Arranz, Victoria, Choukri, Khalid, Popescu, Vladimir, Neiva, Pedro, Neto, Rui, Melero, Maite, Perez, David, Branco, António, Branco, Ruben, and Gomes, Luís
- Abstract
We describe the European Language Resources Infrastructure project, whose main aim is the provision of an infrastructure to help collect, prepare and share language resources that can in turn improve translation services in Europe.
- Published
- 2018
16. QUALES: Estimación Automática de Calidad de Traducción Mediante Aprendizaje Automático Supervisado y No-Supervisado
- Author
-
Etchegoyhen, Thierry, Martínez Garcia, Eva, Azpeitia, Andoni, Alegría Loinaz, Iñaki, Labaka Intxauspe, Gorka, Otegi, Arantza, Sarasola Gabiola, Kepa, Cortés Etxabe, Itziar, Jauregi Carrera, Amaia, Ellakuria, Igor, Calonge, Eusebi, Martin, Maite, Etchegoyhen, Thierry, Martínez Garcia, Eva, Azpeitia, Andoni, Alegría Loinaz, Iñaki, Labaka Intxauspe, Gorka, Otegi, Arantza, Sarasola Gabiola, Kepa, Cortés Etxabe, Itziar, Jauregi Carrera, Amaia, Ellakuria, Igor, Calonge, Eusebi, and Martin, Maite
- Abstract
La estimación automática de calidad (EAC) de la traducción automática consiste en medir la calidad de traducciones sin acceso a referencias humanas, habitualmente mediante métodos de aprendizaje automático. Un buen sistema EAC puede ayudar en tres aspectos del proceso de traducción asistida por medio de traducción automática y posedición: aumento de la productividad (descartando traducciones automáticas de mala calidad), estimación de costes (ayudando a prever el coste de posedición) y selección de proveedor (si se dispone de varios sistemas de traducción automática). El interés en este campo de investigación ha crecido significativamente en los últimos años, dando lugar a tareas compartidas a nivel mundial (WMT) y a una fuerte actividad científica. En este artículo, se hace un repaso del estado del arte en este área y se presenta el proyecto QUALES que se está realizando., The automatic quality estimation (QE) of machine translation consists in measuring the quality of translations without access to human references, usually via machine learning approaches. A good QE system can help in three aspects of translation processes involving machine translation and post-editing: increasing productivity (by ruling out poor quality machine translation), estimating costs (by helping to forecast the cost of post-editing) and selecting a provider (if several machine translation systems are available). Interest in this research area has grown significantly in recent years, leading to regular shared tasks in the main machine translation conferences and intense scientific activity. In this article we review the state of the art in this research area and present project QUALES, which is under development.
- Published
- 2018
17. Cross-lingual semantic annotation of biomedical literature: experiments in Spanish and English.
- Author
-
Perez, Naiara, Accuosto, Pablo, Bravo, Àlex, Cuadros, Montse, Martínez-Garcia, Eva, Saggion, Horacio, and Rigau, German
- Subjects
NATURAL language processing ,SPANISH literature ,ANNOTATIONS ,LINGUISTIC analysis ,SPANISH language - Abstract
Motivation Biomedical literature is one of the most relevant sources of information for knowledge mining in the field of Bioinformatics. In spite of English being the most widely addressed language in the field; in recent years, there has been a growing interest from the natural language processing community in dealing with languages other than English. However, the availability of language resources and tools for appropriate treatment of non-English texts is lacking behind. Our research is concerned with the semantic annotation of biomedical texts in the Spanish language, which can be considered an under-resourced language where biomedical text processing is concerned. Results We have carried out experiments to assess the effectiveness of several methods for the automatic annotation of biomedical texts in Spanish. One approach is based on the linguistic analysis of Spanish texts and their annotation using an information retrieval and concept disambiguation approach. A second method takes advantage of a Spanish–English machine translation process to annotate English documents and transfer annotations back to Spanish. A third method takes advantage of the combination of both procedures. Our evaluation shows that a combined system has competitive advantages over the two individual procedures. Availability and implementation UMLS Mapper (https://snlt.vicomtech.org/umlsmapper) and the annotation transfer tool (http://scientmin.taln.upf.edu/anntransfer/) are freely available for research purposes as web services and/or demos. Supplementary information Supplementary data are available at Bioinformatics online. [ABSTRACT FROM AUTHOR]
- Published
- 2020
- Full Text
- View/download PDF
18. STACC, OOV Density and N-gram Saturation: Vicomtech’s Participation in the WMT 2018 Shared Task on Parallel Corpus Filtering
- Author
-
Azpeitia, Andoni, primary, Etchegoyhen, Thierry, additional, and Martínez Garcia, Eva, additional
- Published
- 2018
- Full Text
- View/download PDF
19. Supervised and Unsupervised Minimalist Quality Estimators: Vicomtech’s Participation in the WMT 2018 Quality Estimation Task
- Author
-
Etchegoyhen, Thierry, primary, Martínez Garcia, Eva, additional, and Azpeitia, Andoni, additional
- Published
- 2018
- Full Text
- View/download PDF
20. Weighted Set-Theoretic Alignment of Comparable Sentences
- Author
-
Azpeitia, Andoni, primary, Etchegoyhen, Thierry, additional, and Martínez Garcia, Eva, additional
- Published
- 2017
- Full Text
- View/download PDF
21. The UPC TweetMT participation : translating formal tweets using context information
- Author
-
Martínez Garcia, Eva, España Bonet, Cristina, Márquez Villodre, Luís, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, and Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
- Subjects
Context aware translation ,Twitter ,Traducció automàtica ,Machine translation ,Informàtica::Intel·ligència artificial::Llenguatge natural [Àrees temàtiques de la UPC] ,Machine translating - Abstract
In this paper, we describe the UPC systems that participated in the TweetMT shared task. We developed two main systems that were applied to the Spanish-Catalan language pair: a state-of-the-art phrase-based statistical machine translation system and a context-aware system. In the second approach, we define the
- Published
- 2015
22. Document-level machine translation with word vector models
- Author
-
Martínez Garcia, Eva, España Bonet, Cristina, Márquez Villodre, Luís, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, and Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
- Subjects
Traducció automàtica ,ComputingMethodologies_DOCUMENTANDTEXTPROCESSING ,Machine translation ,Informàtica::Intel·ligència artificial::Llenguatge natural [Àrees temàtiques de la UPC] - Abstract
In this paper we apply distributional semantic information to document-level machine translation. We train monolingual and bilingual word vector models on large corpora and we evaluate them first in a cross-lingual lexical substitution task and then on the final translation task. For translation, we incorporate the semantic information in a statistical document-level decoder (Docent), by enforcing translation choices that are semantically similar to the context. As expected, the bilingual word vector models are more appropriate for the purpose of translation. The final document-level translator incorporating the semantic model outperforms the basic Docent (without semantics) and also performs slightly over a standard sentence level SMT system in terms of ULC (the average of a set of standard automatic evaluation metrics for MT). Finally, we also present some manual analysis of the translations of some concrete documents
- Published
- 2015
23. Experiments on document level machine translation
- Author
-
Martínez Garcia, Eva, España Bonet, Cristina, Márquez Villodre, Luís, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, and Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural
- Subjects
Traducció automàtica ,Informàtica::Intel·ligència artificial [Àrees temàtiques de la UPC] ,Machine translating - Abstract
Most of the current SMT systems work at sentence level. They translate a text assuming that sentences are independent, but, when one looks at a well formed document, it is clear that there exist many inter sentence relations. There is much contextual information that, unfortunately, is lost when translating sentences in an independent way. We want to improve translation coherence and cohesion using document level information. So, we are interested in develop new strategies to take advantage of context information to achieve our goal. For example, we want to approach this challenge developing postprocesses in order to try to fix a first translation obtained by an SMT system. Also we are interested in taking advantage of the document level translation framework given by the Docent decoder to implement and test some of our ideas. The analogous problem can be found regarding to automatic MT evaluation metrics because most of them are designed at sentence level so, they do not capture improvements in lexical cohesion and coherence or discourse structure. However, we will left this topic for future work
- Published
- 2014
24. Overview of TweetMT : a shared task on machine translation of tweets at SEPLN 2015
- Author
-
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Alegria, Iñaki, Aranberri, Nora, España Bonet, Cristina, Gamallo, Pablo, Gonçalo Oliveira, Hugo, Martínez Garcia, Eva, San Vicente Roncal, Iñaki, Toral, Antonio, Zubiaga, Arkaitz, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Alegria, Iñaki, Aranberri, Nora, España Bonet, Cristina, Gamallo, Pablo, Gonçalo Oliveira, Hugo, Martínez Garcia, Eva, San Vicente Roncal, Iñaki, Toral, Antonio, and Zubiaga, Arkaitz
- Abstract
This article presents an overview of the shared task that took place as part of the TweetMT workshop held at SEPLN 2015. The task consisted in translating collections of tweets from and to several languages. The article outlines the data collection and annotation process, the development and evaluation of the shared task, as well as the results achieved by the participants. // Este artículo presenta un resumen de la tarea conjunta que tuvo lugar en el marco del taller TweetMT celebrado junto con SEPLN 2015, que consiste en traducir diversas colecciones de tweets en varios lenguajes. El artículo describe el proceso de recolección y anotación de datos, el desarrollo y evaluación de la tarea y los resultados obtenidos por los participantes, Peer Reviewed, Postprint (author’s final draft)
- Published
- 2015
25. The UPC TweetMT participation : translating formal tweets using context information
- Author
-
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, Márquez Villodre, Luís, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, and Márquez Villodre, Luís
- Abstract
In this paper, we describe the UPC systems that participated in the TweetMT shared task. We developed two main systems that were applied to the Spanish-Catalan language pair: a state-of-the-art phrase-based statistical machine translation system and a context-aware system. In the second approach, we define the, Peer Reviewed, Postprint (author’s final draft)
- Published
- 2015
26. Document-level machine translation with word vector models
- Author
-
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, Márquez Villodre, Luís, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, and Márquez Villodre, Luís
- Abstract
In this paper we apply distributional semantic information to document-level machine translation. We train monolingual and bilingual word vector models on large corpora and we evaluate them first in a cross-lingual lexical substitution task and then on the final translation task. For translation, we incorporate the semantic information in a statistical document-level decoder (Docent), by enforcing translation choices that are semantically similar to the context. As expected, the bilingual word vector models are more appropriate for the purpose of translation. The final document-level translator incorporating the semantic model outperforms the basic Docent (without semantics) and also performs slightly over a standard sentence level SMT system in terms of ULC (the average of a set of standard automatic evaluation metrics for MT). Finally, we also present some manual analysis of the translations of some concrete documents, Peer Reviewed, Postprint (published version)
- Published
- 2015
27. Document-Level Machine Translation as a Re-translation Process
- Author
-
Martínez Garcia, Eva, España Bonet, Cristina, Màrquez Villodre, Lluís, Martínez Garcia, Eva, España Bonet, Cristina, and Màrquez Villodre, Lluís
- Abstract
Most of the current Machine Translation systems are designed to translate a document sentence by sentence ignoring discourse information and producing incoherencies in the final translations. In this paper we present some document-level-oriented post-processes to improve translations' coherence and consistency. Incoherences are detected and new partial translations are proposed. The work focuses on studying two phenomena: words with inconsistent translations throughout a text and also, gender and number agreement among words. Since we deal with specific phenomena, an automatic evaluation does not reflect significant variations in the translations. However, improvements are observed through a manual evaluation., Los sistemas de Traducción Automática suelen estar diseñados para traducir un texto oración por oración ignorando la información del discurso y provocando así la aparición de incoherencias en las traducciones. En este artículo se presentan varios sistemas que detectan incoherencias a nivel de documento y proponen nuevas traducciones parciales para mejorar el nivel de cohesión y coherencia global. El estudio se centra en dos casos: palabras con traducciones inconsistentes en un texto y la concordancia de género y número entre palabras. Dado que se trata de fenómenos concretos, los cambios no se ven reflejados en una evaluación automática global pero una evaluación manual muestra mejoras en las traducciones.
- Published
- 2014
28. Word's vector representations meet machine translation
- Author
-
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, Tiedemann, Jörg, Márquez Villodre, Luís, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, Tiedemann, Jörg, and Márquez Villodre, Luís
- Abstract
Distributed vector representations of words are useful in various NLP tasks. We briefly review the CBOW approach and propose a bilingual application of this architecture with the aim to improve consistency and coherence of Machine Translation. The primary goal of the bilingual extension is to handle ambiguous words for which the different senses are conflated in the monolingual setup., Peer Reviewed, Postprint (published version)
- Published
- 2014
29. Document-level machine translation as a re-translation process
- Author
-
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, Màrquez Villodre, Lluís, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, and Màrquez Villodre, Lluís
- Abstract
Most of the current Machine Translation systems are designed to translate a document sentence by sentence ignoring discourse information and producing incoherencies in the final translations. In this paper we present some document-level-oriented post-processes to improve translations' coherence and consistency. Incoherences are detected and new partial translations are proposed. The work focuses on studying two phenomena: words with inconsistent translations throughout a text and also, gender and number agreement among words. Since we deal with specific phenomena, an automatic evaluation does not reflect significant variations in the translations. However, improvements are observed through a manual evaluation., Peer Reviewed, Postprint (published version)
- Published
- 2014
30. Experiments on document level machine translation
- Author
-
Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, Márquez Villodre, Luís, Universitat Politècnica de Catalunya. Departament de Ciències de la Computació, Universitat Politècnica de Catalunya. GPLN - Grup de Processament del Llenguatge Natural, Martínez Garcia, Eva, España Bonet, Cristina, and Márquez Villodre, Luís
- Abstract
Most of the current SMT systems work at sentence level. They translate a text assuming that sentences are independent, but, when one looks at a well formed document, it is clear that there exist many inter sentence relations. There is much contextual information that, unfortunately, is lost when translating sentences in an independent way. We want to improve translation coherence and cohesion using document level information. So, we are interested in develop new strategies to take advantage of context information to achieve our goal. For example, we want to approach this challenge developing postprocesses in order to try to fix a first translation obtained by an SMT system. Also we are interested in taking advantage of the document level translation framework given by the Docent decoder to implement and test some of our ideas. The analogous problem can be found regarding to automatic MT evaluation metrics because most of them are designed at sentence level so, they do not capture improvements in lexical cohesion and coherence or discourse structure. However, we will left this topic for future work, Preprint
- Published
- 2014
31. Robust Part of Speech Tagging
- Author
-
Màrquez Villodre, Lluís, Martínez Garcia, Eva, Màrquez Villodre, Lluís, and Martínez Garcia, Eva
- Abstract
Generally, NLP tools use well-formed and annotated data to learn patterns by using machine learning techniques. However, in this work we will focus on the language used in an on-line platform for machine translation. In this area it is usual to have a framework such the following: a web-page which offer a service of translation between pairs of languages. The problem is that the casual users utilize the service to translate any type of text (cut and paste, single words, bad formatting, snipets, informal language, pre-traductions, etc.). Hence, in this situation we will find very often words with mistakes that make the system provides a bad translation because it is not able to understand the input., The main goal of our work is, once we have identified the problem of dealing with non-standard-input is to develop a robust PoS tagger from the SVMTagger.
- Published
- 2013
32. MMSET Contributes to Multiple Myeloma Oncogenesis Through Induction of Global Epigenetic Changes and Alteration of the DNA Damage Response,
- Author
-
Popovic, Relja, Martinez-Garcia, Eva, Sweet, Steve M.M, Zheng, Yupeng, Kelleher, Neil L, and Licht, Jonathan D.
- Published
- 2011
- Full Text
- View/download PDF
33. MMSET Stimulates Myeloma Cell Growth Through MicroRNA-Mediated Modulation of c-MYC
- Author
-
Min, Dong-Joon, Kim, Marianne, Ezponda, Teresa, Will, Christine, Martinez-Garcia, Eva, Popovic, Relja, Elenitoba-Johnson, Kojo S.J., Venkatesha, Basrur, and Licht, Jonathan D.
- Published
- 2011
- Full Text
- View/download PDF
34. The MMSET Histone Methyl Transferase Alters Chromatin Structure and Gene Expression in t(4;14) Multiple Myeloma Cells.
- Author
-
Martinez-Garcia, Eva, Popovic, Relja, Min, Dong-Joon, Will, Christine, Meyer, Julia, Staudt, Louis M., Lamy, Laurence, Lauring, Josh, Cheng, Zhongjun, Patel, Dinshaw, and Licht, Jonathan D.
- Published
- 2009
- Full Text
- View/download PDF
35. Genome-Wide Chromatin Immunopreciptiation and Gene Expression Analysis Indicates That MMSET Is a Transcriptional Repressor in Vivo
- Author
-
Min, Dong-Joon, Meyer, Julia, Martinez-Garcia, Eva, Lauring, Josh, and Licht, Jonathan D.
- Published
- 2008
- Full Text
- View/download PDF
36. Robust Part of Speech Tagging
- Author
-
Martínez Garcia, Eva and Màrquez Villodre, Lluís
- Subjects
Natural language processing (Computer science) ,Traducció automàtica ,Tractament del llenguatge natural (Informàtica) ,Informàtica::Intel·ligència artificial::Llenguatge natural [Àrees temàtiques de la UPC] ,Machine translating - Abstract
Generally, NLP tools use well-formed and annotated data to learn patterns by using machine learning techniques. However, in this work we will focus on the language used in an on-line platform for machine translation. In this area it is usual to have a framework such the following: a web-page which offer a service of translation between pairs of languages. The problem is that the casual users utilize the service to translate any type of text (cut and paste, single words, bad formatting, snipets, informal language, pre-traductions, etc.). Hence, in this situation we will find very often words with mistakes that make the system provides a bad translation because it is not able to understand the input. The main goal of our work is, once we have identified the problem of dealing with non-standard-input is to develop a robust PoS tagger from the SVMTagger.
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.