14 results on '"Sandford Pedersen, Bolette"'
Search Results
2. Evaluering af sprogforståelsen i danske sprogmodeller - med udgangspunkt i semantiske ordbøger
- Author
-
Sandford Pedersen, Bolette, Hau Sørensen, Nathalie C., Olsen, Sussi, Nimb, Sanni, Sandford Pedersen, Bolette, Hau Sørensen, Nathalie C., Olsen, Sussi, and Nimb, Sanni
- Abstract
Artiklen beskriver hvordan vi har udviklet en række datasæt – et såkaldt benchmark – til at evaluere forskellige aspekter af sprogforståelse i danske sprogmodeller. Vores antagelse er at den viden der allerede er beskrevet i en række eksisterende danske ordbøger, kan opfattes som ’ground truth’ for semantikken i det danske ordforråd. Vores metode går derfor ud på at ’vende’ de semantiske ordbøger om og bruge dem til at generere et benchmark der afprøver modellernes evne til at forstå dansk. Mere specifikt undersøger vi hvor godt modellerne i) forstår synonymi, nærsynonymi, og hvornår noget er semantisk associeret, ii) skaber inferens i relation til begrebsmæssig viden og nedarvning af egenskaber fra overbegreb til underbegreb, iii) laver korrekte følgeslutninger i forbindelse med specifikke handlinger og hændelser, iv) skelner mellem centrale betydninger af et ord i kontekst og v) håndterer positiv og negativ konnotation eller ’sentiment’ i løbende tekst. Vi afprøver vores datasæt på ChatGPT 3.5 turbo og på ChatGPT 4.0 og kan se at datasættene har en passende sværhedsgrad i forhold til hvad modellerne er i stand til at håndtere, om end ChatGPT 4.0 opnår særdeles gode resultater for flere af datasættene.
- Published
- 2024
3. Ontological Extraction of Content for Text Querying
- Author
-
Andreasen, Troels, Anker Jensen, Per, Fischer Nilsson, Jørgen, Paggio, Patrizia, Sandford Pedersen, Bolette, Erdman Thomsen, Hanne, Goos, Gerhard, editor, Hartmanis, Juris, editor, van Leeuwen, Jan, editor, Andersson, Birger, editor, Bergholtz, Maria, editor, and Johannesson, Paul, editor
- Published
- 2002
- Full Text
- View/download PDF
4. Ontological Extraction of Content for Text Querying
- Author
-
Andreasen, Troels, primary, Anker Jensen, Per, additional, Fischer Nilsson, Jørgen, additional, Paggio, Patrizia, additional, Sandford Pedersen, Bolette, additional, and Erdman Thomsen, Hanne, additional
- Published
- 2002
- Full Text
- View/download PDF
5. Lexical ambiguity in machine translation
- Author
-
Sandford Pedersen, Bolette, primary
- Published
- 2000
- Full Text
- View/download PDF
6. Content-based text querying with ontological descriptors
- Author
-
Andreasen, Troels, Anker Jensen, Per, Fischer Nilsson, Jørgen, Paggio, Patrizia, Sandford Pedersen, Bolette, and Erdman Thomsen, Hanne
- Published
- 2004
- Full Text
- View/download PDF
7. The strategic impact of META-NET on the regional, national and international level
- Author
-
Rehm, Georg, Uszkoreit, Hans, Ananiadou, Sophia, Bel, Núria, Bielevičienė, Audronė, Borin, Lars, Branco, António, Budin, Gerhard, Calzolari, Nicoletta, Daelemans, Walter, Garabík, Radovan, Grobelnik, Marko, García-Mateo, Carmen, van Genabith, Josef, Hajič, Jan, Hernáez, Inma, Judge, John, Koeva, Svetla, Krek, Simon, Krstev, Cvetana, Lindén, Krister, Magnini, Bernardo, Mariani, Joseph, McNaught, John, Melero, Maite, Monachini, Monica, Moreno, Asunción, Odijk, J.E.J.M., Ogrodniczuk, Maciej, Pęzik, Piotr, Piperidis, Stelios, Przepiórkowski, Adam, Rögnvaldsson, Eiríkur, Rosner, Mike, Sandford Pedersen, Bolette, Skadiņa, Inguna, De Smedt, Koenraad, Tadić, Marko, Thompson, Paul, Tufiş, Dan, Váradi, Tamás, Vasiļjevs, Andrejs, Vider, Kadri, Zabarskaitė, Jolanta, Rehm, Georg, Uszkoreit, Hans, Ananiadou, Sophia, Bel, Núria, Bielevičienė, Audronė, Borin, Lars, Branco, António, Budin, Gerhard, Calzolari, Nicoletta, Daelemans, Walter, Garabík, Radovan, Grobelnik, Marko, García-Mateo, Carmen, van Genabith, Josef, Hajič, Jan, Hernáez, Inma, Judge, John, Koeva, Svetla, Krek, Simon, Krstev, Cvetana, Lindén, Krister, Magnini, Bernardo, Mariani, Joseph, McNaught, John, Melero, Maite, Monachini, Monica, Moreno, Asunción, Odijk, J.E.J.M., Ogrodniczuk, Maciej, Pęzik, Piotr, Piperidis, Stelios, Przepiórkowski, Adam, Rögnvaldsson, Eiríkur, Rosner, Mike, Sandford Pedersen, Bolette, Skadiņa, Inguna, De Smedt, Koenraad, Tadić, Marko, Thompson, Paul, Tufiş, Dan, Váradi, Tamás, Vasiļjevs, Andrejs, Vider, Kadri, and Zabarskaitė, Jolanta
- Abstract
This article provides an overview of the dissemination work carried out in META-NET from 2010 until 2015; we describe its impact on the regional, national and international level, mainly with regard to politics and the funding situation for LT topics. The article documents the initiative’s work throughout Europe in order to boost progress and innovation in our field.
- Published
- 2016
8. The strategic impact of META-NET on the regional, national and international level
- Author
-
LS OZ Taal en spraaktechnologie, ILS LLI, Rehm, Georg, Uszkoreit, Hans, Ananiadou, Sophia, Bel, Núria, Bielevičienė, Audronė, Borin, Lars, Branco, António, Budin, Gerhard, Calzolari, Nicoletta, Daelemans, Walter, Garabík, Radovan, Grobelnik, Marko, García-Mateo, Carmen, van Genabith, Josef, Hajič, Jan, Hernáez, Inma, Judge, John, Koeva, Svetla, Krek, Simon, Krstev, Cvetana, Lindén, Krister, Magnini, Bernardo, Mariani, Joseph, McNaught, John, Melero, Maite, Monachini, Monica, Moreno, Asunción, Odijk, J.E.J.M., Ogrodniczuk, Maciej, Pęzik, Piotr, Piperidis, Stelios, Przepiórkowski, Adam, Rögnvaldsson, Eiríkur, Rosner, Mike, Sandford Pedersen, Bolette, Skadiņa, Inguna, De Smedt, Koenraad, Tadić, Marko, Thompson, Paul, Tufiş, Dan, Váradi, Tamás, Vasiļjevs, Andrejs, Vider, Kadri, Zabarskaitė, Jolanta, LS OZ Taal en spraaktechnologie, ILS LLI, Rehm, Georg, Uszkoreit, Hans, Ananiadou, Sophia, Bel, Núria, Bielevičienė, Audronė, Borin, Lars, Branco, António, Budin, Gerhard, Calzolari, Nicoletta, Daelemans, Walter, Garabík, Radovan, Grobelnik, Marko, García-Mateo, Carmen, van Genabith, Josef, Hajič, Jan, Hernáez, Inma, Judge, John, Koeva, Svetla, Krek, Simon, Krstev, Cvetana, Lindén, Krister, Magnini, Bernardo, Mariani, Joseph, McNaught, John, Melero, Maite, Monachini, Monica, Moreno, Asunción, Odijk, J.E.J.M., Ogrodniczuk, Maciej, Pęzik, Piotr, Piperidis, Stelios, Przepiórkowski, Adam, Rögnvaldsson, Eiríkur, Rosner, Mike, Sandford Pedersen, Bolette, Skadiņa, Inguna, De Smedt, Koenraad, Tadić, Marko, Thompson, Paul, Tufiş, Dan, Váradi, Tamás, Vasiļjevs, Andrejs, Vider, Kadri, and Zabarskaitė, Jolanta
- Published
- 2016
9. The Strategic Impact of META-NET on the Regional, National and International Level
- Author
-
Rehm, Georg, Uszkoreit, Hans, Ananiadou, Sophia, Bel, Núria, Bielevičienė, Audronė, Borin, Lars, Branco, António, Budin, Gerhard, Calzolari, Nicoletta, Daelemans, Walter, Garabík, Radovan, Grobelnik, Marko, García-Mateo, Carmen, van Genabith, Josef, Hajič, Jan, Hernáez, Inma, Judge, John, Koeva, Svetla, Krek, Simon, Krstev, Cvetana, Lindén, Krister, Magnini, Bernardo, Mariani, Joseph, McNaught, John, Melero, Maite, Monachini, Monica, Moreno, Asunción, Odijk, J.E.J.M., Ogrodniczuk, Maciej, Pęzik, Piotr, Piperidis, Stelios, Przepiórkowski, Adam, Rögnvaldsson, Eiríkur, Rosner, Mike, Sandford Pedersen, Bolette, Skadiņa, Inguna, De Smedt, Koenraad, Tadić, Marko, Thompson, Paul, Tufiş, Dan, Váradi, Tamás, Vasiļjevs, Andrejs, Vider, Kadri, Zabarskaitė, Jolanta, LS OZ Taal en spraaktechnologie, ILS LLI, Calzolari, Nicoletta, Choukri, Khalid, Declerck, Thierry, Loftsson, Hrafn, Maegaard, Bente, Mariani, Joseph, Moreno, Asunción, LS OZ Taal en spraaktechnologie, ILS LLI, University of Helsinki, Department of Modern Languages 2010-2017, Language Technology, Universitat Politècnica de Catalunya. Departament de Teoria del Senyal i Comunicacions, Universitat Politècnica de Catalunya. VEU - Grup de Tractament de la Parla, Moreno, Asuncion, Odijk, Jan, and Piperidis, Stelios
- Subjects
Linguistics and Language ,Strategic impact ,Operations research ,META-NET ,LR National/International Projects ,Infrastructural/Policy Issues ,Multilinguality ,Machine Translation ,English language -- Machine translating ,02 engineering and technology ,Library and Information Sciences ,Language and Linguistics ,Education ,machine translation ,Politics ,Order (exchange) ,Traducció automàtica ,0202 electrical engineering, electronic engineering, information engineering ,Regional science ,Computer Science (miscellaneous) ,6121 Languages ,META-SHARE ,Sociology ,Multilingual technologies ,Anglès -- Traducció automàtica ,natural language processing ,060201 languages & linguistics ,International level ,Computer. Automation ,multilingual technologies ,Linguistics ,06 humanities and the arts ,113 Computer and information sciences ,language resources ,Work (electrical) ,0602 languages and literature ,Enginyeria de la telecomunicació::Processament del senyal::Processament de la parla i del senyal acústic [Àrees temàtiques de la UPC] ,020201 artificial intelligence & image processing ,Informàtica::Intel·ligència artificial [Àrees temàtiques de la UPC] ,multilinguality ,language technology ,Language technology ,Machine translation ,Language resources ,Machine translating - Abstract
This article provides an overview of the dissemination work carried out in META-NET from 2010 until early 2014; we describe its impact on the regional, national and international level, mainly with regard to politics and the situation of fundingfor langauge technology topics. This paper documents the initiative’s work throughout Europe in order to boost progress and innovation in our field.
- Published
- 2014
- Full Text
- View/download PDF
10. Lexicography and Language Technology in the Nordic Countries
- Author
-
Sandford Pedersen, Bolette, primary
- Published
- 2010
- Full Text
- View/download PDF
11. Annotation of regular polysemy and underspecification
- Author
-
Martínez Alonso, Héctor, Sandford Pedersen, Bolette, and Bel Rafecas, Núria
- Subjects
Generative lexicon ,Corpus annotation ,Polysemy - Abstract
Comunicació presentada a: 51st Annual Meeting of the Association for Computational Linguistics, celebrat a Sofia, Bulgaria, del 4 al 9 d'agost de 2013. We present the result of an annotation task on regular polysemy for a series of semantic classes or dot types in English, Danish and Spanish. This article describes the annotation process, the results in terms of inter-encoder agreement, and the sense distributions obtained with two methods: majority voting with a theory-compliant backoff strategy, and MACE, an unsupervised system to choose the most likely sense from all the annotations. The research leading to these results has been funded by the European Commission’s 7th Framework Program under grant agreement 238405 (CLARA).
12. Identification of sense selection in regular polysemy using shallow features
- Author
-
Martínez Alonso, Héctor, Sandford Pedersen, Bolette, and Bel Rafecas, Núria
- Abstract
Comunicació presentada a: 18th Nordic Conference of Computational Linguistics NODALIDA 2011, celebrada a Riga, Latvia, del 11 al 13 de maig de 2011. The following work describes a method to automatically classify the sense selection of the complex type Location/Organization –which depends on regular polysemy– using shallow features, as well as a way to increase the volume of sense-selection gold standards by using monosemous data as filler. The classifier results show that grammatical features are the most relevant cues for the identification of sense selection in this instance of regular polysemy. The research leading to these results has received funding from the European Commission’s 7th Framework Program under grant agreement n°238405 (CLARA).
13. Annotation of regular polysemy: an empirical assessment of the underspecified sense
- Author
-
Martínez Alonso, Héctor, Sandford Pedersen, Bolette, Bel Rafecas, Núria, and Universitat Pompeu Fabra. Departament de Traducció i Ciències del llenguatge
- Subjects
Polisèmia ,Tractament automàtic de la parla - Abstract
Words that belong to a semantic type, like location, can metonymically behave as a member of another semantic type, like organization. This phenomenon is known as regular polysemy. In Pustejovsky's (1995) Generative Lexicon, some cases of regular polysemy are grouped in a complex semantic class called a dot type. For instance, the sense alternation mentioned above is the location organization dot type. Other dot types are for instance animal meat or container content. We refer to the usages of dot-type words that are potentially both metonymic and literal as underspeci ed. Regular polysemy has received a lot of attention from the theory of lexical semantics and from computational linguistics. However, there is no consensus on how to represent the sense of underspeci ed examples at the token level, namely when annotating or disambiguating senses of dot types. This leads us to the main research question of the dissertation: Does sense underspeci cation justify incorporating a third sense into our sense inventories when dealing with dot types at the token level, thereby treating the underspeci ed sense as independent from the literal and metonymic? We have conducted an analysis in English, Danish and Spanish on the possibility to annotate underspeci ed senses by humans. If humans cannot consistently annotate the underspeci ed sense, its applicability to NLP tasks is to be called into question. Later on, we have tried to replicate the human judgments by means of unsupervised and semisupervised sense prediction. Achieving an NLP method that can reproduce the human judgments for the underspeci ed sense would be suf- cient to postulate the inclusion of the underspeci ed in our sense inventories. The human annotation task has yielded results that indicate that the kind of annotator (volunteer vs. crowdsourced from Amazon Mechanical Turk) is a decisive factor in the recognizability of the underspeci ed sense. This sense distinction is too nuanced to be recognized using crowdsourced annotations. The automatic sense-prediction systems have been unable to nd empiric evidence for the underspeci ed sense, even though the semisupervised system recognizes the literal and metonymic senses with good performance. In this light, we propose an alternative representation for the sense alternation of dot-type words where literal and metonymic are poles in a continuum, instead of discrete categories., Las palabras de una clase sem antica como lugar pueden comportarse meton - micamente como miembros de otra clase sem antica, como organizaci on. Este fen omeno se denomina polisemia regular. En el Generative Lexicon de Pustejovsky (1995), algunos casos de polisemia regular se encuentran agrupados en una clase sem antica compleja llamada dot type. Por ejemplo, la alternaci on de sentidos anterior es el dot type lugar orga- nizaci on. Otros ejemplos de dot type son animal carne or contenedor con- tenido. Llamamos subespeci cados a los usos de palabras pertenecientes a un dot type que son potentialmente literales y met onimicos. La polisemia regular ha recibido mucha atenci on desde la teor a en sem antica l exica y desde la ling u stica computacional. Sin embargo, no existe un consenso sobre c omo representar el sentido de los ejemplos subespeci cados al nivel de token, es decir, cuando se anotan o disambiguan sentidos de palabras de dot types. Esto nos lleva a la principal pregunta de esta tesis: >Justi ca la subespeci- caci on la incorporaci on de un tercer sentido a nuestros inventarios de sentidos cuando tratamos con dot types a nivel de token, tratando de este modo el el sentido subespeci cado como independiente de los sentidos met onimico y literal? Hemos realizado un an alisi en ingl es, dan es y espa~nol sobre la posibilidad de anotar sentidos subespeci cados usando informantes. Si los humanos no pueden annotar el sentido subespeci cado de forma consistente, la aplicabilidad del mismo en tareas computacionales ha de ser puesta en tela de juicio. Posteriormente hemos tratado de replicar los juicios humanos usando aprendizaje autom atico. Obtener un m etodo computacional que reproduzca los juicios humanos para el sentido subespeci cado ser a su ciente para incluirlo en los inventarios de sentidos para las tareas de anotaci on. La anotaci on humana ha producido resultados que indican que el tipo de anotador (voluntario o crowdsourced mediante Amazon Mechanical Turk) es un factor decisivo a la hora de reconocer el sentido subespeci cado. Esta diferenciaci on de sentidos requiere demasiados matices de interpretaci on como para poder ser anotada usando Mechanical Turk. Los sistemas de predicci on autom atica de sentidos han sido incapaces de identi car evidencia emp rica su ciente para el sentido subespeci cado, a pesar de que la tarea de reconocimiento semisupervisado reconoce los sentidos literal y meton mico de forma satisfactoria. Finalmente, propones una representaci on alternativa para la representaci on de sentidos de las palabras de dot types en la que literal y met onimico son polos en un cont nuo en lugar de categor as discretas.
- Published
- 2013
14. Explorations on Positionwise Flag Diacritics in Finite-State Morphology
- Author
-
Yli-Jyrä, Anssi Mikael, Sandford Pedersen, Bolette, Nešpore, Gunta, Skadiņa, Inguna, Department of Modern Languages 2010-2017, Anssi Mikael Yli-Jyrä / Principal Investigator, and Krister Linden / Research Group
- Subjects
representation ,finite-state morfologisk analysis ,sträng algoritmer ,äärellistilaiset menetelmät ,kontekstuaalisuus ,grammatiker ,phonemes ,representaatio ,kontextuella effekter ,äärellistilaiset transduktorit ,morphology ,finite-state metoder ,fonem represention ,mönster bearbetning ,fonologisk bearbetning ,contextual effects ,äärelliset transduktorit ,finite automata ,string algorithms ,lexikon ,pattern processing ,lexicon ,morpho-phonemes ,grammars ,finite-state transducers ,två-nivå morfologi ,phonological processing ,two-level rules ,morphologiska fonem ,morfologi ,finite-automata ,äärelliset automaatit ,leksikot ,finite-state methods ,morfologia ,foneemin esitysmuodot ,6121 Languages ,ändliga automater ,två-nivå regler ,merkkijonoalgoritmit ,hahmontunnistus ,kieliopit ,compilers ,two-level morphology ,kaksitasoinen morfologia ,113 Computer and information sciences ,ändliga transduktor ,äärellistilainen morfologinen analyysi ,finite-state morphological analysis ,kaksitasosäännöt ,kääntäjät ,fonologinen prosessointi ,PHONEME REPRESENTATIONS ,fonem ,kompilatorer - Abstract
(översättning:) I dokumentet föreslås morphofonematisk markörer kallas positionwise flaggor. Dessa flaggor är inspirerade av de tekniker som används i sammanställningen av två nivåer regler. Det sammanställer praktiskt taget alla regler parallellt, men på ett effektivt sätt. Tekniken hanterar morphofonematisk processer utan separat morphofonematisk representation. De förekomster av allomorphofonem i latenta fonologiska strängar spåras genom en dynamisk datastruktur där den mest framträdande (dvs. bäst rankade) flaggor samlas in. Tillämpningen av tekniken är misstänkt för att ge fördelar när de beskriver morfologi Bantu språk och dialekter Artikkeli esittelee positiokohtaiset flagit, joita voidaan käyttää yksitasoisessa äärellistilaisessa leksikossa. Nämä flagit ovat eräänlaisia morfofoneemien merkitsimiä. Ne mahdollistavat erilaisten äännevaihteluprosessien kuvaamisen formalismilla, joka laajentaa kiinteästi leksikkovientejä sekä vaihteluja kuvaavia säännällisiä lausekkeita. Formalismin juuret ovat kaksitasosääntöjen käännöstekniikoissa ja se mahdollistaa käytännössä kaikkien sääntöjen kääntämisen rinnakkain, mutta tehokkaasti. Tekniikan avulla morfofonologiset prosessit voidaan käsitellä ilman erillistä morfofoneemista esitysmuotoa. Latenteissa fonologisissa merkkijonoissa olevat allo-morfofoneemien esiintymäkohdat kirjataan ylös tietorakenteesseen, johon kokoontuvat kunkin position vahvimmat flagit. Tekniikka on edukseen kuvattaessa esim. bantukielten morfofonologiaa. A novel technique of adding positionwise flags to one-level finite state lexicons is presented. The proposed flags are kinds of morphophonemic markers and they constitute a flexible method for describing morphophonological processes with a formalism that is tightly coupled with lexical entries and rule-like regular expressions. The formalism is inspired by the techniques used in two-level rule compilation and it practically compiles all the rules in parallel, but in an efficient way. The technique handles morphophonological processes without a separate morphophonemic representation. The occurrences of the allomorphophonemes in latent phonological strings are tracked through a dynamic data structure into which the most prominent (i.e. the best ranked) flags are collected. The application of the technique is suspected to give advantages when describing the morphology of Bantu languages and dialects.
- Published
- 2011
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.