Author: "VISEO" / Search Limiters: Full Text - Searchworks@Jio Institute Digital Library Search Results

Your search keyword '"VISEO"' showing total 17 results

Start Over Author "VISEO" Search Limiters Full Text

17 results on '"VISEO"'

1. Non-standard texts: from theoretical positions to Natural Language Processing normalisation

Author: Lopez, Cédric, Roche, Mathieu, Panckhurst, Rachel, VISEO - Objet Direct, VISEO, Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA)-Centre National de la Recherche Scientifique (CNRS), ADVanced Analytics for data SciencE (ADVANSE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Praxiling UMR 5267 (Praxiling), Université Paul-Valéry - Montpellier 3 (UM3)-Centre National de la Recherche Scientifique (CNRS), CENTAL, UCL, Louvain-la-Neuve, Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Département Environnements et Sociétés (Cirad-ES), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad), Praxiling (Praxiling), Université Paul-Valéry - Montpellier 3 (UPVM)-Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique (CNRS)-Université Paul-Valéry - Montpellier 3 (UPVM), and Panckhurst, Rachel
Subjects: C30 - Documentation et information, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], SMS, Normalisation, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], Natural Language Processing
Abstract: A finalised digital resource of 88,000 anonymised French text messages, the 88milSMS corpus, two extracts (1,000 SMS transcoded into standardised French and 100 linguistically annotated SMS) and sociolinguistic questionnaire data were released in June 2014 for all to download via a user free-of-charge licence agreement, from the Huma-Num web service (http://88milsms.huma-num.fr, Panckhurst et al., 2014). The sud4science project (http://sud4science.org, Panckhurst et al. 2013), enabling authentic text message collection from the general public by a group of academics, is part of a vast international initiative (http://www.sms4science.org/, Fairon et al. 2006, Cougnon and Fairon, 2014, Cougnon 2015), to build a worldwide database and analyse authentic text messages in different languages. We decided to exclude full transcoding and annotation tagging in the final corpus. This is a theoretical position, since annotation is far from neutral, and is invariably linked to an interpretative framework. Owing to varying theoretical disciplinary and scientific stances, it seems that a true consensus on how to standardise the transcoding and linguistic annotation tagging does not exist (Panckhurst, 2015). Other researchers may disagree and prefer to provide both 'raw' and fully tagged corpora (Chanier et al. 2014). This theoretical position does not exclude exploring Natural Language Processing (NLP) investigation techniques, which could then be implemented in real-life applications. Examples of investigation techniques are indicated as follows: 1) Our corpus can be used to analyse current mediated electronic discourse, and help build knowledge on different SMS writing forms (Roche et al. 2015). 2) Algorithms may be used to learn from this: alignment methods for facilitating automatic transcoding have been explored (Aw et al. 2006, Beaufort et al., 2008, Guimier de Neef and Fessard, 2007, Kobus et al, 2008, Lopez et al, 2014). 3) We have devised a method for classifying 'unknown' items within text messages, which may help to automatically identify lexical 'creativity' within 88milSMS and improve electronic dictionary approaches (Lopez et al. 2015). In order to refine automatic normalisation techniques for initially non-standard texts in French, the next logical step is to compare our resource with different types of instant media (i.e. SMS, forums, tweets). Firstly, a new typology of the detected 'mistakes', based on existing typologies, will be elaborated. Secondly, automatic normalisation techniques — focusing on the most frequent errors — will be proposed. These will then be confronted with traditional automatic translation (Vilariño et al., 2012), speech recognition (Kobus et al., 2008) and spelling/grammatical checker principles (Beaufort et al., 2010). Finally, the approach should enable comparison between different types of instant media.
Published: 2016

2. De la collecte à l'analyse d'un corpus de SMS authentiques : une démarche pluridisciplinaire

Author: Mathieu Roche, Claudine Moïse, Catherine Détrie, Cédric Lopez, Bertrand Verine, Rachel Panckhurst, Praxiling (Praxiling), Université Paul-Valéry - Montpellier 3 (UPVM)-Centre National de la Recherche Scientifique (CNRS), Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA)-Centre National de la Recherche Scientifique (CNRS), Département Environnements et Sociétés (Cirad-ES), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad), VISEO - Objet Direct, VISEO, LInguistique et DIdactique des Langues Étrangères et Maternelles (LIDILEM ), Université Grenoble Alpes [2016-2019] (UGA [2016-2019]), Praxiling UMR 5267 (Praxiling), Université Paul-Valéry - Montpellier 3 (UM3)-Centre National de la Recherche Scientifique (CNRS), ADVanced Analytics for data SciencE (ADVANSE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), LInguistique et DIdactique des Langues Étrangères et Maternelles (LIDILEM), Université Stendhal - Grenoble 3-Université Grenoble Alpes (UGA), and Centre National de la Recherche Scientifique (CNRS)-Université Paul-Valéry - Montpellier 3 (UPVM)
Subjects: Linguistics and Language, SMS, Corpus, alignement, dictionnaires électroniques, logiciel d’anonymisation, discours électronique médié, traitement automatique du langage naturel (TALN), données authentiques, pluridisciplinarité, media_common.quotation_subject, 02 engineering and technology, Language and Linguistics, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], Dictionnaires électroniques, Pluridisciplinarity, Authentic data, 0202 electrical engineering, electronic engineering, information engineering, alignment, electronic dictionaries, anonymisation software, mediated electronic discourse, natural language processing (NLP), authentic data, pluridisciplinarity, media_common, Alignment, 060201 languages & linguistics, Données authentiques, Pluridisciplinarité, Electronic dictionary, Discours électronique médié, U10 - Informatique, mathématiques et statistiques, Natural language processing, Anonymisation software, 000 - Autres thèmes, 06 humanities and the arts, Art, Mediated electronic discourse, Linguistics, Traitement automatique du langage naturel, Philosophy, Chose, Alignement, C30 - Documentation et information, 0602 languages and literature, 020201 artificial intelligence & image processing, Logiciel d’anonymisation, U30 - Méthodes de recherche
Abstract: This article highlights an approach based on authentic data, by focusing on recent research related to collection, processing and analysis of a large French text-message corpus, entitled 88milSMS (http://88milsms.huma-num.fr/, Panckhurst, Détrie, Lopez, Moïse, Roche, Verine, 2014), including a sociolinguistic questionnaire submitted to donors (with their answers). The authors, using a pluridisciplinary approach (linguistics/ language sciences, computer science, Natural Language Processing), explain why they chose to give the scientific community and the general public access to the SMS corpus., Nous présentons notre approche fondée sur les données authentiques, en nous concentrant sur des recherches récentes, portant sur le recueil, le traitement et l’analyse d’un grand corpus de SMS en français, intitulé 88milSMS (http://88milsms. huma-num.fr/, Panckhurst, Détrie, Lopez, Moïse, Roche, Verine, 2014), incluant un questionnaire sociolinguistique soumis aux donateurs au moment de la collecte ainsi que leurs réponses. Puis nous expliquons pourquoi, dans une démarche pluridisciplinaire (située entre sciences du langage, informatique et traitement automatique du langage naturel), nous avons décidé de fournir à la communauté scientifique et au grand public le corpus de SMS., Panckhurst Rachel, Roche Mathieu, Lopez Cédric, Verine Bertrand, Détrie Catherine, Moïse Claudine. De la collecte à l’analyse d’un corpus de SMS authentiques : une démarche pluridisciplinaire. In: Histoire Épistémologie Langage, tome 38, fascicule 2, 2016. Constitution de corpus linguistiques et pérennisation des données. pp. 73-85.
Published: 2016
Full Text: View/download PDF

3. Classification des items inconnus de 88milSMS : aide à l'identification automatique de la créativité scripturale

Author: Lopez, Cédric, Roche, Mathieu, Panckhurst, Rachel, Praxiling UMR 5267 (Praxiling), Université Paul-Valéry - Montpellier 3 (UPVM)-Centre National de la Recherche Scientifique (CNRS), VISEO - Objet Direct, VISEO, ADVanced Analytics for data SciencE (ADVANSE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA), Praxiling (Praxiling), Centre National de la Recherche Scientifique (CNRS)-Université Paul-Valéry - Montpellier 3 (UPVM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), and Université Paul-Valéry - Montpellier 3 (UM3)-Centre National de la Recherche Scientifique (CNRS)
Subjects: Items inconnus, Méthode statistique, Identification automatique, Analyse de données, Communication, 000 - Autres thèmes, Logiciel, Classification, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], information, Créativité scripturale, C30 - Documentation et information, SMS, Classification (information), [SHS.LANGUE]Humanities and Social Sciences/Linguistics, linguistique
Abstract: International audience; The sud4science LR project (http://www.sud4science.org/) aimed at studying a fairly recent form of written communication: SMS (Short Message Service). The first step of the project was to collect a large number of text messages from the general public. We initially gathered 93'085 SMS and our final corpus, entitled 88milSMS, contains over 88'000 SMS.2 In this article, we propose a novel approach(which is also applicable to other textual data)for classifying unknown items in 88milSMS, based on two steps: 1) Classification of SMS in relation to 5 European languages (French, Spanish, English, German, Italian), 2) Classification of unknown items according to predefined classes (schedules, items containing special character(s), number(s), words without accents, or with repeated characters, etc.). We are then able to make a distinction between the truly "original" items which are widely used compared to those that are rarely used in the corpus. Based on examples mined in the different classes, we present a preliminary analysis of the obtained resource.
Published: 2015

4. Approaches of anonymisation of an SMS corpus

Author: Mathieu Roche, Pierre Accorsi, Cédric Lopez, Namrata Patel, Diana Inkpen, Graphs for Inferences on Knowledge (GRAPHIK), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria), Université Montpellier 2 - Sciences et Techniques (UM2), University of Ottawa [Ottawa], VISEO - Objet Direct, VISEO, Exploration et exploitation de données textuelles (TEXTE), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Inria Sophia Antipolis - Méditerranée (CRISAM), and Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
Subjects: Short Message Service, 020205 medical informatics, business.industry, Computer science, Process (engineering), 02 engineering and technology, computer.software_genre, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], 0202 electrical engineering, electronic engineering, information engineering, 020201 artificial intelligence & image processing, Artificial intelligence, Data mining, business, computer, Natural language processing
Abstract: International audience; This paper presents two anonymisation methods to process an SMS corpus. The first one is based on an unsupervised approach called Seek&Hide. The implemented system uses several dictionaries and rules in order to predict if a SMS needs anonymisation process. The second method is based on a supervised approach using machine learning techniques. We evaluate the two approaches and we propose a way to use them together. Only when the two methods do not agree on their prediction, will the SMS be checked by a human expert. This greatly reduces the cost of anonymising the corpus.
Published: 2013
Full Text: View/download PDF

5. The PEW Framework for Worth Mapping

Author: Rachel Demumieux, Fatoumata Camara, Gaëlle Calvary, VISEO - Objet Direct, VISEO, Ingénierie de l’Interaction Homme-Machine (IIHM), Laboratoire d'Informatique de Grenoble (LIG), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF), Orange Communications SA, Kotzé, Paula and Marsden, Gary and Lindgaard, Gitte and Wesson, Janet and Winckler, and Marco
Subjects: Operationalization, Process (engineering), Computer science, business.industry, 05 social sciences, Usability, Data science, Set (abstract data type), Order (business), 0502 economics and business, 050211 marketing, 0501 psychology and cognitive sciences, Artificial intelligence, [INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC], business, 050107 human factors
Abstract: September 2-6, 2013; International audience; In Human Computer Interaction, it is more and more clear that usability is not enough. In order to take into account the other criteria that may be relevant for design, G. Cockton introduced the notion of "worth" and the Worth Centered Design (WCD) framework for its operationalization. The WCD framework structures the development process and provides designers with a set of tools, including Worth Maps (WMs). Worth maps connect systems attributes to human ones, and as such represent a promising tool. However, they remain understudied and under-experimented. This paper presents the results of our experience with WMs. More precisely, it proposes the PEW (Perceived and Expected Worth) framework for worth mapping, reports findings from a study conducted with 5 experts regarding many aspects of WMs, and discusses future directions for research. Keywords: Interactive systems design, worth, Worth Maps (WMs).
Published: 2013

6. CAD modelling based on knowledge synthesis for design rational

Author: Anthony Geromin, Lionel Roucoules, François Malburet, Cédric Lopez, Laboratoire des Sciences de l'Information et des Systèmes (LSIS), Centre National de la Recherche Scientifique (CNRS)-Arts et Métiers Paristech ENSAM Aix-en-Provence-Université de Toulon (UTLN)-Aix Marseille Université (AMU), Arts et Métiers Paristech ENSAM Aix-en-Provence, VISEO, Aix Marseille Université (AMU)-Université de Toulon (UTLN)-Arts et Métiers Paristech ENSAM Aix-en-Provence-Centre National de la Recherche Scientifique (CNRS), and Administrateur Ensam, Compte De Service
Subjects: 0209 industrial biotechnology, [SPI] Engineering Sciences [physics], Computer science, 020209 energy, Knowledge synthesis, CAD modelling, design maturity visualisation, color-coding, Color-coding, CAD, 02 engineering and technology, Space (commercial competition), computer.software_genre, Sciences de l'ingénieur, color-coding, design maturity visualisation, [SPI]Engineering Sciences [physics], 020901 industrial engineering & automation, Product lifecycle, 0202 electrical engineering, electronic engineering, information engineering, Computer Aided Design, General Environmental Science, Point (typography), business.industry, Knowledge synthesis, General Earth and Planetary Sciences, CAD modelling, Engineering design process, Software engineering, business, computer
Abstract: International audience; Although many new methodological and modelling concepts have been proposed by the scientific community, current industries are still focusing their engineering design process on CAD model since they assume it is the starting point of many analyses with respect to product life cycle (CAM, FEA, LCA…). The paper presents the application of modelling concepts that lead the progressive justification of CAD model with respect to knowledge synthesis by least commitment. Design experts are first formalizing their knowledge that is therefore translated to form features and parameters (topology, position, orientation, dimensions…). The results show that this new design approach and models support design intents and rational, but the generated CAD model is not fully justified. That drives to many conclusions: CAD model is many often non-100% rational by designers’ knowledge, design solution space is therefore larger than the one modelled in CAD software and could be used to foster innovation.
Published: 2018

7. Detecting Influencial Users in Social Networks: Analysing Graph-Based and Linguistic Perspectives

Author: Damien Nouvel, Namrata Patel, Frédérique Segond, Cédric Lopez, Pierre-Alain Avouac, Kévin Deturck, Ioannis Partalas, VISEO, Institut National des Langues et Civilisations Orientales (Inalco), Université Paul-Valéry - Montpellier 3 (UPVM), Emvista, Expedia [Lausanne], Eunika Mercier-Laurent, Danielle Boulanger, TC 12, WG 12.6, and Université Paul-Valéry - Montpellier 3 (UM3)
Subjects: Social network, business.industry, Computer science, 05 social sciences, Graph based, Linguistics, 02 engineering and technology, Social media, Influence, 0202 electrical engineering, electronic engineering, information engineering, Graph (abstract data type), Centrality, 020201 artificial intelligence & image processing, [INFO]Computer Science [cs], 0509 other social sciences, 050904 information & library sciences, business
Abstract: International audience; There has been increasing interest in the artificial intelligence community for influencer detection in recent years for its utility in singling out pertinent users within a large network of social media users. This could be useful, for example in commercial campaigns, to promote a product or a brand to a relevant target set of users. This task is performed either by analysing the graph-based representation of user interactions in a social network or by measuring the impact of the linguistic content of user messages in online discussions. We performed independent studies for each of these methods in the present paper with a hybridisation perspective. In the first study, we extract structural information to highlight influence among interaction networks. In the second, we identify linguistic features of influential behaviours. We then compute a score of user influence using centrality measures with the structural information for the former and a machine learning approach based on the relevant linguistic features for the latter.
Published: 2017
Full Text: View/download PDF

8. Extraction de relations pour le peuplement d'une base de connaissance à partir de tweets

Author: Lopez, Cédric, Cabrio, Elena, Segond, Frédérique, Laboratoire de Neurosciences intégratives et adaptatives (LNIA), Aix Marseille Université (AMU)-Centre National de la Recherche Scientifique (CNRS), Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S), Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA), Web-Instrumented Man-Machine Interactions, Communities and Semantics (WIMMICS), Inria Sophia Antipolis - Méditerranée (CRISAM), Institut National de Recherche en Informatique et en Automatique (Inria)-Institut National de Recherche en Informatique et en Automatique (Inria)-Scalable and Pervasive softwARe and Knowledge Systems (Laboratoire I3S - SPARKS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (... - 2019) (UNS), COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Laboratoire d'Informatique, Signaux, et Systèmes de Sophia Antipolis (I3S), Équipe de Recherche en Textes, Informatique, Multilinguisme (ERTIM), Institut National des Langues et Civilisations Orientales (Inalco), Viseo, R&D , Grenoble, ANR-13-LAB2-0001,SMILK,Social Media Intelligence and Linked Knowledge(2013), Segond, Frédérique, Laboratoires communs organismes de recherche publics – PME/ETI - Social Media Intelligence and Linked Knowledge - - SMILK2013 - ANR-13-LAB2-0001 - LabCom - VALID, Université Nice Sophia Antipolis (1965 - 2019) (UNS), and COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-COMUE Université Côte d'Azur (2015-2019) (COMUE UCA)-Centre National de la Recherche Scientifique (CNRS)-Université Côte d'Azur (UCA)-Université Nice Sophia Antipolis (1965 - 2019) (UNS)
Subjects: [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-WB] Computer Science [cs]/Web, knowledge representation, [INFO.INFO-WB]Computer Science [cs]/Web, [INFO.INFO-TT] Computer Science [cs]/Document and Text Processing, représentation des connaissances, [SCCO.LING]Cognitive science/Linguistics, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, semantic web, ProVoc, [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], ontologies, [INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR], [SCCO.LING] Cognitive science/Linguistics, web sémantique, ontologie
Abstract: International audience; Dans une base de connaissance, les entités se veulent pérennes mais certains événements induisent que les relations entre ces entités sont instables. C'est notamment le cas pour des relations entre organisations, produits, ou marques, entités qui peuvent être rachetées. Dans cet article, nous proposons une approche permettant d'extraire des relations d'appartenance entre deux entités afin de peu-pler une base de connaissance. L'extraction des relations à partir d'une source dynamique d'informations telle que Twitter permet d'atteindre cet objectif en temps réel. L'approche consiste à modéliser les événements en s'appuyant sur une ressource lexico-sémantique. Une fois les entités liées au Web des données ouvertes (en particulier DBpedia), des règles linguistiques sont appliquées pour finalement générer les triplets RDF qui représentent les événements.
Published: 2017

9. Efficient Model Selection for Regularized Classification by Exploiting Unlabeled Data

Author: Eric Gaussier, Ioannis Partalas, Rohit Babbar, Georgios Balikas, Massih-Reza Amini, Analyse de données, Modélisation et Apprentissage automatique [Grenoble] (AMA), Laboratoire d'Informatique de Grenoble (LIG), Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF)-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Institut National Polytechnique de Grenoble (INPG)-Centre National de la Recherche Scientifique (CNRS)-Université Pierre Mendès France - Grenoble 2 (UPMF)-Université Joseph Fourier - Grenoble 1 (UJF), VISEO, Max Planck Institute for Intelligent Systems, Max-Planck-Gesellschaft, and ANR-11-LABX-0025,PERSYVAL-lab,Systemes et Algorithmes Pervasifs au confluent des mondes physique et numérique(2011)
Subjects: Computer science, business.industry, Model selection, computer.software_genre, Machine learning, Cross-validation, [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Multiclass classification, [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG], Quantification, [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], Multi-class classification, [INFO]Computer Science [cs], Data mining, Artificial intelligence, Macro, business, computer, Classifier (UML)
Abstract: International audience; Hyper-parameter tuning is a resource-intensive task when optimizing classification models. The commonly used k-fold cross validation can become intractable in large scale settings when a classifier has to learn billions of parameters. At the same time, in real-world, one often encounters multi-class classification scenarios with only a few labeled examples; model selection approaches often offer little improvement in such cases and the default values of learners are used. We propose bounds for classification on accuracy and macro measures (precision, recall, F1) that motivate efficient schemes for model selection and can benefit from the existence of unlabeled data. We demonstrate the advantages of those schemes by comparing them with k-fold cross validation and hold-out estimation in the setting of large scale classification.
Published: 2015
Full Text: View/download PDF

10. Données authentiques : un grand corpus de SMS en français

Author: Panckhurst, Rachel, Roche, Mathieu, Lopez, Cédric, Praxiling UMR 5267 (Praxiling), Université Paul-Valéry - Montpellier 3 (UM3)-Centre National de la Recherche Scientifique (CNRS), ADVanced Analytics for data SciencE (ADVANSE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA), VISEO, Praxiling (Praxiling), Centre National de la Recherche Scientifique (CNRS)-Université Paul-Valéry - Montpellier 3 (UPVM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Université Paul-Valéry - Montpellier 3 (UPVM)-Centre National de la Recherche Scientifique (CNRS), and Roche, Mathieu
Subjects: 000 - Autres thèmes, [INFO.INFO-TT] Computer Science [cs]/Document and Text Processing, Corpus, Données authentiques, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, TAL, C30 - Documentation et information, Alignement, SMS, [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], Dictionnaires électroniques, Discours électronique medie, Logiciel d’anonymisation, [INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR]
Abstract: National audience; Qu’est-ce que la donnée écrite en sciences du langage ? Trois types se distinguent : 1) la donnée lexicale, qui se présente essentiellement sous forme d’une entrée lexicale, regroupant un ensemble de propriétés ; 2) » le nom spécifique de la donnée observable en linguistique est l’exemple » et renvoie à « un énoncé qui pourrait être effectivement prononcé, même s’il ne l’est pas dans les faits » (Milner 1989, p. 51-52) ; 3) la donnée en tant que texte brut, i.e. le corpus. En linguistique(s) de corpus, il s’agit d’analyser les productions authentiques contenues dans le corpus. Dans certaines écoles linguistiques, au contraire, l’étude du corpus tout-venant n’a pas lieu d’être. Ainsi, perdure le débat concernant l’opposition (ou, tout au moins, la différenciation) entre exemples linguistiques (éventuellement « fabriqués ») et productions authentiques relevées dans des corpus (cf. entre autres, pour le français, Bilger et al. 2000, Cori et al. 2008, Habert et al.1997, Péry-Woodley 1995). En vingt ans, notre propre approche a évolué : d’une analyse linguistique-informatique basée sur l’exemple (Panckhurst 1994, p. 39), nous sommes passée à une analyse de la donnée authentique figurant dans des corpus (Panckhurst 2013, p. 97, Panckhurst et al. 2014). Pour nous, cette mutation s’explique, d’une part, par l’évolution de l’accès aux données, et, d’autre part, par le discours électronique médié (Panckhurst 1997, 2006), circulant entre individus se servant d’outils électroniques (ordinateurs, tablettes, téléphones portables, etc.), qui induit des pratiques et des usages émergents. En deux décennies, la constitution de corpus numérisés ou nativement numériques est devenue monnaie courante, et cette accessibilité massive constitue en soi une nouveauté. Les données authentiques existant sous la forme de courriels, forums, chats, blogs, réseaux sociaux, et, plus récemment de SMS, facilement exploitables par les chercheurs, permettent l’observation, la fouille et l’analyse des pratiques et des usages (novateurs ou non) des scripteurs. Dans le cadre de cette communication, nous expliquerons ce cheminement, en nous focalisant sur des recherches récentes, portant sur le recueil, le traitement et l’analyse d’un grand corpus de SMS en français, intitulé « 88milSMS » (consultable sur la grille de services d’Huma-Num : http://88milsms.huma-num.fr/).
Published: 2015

11. Towards Electronic SMS Dictionary Construction: An Alignment-based Approach

Author: Lopez, Cédric, Bestandji, Reda, Roche, Mathieu, Panckhurst, Rachel, VISEO, Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), ADVanced Analytics for data SciencE (ADVANSE), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA), Praxiling UMR 5267 (Praxiling), Université Paul-Valéry - Montpellier 3 (UM3)-Centre National de la Recherche Scientifique (CNRS), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Praxiling (Praxiling), Université Paul-Valéry - Montpellier 3 (UPVM)-Centre National de la Recherche Scientifique (CNRS), Roche, Mathieu, and Centre National de la Recherche Scientifique (CNRS)-Université Paul-Valéry - Montpellier 3 (UPVM)
Subjects: electronic dictionaries, [SPI.OTHER]Engineering Sciences [physics]/Other, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, C30 - Documentation et information, [SPI.OTHER] Engineering Sciences [physics]/Other, SMS, [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], [INFO.INFO-TT] Computer Science [cs]/Document and Text Processing, alignment, [INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR]
Abstract: International audience; In this paper, we propose a method for aligning text messages (entitled AlignSMS) in order to automatically build an SMS dictionary. An extract of 100 text messages from the 88milSMS corpus (Panckhurst el al., 2013, 2014) was used as an initial test. More than 90,000 authentic text messages in French were collected from the general public by a group of academics in the south of France in the context of the sud4science project (http://www.sud4science.org). This project is itself part of a vast international SMS data collection project, entitled sms4science (http://www.sms4science.org, Fairon et al. 2006, Cougnon, 2014). After corpus collation, pre-processing and anonymisation (Accorsi et al., 2012, Patel et al., 2013), we discuss how "raw" anonymised text messages can be transcoded into normalised text messages, using a statistical alignment method. The future objective is to set up a hybrid (symbolic/statistic) approach based on both grammar rules and our statistical AlignSMS method.
Published: 2014

12. Looking for Opinion in Land-Use Planning Corpora

Author: Mathieu Roche, Cédric Lopez, Maguelonne Teisseire, Eric Kergosien, ADVanced Analytics for data SciencE (ADVANSE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA), VISEO, Numev (Labex), Geosud (Equipex), Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA)-AgroParisTech-Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad), and Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)
Subjects: [SPI.OTHER]Engineering Sciences [physics]/Other, Computer science, land use planning, 02 engineering and technology, Fouille de données, computer.software_genre, Corpus, Aménagement du territoire, 050105 experimental psychology, Base de connaissances, 0202 electrical engineering, electronic engineering, information engineering, 0501 psychology and cognitive sciences, Relevance (information retrieval), Land-use planning, Opinion-mining, Lexique, 05 social sciences, Sentiment analysis, 000 - Autres thèmes, Text-Mining, Méthode, Data science, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, C30 - Documentation et information, [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], lexicon, 020201 artificial intelligence & image processing, knowledge base, Data mining, P01 - Conservation de la nature et ressources foncières, U30 - Méthodes de recherche, computer
Abstract: International audience; A great deal of research on opinion mining and sentiment analysis has been done in specific contexts such as movie reviews, commercial evaluations, campaign speeches, etc. In this paper, we raise the issue of how appropriate these methods are for documents related to land-use planning. After highlighting limitations of existing proposals and discussing issues related to textual data, we present the method called Opiland (OPinion mIning from LAND-use planning documents) designed to semi-automatically mine opinions in specialized contexts. Experiments are conducted on a land-use planning dataset, and on three datasets related to others areas highlighting the relevance of our proposal.
Published: 2014
Full Text: View/download PDF

13. Le résumé et le titrage automatique partagent-ils les mêmes objectifs ?

Author: Cédric Lopez, Mathieu Roche, Violaine Prince, VISEO, Exploration et exploitation de données textuelles (TEXTE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), and Roche, Mathieu
Subjects: [INFO.INFO-AI] Computer Science [cs]/Artificial Intelligence [cs.AI], lcsh:Language and Literature, Computer science, lcsh:Anthropology, [INFO.INFO-TT] Computer Science [cs]/Document and Text Processing, 02 engineering and technology, computer.software_genre, Task (project management), [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Set (abstract data type), traitement automatique du langage naturel, 020204 information systems, 0202 electrical engineering, electronic engineering, information engineering, Relevance (information retrieval), résumé automatique, natural language processing, GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries), 060201 languages & linguistics, Information retrieval, business.industry, lcsh:GN1-890, 06 humanities and the arts, [SCCO.LING]Cognitive science/Linguistics, Automatic summarization, [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, classification, [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], 0602 languages and literature, lcsh:B, titrage automatique, lcsh:P, [INFO.INFO-IR] Computer Science [cs]/Information Retrieval [cs.IR], Artificial intelligence, [SCCO.LING] Cognitive science/Linguistics, business, lcsh:Philosophy. Psychology. Religion, computer, Natural language processing, clustering, automatic summarization
Abstract: In the literature, automatic summarization and automatic titling tasks are often merged. It seems that a short summary can be considered as a relevant title. But can we compare a title and a summary without having previously studied their criteria? This study aims at positioning the emergent task of automatic titling with regard to automatic summarization task. In this paper, we define a set of criteria according to the summary and to the title, and we analyze the results obtained with our method of automatic classification. This analysis enables to report real objectives of both tasks and to validate their relevance. Dans la littérature, les tâches de résumé et de titrage automatique sont souvent confondues. A priori, il semble qu’un résumé de quelques mots peut constituer un titre tout à fait pertinent. Mais peut-on comparer un titre et un résumé sans auparavant en avoir étudié leurs critères ? Cette étude a pour but de positionner l’émergente tâche de titrage automatique par rapport à celle de résumé automatique. Après avoir défini les critères attachés au résumé et au titre, nous analysons les résultats obtenus via notre méthode automatique de classification, permettant de rendre compte des objectifs réels des deux tâches et de valider leur pertinence.
Published: 2014

14. How can catchy titles be generated without loss of informativeness?

Author: Violaine Prince, Cédric Lopez, Mathieu Roche, VISEO, Exploration et exploitation de données textuelles (TEXTE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), ADVanced Analytics for data SciencE (ADVANSE), Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA), and Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)
Subjects: Information retrieval, business.industry, Computer science, [INFO.INFO-WB]Computer Science [cs]/Web, General Engineering, 000 - Autres thèmes, computer.software_genre, Automatic summarization, Noun phrase, Nominalization, Computer Science Applications, Task (project management), [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI], Set (abstract data type), Text mining, Web mining, C30 - Documentation et information, Artificial Intelligence, [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR], ComputingMethodologies_DOCUMENTANDTEXTPROCESSING, Artificial intelligence, business, computer, Natural language processing
Abstract: International audience; Automatic titling of text documents is an essential task for several applications (automatic heading of e-mails, summarization, and so forth). This paper describes a system facilitating information retrieval in a set of textual documents by tackling the automatic titling and subtitling issue. Automatic titling here involves providing both informative and catchy titles. We thus propose two different approaches based on NLP, text mining, and Web Mining techniques. The first one (POSTIT) consists of extracting relevant noun phrases from texts as candidate titles. An original approach combining statistical criteria and noun phrase positions in the text helps in collecting informative titles and subtitles. The second approach (NOMIT) is based on various assumptions made on POSTIT and aims to generate both informative and catchy titles. Both approaches are applied to a corpus of news articles, then evaluated according to two criteria, i.e. informativeness and catchiness.
Published: 2014
Full Text: View/download PDF

15. Sud4science, de l'acquisition d'un grand corpus de SMS en français à l'analyse de l'écriture SMS

Author: Panckhurst, Rachel, Détrie, Catherine, Lopez, Cédric, Moïse, Claudine, Roche, Mathieu, Verine, Bertrand, Praxiling UMR 5267 (Praxiling), Université Paul-Valéry - Montpellier 3 (UPVM)-Centre National de la Recherche Scientifique (CNRS), Université Paul-Valéry - Montpellier 3 (UM3)-Centre National de la Recherche Scientifique (CNRS), VISEO, LInguistique et DIdactique des Langues Étrangères et Maternelles (LIDILEM), Université Stendhal - Grenoble 3-Université Grenoble Alpes (UGA), Exploration et exploitation de données textuelles (TEXTE), Laboratoire d'Informatique de Robotique et de Microélectronique de Montpellier (LIRMM), Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS), Territoires, Environnement, Télédétection et Information Spatiale (UMR TETIS), Centre de Coopération Internationale en Recherche Agronomique pour le Développement (Cirad)-AgroParisTech-Institut national de recherche en sciences et technologies pour l'environnement et l'agriculture (IRSTEA), MSH-M, DGLFLF, sud4science.org, Praxiling (Praxiling), Université Stendhal - Grenoble 3, Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM)-Centre National de la Recherche Scientifique (CNRS)-Université de Montpellier (UM), Panckhurst, Rachel, and Centre National de la Recherche Scientifique (CNRS)-Université Paul-Valéry - Montpellier 3 (UPVM)
Subjects: Anonymisation, Transcoding, Annotation, [INFO.INFO-TT] Computer Science [cs]/Document and Text Processing, SMS en français, Transcodage, Mediated electronic discourse, [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL], [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing, Discours électronique médié, [INFO.INFO-CL] Computer Science [cs]/Computation and Language [cs.CL], (socio-)linguistic analyses, French SMS, [SHS.LANGUE]Humanities and Social Sciences/Linguistics, Analyses (socio-) linguistiques, ComputingMilieux_MISCELLANEOUS
Abstract: This article describes the sud4science project (www.sud4science.org). Firstly, the authors present the acquisition phase of both SMS data and questionnaire data. Secondly, they explain anonymisation techniques, transcoding and optional annotation phases. Finally, they propose preliminary (socio-) linguistic analyses of scriptural usage of SMS writing, and they also indicate those that are planned in the foreseeable future., Dans le cadre de cet article, on expose le déroulement du projet sud4science (www.sud4science.org). En premier lieu, on décrit la phase d'acquisition des données en provenance des SMS et du questionnaire, avant d'aborder les étapes successives d'anonymisation, de transcodage et d'annotation optionnelle. Ensuite, on présente les analyses (socio-)linguistiques des pratiques scripturales de l'écriture SMS (eSMS) qui ont débuté, ainsi que celles prévues à court et à moyen terme.
Published: 2013

16. Accuracy of using natural language processing methods for identifying healthcare-associated infections.

Author: Tvardik N, Kergourlay I, Bittar A, Segond F, Darmoni S, and Metzger MH
Subjects: Adult, Algorithms, Hospitals, University, Humans, Intensive Care Units, Sensitivity and Specificity, Cross Infection diagnosis, Electronic Health Records, Natural Language Processing
Abstract: Objective: There is a growing interest in using natural language processing (NLP) for healthcare-associated infections (HAIs) monitoring. A French project consortium, SYNODOS, developed a NLP solution for detecting medical events in electronic medical records for epidemiological purposes. The objective of this study was to evaluate the performance of the SYNODOS data processing chain for detecting HAIs in clinical documents., Materials and Methods: The collection of textual records in these hospitals was carried out between October 2009 and December 2010 in three French University hospitals (Lyon, Rouen and Nice). The following medical specialties were included in the study: digestive surgery, neurosurgery, orthopedic surgery, adult intensive-care units. Reference Standard surveillance was compared with the results of automatic detection using NLP. Sensitivity on 56 HAI cases and specificity on 57 non-HAI cases were calculated., Results: The accuracy rate was 84% (n = 95/113). The overall sensitivity of automatic detection of HAIs was 83.9% (CI 95%: 71.7-92.4) and the specificity was 84.2% (CI 95%: 72.1-92.5). The sensitivity varies from one specialty to the other, from 69.2% (CI 95%: 38.6-90.9) for intensive care to 93.3% (CI 95%: 68.1-99.8) for orthopedic surgery. The manual review of classification errors showed that the most frequent cause was an inaccurate temporal labeling of medical events, which is an important factor for HAI detection., Conclusion: This study confirmed the feasibility of using NLP for the HAI detection in hospital facilities. Automatic HAI detection algorithms could offer better surveillance standardization for hospital comparisons., (Copyright © 2018 Elsevier B.V. All rights reserved.)
Published: 2018
Full Text: View/download PDF

17. Semantic distance-based creation of clusters of pharmacovigilance terms and their evaluation.

Author: Dupuch M and Grabar N
Subjects: Algorithms, Cluster Analysis, Databases, Factual, Humans, Adverse Drug Reaction Reporting Systems, Drug-Related Side Effects and Adverse Reactions classification, Pharmacovigilance, Semantics, Terminology as Topic
Abstract: Background: Pharmacovigilance is the activity related to the collection, analysis and prevention of adverse drug reactions (ADRs) induced by drugs or biologics. The detection of adverse drug reactions is performed using statistical algorithms and groupings of ADR terms from the MedDRA (Medical Dictionary for Drug Regulatory Activities) terminology. Standardized MedDRA Queries (SMQs) are the groupings which become a standard for assisting the retrieval and evaluation of MedDRA-coded ADR reports worldwide. Currently 84 SMQs have been created, while several important safety topics are not yet covered. Creation of SMQs is a long and tedious process performed by the experts. It relies on manual analysis of MedDRA in order to find out all the relevant terms to be included in a SMQ. Our objective is to propose an automatic method for assisting the creation of SMQs using the clustering of terms which are semantically similar., Methods: The experimental method relies on a specific semantic resource, and also on the semantic distance algorithms and clustering approaches. We perform several experiments in order to define the optimal parameters., Results: Our results show that the proposed method can assist the creation of SMQs and make this process faster and systematic. The average performance of the method is precision 59% and recall 26%. The correlation of the results obtained is 0.72 against the medical doctors judgments and 0.78 against the medical coders judgments., Conclusions: These results and additional evaluation indicate that the generated clusters can be efficiently used for the detection of pharmacovigilance signals, as they provide better signal detection than the existing SMQs., (Copyright © 2014. Published by Elsevier Inc.)
Published: 2015
Full Text: View/download PDF

Catalog

Books, media, physical & digital resources

See catalog results

Searchworks

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources

Refine your results

17 results on '"VISEO"'

1. Non-standard texts: from theoretical positions to Natural Language Processing normalisation

2. De la collecte à l'analyse d'un corpus de SMS authentiques : une démarche pluridisciplinaire

3. Classification des items inconnus de 88milSMS : aide à l'identification automatique de la créativité scripturale

4. Approaches of anonymisation of an SMS corpus

5. The PEW Framework for Worth Mapping

6. CAD modelling based on knowledge synthesis for design rational

7. Detecting Influencial Users in Social Networks: Analysing Graph-Based and Linguistic Perspectives

8. Extraction de relations pour le peuplement d'une base de connaissance à partir de tweets

9. Efficient Model Selection for Regularized Classification by Exploiting Unlabeled Data

10. Données authentiques : un grand corpus de SMS en français

11. Towards Electronic SMS Dictionary Construction: An Alignment-based Approach

12. Looking for Opinion in Land-Use Planning Corpora

13. Le résumé et le titrage automatique partagent-ils les mêmes objectifs ?

14. How can catchy titles be generated without loss of informativeness?

15. Sud4science, de l'acquisition d'un grand corpus de SMS en français à l'analyse de l'écriture SMS

16. Accuracy of using natural language processing methods for identifying healthcare-associated infections.

17. Semantic distance-based creation of clusters of pharmacovigilance terms and their evaluation.

Catalog

Searchworks

Select search scope, currently: Articles Catalog books, media & more in Jio Institute collections Articles journal articles & other e-resources

Search

Search Constraints

Refine your results

Search Limiters

Topic

Publication Year Range

Language

Publication Type

Journal

Database

Publisher

17 results on '"VISEO"'

Search Results

Catalog

Select search scope, currently: Articles

Catalog

books, media & more in Jio Institute collections

Articles

journal articles & other e-resources