5 results on '"Baptista, Cláudio"'
Search Results
2. Similarity Search on Semantic Trajectories Using Text Processing.
- Author
-
Ribeiro de Almeida, Damião, de Souza Baptista, Cláudio, and de Andrade, Fabio Gomes
- Subjects
- *
DATA management , *BEHAVIORAL assessment , *SOCIAL media - Abstract
The use of location-based sensors has increased exponentially. Tracking moving objects has become increasingly common, consolidating a new field of research that focuses on trajectory data management. Such trajectories may be semantically enriched using sensors and social media. This enables a detailed analysis of trajectory behavior patterns. One of the problems in this field is the search for a semantic trajectory database that is flexible and adaptable; flexibility in the sense of retrieving trajectories that are closest to the user's query and not just based on exact matching. Adaptability refers to adjusting to different types of semantic trajectories. This article proposes a new approach for representing and querying semantic trajectories based on text-processing techniques. Furthermore, we describe a framework, called SETHE (SEmantic Trajectory HuntEr), that performs similarity queries on semantically enriched trajectory databases. SETHE can be adapted according to the aspect types posed in user queries. We also presented an evaluation of the proposed framework using a real dataset, and compare our results with those of state-of-the-art approaches. [ABSTRACT FROM AUTHOR]
- Published
- 2022
- Full Text
- View/download PDF
3. Improving hate speech detection using Cross-Lingual Learning.
- Author
-
Firmino, Anderson Almeida, de Souza Baptista, Cláudio, and de Paiva, Anselmo Cardoso
- Subjects
- *
HATE speech , *AUTOMATIC speech recognition , *LANGUAGE models , *NATURAL language processing , *PORTUGUESE language , *ITALIAN language - Abstract
The growth of social media worldwide has brought social benefits and challenges. One problem we highlight is the proliferation of hate speech on social media. We propose a novel method for detecting hate speech in texts using Cross-Lingual Learning. Our approach uses transfer learning from Pre-Trained Language Models (PTLM) with large corpora available to solve problems in languages with fewer resources for the specific task. The proposed methodology comprises four stages: corpora acquisition, the PTLM definition, training strategies, and evaluation. We carried out experiments using Pre-Trained Language Models in English, Italian, and Portuguese (BERT and XLM-R) to verify which best suited the proposed method. We used corpora in English (WH) and Italian (Evalita 2018) as the source language and the OffComBr-2 corpus in Portuguese (the target language). The results of the experiments showed that the proposed methodology is promising: for the OffComBr-2 corpus, the best state-of-the-art result was obtained (F1-measure = 92%). • The development of a new methodology for hate speech detection. • Portuguese hate speech detection using Cross-Lingual Learning. • Up to 20% performance improvement over other models using the OffComBr-2 corpus. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. Gazetteer enrichment for addressing urban areas: a case study.
- Author
-
Oliveira, Maxwell Guimarães de, Campelo, Cláudio E. C., Baptista, Cláudio de Souza, and Bertolotto, Michela
- Subjects
GEOGRAPHIC information systems ,USER-generated content ,METROPOLITAN areas ,CITIES & towns ,SOCIAL media - Abstract
The advent of volunteered geographical information (VGI) has contributed to the growth of the amount of user-contributed spatial data around the world. Spatial data acquired from crowdsourcing environments may contain valuable information which can be useful in other research fields, such as Digital gazetteers, commonly used in Geographic Information Retrieval. Digital gazetteers have a powerful role in the geoparsing process. They need to be kept up-to-date and as comprehensive as possible to enable geoparsers to perform lookup and then resolve toponym recognition precisely over digital documents. The detection of toponyms in digital texts such as social media posts is a bottom line for discovering useful spatially related information such as complaints regarding urban areas. In this context, this article proposes a method for gazetteer enrichment leveraging VGI data sources. Indeed VGI environments are not originally developed to work as gazetteers, however, they often contain more detailed and up-to-date information than gazetteers. Our method is applied within a geoparser environment by adapting its heuristics set besides enriching the corresponding gazetteer. A case study was performed by geoparsing Twitter posts focusing solely on the messages aiming at evaluating the performance of the enriched system. The obtained results were encouraging and have provided a good basis for discussion. [ABSTRACT FROM AUTHOR]
- Published
- 2016
- Full Text
- View/download PDF
5. Ontology-driven urban issues identification from social media
- Author
-
OLIVEIRA, Maxwell Guimarães de., BAPTISTA, Cláudio de Souza., and CAMPELO, Cláudio Elízio Calazans.
- Subjects
Geoparsing ,Ciência da Computação ,Ciências ,Ontology ,Urban Issues ,Crowdsourcing ,Mídia Social ,Problemas Urbanos ,Ontologia ,Social Media - Abstract
Submitted by Maria Medeiros (maria.dilva1@ufcg.edu.br) on 2018-06-05T14:22:04Z No. of bitstreams: 1 MAXWELL GUIMARÃES DE OLIVEIRA - TESE (PPGCC) 2016.pdf: 7339920 bytes, checksum: c917e7c00193e284b46c986eb3d45841 (MD5) Made available in DSpace on 2018-06-05T14:22:04Z (GMT). No. of bitstreams: 1 MAXWELL GUIMARÃES DE OLIVEIRA - TESE (PPGCC) 2016.pdf: 7339920 bytes, checksum: c917e7c00193e284b46c986eb3d45841 (MD5) Previous issue date: 2016 CNPq As cidades em todo o mundo enfrentam muitos problemas diretamente relacionados ao espaço urbano, especialmente nos aspectos de infraestrutura. A maioria desses problemas urbanos geralmente afeta a vida de residentes e visitantes. Por exemplo, as pessoas podem relatar um carro estacionado em uma calçada que está forçando os pedestres a andar na via, ou um enorme buraco que está causando congestionamento. Além de estarem relacionados com o espaço urbano, os problemas urbanos geralmente demandam ações das autoridades municipais. Existem diversas Redes Sociais Baseadas em Localização (LBSN, em inglês) no domínio das cidades inteligentes em todo o mundo, onde as pessoas relatam problemas urbanos de forma estruturada e as autoridades locais tomam conhecimento para então solucioná-los. Com o advento das redes sociais como Facebook e Twitter, as pessoas tendem a reclamar de forma não estruturada, esparsa e imprevisível, sendo difícil identificar problemas urbanos eventualmente relatados. Dados de mídia social, especialmente mensagens do Twitter, fotos e check-ins, tem desempenhado um papel importante nas cidades inteligentes. Um problema chave é o desafio de identificar conversas específicas e relevantes ao processar dados crowdsourcing ruidosos. Neste contexto, esta pesquisa investiga métodos computacionais a fim de fornecer uma identificação automatizada de problemas urbanos compartilhados em mídias sociais. A maioria dos trabalhos relacionados depende de classificadores baseados em técnicas de aprendizado de máquina, como SVM, Naïve Bayes e Árvores de Decisão; e enfrentam problemas relacionados à representação do conhecimento semântico, legibilidade humana e capacidade de inferência. Com o objetivo de superar essa lacuna semântica, esta pesquisa investiga a Extração de Informação baseada em ontologias, a partir da perspectiva de problemas urbanos, uma vez que tais problemas podem ser semanticamente interligados em plataformas LBSN. Dessa forma, este trabalho propõe uma ontologia no domínio de Problemas Urbanos (UIDO) para viabilizar a identificação e classificação dos problemas urbanos em uma abordagem automatizada que foca principalmente nas facetas temática e geográfica. Uma avaliação experimental demonstra que o desempenho da abordagem proposta é competitivo com os algoritmos de aprendizado de máquina mais utilizados, quando aplicados a este domínio em particular. The cities worldwide face with many issues directly related to the urban space, especially in the infrastructure aspects. Most of these urban issues generally affect the life of both resident and visitant people. For example, people can report a car parked on a footpath which is forcing pedestrians to walk on the road or a huge pothole that is causing traffic congestion. Besides being related to the urban space, urban issues generally demand actions from city authorities. There are many Location-Based Social Networks (LBSN) in the smart cities domain worldwide where people complain about urban issues in a structured way and local authorities are aware to fix them. With the advent of social networks such as Facebook and Twitter, people tend to complain in an unstructured, sparse and unpredictable way, being difficult to identify urban issues eventually reported. Social media data, especially Twitter messages, photos, and check-ins, have played an important role in the smart cities. A key problem is the challenge in identifying specific and relevant conversations on processing the noisy crowdsourced data. In this context, this research investigates computational methods in order to provide automated identification of urban issues shared in social media streams. Most related work rely on classifiers based on machine learning techniques such as Support Vector Machines (SVM), Naïve Bayes and Decision Trees; and face problems concerning semantic knowledge representation, human readability and inference capability. Aiming at overcoming this semantic gap, this research investigates the ontology-driven Information Extraction (IE) from the perspective of urban issues; as such issues can be semantically linked in LBSN platforms. Therefore, this work proposes an Urban Issues Domain Ontology (UIDO) to enable the identification and classification of urban issues in an automated approach that focuses mainly on the thematic and geographical facets. Experimental evaluation demonstrates the proposed approach performance is competitive with most commonly used machine learning algorithms applied for that particular domain.
- Published
- 2016
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.