852 results on '"Wikidata"'
Search Results
2. Navigating change: an exploration of socio-epistemic process of extending Wikidata ontology with new properties.
- Author
-
Roszkowski, Marcin
- Subjects
- *
DIVISION of labor , *SOCIAL role , *SOCIAL responsibility , *SOCIAL processes - Abstract
Purpose: The paper addresses the issue of change in Wikidata ontology by exposing the role of the socio-epistemic processes that take place inside the infrastructure. The subject of the study was the process of extending the Wikidata ontology with a new property as an example of the interplay between the social and technical components of the Wikidata infrastructure. Design/methodology/approach: In this study, an interpretative approach to the evolution of the Wikidata ontology was used. The interpretation framework was a process-centric approach to changes in the Wikidata ontology. The extension of the Wikidata ontology with a new property was considered a socio-epistemic process where multiple agents interact for epistemic purposes. The decomposition of this process into three stages (initiation, knowledge work and closure) allowed us to reveal the role of the institutional structure of Wikidata in the evolution of its ontology. Findings: This study has shown that the modification of the Wikidata ontology is an institutionalized process where community-accepted regulations and practices must be applied. These regulations come from the institutional structure of the Wikidata community, which sets the normative patterns for both the process and social roles and responsibilities of the involved agents. Originality/value: The results of this study enhance our understanding of the evolution of the collaboratively developed Wikidata ontology by exposing the role of socio-epistemic processes, division of labor and normative patterns. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
3. Doc‐KG: Unstructured documents to knowledge graph construction, identification and validation with Wikidata.
- Author
-
Salman, Muhammad, Haller, Armin, Méndez, Sergio J. Rodríguez, and Naseem, Usman
- Subjects
- *
KNOWLEDGE graphs , *DIGITAL technology , *STRUCTURAL frames , *KNOWLEDGE management , *NATURAL languages - Abstract
The exponential growth of textual data in the digital era underlines the pivotal role of Knowledge Graphs (KGs) in effectively storing, managing, and utilizing this vast reservoir of information. Despite the copious amounts of text available on the web, a significant portion remains unstructured, presenting a substantial barrier to the automatic construction and enrichment of KGs. To address this issue, we introduce an enhanced Doc‐KG model, a sophisticated approach designed to transform unstructured documents into structured knowledge by generating local KGs and mapping these to a target KG, such as Wikidata. Our model innovatively leverages syntactic information to extract entities and predicates efficiently, integrating them into triples with improved accuracy. Furthermore, the Doc‐KG model's performance surpasses existing methodologies by utilizing advanced algorithms for both the extraction of triples and their subsequent identification within Wikidata, employing Wikidata's Unified Resource Identifiers for precise mapping. This dual capability not only facilitates the construction of KGs directly from unstructured texts but also enhances the process of identifying triple mentions within Wikidata, marking a significant advancement in the domain. Our comprehensive evaluation, conducted using the renowned WebNLG benchmark dataset, reveals the Doc‐KG model's superior performance in triple extraction tasks, achieving an unprecedented accuracy rate of 86.64%. In the domain of triple identification, the model demonstrated exceptional efficacy by mapping 61.35% of the local KG to Wikidata, thereby contributing 38.65% of novel information for KG enrichment. A qualitative analysis based on a manually annotated dataset further confirms the model's excellence, outshining baseline methods in extracting high‐fidelity triples. This research embodies a novel contribution to the field of knowledge extraction and management, offering a robust framework for the semantic structuring of unstructured data and paving the way for the next generation of KGs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
4. qEndpoint: A novel triple store architecture for large RDF graphs.
- Author
-
Willerval, Antoine, Diefenbach, Dennis, and Bonifati, Angela
- Subjects
KNOWLEDGE graphs ,RELATIONAL databases ,DATABASES ,ONLINE data processing ,STIMULUS & response (Psychology) ,RDF (Document markup language) - Abstract
In the relational database realm, there has been a shift towards novel hybrid database architectures combining the properties of transaction processing (OLTP) and analytical processing (OLAP). OLTP workloads are made up by read and write operations on a small number of rows and are typically addressed by indexes such as B+trees. On the other side, OLAP workloads consists of big read operations that scan larger parts of the dataset. To address both workloads some databases introduced an architecture using a buffer or delta partition. Precisely, changes are accumulated in a write-optimized delta partition while the rest of the data is compressed in the read-optimized main partition. Periodically, the delta storage is merged in the main partition. In this paper we investigate for the first time how this architecture can be implemented and behaves for RDF graphs. We describe in detail the indexing-structures one can use for each partition, the merge process as well as the transactional management. We study the performances of our triple store, which we call qEndpoint , over two popular benchmarks, the Berlin SPARQL Benchmark (BSBM) and the recent Wikidata Benchmark (WDBench). We are also studying how it compares against other public Wikidata endpoints. This allows us to study the behavior of the triple store for different workloads, as well as the scalability over large RDF graphs. The results show that, compared to the baselines, our triple store allows for improved indexing times, better response time for some queries, higher insert and delete rates, and low disk and memory footprints, making it ideal to store and serve large Knowledge Graphs. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
5. What does it mean to be queer in Wikidata? Practices of gender representation within a transnational online community.
- Author
-
Melis, Beatrice, Paolini, Chiara, Fioravanti, Marta, and Metilli, Daniele
- Subjects
GENDER nonconformity ,BINARY gender system ,LGBTQ+ studies ,LGBTQ+ communities ,KNOWLEDGE representation (Information theory) - Abstract
The continuing digitization and datafication that our society is undergoing are having a significant impact on our daily lives, giving rise to new possibilities but also entailing significant risks for people who are discriminated against or marginalized. Queer communities are particularly affected by these processes; therefore, it is crucially relevant to research transnational digital projects that involve them. In the Wikidata Gender Diversity (WiGeDi) project, we are looking at practices of gender representation in the Wikidata knowledge base, a collaborative online project managed by a worldwide community. Working from the idea that gender is a complex social construct, we investigate how the Wikidata community has approached the complex issue of modeling and populating gender data, progressing from a very narrow interpretation of gender as a binary to a representation that is more inclusive of a multiplicity of gender identities. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
6. Teaching Linked Data Principles through Virtual Wikidata Edit‐a‐thons.
- Author
-
Swierenga, Marianne
- Subjects
- *
LINKED data (Semantic Web) , *LIBRARY catalogs , *METADATA , *VIDEOCONFERENCING , *PARTICIPANT observation - Abstract
Cataloging and metadata professionals in libraries have an interest in learning about the Semantic Web and Linked Data technologies. However, with limited release time and funding for continuing education, professional development opportunities need to be designed with learner needs in mind: free, flexible, and fun. Our regional Linked Data interest group approached this challenge by developing an annual program in the form of a five‐day edit‐a‐thon, providing training and hands‐on experience in creating and editing Wikidata. The flexible format makes use of synchronous video conferencing, detailed online documentation, independent editing days for hands‐on learning, and opt‐in community discussion via Slack. Individual and project progress was tracked with the Wikimedia Event Dashboard. This poster proposes the Wikidata edit‐a‐thon as an ideal method to fulfill a professional development need of library staff by introducing participants to the principles of Linked Data through this flexible and interactive event. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
7. Reflections on the PCC Wikidata Pilot at UCLA Library: Undertaking the PCC Learning Objectives
- Author
-
Zhang, Erica, Biswas, Paromita, and Dagher, Iman
- Subjects
Information and Computing Sciences ,Library and Information Studies ,Wikidata ,identity management ,linked data ,Program for Cooperative Cataloging ,metadata ,Information & Library Sciences ,Library and information studies - Abstract
In 2020, the Program for Cooperative Cataloging (PCC) Task Group on Identity Management in NACO sponsored a 14-month PCC Wikidata Pilot, complete with learning objectives, for participants to experiment with Wikidata, an open linked data platform. UCLA Library joined the Pilot to create and edit Wikidata items related to UCLA Library’s collections and UCLA Library entities. With the Pilot’s conclusion, the UCLA Library Pilot team reflected on lessons learned. By assessing UCLA Library’s experience against the Pilot’s learning objectives, the authors hope to contribute on-the-ground insights that may be relevant to PCC’s progress toward identity management, and the role Wikidata may play in this transition.
- Published
- 2023
8. Enhancing knowledge graphs with microdata and LLMs: the case of Schema.org and Wikidata in touristic information
- Author
-
Gonzalez-Garcia, Lino, González-Carreño, Gema, Rivas Machota, Ana María, and Padilla Fernández-Vega, Juan
- Published
- 2024
- Full Text
- View/download PDF
9. Assessing knowledge organization systems from a gender perspective: Wikipedia taxonomy and Wikidata ontologies.
- Author
-
Centelles, Miquel and Ferran-Ferrer, Núria
- Subjects
- *
INFORMATION retrieval , *NONBINARY people , *TAXONOMY , *GENDER identity , *STANDARDS , *HEURISTIC - Abstract
Purpose: Develop a comprehensive framework for assessing the knowledge organization systems (KOSs), including the taxonomy of Wikipedia and the ontologies of Wikidata, with a specific focus on enhancing management and retrieval with a gender nonbinary perspective. Design/methodology/approach: This study employs heuristic and inspection methods to assess Wikipedia's KOS, ensuring compliance with international standards. It evaluates the efficiency of retrieving non-masculine gender-related articles using the Catalan Wikipedian category scheme, identifying limitations. Additionally, a novel assessment of Wikidata ontologies examines their structure and coverage of gender-related properties, comparing them to Wikipedia's taxonomy for advantages and enhancements. Findings: This study evaluates Wikipedia's taxonomy and Wikidata's ontologies, establishing evaluation criteria for gender-based categorization and exploring their structural effectiveness. The evaluation process suggests that Wikidata ontologies may offer a viable solution to address Wikipedia's categorization challenges. Originality/value: The assessment of Wikipedia categories (taxonomy) based on KOS standards leads to the conclusion that there is ample room for improvement, not only in matters concerning gender identity but also in the overall KOS to enhance search and retrieval for users. These findings bear relevance for the design of tools to support information retrieval on knowledge-rich websites, as they assist users in exploring topics and concepts. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
10. Quantifying Americanization: Coverage of American Topics in Different Wikipedias.
- Author
-
Konieczny, Piotr and Lewoniewski, Włodzimierz
- Subjects
- *
ELECTRONIC encyclopedias , *AMERICANIZATION , *INFORMATION resources , *DOMINANT language , *DATA mining - Abstract
As one of the most popular sources of information in the world, Wikipedia is edited by a large, global community of contributors. User-generated nature of this online encyclopedia ensures that the information reflects a wide range of topics. Hovewer, Wikipedia articles are created and edited independently in each language version. Therefore, some topics may be presented with varying degrees of completeness depending on their importance in a particular language community. In this paper, we quantified the concept of Americanization on a global scale through comparative analysis of the coverage of American topics in different language versions of Wikipedia. For this purpose, we analyzed over 90 million Wikidata items and 40 million Wikipedia articles in 58 languages. We discussed whether Americanization is more or less dominant in different languages, regions, and cultures. We showed that the interest in American topics is not universal. Western, developed countries are more Americanized (more interested in topics related to America) than the rest of the world. This is the first global, quantitative confirmation of issues often hypothesized, or assumed, in the literature on Americanization and related phenomena. This study shows that Wikipedia and Wikidata can allow quantification of social science concepts that previously were considered not realistically measurable. Finally, the presented research is also relevant to the discourses on the biases of Wikipedia. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
11. Combining language models for knowledge extraction from Italian TEI editions
- Author
-
Cristian Santini
- Subjects
large language models (LLMs) ,knowledge extraction ,Semantic Web ,Wikidata ,TEI/XML ,Giacomo Leopardi ,Electronic computers. Computer science ,QA75.5-76.95 - Abstract
This study investigates the integration of language models for knowledge extraction (KE) from Italian TEI/XML encoded texts, focusing on Giacomo Leopardi's works. The objective is to create structured, machine-readable knowledge graphs (KGs) from unstructured texts for better exploration and linkage to external resources. The research introduces a methodology that combines large language models (LLMs) with traditional relation extraction (RE) algorithms to overcome the limitations of current models with Italian literary documents. The process adopts a multilingual LLM, that is, ChatGPT, to extract natural language triples from the text. These are then converted into RDF/XML format using the REBEL model, which maps natural language relations to Wikidata properties. A similarity-based filtering mechanism using SBERT is applied to keep semantic consistency. The final RDF graph integrates these filtered triples with document metadata, utilizing established ontologies and controlled vocabularies. The research uses a dataset of 41 TEI/XML files from a semi-diplomatic edition of Leopardi's letters as case study. The proposed KE pipeline significantly outperformed the baseline model, that is, mREBEL, with remarkable improvements in semantic accuracy and consistency. An ablation study demonstrated that combining LLMs with traditional RE models enhances the quality of KGs extracted from complex texts. The resulting KG had fewer, but semantically richer, relations, predominantly related to Leopardi's literary activities and health, highlighting the extracted knowledge's relevance to understanding his life and work.
- Published
- 2024
- Full Text
- View/download PDF
12. A framework for integrating biomedical knowledge in Wikidata with open biological and biomedical ontologies and MeSH keywords
- Author
-
Houcemeddine Turki, Khalil Chebil, Bonaventure F.P. Dossou, Chris Chinenye Emezue, Abraham Toluwase Owodunni, Mohamed Ali Hadj Taieb, and Mohamed Ben Aouicha
- Subjects
Wikidata ,Open biological and biomedical ontologies ,MeSH keywords ,Biomedical relation identification ,Crowdsourcing ,PubMed ,Science (General) ,Q1-390 ,Social sciences (General) ,H1-99 - Abstract
This study presents a comprehensive framework to enhance Wikidata as an open and collaborative knowledge graph by integrating Open Biological and Biomedical Ontologies (OBO) and Medical Subject Headings (MeSH) keywords from PubMed publications. The primary data sources include OBO ontologies and MeSH keywords, which were collected and classified using SPARQL queries for RDF knowledge graphs. The semantic alignment between OBO ontologies and Wikidata was evaluated, revealing significant gaps and distorted representations that necessitate both automated and manual interventions for improvement. We employed pointwise mutual information to extract biomedical relations among the 5000 most common MeSH keywords in PubMed, achieving an accuracy of 89.40 % for superclass-based classification and 75.32 % for relation type-based classification. Additionally, Integrated Gradients were utilized to refine the classification by removing irrelevant MeSH qualifiers, enhancing overall efficiency. The framework also explored the use of MeSH keywords to identify PubMed reviews supporting unsupported Wikidata relations, finding that 45.8 % of these relations were not present in PubMed, indicating potential inconsistencies in Wikidata. The contributions of this study include improved methodologies for enriching Wikidata with biomedical information, validated semantic alignments, and efficient classification processes. This work enhances the interoperability and multilingual capabilities of biomedical ontologies and demonstrates the critical role of MeSH keywords in verifying semantic relations, thereby contributing to the robustness and accuracy of collaborative biomedical knowledge graphs.
- Published
- 2024
- Full Text
- View/download PDF
13. A Systematic Review of Wikidata in GLAM Institutions: a Labs Approach
- Author
-
Candela, Gustavo, Cuper, Mirjam, Holownia, Olga, Gabriëls, Nele, Dobreva, Milena, Mahey, Mahendra, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Hinze, Annika, editor, Piwowarski, Benjamin, editor, Coustaty, Mickaël, editor, Di Nunzio, Giorgio Maria, editor, Gelati, Francesco, editor, and Vanderschantz, Nicholas, editor
- Published
- 2024
- Full Text
- View/download PDF
14. Enriching Archival Linked Data Descriptions with Information from Wikidata and DBpedia
- Author
-
Koch, Inês, Ribeiro, Cristina, Poveda-Villalón, María, Rico, Mariano, Teixeira Lopes, Carla, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Antonacopoulos, Apostolos, editor, Hinze, Annika, editor, Piwowarski, Benjamin, editor, Coustaty, Mickaël, editor, Di Nunzio, Giorgio Maria, editor, Gelati, Francesco, editor, and Vanderschantz, Nicholas, editor
- Published
- 2024
- Full Text
- View/download PDF
15. Discovering Relationships Among Properties in Wikidata Knowledge Graph
- Author
-
Niazmand, Emetis, Vidal, Maria-Esther, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Wrembel, Robert, editor, Chiusano, Silvia, editor, Kotsis, Gabriele, editor, Tjoa, A Min, editor, and Khalil, Ismail, editor
- Published
- 2024
- Full Text
- View/download PDF
16. Analysis of the Successful and Bankrupt Digital Currency Exchanges Based on Open Data
- Author
-
Stolarski, Piotr, Lewoniewski, Włodzimierz, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Hernes, Marcin, editor, and Wątróbski, Jarosław, editor
- Published
- 2024
- Full Text
- View/download PDF
17. Closer Reading of RDF Generated by NLP on Wikipedia Biography: Comparative Analysis
- Author
-
Sugimoto, Go, Daza, Angel, Boer, Victor de, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, Garoufallou, Emmanouel, editor, and Sartori, Fabio, editor
- Published
- 2024
- Full Text
- View/download PDF
18. Method for Linking Named Entities to Wikidata Concepts for Russian Texts
- Author
-
Teslya, Nikolay, Shutiuk, Vsevolod, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Abraham, Ajith, editor, Bajaj, Anu, editor, Hanne, Thomas, editor, and Siarry, Patrick, editor
- Published
- 2024
- Full Text
- View/download PDF
19. Using WikiData for Handling Legal Rule Exceptions: Proof of Concept
- Author
-
Fungwacharakorn, Wachara, Takeda, Hideaki, Satoh, Ken, Goos, Gerhard, Series Editor, Hartmanis, Juris, Founding Editor, van Leeuwen, Jan, Series Editor, Hutchison, David, Editorial Board Member, Kanade, Takeo, Editorial Board Member, Kittler, Josef, Editorial Board Member, Kleinberg, Jon M., Editorial Board Member, Kobsa, Alfred, Series Editor, Mattern, Friedemann, Editorial Board Member, Mitchell, John C., Editorial Board Member, Naor, Moni, Editorial Board Member, Nierstrasz, Oscar, Series Editor, Pandu Rangan, C., Editorial Board Member, Sudan, Madhu, Series Editor, Terzopoulos, Demetri, Editorial Board Member, Tygar, Doug, Editorial Board Member, Weikum, Gerhard, Series Editor, Vardi, Moshe Y, Series Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Woeginger, Gerhard, Editorial Board Member, Bono, Mayumi, editor, Takama, Yasufumi, editor, Satoh, Ken, editor, Nguyen, Le-Minh, editor, and Kurahashi, Setsuya, editor
- Published
- 2024
- Full Text
- View/download PDF
20. Generate and Update Large HDT RDF Knowledge Graphs on Commodity Hardware
- Author
-
Willerval, Antoine, Diefenbach, Dennis, Bonifati, Angela, Hartmanis, Juris, Founding Editor, Goos, Gerhard, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Meroño Peñuela, Albert, editor, Dimou, Anastasia, editor, Troncy, Raphaël, editor, Hartig, Olaf, editor, Acosta, Maribel, editor, Alam, Mehwish, editor, Paulheim, Heiko, editor, and Lisena, Pasquale, editor
- Published
- 2024
- Full Text
- View/download PDF
21. ULKB Logic: A HOL-Based Framework for Reasoning over Knowledge Graphs
- Author
-
Lima, Guilherme, Rademaker, Alexandre, Uceda-Sosa, Rosario, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Barbosa, Haniel, editor, and Zohar, Yoni, editor
- Published
- 2024
- Full Text
- View/download PDF
22. The Formation of the Idea of the Library as an Institution in 18th-Century Europe. A Qualitative and Quantitative Approach.
- Author
-
Bianchini, Carlo, Mancini, Lorenzo, and Sabba, Fiammetta
- Subjects
- *
RESEARCH libraries , *LINKED data (Semantic Web) , *NATURAL language processing , *HISTORY of libraries , *TEXT recognition , *SEMANTIC Web - Abstract
The paper illustrates the LIBMOVIT project – Libraries on the Move: Scholars, Books, Ideas Traveling in Italy in the 18th Century – whose main research focus is the European Eighteenth century socio-cultural framework in which the library as an institution acquired an historical, social, public and dynamic dimension. This context will be analysed through a study of the Eighteenth century sources connected to the learned journey experience of the Grand Tour, in particular those contained in the Angiolo Tursi collection – one of the largest travel literature collections in Italy – held at the Marciana national library in Venice. The paper presents the planned approach of the research: first, a classification and an organization of a corpus of relevant documents for the knowledge of travel literature in connection to the libraries world will be created; in particular, the sources will be identified, further bibliographical information will be added, and new sources will be integrated to the corpus and selected documents will be digitized. After that, the research will proceed through a double analysis – traditional and computational – of the texts collected in the corpus is to be developed. First, all the library and bibliographical aspects described by travellers will be studied according to the traditional approach in humanities research to collect important information about the history of libraries (location, decoration, catalogues, opening hours, access, collections, cited books and documents), the travellers and their companions (professions, nationality, reason to travel), the people met (scholars, librarians, superintendents) and the subjects and ideas discussed during the visits in the libraries. Second, the texts will be computationally analysed through several Natural Language Processing (NLP) techniques, starting from the automatic text recognition until arriving to more complex lexical and terminological analysis and Named Entity Recognition (NER). This work is meant to support the previously described qualitative study and will also allow to produce Linked open data about the domain entities (e.g. libraries, people, books) in view of their publication in the semantic web in order to ease and promote their exploration, visualisation and reuse. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
23. A study of concept similarity in Wikidata.
- Author
-
Ilievski, Filip, Shenoy, Kartik, Chalupsky, Hans, Klein, Nicholas, and Szekely, Pedro
- Subjects
LANGUAGE models ,KNOWLEDGE graphs ,RETROFITTING ,CROWDSOURCING ,ARTIFICIAL intelligence ,MULTICASTING (Computer networks) - Abstract
Robust estimation of concept similarity is crucial for applications of AI in the commercial, biomedical, and publishing domains, among others. While the related task of word similarity has been extensively studied, resulting in a wide range of methods, estimating concept similarity between nodes in Wikidata has not been considered so far. In light of the adoption of Wikidata for increasingly complex tasks that rely on similarity, and its unique size, breadth, and crowdsourcing nature, we propose that conceptual similarity should be revisited for the case of Wikidata. In this paper, we study a wide range of representative similarity methods for Wikidata, organized into three categories, and leverage background information for knowledge injection via retrofitting. We measure the impact of retrofitting with different weighted subsets from Wikidata and ProBase. Experiments on three benchmarks show that the best performance is achieved by pairing language models with rich information, whereas the impact of injecting knowledge is most positive on methods that originally do not consider comprehensive information. The performance of retrofitting is conditioned on the selection of high-quality similarity knowledge. A key limitation of this study, similar to prior work lies in the limited size and scope of the similarity benchmarks. While Wikidata provides an unprecedented possibility for a representative evaluation of concept similarity, effectively doing so remains a key challenge. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
24. LIS Journals' Lack of Participation in Wikidata Item Creation
- Author
-
Eric Willey and Susan Radovsky
- Subjects
wikidata ,metadata ,scholarly publishing ,journal article metadata ,linked data ,linked open data ,Bibliography. Library science. Information resources - Abstract
There are many items in Wikidata representing scholarly articles. However, these items have been created mostly by volunteer Wikidata editors and not systematically by journal publishers or editors, which can lead to gaps and inconsistencies in the datasets. This article presents findings from a survey investigating practices of library and information studies (LIS) journals in Wikidata item creation. Believing that a significant number of LIS journal editors would be aware of Wikidata and some would be creating Wikidata items for their publications, the authors sent a survey asking 138 English-language LIS journal editors if they created Wikidata items for materials published in their journal and follow-up questions. With a response rate of 41 percent, respondents overwhelmingly indicated that they did not create Wikidata items for materials published in their journal and were completely unaware of or only somewhat familiar with Wikidata. Respondents indicated that more familiarity with Wikidata and its benefits for scholarly journals as well as institutional support for the creation of Wikidata items could lead to greater participation; however, a campaign of education about Wikidata, documentation of benefits, and support for creation would be a necessary first step. The article presents and discusses the results of the survey, but the conclusions that can be drawn are minimal; therefore, the authors also discuss the benefits of creating Wikidata items for LIS journals as a first step in this educational campaign for editors and publishers.
- Published
- 2024
- Full Text
- View/download PDF
25. Transforming higher education: a decade of integrating wikipedia and wikidata for literacy enhancement and social impact
- Author
-
Evenstein Sigalov, Shani, Cohen, Anat, and Nachmias, Rafi
- Published
- 2024
- Full Text
- View/download PDF
26. Wikipedia y los centros educativos de enseñanza superior de Portugal.
- Author
-
Obregón, Angel and Rodrigues Costa, Pedro
- Subjects
- *
ENCYCLOPEDIAS & dictionaries , *PORTUGUESE language , *CLASSIFICATION , *DISCOURSE , *ELECTRONIC encyclopedias , *LANGUAGE & languages - Abstract
Since its founding, Wikipedia has been one of the most visited resources in the world, and has been analyzed both as a communicative and educational tool. This research focuses on studying its content in reference to higher educational centers in Portugal. It also seeks to analyze the languages that best represent these educational centers on Wikipedia and check whether the discourse of their articles has been created thanks to collaborative and free work. To do this, a classification was created with all the higher institutions, Wikidata was searched to see if they had an element and the Wikipedia languages in which they were present. This data was saved and a bot was programmed to collect the visits they received, their size, the number of sources on which they are based, the edits that have been recorded in them and by whom they have been made. 102 higher institutions were found, of which 81 were present on Wikidata. Of these, 60 had an article on Wikipedia in any of the active languages of the encyclopedia, with the Portuguese version having the highest number of articles created with 56. The universities of Coimbra, Lisbon and Porto were the most important, showing that The discourse of its articles is the result of the collaborative work of many editors. The rest of the higher institutions showed very diverse data, sometimes with content from accounts for particular purposes, especially in certain private institutions. Therefore, it can be concluded that the content of Wikipedia articles is very varied, depending on the importance of the institutions and the particular interest of the community editors. In general, the most prestigious institutions are more closely monitored, have more editors in charge, and their discourse is more varied than that of smaller centers. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
27. Wiki3DRank: un modelo para medir la relevancia de objetos de conocimiento mediante datos cuantitativos de Wikidata y Wikipedia.
- Author
-
Antonio PASTOR-SÁNCHEZ, Juan, SAORÍN, Tomás, and BAÑOS MORENO, María-José
- Abstract
This research introduces the Wiki3DRank, a model combining real-time extracted quantitative data from Wikidata and Wikipedia to obtain a ranking of knowledge objects through a quantitative value that measures the relevance of one object compared to others in a specific domain. The model is based on the distribution of knowledge objects in a vector space, whose components are based on three main variables: the number of statements on Wikidata about an item, the number of articles in different Wikipedia editions, and the length in number of words of these articles. These variables are associated with the level of description of the Wikidata items, the dissemination of the referred knowledge objects in Wikipedia editions in different languages, and the degree of editorial elaboration of the corresponding Wikipedia articles. To demonstrate the viability of the model, a series of use cases across various domains are analysed: books, movies, cathedrals, earthquakes, rivers, and chemical elements. From the results obtained, it is possible to conclude that Wiki3DRank is a tool that allows measure the relevance of knowledge objects in the context of a knowledge domain. The operation of an open-source tool that enables the online calculation of Wiki3DRank is presented. The results suggest that the proposed model can be applied to different contexts and domains and that it`s ease to expand it by adding elements of weighting and extending the model with new components based on other characteristics of the encyclopaedic data of the knowledge objects, while the base vector calculation system is maintained. [ABSTRACT FROM AUTHOR]
- Published
- 2024
28. Wiki3DRank: A model for measuring the relevance of knowledge objects using quantitative data from Wikidata and Wikipedia.
- Author
-
Antonio PASTOR-SÁNCHEZ, Juan, SAORÍN, Tomás, and BAÑOS MORENO, María-José
- Subjects
- *
MEASUREMENT - Abstract
This research introduces the Wiki3DRank, a model combining real-time extracted quantitative data from Wikidata and Wikipedia to obtain a ranking of knowledge objects through a quantitative value that measures the relevance of one object compared to others in a specific domain. The model is based on the distribution of knowledge objects in a vector space, whose components are based on three main variables: the number of statements on Wikidata about an item, the number of articles in different Wikipedia editions, and the length in number of words of these articles. These variables are associated with the level of description of the Wikidata items, the dissemination of the referred knowledge objects in Wikipedia editions in different languages, and the degree of editorial elaboration of the corresponding Wikipedia articles. To demonstrate the viability of the model, a series of use cases across various domains are analysed: books, movies, cathedrals, earthquakes, rivers, and chemical elements. From the results obtained, it is possible to conclude that Wiki3DRank is a tool that allows measure the relevance of knowledge objects in the context of a knowledge domain. The operation of an open-source tool that enables the online calculation of Wiki3DRank is presented. The results suggest that the proposed model can be applied to different contexts and domains and that it`s ease to expand it by adding elements of weighting and extending the model with new components based on other characteristics of the encyclopaedic data of the knowledge objects, while the base vector calculation system is maintained. [ABSTRACT FROM AUTHOR]
- Published
- 2024
29. Riconciliare le voci di autorità in SBN con Wikidata. Progressi e prospettive dopo un decennio di lavoro (2013-2023).
- Author
-
Pellizzari di San Girolamo, Camillo Carlo
- Abstract
The article retraces diachronically the history of the reconciliation of Opac SBN's authority records with Wikidata, from the perspective of the editing activity on Wikidata; the article deals with the creation of the property used in Wikidata to connect to Opac SBN's name authority file (P396), the progressive growth of the number of values of P396 in Wikidata (compared with the identifiers of other European authority files), the reconciliation strategies that have been adopted and the main difficulties which have slowed down this process, and finally the relevant novelties brought firstly by the adoption of the new Opac SBN and secondly by the recent decisions taken by ICCU in order to promote a greater data openness. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
30. From Authority Work to Wikidata (Via Wikipedia): Implementing a Workflow for Original Catalogers.
- Author
-
Tashlitskyy, Roman and Soglasnova, Lana
- Subjects
- *
WORKFLOW , *SLAVIC languages , *ACADEMIC libraries - Abstract
This is a practical reflection on how a library cataloger can learn to create Wikidata items based on Slavic language publications, and to maintain the practice as part of the workflow. We describe our experience at the University of Toronto Robarts Library, and the path to adopting a working model for regularly contributing Wikidata items based on the cataloger's original authority work for NACO. [ABSTRACT FROM AUTHOR]
- Published
- 2024
- Full Text
- View/download PDF
31. Monogràfic a 'BID' sobre webs de dades i grafs de coneixements.
- Author
-
Centelles Velilla, Miquel
- Published
- 2023
- Full Text
- View/download PDF
32. Creating a multi-linked dynamic dataset: a case study of plant genera named for women.
- Author
-
von Mering, Sabine, Gardiner, Lauren Maria, Knapp, Sandra, Lindon, Heather, Leachman, Siobhan, Ulloa, Carmen Ulloa, Vincent, Sarah, and Vorontsova, Maria S.
- Subjects
SCIENCE databases ,SOCIAL media ,BOTANICAL nomenclature ,WOMEN botanists ,PLANT classification - Abstract
Background: A discussion on social media led to the formation of a multidisciplinary group working on this project to highlight women's contributions to science. The role of marginalised groups in science has been a topic of much discussion, but data on these contributions are largely lacking. Our motivation for the development of this dataset was not only to highlight names of plant genera that honour women, but to enrich this information with data that would allow the names, roles and lives of these women to be shared more widely with others, both researchers and data sources like Wikidata. Amplification of the contributions of women to botany through multiple means will enable the community to better recognise and celebrate the role of this particular marginalised group in the history and development of science. New information: The innovative approach of our study resulted in a dataset that is dynamic, expansive and widely shared. We have published a static dataset with this paper and have also created a dynamic dataset by linking flowering plant genera and the women in whose honour those genera were named in Wikidata. This concurrent addition of the data to Wikidata, a linked open data repository, enabled it to be enriched, queried and proactively shared during the whole process of dataset creation and into the future. This innovative workflow allowed wide, open participation throughout the research process. The methodology and workflows applied can be used to create future datasets celebrating and amplifying the contributions of marginalised groups in science. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
33. The Formation of the Idea of the Library as an Institution in 18th-Century Europe. A Qualitative and Quantitative Approach
- Author
-
Carlo Bianchini, Lorenzo Mancini, and Fiammetta Sabba
- Subjects
Library history ,Travel literature ,Natural Language Processing ,Wikidata ,Qualitative research ,Quantitative research. ,Bibliography. Library science. Information resources - Abstract
The paper illustrates the LIBMOVIT project - - Libraries on the Move: Scholars, Books, Ideas Traveling in Italy in the 18th Century - whose main research focus is the European Eighteenth century socio-cultural framework in which the library as an institution acquired an historical, social, public and dynamic dimension. This context will be analysed through a study of the Eighteenth century sources connected to the learned journey experience of the Grand Tour, in particular those contained in the Angiolo Tursi collection - one of the largest travel literature collections in Italy - held at the Marciana national library in Venice. The paper presents the planned approach of the research: first, a classification and an organization of a corpus of relevant documents for the knowledge of travel literature in connection to the libraries world will be created; in particular, the sources will be identified, further bibliographical information will be added, and new sources will be integrated to the corpus and selected documents will be digitized. After that, the research will proceed through a double analysis – traditional and computational – of the texts collected in the corpus is to be developed. First, all the library and bibliographical aspects described by travellers will be studied according to the traditional approach in humanities research to collect important information about the history of libraries (location, decoration, catalogues, opening hours, access, collections, cited books and documents), the travellers and their companions (professions, nationality, reason to travel), the people met (scholars, librarians, superintendents) and the subjects and ideas discussed during the visits in the libraries. Second, the texts will be computationally analysed through several Natural Language Processing (NLP) techniques, starting from the automatic text recognition until arriving to more complex lexical and terminological analysis and Named Entity Recognition (NER). This work is meant to support the previously described qualitative study and will also allow to produce Linked open data about the domain entities (e.g. libraries, people, books) in view of their publication in the semantic web in order to ease and promote their exploration, visualisation and reuse.
- Published
- 2024
- Full Text
- View/download PDF
34. PhiloBiblon y el mundo wiki
- Author
-
Faulhaber, Charles
- Subjects
Medieval Spanish literature ,digital humanities ,wikidata ,triples ,triplestores ,linked open data ,technological changes ,databases ,information technology - Abstract
After a homage to Gemma Avenoza and a brief review of the technological changes in PhiloBiblon since 1987, there follows a description of the current project, “PhiloBiblon: From Siloed Databases to Linked Open Data via Wikibase: Proof of Concept”. The current rigid structure, with its ten relational tables and almost 1,300 data fields, is abandoned in favor of a much more flexible structure of “triples,” records based on a series of statements of the type Entity + Property + Entity, using the system of Wikidata, in which two entities Q are linked by a property P, e.g., Santillana (Q2877) writes (P50) the “Comedieta dePonza” (Q390408).
- Published
- 2022
35. Exploring the Opportunities and Challenges in Contributing to Tamil Wikimedia
- Author
-
Navaneethakrishnan, Subalalitha Chinnaudayar, Thangasamy, Sathiyaraj, R, Nithya, Info-farmer, Neechalkaran, Filipe, Joaquim, Editorial Board Member, Ghosh, Ashish, Editorial Board Member, Prates, Raquel Oliveira, Editorial Board Member, Zhou, Lizhu, Editorial Board Member, M, Anand Kumar, editor, Chakravarthi, Bharathi Raja, editor, B, Bharathi, editor, O’Riordan, Colm, editor, Murthy, Hema, editor, Durairaj, Thenmozhi, editor, and Mandl, Thomas, editor
- Published
- 2023
- Full Text
- View/download PDF
36. Evaluation of a Representative Selection of SPARQL Query Engines Using Wikidata
- Author
-
Lam, An Ngoc, Elvesæter, Brian, Martin-Recuerda, Francisco, Goos, Gerhard, Founding Editor, Hartmanis, Juris, Founding Editor, Bertino, Elisa, Editorial Board Member, Gao, Wen, Editorial Board Member, Steffen, Bernhard, Editorial Board Member, Yung, Moti, Editorial Board Member, Pesquita, Catia, editor, Jimenez-Ruiz, Ernesto, editor, McCusker, Jamie, editor, Faria, Daniel, editor, Dragoni, Mauro, editor, Dimou, Anastasia, editor, Troncy, Raphael, editor, and Hertling, Sven, editor
- Published
- 2023
- Full Text
- View/download PDF
37. Zirkulation und Wertschöpfung am Beispiel literarischer Figuren
- Author
-
Picard, Sophie, Wojcik, Paula, Zarrieß, Sina, Gamper, Michael, Series Editor, Nebrig, Alexander, Series Editor, Müller-Tamm, Jutta, editor, Wachter, David, editor, and Wrobel, Jasmin, editor
- Published
- 2023
- Full Text
- View/download PDF
38. Companies in Multilingual Wikipedia: Articles Quality and Important Sources of Information
- Author
-
Lewoniewski, Włodzimierz, Wȩcel, Krzysztof, Abramowicz, Witold, van der Aalst, Wil, Series Editor, Ram, Sudha, Series Editor, Rosemann, Michael, Series Editor, Szyperski, Clemens, Series Editor, Guizzardi, Giancarlo, Series Editor, Ziemba, Ewa, editor, Chmielarz, Witold, editor, and Wątróbski, Jarosław, editor
- Published
- 2023
- Full Text
- View/download PDF
39. Multilingual Text Generation for Abstract Wikipedia in Grammatical Framework: Prospects and Challenges
- Author
-
Ranta, Aarne, Kacprzyk, Janusz, Series Editor, Loukanova, Roussanka, editor, Lumsdaine, Peter LeFanu, editor, and Muskens, Reinhard, editor
- Published
- 2023
- Full Text
- View/download PDF
40. Multilingual Complementation of Causality Property on Wikidata Based on GPT-3
- Author
-
Jin, Yuxi, Shiramatsu, Shun, Kacprzyk, Janusz, Series Editor, Gomide, Fernando, Advisory Editor, Kaynak, Okyay, Advisory Editor, Liu, Derong, Advisory Editor, Pedrycz, Witold, Advisory Editor, Polycarpou, Marios M., Advisory Editor, Rudas, Imre J., Advisory Editor, Wang, Jun, Advisory Editor, Yang, Xin-She, editor, Sherratt, Simon, editor, Dey, Nilanjan, editor, and Joshi, Amit, editor
- Published
- 2023
- Full Text
- View/download PDF
41. The reconciliation of SBN authority records with Wikidata. Progresses and perspectives after a decade of work (2013-2023)
- Author
-
Camillo Carlo Pellizzari di San Girolamo
- Subjects
Authority control ,Entity management ,Data reconciliation ,Wikidata ,Opac SBN. ,Bibliography. Library science. Information resources - Abstract
The article retraces diachronically the history of the reconciliation of Opac SBN’s authority records with Wikidata, from the perspective of the editing activity on Wikidata; the article deals with the creation of the property used in Wikidata to connect to Opac SBN’s name authority file (P396), the progressive growth of the number of values of P396 in Wikidata (compared with the identifiers of other European authority files), the reconciliation strategies that have been adopted and the main difficulties which have slowed down this process, and finally the relevant novelties brought firstly by the adoption of the new Opac SBN and secondly by the recent decisions taken by ICCU in order to promote a greater data openness.
- Published
- 2024
- Full Text
- View/download PDF
42. Kamptaler Sakrallandschaften im Wikiversum : Edits mit Versionsgeschichte: Elementarteilchen offener Wissensproduktion am Beispiel eines Citizen Science-Projektes
- Author
-
Christian Erlinger and Jens Bemme
- Subjects
regionalgeschichte ,heimatkunde ,wikiversum ,linked open data ,wikidata ,wikimedia commons ,open public humanities ,Bibliography. Library science. Information resources - Abstract
Die Autoren skizzieren, dass insbesondere lokales und regionales Wissen mit Wikis entsteht und dauerhaft bleibt – als Regionalia in globalen offenen Linkzusammenhängen. “Grass Root Open Access” bedeutet nicht nur, Publikationen auf selbst gezimmerte Art und Weise frei und unter offener Lizenz zu publizieren ("I have published my pdf under a cc license on my personal website”). “Grass Root Open Science” bedeutet auch, den Inhalt, die Daten und Bilder – das Wissen einer publizistischen Arbeit an sich frei, offen und reproduzierbar zu veröffentlichen. Am Beispiel der “Wikifizierung” einer gedruckten, heimatkundlichen Buchpublikation wird gezeigt, wie mit Graswurzelstrategien im Wikiversums Open Science entsteht. Wir skizzieren einen solchen Prozess als ‘linked open’: Methoden und Effekte regionaler Datenpflege als demokratisierende Praxis mittels Citizen Science, mit Blick auf Technologien und Gemeinschaften. Potentiell beeinflussen wir mit offenen, wiki-basierten und damit dezentralen Wissenssystemen die Kalkulation und Rentabilität öffentlicher und quasi-öffentlicher Investitionen in Bildungsressourcen, Informationsinfrastrukturen, Forschung und Entwicklung.
- Published
- 2023
- Full Text
- View/download PDF
43. Utilización de Wikidata para identificar la brecha de género en el arte público iberoamericano
- Author
-
Ángel Obregón-Sierra and Silvia Cecilia Anselmi
- Subjects
Arte público ,Brecha de género ,Escultura ,Monumento ,Wikidata ,Museums. Collectors and collecting ,AM1-501 ,Bibliography. Library science. Information resources - Abstract
Resumen: Se presenta el resultado de una investigación sobre la posible brecha de género existente en el arte público a partir del análisis de los monumentos que se pueden encontrar en los espacios públicos de las capitales de Argentina y España. Mediante la introducción de la información correspondiente a los monumentos de ambas ciudades en la base de conocimiento libre Wikidata, se revisaron las tipologías, las personas creadoras y lo que representan estas obras de arte. En total se trabajó con 1851 monumentos situados en Madrid y 2016 en Buenos Aires, hallando diferencias sustanciales respecto a los creadores de las obras y lo que representan. Un 3,48 % de las obras de arte de Madrid y un 4,66 % en el caso de Buenos Aires fueron creadas por artistas mujeres, mientras que el 14,48 % de las obras en Madrid corresponden a representaciones del género femenino por un 10,59 % en Buenos Aires. Se constató que existe una notable brecha de género, siendo poco habitual encontrar monumentos en el arte público realizados por escultoras, a pesar de que en los últimos años se dedican profesionalmente hasta dos veces más que esa cantidad. Desde esta investigación se recomienda acrecentar el número de obras creadas por artistas mujeres, así como aumentar el número de veces que aparecen representadas, además de realizar otras iniciativas para su mayor visibilización.
- Published
- 2023
- Full Text
- View/download PDF
44. 1Lib1Nearby mit Wikidata
- Author
-
Jens Bemme
- Subjects
Wikidata ,1lib1ref ,1lib1nearby ,Information Literacy ,Digital Literacy ,Data Literacy ,Bibliography. Library science. Information resources - Published
- 2023
- Full Text
- View/download PDF
45. Naked data: curating Wikidata as an artistic medium to interpret prehistoric figurines.
- Author
-
Sant, Toni and Tabone, Enrique
- Subjects
PREHISTORIC figurines ,ART materials ,FIGURINES ,DIGITAL preservation ,ARTISTIC creation ,INSTALLATION art - Abstract
In 2019, Digital Curation Lab Director Toni Sant and the artist Enrique Tabone started collaborating on a research project exploring the visualization of specific data sets through Wikidata for artistic practice. An art installation called Naked Data was developed from this collaboration and exhibited at the Stanley Picker Gallery in Kingson, London, during the DRHA 2022 conference. Through data analysis, employing Wikidata tools, this creative work employs a data set depicting prehistoric female figurines held by Heritage Malta. The artistic research aims to develop a creative workflow model for processing essential information about art collections, museum policies, and ways to engage with cultural heritage through data. This article outlines the key elements involved in this practice-based research work and shares the artistic process involving the visualizing of the scientific data with special attention to the aesthetic qualities afforded by this technological engagement. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
46. Hypotheses in urban ecology: building a common knowledge base.
- Author
-
Lokatis, Sophie, Jeschke, Jonathan M., Bernard‐Verdier, Maud, Buchholz, Sascha, Grossart, Hans‐Peter, Havemann, Frank, Hölker, Franz, Itescu, Yuval, Kowarik, Ingo, Kramer‐Schadt, Stephanie, Mietchen, Daniel, Musseau, Camille L., Planillo, Aimara, Schittko, Conrad, Straka, Tanja M., and Heger, Tina
- Subjects
- *
URBAN ecology , *KNOWLEDGE base , *HYPOTHESIS , *KNOWLEDGE transfer , *BIOTIC communities , *NETWORK analysis (Planning) - Abstract
Urban ecology is a rapidly growing research field that has to keep pace with the pressing need to tackle the sustainability crisis. As an inherently multi‐disciplinary field with close ties to practitioners and administrators, research synthesis and knowledge transfer between those different stakeholders is crucial. Knowledge maps can enhance knowledge transfer and provide orientation to researchers as well as practitioners. A promising option for developing such knowledge maps is to create hypothesis networks, which structure existing hypotheses and aggregate them according to topics and research aims. Combining expert knowledge with information from the literature, we here identify 62 research hypotheses used in urban ecology and link them in such a network. Our network clusters hypotheses into four distinct themes: (i) Urban species traits & evolution, (ii) Urban biotic communities, (iii) Urban habitats and (iv) Urban ecosystems. We discuss the potentials and limitations of this approach. All information is openly provided as part of an extendable Wikidata project, and we invite researchers, practitioners and others interested in urban ecology to contribute additional hypotheses, as well as comment and add to the existing ones. The hypothesis network and Wikidata project form a first step towards a knowledge base for urban ecology, which can be expanded and curated to benefit both practitioners and researchers. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
47. Investigating the potential of the semantic web for education: Exploring Wikidata as a learning platform.
- Author
-
Evenstein Sigalov, Shani and Nachmias, Rafi
- Subjects
SEMANTIC Web ,TECHNOLOGICAL innovations ,EDUCATIONAL technology ,CAREER development ,COLLABORATIVE learning - Abstract
Wikidata is a free, multilingual, open knowledge-base that stores structured, linked data. It has grown rapidly and as of December 2022 contains over 100 million items and millions of statements, making it the largest semantic knowledge-base in existence. Changing the interaction between people and knowledge, Wikidata offers various learning opportunities, leading to new applications in sciences, technology and cultures. These learning opportunities stem in part from the ability to query this data and ask questions that were difficult to answer in the past. They also stem from the ability to visualize query results, for example on a timeline or a map, which, in turn, helps users make sense of the data and draw additional insights from it. Research on the semantic web as learning platform and on Wikidata in the context of education is almost non-existent, and we are just beginning to understand how to utilize it for educational purposes. This research investigates the Semantic Web as a learning platform, focusing on Wikidata as a prime example. To that end, a methodology of multiple case studies was adopted, demonstrating Wikidata uses by early adopters. Seven semi-structured, in-depth interviews were conducted, out of which 10 distinct projects were extracted. A thematic analysis approach was deployed, revealing eight main uses, as well as benefits and challenges to engaging with the platform. The results shed light on Wikidata's potential as a lifelong learning process, enabling opportunities for improved Data Literacy and a worldwide social impact. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
48. Apertura radical y conocimiento libre: repositorio de revistas académicas mexicanas de acceso abierto a través de Wikidata.
- Author
-
Álvarez-Azcárraga, Luis
- Subjects
- *
INSTITUTIONAL repositories , *SCHOLARLY periodicals , *OPEN scholarship , *WOMEN authors , *DATABASES , *REFERENCE sources , *ELECTRONIC encyclopedias - Abstract
This article proposes the possibility of creating an open and free repository of open-access academic journals using Wikidata, Wikipedia, and Zotero. The project starts with only Mexican social sciences and humanities journals. To this effect, some open indices and datasets are taken as a foundation: DOAJ, Latindex, Dialnet, and Latinrev. Even though there are open datasets for academic journals, this work proposes the creation of a database based on radical openness from Wikimedia projects, following the principles of free knowledge and the participation of anyone in the construction of open science. Storing academic journals and papers in Wikipedia also allows performing analyses and cross-referencing information, which may be a complex task in other repositories. This article shows the preliminary results of analysis regarding the publishers, topics, and licenses of Mexican academic journals in the field of social sciences and humanities which have thus far been updated in Wikidata. Another aspect of this study is that it suggests that this open database for academic publications also allows indexing papers and men and women authors, as well as other topics derived from the items added, along with the use of these articles as sources or references for Wikipedia articles. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
49. Discovery and recognition of formula concepts using machine learning.
- Author
-
Scharpf, Philipp, Schubotz, Moritz, Cohl, Howard S., Breitinger, Corinna, and Gipp, Bela
- Abstract
Citation-based Information Retrieval (IR) methods for scientific documents have proven effective for IR applications, such as Plagiarism Detection or Literature Recommender Systems in academic disciplines that use many references. In science, technology, engineering, and mathematics, researchers often employ mathematical concepts through formula notation to refer to prior knowledge. Our long-term goal is to generalize citation-based IR methods and apply this generalized method to both classical references and mathematical concepts. In this paper, we suggest how mathematical formulas could be cited and define a Formula Concept Retrieval task with two subtasks: Formula Concept Discovery (FCD) and Formula Concept Recognition (FCR). While FCD aims at the definition and exploration of a 'Formula Concept' that names bundled equivalent representations of a formula, FCR is designed to match a given formula to a prior assigned unique mathematical concept identifier. We present machine learning-based approaches to address the FCD and FCR tasks. We then evaluate these approaches on a standardized test collection (NTCIR arXiv dataset). Our FCD approach yields a precision of 68% for retrieving equivalent representations of frequent formulas and a recall of 72% for extracting the formula name from the surrounding text. FCD and FCR enable the citation of formulas within mathematical documents and facilitate semantic search and question answering, as well as document similarity assessments for plagiarism detection or recommender systems. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
50. Empowering machine learning models with contextual knowledge for enhancing the detection of eating disorders in social media posts.
- Author
-
Benítez-Andrades, José Alberto, García-Ordás, María Teresa, Russo, Mayra, Sakor, Ahmad, Fernandes Rotger, Luis Daniel, and Vidal, Maria-Esther
- Subjects
MACHINE learning ,DEEP learning ,EATING disorders ,KNOWLEDGE graphs ,CONTEXTUAL learning ,SOCIAL media - Abstract
Social networks have become information dissemination channels, where announcements are posted frequently; they also serve as frameworks for debates in various areas (e.g., scientific, political, and social). In particular, in the health area, social networks represent a channel to communicate and disseminate novel treatments' success; they also allow ordinary people to express their concerns about a disease or disorder. The Artificial Intelligence (AI) community has developed analytical methods to uncover and predict patterns from posts that enable it to explain news about a particular topic, e.g., mental disorders expressed as eating disorders or depression. Albeit potentially rich while expressing an idea or concern, posts are presented as short texts, preventing, thus, AI models from accurately encoding these posts' contextual knowledge. We propose a hybrid approach where knowledge encoded in community-maintained knowledge graphs (e.g., Wikidata) is combined with deep learning to categorize social media posts using existing classification models. The proposed approach resorts to state-of-the-art named entity recognizers and linkers (e.g., Falcon 2.0) to extract entities in short posts and link them to concepts in knowledge graphs. Then, knowledge graph embeddings (KGEs) are utilized to compute latent representations of the extracted entities, which result in vector representations of the posts that encode these entities' contextual knowledge extracted from the knowledge graphs. These KGEs are combined with contextualized word embeddings (e.g., BERT) to generate a context-based representation of the posts that empower prediction models. We apply our proposed approach in the health domain to detect whether a publication is related to an eating disorder (e.g., anorexia or bulimia) and uncover concepts within the discourse that could help healthcare providers diagnose this type of mental disorder. We evaluate our approach on a dataset of 2,000 tweets about eating disorders. Our experimental results suggest that combining contextual knowledge encoded in word embeddings with the one built from knowledge graphs increases the reliability of the predictive models. The ambition is that the proposed method can support health domain experts in discovering patterns that may forecast a mental disorder, enhancing early detection and more precise diagnosis towards personalized medicine. [ABSTRACT FROM AUTHOR]
- Published
- 2023
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.