25 results on '"Elena Simperl"'
Search Results
2. What we talk about when we talk about wikidata quality
- Author
-
Alessandro Piscopo and Elena Simperl
- Subjects
Quality dimensions ,Trustworthiness ,Computer science ,Collaborative knowledge ,Data quality ,010401 analytical chemistry ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Literature survey ,01 natural sciences ,Data science ,0104 chemical sciences - Abstract
Launched in 2012, Wikidata has already become a success story. It is a collaborative knowledge graph, whose large community has produced so far data about more than 55 million entities. Understanding the quality of the data in Wikidata is key to its widespread adoption and future development. No study has investigated so far to what extent and which aspects of this topic have been addressed. To fill this gap, we surveyed prior literature about data quality in Wikidata. Our analysis includes 28 papers and categorise by quality dimensions addressed. We showed that a number of quality dimensions has not been yet adequately covered, e.g. accuracy and trustworthiness. Future work should focus on these.
- Published
- 2019
3. Hybrid Human Machine workflows for mobility management
- Author
-
Richard Gomer, Fausto Giunchiglia, Donglei Song, Luis-Daniel Ibáñez, Elena Simperl, Mattia Zeni, and Eddy Maddalena
- Subjects
Computer science ,business.industry ,0211 other engineering and technologies ,020207 software engineering ,021107 urban & regional planning ,02 engineering and technology ,Crowdsourcing ,Data science ,Field (computer science) ,Workflow ,11. Sustainability ,Scalability ,0202 electrical engineering, electronic engineering, information engineering ,Reinforcement learning ,Human–machine system ,business ,Mobility management - Abstract
Sustainable mobility is one of the main goals of both European and United Nations plans for 2030. The concept of Smart Cities has arisen as a way to achieve this goal by leveraging IoT interconnected devices to collect and analyse large quantities of data. However, several works have pointed out the importance of including the human factor, and in particular, citizens, to make sense of the collected data and ensure their engagement along the data value chain. This paper presents the design and implementation of two end-to-end hybrid human-machine workflows for solving two mobility problems: modal split estimation, and mapping mobility infrastructure. For modal split, we combine the use of i-Log, an app to collect data and interact with citizens, with reinforcement learning classifiers to continuously improve the accuracy of the classification, aiming at reducing the required interactions from citizens. For mobility infrastructure, we developed a system that uses remote crowdworkers to explore the city looking for Points of Interest, that is more scalable than sending agents on the field. Crowdsourced maps are then fused with existing maps (if available) to create a final map that then is validated on the field by citizens engaged through the i-Log app.
- Published
- 2019
4. Analysis of Editors' Languages in Wikidata
- Author
-
Lucie-Aimée Kaffee and Elena Simperl
- Subjects
World Wide Web ,Knowledge base ,business.industry ,Computer science ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Variety (linguistics) ,business - Abstract
Wikidata is unique as a knowledge base as well as a community given its users contribute together to one cross-lingual project. To create a truly multilingual knowledge base, a variety of languages of contributors is needed. In this paper, we investigate the language distribution in Wikidata's editors, how it relates to Wikidata's content and the users' label editing. This gives us an insight into its community that can help supporting users working on multilingual projects.
- Published
- 2018
5. The Trials and Tribulations of Working with Structured Data
- Author
-
Laura Koesten, Emilia Kacprzak, Jenifer F. A. Tennison, and Elena Simperl
- Subjects
Information seeking ,Computer science ,020207 software engineering ,Context (language use) ,02 engineering and technology ,Data science ,Domain (software engineering) ,Task (project management) ,World Wide Web ,Identification (information) ,Resource (project management) ,Work (electrical) ,Data model ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Set (psychology) - Abstract
Structured data such as databases, spreadsheets and web tables is becoming critical in every domain and professional role. Yet we still do not know much about how people interact with it. Our research focuses on the information seeking behaviour of people looking for new sources of structured data online, including the task context in which the data will be used, data search, and the identification of relevant datasets from a set of possible candidates. We present a mixed-methods study covering in-depth interviews with 20 participants with various professional backgrounds, supported by the analysis of search logs of a large data portal. Based on this study, we propose a framework for human structured-data interaction and discuss challenges people encounter when trying to find and assess data that helps their daily work. We provide design recommendations for data publishers and developers of online data platforms such as data catalogs and marketplaces. These recommendations highlight important questions for HCI research to improve how people engage and make use of this incredibly useful online resource.
- Published
- 2017
6. HARE
- Author
-
Elena Simperl, Maribel Acosta, Maria-Esther Vidal, and Fabian Flöck
- Subjects
Information retrieval ,business.industry ,Computer science ,RDF Schema ,InformationSystems_INFORMATIONSTORAGEANDRETRIEVAL ,InformationSystems_DATABASEMANAGEMENT ,computer.file_format ,Missing data ,Crowdsourcing ,Hybrid system ,SPARQL ,RDF ,business ,Completeness (statistics) ,computer ,RDF query language ,computer.programming_language - Abstract
Due to the semi-structured nature of RDF data, missing values affect answer completeness of queries that are posed against RDF. To overcome this limitation, we present HARE, a novel hybrid query processing engine that brings together machine and human computation to execute SPARQL queries. We propose a model that exploits the characteristics of RDF in order to estimate the completeness of portions of a data set. The completeness model complemented by crowd knowledge is used by the HARE query engine to on-the-fly decide which parts of a query should be executed against the data set or via crowd computing. To evaluate HARE, we created and executed a collection of 50 SPARQL queries against the DBpedia data set. Experimental results clearly show that our solution accurately enhances answer completeness.
- Published
- 2015
7. Thematically Analysing Social Network Content During Disasters Through the Lens of the Disaster Management Lifecycle
- Author
-
Mark J. Weal, Peter M. Atkinson, Sophie Parsons, and Elena Simperl
- Subjects
Situation awareness ,Social network ,Emergency management ,business.industry ,Event (computing) ,Computer science ,Timeline ,Context (language use) ,Data science ,World Wide Web ,Credibility ,InformationSystems_MISCELLANEOUS ,Thematic analysis ,Natural disaster ,business - Abstract
Social Networks such as Twitter are often used for disseminating and collecting information during natural disasters. The potential for its use in Disaster Management has been acknowledged. However, more nuanced understanding of the communications that take place on social networks are required to more effectively integrate this information into the processes within disaster management. The type and value of information shared should be assessed, determining the benefits and issues, with credibility and reliability as known concerns. Mapping the tweets in relation to the modelled stages of a disaster can be a useful evaluation for determining the benefits/drawbacks of using data from social networks, such as Twitter, in disaster management.A thematic analysis of tweets' content, language and tone during the UK Storms and Floods 2013/14 was conducted. Manual scripting was used to determine the official sequence of events, and classify the stages of the disaster into the phases of the Disaster Management Lifecycle, to produce a timeline. Twenty-five topics discussed on Twitter emerged, and three key types of tweets, based on the language and tone, were identified. The timeline represents the events of the disaster, according to the Met Office reports, classed into B. Faulkner's Disaster Management Lifecycle framework. Context is provided when observing the analysed tweets against the timeline. This illustrates a potential basis and benefit for mapping tweets into the Disaster Management Lifecycle phases. Comparing the number of tweets submitted in each month with the timeline, suggests users tweet more as an event heightens and persists. Furthermore, users generally express greater emotion and urgency in their tweets.This paper concludes that the thematic analysis of content on social networks, such as Twitter, can be useful in gaining additional perspectives for disaster management. It demonstrates that mapping tweets into the phases of a Disaster Management Lifecycle model can have benefits in the recovery phase, not just in the response phase, to potentially improve future policies and activities.
- Published
- 2015
8. Session details: SOCM 2015
- Author
-
Matthew S. Weber, Nigel Shadbolt, Elena Simperl, Thanassis Tiropanis, Wendy Hall, and David De Roure
- Subjects
Multimedia ,Computer science ,Session (computer science) ,computer.software_genre ,computer - Published
- 2015
9. Quick-and-clean extraction of linked data entities from microblogs
- Author
-
Nigel Shadbolt, Markus Luczak-Roesch, Oluwaseyi Feyisetan, Elena Simperl, and Ramine Tinati
- Subjects
Information retrieval ,Process (engineering) ,Microblogging ,Computer science ,Semantic analysis (machine learning) ,Core ontology ,Sample (statistics) ,Linked data ,computer.software_genre ,Metadata ,Set (abstract data type) ,Social media ,Data mining ,computer - Abstract
In this paper, we address the problem of finding Named Entities in very large micropost datasets. We propose methods to generate a sample of representative microposts by discovering tweets that are likely to refer to new entities. Our approach is able to significantly speed-up the semantic analysis process by discarding retweets, tweets without pre-identifiable entities, as well similar and redundant tweets, while retaining information content.We apply the approach on a corpus of 1:4 billion microposts, using the IE services of AlchemyAPI, Calais, and Zemanta to identify more than 700,000 unique entities. For the evaluation we compare runtime and number of entities extracted based on the full and the downscaled version of a micropost set. We are able to demonstrate that for datasets of more than 10 million tweets we can achieve a reduction in size of more than 80% while maintaining up to 60% coverage on unique entities cumulatively discovered by the three IE tools.We publish the resulting Twitter metadata as Linked Data using SIOC and an extension of the NERD core ontology.
- Published
- 2014
10. Motivations of citizen scientists
- Author
-
Nigel Shadbolt, Elena Simperl, Ramine Tinati, and Markus Luczak-Roesch
- Subjects
Knowledge management ,business.industry ,Citizen science ,Sociology ,Public relations ,business - Published
- 2014
11. Session details: Theory and practice of social machines 2014 workshop
- Author
-
Nigel Shadbolt, Noshir Contractor, James A. Hendler, and Elena Simperl
- Subjects
Multimedia ,Computer science ,Social machines ,Session (computer science) ,computer.software_genre ,computer - Published
- 2014
12. Revisiting reverts
- Author
-
Fabian Flöck, Denny Vrandecic, and Elena Simperl
- Subjects
MD5 ,Information retrieval ,Computer science ,User modeling ,Scale (chemistry) ,Hash function ,Collaboration ,Noise (video) ,State (computer science) ,Data mining ,computer.software_genre ,computer ,Word (computer architecture) - Abstract
Wikipedia is commonly used as a proving ground for research in collaborative systems. This is likely due to its popularity and scale, but also to the fact that large amounts of data about its formation and evolution are freely available to inform and validate theories and models of online collaboration. As part of the development of such approaches, revert detection is often performed as an important pre-processing step in tasks as diverse as the extraction of implicit networks of editors, the analysis of edit or editor features and the removal of noise when analyzing the emergence of the content of an article. The current state of the art in revert detection is based on a rather naive approach, which identifies revision duplicates based on MD5 hash values. This is an efficient, but not very precise technique that forms the basis for the majority of research based on revert relations in Wikipedia. In this paper we prove that this method has a number of important drawbacks - it only detects a limited number of reverts, while simultaneously misclassifying too many edits as reverts, and not distinguishing between complete and partial reverts. This is very likely to hamper the accurate interpretation of the findings of revert-related research. We introduce an improved algorithm for the detection of reverts based on word tokens added or deleted to adresses these drawbacks. We report on the results of a user study and other tests demonstrating the considerable gains in accuracy and coverage by our method, and argue for a positive trade-off, in certain research scenarios, between these improvements and our algorithm's increased runtime.
- Published
- 2012
13. Crowdsourcing semantic data management
- Author
-
Elena Simperl
- Subjects
Information retrieval ,Named graph ,Computer science ,RDF Schema ,SPARQL ,Linked data ,computer.file_format ,RDF ,computer ,RDF/XML ,Blank node ,RDF query language ,computer.programming_language - Abstract
Linked Data refers to a set of guidelines and best practices for publishing and accessing structured data on the Web. It builds upon established Web technologies, in particular HTTP and URIs, extended with Semantic Web representation formats and protocols such as RDF, RDFS, OWL and SPARQL, by which data from different sources can be shared, interconnected and used beyond the application scenarios for which it was originally created. RDF is a central building block of the Linked Data technology stack. It is a graph-based data model based on the idea of making statements about (information and non-information) resources on the Web in terms of triples of the form subject predicate object. The object of any RDF triple may be used in the subject position in other triples, leading to a directed, labeled graph typically referred to as an 'RDF graph'. Both nodes and edges in such graphs are identified via URIs; nodes represent Web resources, while edges stand for attributes of such resources or properties connecting them. Schema information can be expressed using languages such as RDFS and OWL, by which resources can be typed as classes described in terms of domain-specific attributes, properties and constraints. RDF graphs can be natively queried using the query language SPARQL. A SPARQL query is composed of graph patterns and can be stored as RDF triples together with any RDF domain model using SPIN to facilitate the definition of constraints and inference rules in ontologies.
- Published
- 2012
14. Comparison of wiki-based process modeling systems
- Author
-
Frank Dengler, Denny Vrandecic, and Elena Simperl
- Subjects
Business process discovery ,Semantic wiki ,Process elicitation ,Process modeling ,Knowledge management ,Web 2.0 ,Computer science ,Artifact-centric business process model ,business.industry ,Business process modeling ,Software engineering ,business ,nobody - Abstract
As traditional process elicitation methods are expensive and time consuming, a trend toward collaborative, user-centric, on-line business process modeling can be observed. A common proposal in this area is the use of a semantic wiki-based light-weight knowledge capturing tool for collaborative process development. Although different frameworks have been proposed, nobody has compared the systems against existing requirements for collaborative maturing of processes. To address this issue we provide a comparison framework on the basis of these rewquierments, which we used to compare existing approaches.
- Published
- 2011
15. Deriving human-readable labels from SPARQL queries
- Author
-
Basil Ell, Denny Vrandecic, and Elena Simperl
- Subjects
Structure (mathematical logic) ,Set (abstract data type) ,Variable (computer science) ,Information retrieval ,Named graph ,Computer science ,Interface (Java) ,SPARQL ,computer.file_format ,Linked data ,computer ,Semantic Web - Abstract
Over 80% of entities on the Semantic Web lack a human-readable label. This hampers the ability of any tool that uses linked data to offer a meaningful interface to human users. We argue that methods for deriving human-readable labels are essential in order to allow the usage of the Web of Data. In this paper we explore, implement, and evaluate a method for deriving human-readable labels based on the variable names used in a large corpus of SPARQL queries that we built from a set of log files. We analyze the structure of the SPARQL graph patterns and offer a classification scheme for graph patterns. Based on this classification, we identify graph patterns that allow us to derive useful labels. We also provide an overview over the current usage of SPARQL in the newly built corpus.
- Published
- 2011
16. Wikiing pro
- Author
-
Elena Simperl, Denny Vrandecic, and Frank Dengler
- Subjects
Focus (computing) ,Process modeling ,Web 2.0 ,Computer science ,business.industry ,Process (engineering) ,Social software ,Representation (arts) ,Reuse ,computer.software_genre ,World Wide Web ,Artificial intelligence ,business ,computer ,Natural language ,Natural language processing - Abstract
Recently, a trend toward collaborative, user-centric, on-line process modeling can be observed. Unfortunately, current social software approaches mostly focus on the graphical development of processes and do not consider existing textual process description like HowTos or guidelines. We address this issue by combining graphical process modeling techniques with a wiki-based light-weight knowledge capturing approach and a background semantic knowledge base. Our approach enables the collaborative maturing of process descriptions with a graphical representation, formal semantic annotations, and natural language. By translating existing textual process descriptions into graphical descriptions and formal semantic annotations, we provide a holistic approach for collaborative process development that is designed to foster knowledge reuse and maturing within the system.
- Published
- 2011
17. Towards a diversity-minded Wikipedia
- Author
-
Fabian Flöck, Denny Vrandecic, and Elena Simperl
- Subjects
business.industry ,media_common.quotation_subject ,Sentiment analysis ,World Wide Web ,Social dynamics ,Sustainability ,Encyclopedia ,Semantic technology ,Quality (business) ,Sociology ,business ,GeneralLiterature_REFERENCE(e.g.,dictionaries,encyclopedias,glossaries) ,Content management ,media_common ,Diversity (politics) - Abstract
Wikipedia is a top-ten Web site providing a free encyclopedia created by an open community of volunteer contributors. As investigated in various studies over the past years, contributors have different backgrounds, mindsets and biases; however, the effects - positive and negative - of this diversity on the quality of the Wikipedia content, and on the sustainability of the overall project are yet only partially understood. In this paper we discuss these effects through an analysis of existing scholarly literature in the area and identify directions for future research and development; we also present an approach for diversity-minded content management within Wikipedia that combines techniques from semantic technologies, data and text mining and quantitative social dynamics analysis to create greater awareness of diversity-related issues within the Wikipedia community, give readers access to indicators and metrics to understand biases and their impact on the quality of Wikipedia articles, and support editors in achieving balanced versions of these articles that leverage the wealth of knowledge and perspectives inherent to large-scale collaboration.
- Published
- 2011
18. PlayIT 2011
- Author
-
Katharina Siorpaes, Elena Simperl, Arpita Ghosh, and Michael Fink
- Subjects
Competition (economics) ,Intervention (law) ,Knowledge management ,Incentive ,Order (exchange) ,business.industry ,Computer science ,Context (language use) ,business ,Knowledge acquisition - Abstract
Many problems in knowledge acquisition, such as image labeling, still rely on extensive human input and intervention. In order to attract people to invest the necessary time into such tasks, rewarding incentives and motivation mechanisms have been employed. While recruting "human cycles" for such tasks is difficult, online games manage to attract plenty of attention (due to the fact that they provide inherent incentives, such as fun and competition). This workshop focuses on games that embed various knowledge acquisition tasks into the context of online games, with the end goal of attracting the sufficient manual labor.
- Published
- 2011
19. DiversiWeb 2011
- Author
-
Elena Simperl, Devika P. Madalli, Denny Vrandečić, and Enrique Alfonseca
- Published
- 2011
20. SpotTheLink
- Author
-
Stefan Thaler, Katharina Siorpaes, and Elena Simperl
- Subjects
World Wide Web ,Human intelligence ,Computer science ,Process ontology ,Ontology ,Upper ontology ,Web service ,Ontology (information science) ,computer.software_genre ,Ontogame ,Ontology engineering ,computer ,Ontology alignment - Abstract
A large share of tasks in semantic-content authoring crucially rely on human intelligence [4]. This holds for many aspects of ontology engineering, but also for ontology-based annotation, be that for data-oriented resources, such as images, audio and video content, or for functionality, such as Web services and APIs. In previous work of ours we have extensively discussed the importance of motivators and incentive mechanisms to encourage a critical mass of Internet users - in particular, users beyond the boundaries of the semantic-technologies community - to contribute to such inherently human-driven tasks. Through OntoGame we have provided a framework for casual games which capitalizes on fun and competition as two key motivators for people to willingly invest their valuable time and effort in semantic-technologies-related tasks, whose technical details hide behind an entertaining collaboratively game experience [2]. This paper presents the newest release of the OntoGame series, called SpotTheLink, which addresses this challenge in the area of ontology alignment.
- Published
- 2011
21. Semantic web service engineering for semantic business process management
- Author
-
Barry Norton, Mick Kerrigan, Dieter Fensel, and Elena Simperl
- Subjects
World Wide Web ,Semantic grid ,business.industry ,Computer science ,Semantic computing ,Semantic analytics ,Semantic technology ,Semantic Web Stack ,business ,Semantic Web ,Social Semantic Web ,Data Web - Abstract
The Semantic Business Process Management (SPBM) approach from the SUPER project utilizes a Semantic Execution Environment (SEE) for the automatic discovery, composition, mediation, and invocation of Web services. In order to enable the Semantic Execution Environment, an engineer must create semantic descriptions of functional, nonfunctional, and behavioural aspects of Web services and enduser requirements. In this paper we take the first step into the emerging field of Semantic Web Service engineering by identifying a number of application scenarios within which Semantic Web Services can be used and identifying the different engineering activities that the engineer must perform in order to enable these scenarios. Information was elicited directly from those who are actively developing Semantic Web Services by the use of survey. Thus the scenarios are built based on direct input from a cross section of the Semantic Web Service community. The results in this paper act as a starting point for the Semantic Web Service engineering methodology that we are currently in the process of developing.
- Published
- 2009
22. Towards Wikis as semantic hypermedia
- Author
-
Robert Tolksdorf and Elena Simperl
- Subjects
Computer science ,business.industry ,Hypermedia ,Field (computer science) ,Social Semantic Web ,law.invention ,World Wide Web ,law ,Web navigation ,Semantic Web Stack ,Architecture ,Personal wiki ,business ,Semantic Web - Abstract
Similarly to the Web Wikis have advanced from initially simple ad-hoc solutions to highly popular systems of widespread use. This evolution is reflected by the impressive number of Wiki engines available and by the numerous settings and disciplines they have found applicability to in the last decade. In conjunction to these rapid advances the question on the fundamental principles underlying the design and the architecture of Wiki technologies becomes inevitable for their systematic further development and their long-lasting success at public, private and corporate level. This paper aims at be part of this endeavor; building upon the natural relationship between Wikis and hypermedia, we examine to which extent the current state of the art in the field (complemented by results achieved in adjacent communities such as the World Wide Web and the Semantic Web) fulfills the requirements of modern hypermedia systems. As a conclusion of the study we outline further directions of research and development which are expected to contribute to the realization of this vision.
- Published
- 2006
23. Ranking Knowledge Graphs By Capturing Knowledge about Languages and Labels
- Author
-
Kemele M. Endris, Lucie-Aimée Kaffee, Elena Simperl, and Maria-Esther Vidal
- Subjects
Information retrieval ,Degree (graph theory) ,Knowledge graph ,Computer science ,020204 information systems ,0202 electrical engineering, electronic engineering, information engineering ,Benchmark (computing) ,Question answering ,020201 artificial intelligence & image processing ,02 engineering and technology ,Granularity ,Representation (mathematics) ,Ranking (information retrieval) - Abstract
Capturing knowledge about the mulitilinguality of a knowledge graph is of supreme importance to understand its applicability across multiple languages. Several metrics have been proposed for describing mulitilinguality at the level of a whole knowledge graph. Albeit enabling the understanding of the ecosystem of knowledge graphs in terms of the utilized languages, they are unable to capture a fine-grained description of the languages in which the different entities and properties of the knowledge graph are represented. This lack of representation prevents the comparison of existing knowledge graphs in order to decide which are the most appropriate for a multilingual application.In this work, we approach the problem of ranking knowledge graphs based on their language features and propose LINGVO, a framework able to capture mulitilinguality at different levels of granularity. Grounded in knowledge graph descriptions, LINGVO is, additionally, able to solve the problem of ranking knowledge graphs according to a degree of mulitilinguality of the represented entities. We have empirically studied the effectiveness of LINGVO in a benchmark of queries to be executed against existing knowledge graphs. The observed results provide evidence that LINGVO captures the mulitilinguality of the studied knowledge graphs similarly than a crowd-sourced gold standard.
- Full Text
- View/download PDF
24. Companion of The Web Conference 2022, Virtual Event / Lyon, France, April 25 - 29, 2022
- Author
-
Frédérique Laforest, Raphaël Troncy, Elena Simperl, Deepak Agarwal, Aristides Gionis, Ivan Herman, and Lionel Médini
- Published
- 2022
- Full Text
- View/download PDF
25. WWW '22: The ACM Web Conference 2022, Virtual Event, Lyon, France, April 25 - 29, 2022
- Author
-
Frédérique Laforest, Raphaël Troncy, Elena Simperl, Deepak Agarwal, Aristides Gionis, Ivan Herman, and Lionel Médini
- Published
- 2022
- Full Text
- View/download PDF
Catalog
Discovery Service for Jio Institute Digital Library
For full access to our library's resources, please sign in.