Back to Search
Start Over
PeCoQ: A Dataset for Persian Complex Question Answering over Knowledge Graph
- Publication Year :
- 2021
- Publisher :
- arXiv, 2021.
-
Abstract
- Question answering systems may find the answers to users' questions from either unstructured texts or structured data such as knowledge graphs. Answering questions using supervised learning approaches including deep learning models need large training datasets. In recent years, some datasets have been presented for the task of Question answering over knowledge graphs, which is the focus of this paper. Although many datasets in English were proposed, there have been a few question-answering datasets in Persian. This paper introduces \textit{PeCoQ}, a dataset for Persian question answering. This dataset contains 10,000 complex questions and answers extracted from the Persian knowledge graph, FarsBase. For each question, the SPARQL query and two paraphrases that were written by linguists are provided as well. There are different types of complexities in the dataset, such as multi-relation, multi-entity, ordinal, and temporal constraints. In this paper, we discuss the dataset's characteristics and describe our methodology for building it.<br />Comment: 5 pages, 4 figures
- Subjects :
- FOS: Computer and information sciences
Computer Science - Artificial Intelligence
Computer science
Complex question
02 engineering and technology
computer.software_genre
01 natural sciences
Task (project management)
020204 information systems
0202 electrical engineering, electronic engineering, information engineering
Question answering
SPARQL
Persian
Computer Science - Computation and Language
business.industry
Deep learning
010401 analytical chemistry
Supervised learning
computer.file_format
language.human_language
0104 chemical sciences
Focus (linguistics)
Artificial Intelligence (cs.AI)
language
Artificial intelligence
business
computer
Computation and Language (cs.CL)
Natural language processing
Subjects
Details
- Database :
- OpenAIRE
- Accession number :
- edsair.doi.dedup.....6617cd5dd09ebb5344b42fcdd4d58717
- Full Text :
- https://doi.org/10.48550/arxiv.2106.14167