Back to Search
Start Over
Paraphrase type identification for plagiarism detection using contexts and word embeddings
- Source :
- International Journal of Educational Technology in Higher Education, Vol 18, Iss 1, Pp 1-25 (2021)
- Publication Year :
- 2021
- Publisher :
- Springer Science and Business Media LLC, 2021.
-
Abstract
- Paraphrase types have been proposed by researchers as the paraphrasing mechanisms underlying acts of plagiarism. Synonymous substitution, word reordering and insertion/deletion have been identified as some of the common paraphrasing strategies used by plagiarists. However, similarity reports generated by most plagiarism detection systems provide a similarity score and produce matching sections of text with their possible sources. In this research we propose methods to identify two important paraphrase types – synonymous substitution and word reordering in paraphrased, plagiarised sentence pairs. We propose a three staged approach that uses context matching and pretrained word embeddings for identifying synonymous substitution and word reordering. Our proposed approach indicates that the use of Smith Waterman Algorithm for Plagiarism Detection and ConceptNet Numberbatch pretrained word embeddings produces the best performance in terms of $$\hbox {F}_1$$ F 1 scores. This research can be used to complement similarity reports generated by currently available plagiarism detection systems by incorporating methods to identify paraphrase types for plagiarism detection.
- Subjects :
- Computer science
Context (language use)
Information technology
02 engineering and technology
computer.software_genre
Plagiarism
Paraphrase
Education
Word reordering
Similarity (network science)
Context matching
0202 electrical engineering, electronic engineering, information engineering
Plagiarism detection
LC8-6691
business.industry
05 social sciences
050301 education
T58.5-58.64
Special aspects of education
Computer Science Applications
Paraphrase types
020201 artificial intelligence & image processing
Artificial intelligence
Synonymous substitution
Complement (linguistics)
business
0503 education
computer
Word (computer architecture)
Sentence
Natural language processing
Word order
Subjects
Details
- ISSN :
- 23659440
- Volume :
- 18
- Database :
- OpenAIRE
- Journal :
- International Journal of Educational Technology in Higher Education
- Accession number :
- edsair.doi.dedup.....1ad568a983f7d4cf78cdba1123217fc4
- Full Text :
- https://doi.org/10.1186/s41239-021-00277-8