Back to Search Start Over

Correction orthographique de requêtes: L’apport des distances de Levenshtein et Stoilos

Authors :
Élise Prieur-Gaston
Stéfan Jacques Darmoni
Zied Moalla
Lina Fatima Soualmia
Source :
Informatique et Santé ISBN: 9782817802848
Publication Year :
2011
Publisher :
Springer Paris, 2011.

Abstract

Background: Medical text repositories not only constitute a significant amount of data but also represent an interesting scientific test bed for those willing to apply natural language processing to information retrieval. In order to improve retrieval performance of the Catalogue and Index of Health Resources in French (CISMeF) and its search tool Doc’CISMeF, we tested a new method to correct misspellings of the queries written by the users. Methods: In addition to exact phonetic term matching, we tested two approximate string comparators. The approximate comparators are the string distance metric of Stoilos and the Levenshtein edit distance. We also calculated the results of the two-combined algorithm to examine whether it improves misspelling correction of the queries. Results: At a threshold comparator score of 0.2, the normalized Levenshtein algorithm achieved the highest recall of 76% but the highest precision 94% is achieved by combining the distances of Levenshtein and Stoilos. Conclusion: Although the well-known good performance of the normalized edit distance of Levenshtein, we have demonstrated in this paper that its combination with the Stoilos algorithm improves the results for misspelling correction.

Details

ISBN :
978-2-8178-0284-8
ISBNs :
9782817802848
Database :
OpenAIRE
Journal :
Informatique et Santé ISBN: 9782817802848
Accession number :
edsair.doi...........52b34f9a564afa2e0a13fd99f6aae5af
Full Text :
https://doi.org/10.1007/978-2-8178-0285-5_1