1. Refinement of an Automatic Method for Indexing Medical Literature – a Preliminary Study.
- Author
-
Joubert, Michel, Peretti, Anne-Laure, Gouvernet, Joanny, and Fieschi, Marius
- Abstract
Objectives: to rank according to their significance MeSH terms automatically extracted from Internet sites in the framework of a French project, VUMeF, a contribution to the NLM' UMLS project. Material and methods: scores are affected to key-words of a given document on the basis of the Semantic Network of the UMLS and frequencies of co-occurring major terms in the Medline literature. If N is the number of major terms of a document, and n is the number of major terms retrieved in the N first terms ranked in descending order according to their scores, the measure of the achievement of the method is n/N. Results: a set of 1444 randomized documents have been extracted from Medline. For each document we computed the retrieved major terms among the first N terms with two methods: a statistical method using only frequencies given by co-occurrences, and our method that uses furthermore the UMLS semantic network. In 34% of cases corresponding to documents indexed by about 16 key-words, about 3 major terms among them, our method produces a better precision (7%) than the statistical method. Discussion: the rough calculation of the proportion of retrieved major terms should be enhanced by the use of a probability law allowing to enlarge the list of terms to select taking into account both the number of major terms and the total number of key-words used to index each document. [ABSTRACT FROM AUTHOR]
- Published
- 2005