1. Advanced Methods for Web Information Mining
- Author
-
Lasić-Lazić, Jadranka, Seljan, Sanja, Stančić, Hrvoje, Lasić-Lazić, Jadranka, and Tkalec, Slavko
- Subjects
pronalaženje dokumenata ,relevantnost ,klasifikacija ,multimedijski dokumenti - Abstract
There is currently huge amount of data on the Web and almost no classification information. The key problem is how to embed knowledge into information mining algorithms. The authors analyse techniques of information retrieval and give their strong and weak points. Although most Web documents are text oriented, there are plenty of them that contain multimedia elements, which are not easily accessible through common search methods. Web information is dynamic, semi-structured, and interwound with hyperlinks. Several advanced methods for Web information mining are analyzed: 1) syntax analysis, 2) metadata-based searching using RDF, 3) knowledge annotation by use of conceptual graphs (CGs), 4) KPS: Keyword, Pattern, Sample search techniques, and 5) techniques of obtaining descriptions by fuzzification and back-propagation. The problem of choosing proper keywords is also stressed out. The authors suggest the usage of already accepted standards for classification hierarchy, such as Dewey Decimal Classification (DDC).
- Published
- 2002