1. SVM based extraction of spatial relations in text
- Author
-
Chaoli Du, Xueying Zhang, Chunju Zhang, and Shaonan Zhu
- Subjects
Geospatial analysis ,Computer science ,business.industry ,Feature vector ,Feature extraction ,Spatial intelligence ,computer.software_genre ,Spatial query ,Information extraction ,Spatial relation ,Artificial intelligence ,business ,computer ,Spatial analysis ,Natural language processing - Abstract
Natural language text describes the nature of people's internal representation of space. It is investigated that 80% of unstructured text has location expressions e.g. place names and spatial relations. In the past few years, text has become a most important geospatial data resource as well as survey, map, satellite images and GPS. The most previous research focused on the recognition of place names in text and its integration with map services. Spatial relations play an important role in the fields of spatial data modelling, spatial query, spatial analysis, spatial reasoning and map generalization. Spatial relations in text are described in natural language with qualitative spatial expressions including place names, spatial terms, prepositions, verbs and so on. And these expressions are combined with certain syntactic patterns to represent their semantic functions. An instance of spatial relation in text can be simply formalized as (C1, P1, C2, P2, C3), where P1 and P2 are place names, and C1, C2 and C3 are the context. Support Vector Machine (SVM) is a pattern recognition method popularly used in information extraction from text. This paper investigates the extraction of spatial relations based on SVM model which can implement the recognition of spatial expressions and their classification synchronously. For the SVM model, a set of feature vectors are specified, such as lexical tokens, spatial terms, syntactic structures and geographical feature types of place names, and a multi-label classifier is presented to solve the multi-classification problem. Finally, an experimental evaluation is explored in a Chinese annotation corpus. This study proves that spatial terms are important indicators for identification of spatial relations in text. However, there is serious ambiguity of their classification. Therefore, integration of much more context information could potentially improve the performance of extraction of spatial relations in text.
- Published
- 2011