Back to Search Start Over

Postal address extraction from the web: a comprehensive survey.

Authors :
Kayed, Mohammed
Dakrory, Sara
Ali, A. A.
Source :
Artificial Intelligence Review; Feb2022, Vol. 55 Issue 2, p1085-1120, 36p
Publication Year :
2022

Abstract

The Web is a source of information for Location-Based Service (LBS) applications. These applications lack postal addresses for the user's Point of Interests (POIs) such as schools, hospitals, restaurants, etc., as these locations are annotated manually by using the yellow pages or by the location owners (users/companies). Our study in this paper confirms that Google Maps, a common LBS application, only contains about 32.5 % of the public schools that are registered officially in the documents provided by the Directorate of Education in Egypt. However, the remaining missed school addresses could be fished from the Web (e.g., social media). To the best of our knowledge, no prior survey has been published to compare the previous Web postal address extraction approaches. Additionally, all proposed approaches for address extraction are local (could be working in specific countries/locations with particular languages) and could not be used or even adapted to work in other countries/locations with other languages. Furthermore, the problem of Web postal address extraction is not addressed in many countries such as Arab countries (e.g. Egypt). This paper discusses the issue of address extraction, highlights and compares the recently used techniques in extracting addresses from Web pages. In addition, it investigates the discrepancy of knowledge among existing systems. Moreover, it provides a comprehensive review of the geographical Gazetteers used in the Web postal address approaches and compares their data quality dimensions. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02692821
Volume :
55
Issue :
2
Database :
Complementary Index
Journal :
Artificial Intelligence Review
Publication Type :
Academic Journal
Accession number :
155185698
Full Text :
https://doi.org/10.1007/s10462-021-09983-1