Back to Search Start Over

SG-WSTD: A framework for scalable geographic web search topic discovery.

Authors :
Jiang, Di
Vosecky, Jan
Leung, Kenneth Wai-Ting
Yang, Lingxiao
Ng, Wilfred
Source :
Knowledge-Based Systems. Aug2015, Vol. 84, p18-33. 16p.
Publication Year :
2015

Abstract

Search engine query logs are recognized as an important information source that contains millions of users’ web search needs. Discovering Geographic Web Search Topics (G-WSTs) from a query log can support a variety of downstream web applications such as finding commonality between locations and profiling search engine users. However, the task of discovering G-WSTs is nontrivial, not only because of the diversity of the information in web search but also due to the sheer size of query log. In this paper, we propose a new framework, Scalable Geographic Web Search Topic Discovery (SG-WSTD), which contains highly scalable functionalities such as search session derivation, geographic information extraction and geographic web search topic discovery to discover G-WSTs from query log. Within SG-WSTD, two probabilistic topic models are proposed to discover G-WSTs from two complementary perspectives. The first one is the Discrete Search Topic Model (DSTM), which discovers G-WSTs that capture the commonalities between discrete locations. The second is the Regional Search Topic Model (RSTM), which focuses on a specific geographic region on the map and discovers G-WSTs that demonstrate geographic locality. Since query log is typically voluminous, we implement the functionalities in SG-WSTD based on the MapReduce paradigm to solve the efficiency bottleneck. We evaluate SG-WSTD against several strong baselines on a real-life query log from AOL. The proposed framework demonstrates significantly improved data interpretability, better prediction performance, higher topic distinctiveness and superior scalability in the experimentation. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
09507051
Volume :
84
Database :
Academic Search Index
Journal :
Knowledge-Based Systems
Publication Type :
Academic Journal
Accession number :
102658704
Full Text :
https://doi.org/10.1016/j.knosys.2015.03.020