Back to Search Start Over

Composite leading search index: a preprocessing method of internet search data for stock trends prediction.

Authors :
Liu, Ying
Chen, Yibing
Wu, Sheng
Peng, Geng
Lv, Benfu
Source :
Annals of Operations Research. Nov2015, Vol. 234 Issue 1, p77-94. 18p.
Publication Year :
2015

Abstract

Previous studies have revealed that Internet search data is a new source of data that can be used to predict the stock market. In this new, data-driven research field, choosing a method for preprocessing data is crucial to achieving accurate prediction performance. This paper proposes a preprocessing method of Internet search data: composite leading search index (CLSI), which is composed of three steps: (a) keyword selection, (b) time difference measurement, and (c) leading index composition. We demonstrate the validity of CLSI by comparing this method's results with the results from search volume index (SVI), which is most commonly used in previous literatures. We build a time series model (TS) with error correction and support vector regression (SVR) for stock trend prediction, and combine into four models for comparison: SVI-TS, CLSI-TS, SVI-SVR, and CLSI-SVR. We test these four models in the context of the Chinese stock market, which interests more and more investors nowadays, and analyzed results in nine datasets: stable periods, peak periods and trough periods of Shanghai Composite Index, Shenzhen Composite Index, and Hushen 300 index respectively. The results show that using TS and SVR as forecasting models, CLSI performs better than SVI on majority of the test dataset while has almost the same performance with that of SVI on the remaining test dataset. It is to some extent convincing that CLSI is a more efficient preprocessing method of Internet search data for stock trend prediction. [ABSTRACT FROM AUTHOR]

Details

Language :
English
ISSN :
02545330
Volume :
234
Issue :
1
Database :
Academic Search Index
Journal :
Annals of Operations Research
Publication Type :
Academic Journal
Accession number :
110203524
Full Text :
https://doi.org/10.1007/s10479-014-1779-z