Back to Search
Start Over
A quantitative measure of the information leaked from queries to search engines and a scheme to reduce it
- Source :
- The Journal of Supercomputing. 73:2494-2505
- Publication Year :
- 2016
- Publisher :
- Springer Science and Business Media LLC, 2016.
-
Abstract
- In recent years, the opportunity to use search engines has increased due to the greater variety and number of Internet-capable devices. Search engines have become indispensable for many users, who provide vast amounts of information as input. However, there has been recent recognition of the risk entailed by search engine providers storing and analyzing information related to user privacy. Existing research evaluates protection of user privacy from search engines that try to extract related information from users' query strings (Jones et al. I know what you did last summer--query logs and user privacy. In: Proceedings of the sixteenth ACM conference on conference on information and knowledge management, pp. 909---914, 2007). The searchable encryption technique is effective when searching encrypted queries and is useful when users search their own data stored in external cloud storage. However, search engine providers make a profit based on the query strings of their many users, so they are not expected to adopt this approach. Private information retrieval (PIR) is an established technique which ensures that no information is leaked to the search engine. However, PIR is based on a model with strict limitations on retrieval and is impractical. In this paper, we define a measure to quantify the amount of information that is leaked during a search. The measure is defined based on the entropy of query strings. We propose a practical search scheme that reduces the amount of leaked information. The proposed scheme is simple and can be implemented using a typical personal computer. We evaluate the system by experiment and confirm that the proposed scheme works as intended and is of acceptable usability.
- Subjects :
- 020203 distributed computing
Information retrieval
Concept search
Query string
Computer science
business.industry
Search analytics
02 engineering and technology
Encryption
Theoretical Computer Science
Search-oriented architecture
Search engine
Query expansion
Hardware and Architecture
0202 electrical engineering, electronic engineering, information engineering
business
Metasearch engine
Private information retrieval
Software
Information Systems
Subjects
Details
- ISSN :
- 15730484 and 09208542
- Volume :
- 73
- Database :
- OpenAIRE
- Journal :
- The Journal of Supercomputing
- Accession number :
- edsair.doi...........1a8d1a564321360e11c4815d516f0c7b