Back to Search Start Over

Measuring Privacy Disclosures in URL Query Strings

Authors :
Adam J. Aviv
Andrew G. West
Source :
IEEE Internet Computing. 18:52-59
Publication Year :
2014
Publisher :
Institute of Electrical and Electronics Engineers (IEEE), 2014.

Abstract

Publicly posted URLs sometimes contain a wealth of information about the identities and activities of the users who share them. URLs often utilize query strings -- that is, key-value pairs appended to the URL path -- to pass session parameters and form data. Although often benign and necessary to render the Web page, query strings sometimes contain tracking mechanisms, usernames, email addresses, and other information that users might not wish to publicly reveal. In isolation, this isn't particularly problematic, but the growth of Web 2.0 platforms such as social networks and microblogging means URLs, which are often copied and pasted from Web browsers, are increasingly publicly broadcast. To study URL sharing's privacy ramifications, the authors ran a measurement study that looked at 892 million user-submitted URLs, many disseminated in semipublic forums. That corpus contained a trove of personal information, including 1.7 million email addresses. In the most egregious examples, query strings contain plaintext usernames and passwords for administrative and sensitive accounts. The authors identify data leakage via both key-driven and value-driven analysis using manual inspections and automatic detection logic. Additionally, they analyze the click-through rates of sensitive URLs, examine geographical and mobile behavior patterns, and measure the broader statistical properties of key-value pairs. Finally, they propose a CleanURL service that can "scrub"' URLs of privacy-violating content.

Details

ISSN :
19410131 and 10897801
Volume :
18
Database :
OpenAIRE
Journal :
IEEE Internet Computing
Accession number :
edsair.doi...........addc1ea5133eb11cb1f1ced9394b4c41