1. Synthetic Browsing Histories for 50 Countries Worldwide: Datasets for Research, Development, and Education.
- Author
-
Komosny, Dan, Rehman, Saeed Ur, and Ayub, Muhammad Sohaib
- Subjects
INTERNET content ,PERSONALLY identifiable information ,ANONYMITY ,INTERNET security ,POPULARITY - Abstract
Browsing histories can be a valuable resource for cybersecurity, research, and testing. Individuals are often reluctant to share their browsing histories online, and the use of personal data requires obtaining signed informed consent. Research shows that anonymized histories can lead to re-identification, nullifying the anonymity promised by informed consent. In this work, we present 500 synthetic browsing histories valid for 50 countries worldwide. The synthetic histories are compiled based on real browsing data using a series of transformation criteria, including website content, popularity, locality, and language, ensuring their validity for the respective countries. Each history maintains the order of webpage accesses and covers a one-month period. The motivation for publishing this dataset arises from the community's call for browsing histories from different countries for research, development, and education. The published synthetic browsing histories can be used for any purpose without legal restrictions. [ABSTRACT FROM AUTHOR]
- Published
- 2025
- Full Text
- View/download PDF