Back to Search Start Over

Full-Privacy Secured Search Engine Empowered by Efficient Genome-Mapping Algorithms.

Authors :
Chang YY
Wong ST
Salawu EO
Liao MH
Hung JH
Yang LW
Source :
IEEE journal of biomedical and health informatics [IEEE J Biomed Health Inform] 2023 Oct; Vol. 27 (10), pp. 5155-5164. Date of Electronic Publication: 2023 Oct 05.
Publication Year :
2023

Abstract

Since the 90s, keyword-based search engines have been the only option for people to locate relevant web content through a simple query comprising one to a few keywords. These engines, whether free or paid, retained users' search queries and preferences, often to deliver targeted ads. Additionally, user-uploaded articles for plagiarism detection can further be stored as part of service providers' expanding databases for profit. Essentially, users could not search without exposing their queries to these providers. We present a new solution here: a method for searching the internet using a full article as a query without disclosing the content. Our Sapiens Aperio Veritas Engine (S.A.V.E.) uses an encoding scheme and an FM-index search, borrowed from next-generation human genome sequencing. Each word in a user's query is transformed into one of 12 "amino acids" to create a pseudo-biological sequence (PBS) on the user's device. Plagiarism checks are done by users submitting their locally created PBSs to our cloud service. This detects identical content in our database, which includes all English and Chinese Wikipedia articles and Open Access journals up to April 2021. PBSs, longer than 12 "amino acids", show accurate results with less than 0.8% false positives. Performance-wise, S.A.V.E. runs at a similar genome-mapping speed as Bowtie and is >5 orders faster than BLAST. With both standard and private modes, S.A.V.E. offers a revolutionary, privacy-first search and plagiarism check system. We believe this sets an exciting precedent for future search engines prioritizing user confidentiality. S.A.V.E. can be accessed at https://dyn.life.nthu.edu.tw/SAVE/.

Details

Language :
English
ISSN :
2168-2208
Volume :
27
Issue :
10
Database :
MEDLINE
Journal :
IEEE journal of biomedical and health informatics
Publication Type :
Academic Journal
Accession number :
37527302
Full Text :
https://doi.org/10.1109/JBHI.2023.3300885