Back to Search
Start Over
Using Dynamic Pruned N-Gram Model for Identifying the Gender of the User.
- Source :
- Applied Sciences (2076-3417); Jul2022, Vol. 12 Issue 13, pN.PAG-N.PAG, 16p
- Publication Year :
- 2022
-
Abstract
- Organizations analyze customers' personal data to understand and model their behavior. Identifying customers' gender is a significant factor in analyzing markets that help plan the promotional campaigns, determine target customers and provide relevant offers. Several techniques were developed to analyze different types of data, including text, image, speech, and biometrics, to identify the gender of the user. The method of synthesis of the profile name differs from one customer to another. Using numerical substitutions of specific letters, known as Leet language, impedes the gender identification task. Moreover, using acronyms, misspellings, and adjacent names impose additional challenges. Towards this goal, this work uses the customers' profile names associated with submitted reviews to recognize the customers' gender. First, we create datasets of profile names extracted from the customers' reviews. Secondly, we introduce a dynamic pruned n-gram model for identifying the gender of the user. It starts with data segmentation to handle adjacent parts, followed by data conversion and cleaning to fix the use of Leet language. Feature selection through a dynamic pruned n-gram model is the next step with the recurrent misspelling correction using fuzzy matching. We evaluate the proposed approach on the real data collected from active web resources. The obtained results demonstrate its validity and reliability. [ABSTRACT FROM AUTHOR]
Details
- Language :
- English
- ISSN :
- 20763417
- Volume :
- 12
- Issue :
- 13
- Database :
- Complementary Index
- Journal :
- Applied Sciences (2076-3417)
- Publication Type :
- Academic Journal
- Accession number :
- 157914741
- Full Text :
- https://doi.org/10.3390/app12136378