1. Renaissance of Fuzzy and Fast Matching Entity with DSHS Algorithm
- Author
-
Kari, Venkatram and Amalanathan, Geetha Mary
- Abstract
Entity matching is a crucial aspect of data management systems, requiring the identification of real-world entities from diverse expressions. Despite the human ability to recognize equivalences among entities, machines struggle due to variations in expression. This challenge underscores the necessity for a robust framework capable of matching entities based on their attributes. While previous studies have explored transformation techniques, NLP, and deep learning, they often fall short in accuracy and scalability or overlook internal constraints. Such shortcomings result in costly rework and compromised business decisions. To address these issues, proposed “Deep Semantic Homophonic Synonymy” (DSHS) algorithm, a hybrid approach integrating rule-based, NLP, and deep learning techniques. By employing semantic matching, rule-based preprocessing, NLP enrichment, phonetic analysis, abbreviation variations, and spelling corrections, the DSHS algorithm effectively overcomes entity matching challenges. This solution, comprising six modules including Preprocessing, Transformation, Translation, Word Embedding, Attribute Similarity, and Classification, outperforms existing EM models in terms of accuracy and scalability. This solution is a comprehensive entity matching, offering significant advancements in the field and providing a reliable framework for resolving complex matching requirements.
- Published
- 2024
- Full Text
- View/download PDF