Back to Search Start Over

An Incremental Knowledge Acquisition Method for Improving Duplicate Invoices Detection

Authors :
Julien J. P. Vayssiere
Paul Compton
Lucio Menzel
Boualem Benatallah
Hartmut Vogler
Van Hai Ho
Source :
ICDE
Publication Year :
2009
Publisher :
IEEE, 2009.

Abstract

Duplicate records are a major problem and duplicate invoices are a specific example of this. The detection of duplicate invoices is a critical issue for business since duplicate invoices can result in a company paying more than once for goods or services ordered. Past experience has shown that generic duplicate record detection techniques are not very useful when applied to invoices: the rate of false positives can be so high that invoice clerks are discouraged from using the system. This is because such approaches do not take the business context into account, e.g. what types of good were ordered as well as the past relationship with that vendor. In this paper, we discuss applying Ripple Down Rules (RDR), an approach for incremental and end-user-centred knowledge acquisition, to the problem of classifying pairs of potential duplicate invoices. We describe how we built a prototype on top of the SAP ERP product and evaluated it on a real data set that had been previously independently audited for duplicates. The preliminary results have highlighted the significant potential of this approach for assisting invoicing clerks processing potential duplicate invoices. We have observed a drop in the rate of false positives from 92% down to 18.66% when compared to traditional approaches that do not take the business context into account. We suggest that incremental development of domain specific knowledge may have more general application to the problem of handling duplicate records.

Details

ISSN :
10844627
Database :
OpenAIRE
Journal :
2009 IEEE 25th International Conference on Data Engineering
Accession number :
edsair.doi...........13c2f512e1e245d83f7692f2f9dd24f3
Full Text :
https://doi.org/10.1109/icde.2009.38