A Study of Progressive Techniques to Efficiently Duplicate

T. CHANDRASEKHAR, D.VENKATA SIVA REDDY

Abstract


Duplicate detection is the process of identifying multiple representations of same real world entities. Today, duplicate detection methods need to process ever larger datasets in ever shorter time: maintaining the quality of a dataset becomes increasingly difficult. We present two novel, progressive duplicate detection algorithms that significantly increase the efficiency of finding duplicates if the execution time is limited: They maximize the gain of the overall process within the time available by reporting most results much earlier than traditional approaches. Comprehensive experiments show that our progressive algorithms can double the efficiency over time of traditional duplicate detection and significantly improve upon related work.


Full Text:

PDF

Refbacks

  • There are currently no refbacks.


Copyright (c) 2017 Edupedia Publications Pvt Ltd

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Publisher

EduPedia Publications Pvt Ltd, D/351, Prem Nanar-3, Suleman Nagar, Kirari, Nagloi, New Delhi PIN-Code 110086, India Through Phone Call us now: +919958037887 or +919557022047

All published Articles are Open Access at https://edupediapublications.org/journals/


Paper submission: editor@edupediapublications.com or edupediapublications@gmail.com

Editor-in-Chief       editor@edupediapublications.com

Mobile:                  +919557022047 & +919958037887

Websites   https://edupediapublications.org/journals/.

Journals Maintained and Hosted by

EduPedia Publications (P) Ltd in Association with Other Institutional Partners

http://edupediapublications.org/

Pen2Print and IJR are registered trademark of the Edupedia Publications Pvt Ltd.