Automated Solution for Normalization of Duplicate Records from Multiple Data Sources
Abstract
Full Text:
PDFReferences
R.Ananthakrishna, S. Chaudhuri, and V. Ganti, “Eliminating Fuzzy Duplicates in Data Warehouses,” Proc. 28th International Conference Very Large Data Bases, 2002, pp. 586-597.
Amy J C Trappey , Charles V . Trappey, Fu – ChiangHsu and David W Hsiao , “ A Fuzzy ontological knowledge Document Clustering Method”, IEEE Transaction on Systems, Man, Cybernetics, June 2009, Vol 39 No. 3.
Baodong LI, Yongquan DONG, Yongxin ZHANG and DonglanLIU, ” Duplicate Record Detection Based on Unsupervised Learning Method”, Journal of Computational Information Systems, December 2011, Vol. 7, No. 16, pp. 5891-5899.
Bolla Anil Kumar, Satya P Kumar and Somayajula, “Hide the Duplicate Web Pages”, International Journal of Computer Science and Technology, September 2011, Vol. 2, No. 3, pp. 438-440.
R. Baxter, P. Christen, and T. Churches, “A Comparison of Fast Blocking Methods for Record Linkage, ” Proceedings Knowledge Discovery on Data Workshop Data Cleaning, Record Linkage, and Object Consolidation, 2003 , pp. 25-27
R. Baxter, Lifang Gu ,”Adaptive Filtering for Efficient Record Linkage”, SIAM International Conference on Data Mining, 2004, pp.477-481
M.Bilenko and R.J. Mooney, “Adaptive Duplicate Detection Using Learnable String Similarity Measures,” Proceedings ACM SIGKDD conference on Knowledge Discovery and Data mining, 2003, pp. 39-48.
Cai Bo, Zhang Feng Li and Wang Can, ” Research on Chunking Algorithms of Data De-duplication”, American Journal of Engineering and Technology Research, 2011, Vol. 11, No. 9, pp. 1353-1358.
P.Christen, “Automatic Record Linkage Using Seeded Nearest Neighbour and Support Vector Machine Classification,” Proceedings ACM SIGKDD conference on Knowledge Discovery and Data mining, 2008, pp. 151-159.
P.Christen and K. Goiser, “Quality and Complexity Measures for Data Linkage and Deduplication”, Springer, 2007, vol. 43, pp. 127-151.
S.R. Motwani, “Robust and Efficient Fuzzy Match for Online Data Cleaning,” Proceedings Knowledge Discovery and Data mining 2003, pp. 313-324.
S. Chaudhuri, V. Ganti, and R. Motwani, “Robust Identification of Fuzzy Duplicates,” Proc. 21st IEEE International Conference on Data Engineering, 2005, pp. 865- 876.
DebabrataDey, Member, IEEE, Vijay S. Mookerjee, and Dengpan Liu, “Efficient Techniques for Online Record Linkage”, IEEE Transactions on Data Engineering, March-2011, Vol. 23, No. 3, pp. 373-387.
Diego Zardetto, Monica Scannapieco and TizianaCatarci, “Efficient Automated Object Matching”, International Council for Open and Distance Education World Conference, March 2010, pp. 757-768.
V.S. Verykios. “Duplicate Record Detection: A Survey”, IEEE Transaction Knowledge and Data Engineering, 2007, pp. 1-16.
Haibin Cheng, Pang-Ning Tan, Member, IEEE, and Rong Jin, “Efficient Algorithm for Localized Support Vector Machine,” IEEE Transaction Knowledge and Data Engineering, April 2010, vol. 22, no 4
“PEBL: Web Page Classification without Negative Examples,” IEEE Transaction on Knowledge and Data Engineering, Jan. 2004, vol. 16, no. 1, pp. 70-81.
Ho Min Jung_, Sang Yong Park, Jeong Gun Lee, Young Woong Ko, “Efficient Data deduplication System Considering File Modification Pattern,” International Journal of Security and Its Applications.April, 2012 Vol. 6 No. 2.
Refbacks
- There are currently no refbacks.