Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/39834
Title: A study on de duplication
Researcher: Venkatesh kumar A
Guide(s): Kuppuswami S
Keywords: Information Gain
Multi Level Clustering
Upload Date: 28-Apr-2015
University: Anna University
Completed Date: 01/04/2014
Abstract: We present two algorithms for calculating string Dis Similarity newlinepercentage of De duplication system Our algorithms are multiple levels of newlineclustering to incorporate constraints for reducing the volume of data and newlineInformation Gain IG for calculating Dis Similarity In our proposed system newlinewe will first separate the records into block sized subset by using clustering newlinealgorithm and applying the subset value to IG Most of the existing algorithm newlinesystems depend on generic or manually tuned distance metrics for estimating newlinethe similarity We ran extensive experiments with huge data and compared newlinethem with various versions of existing algorithms and reported that the new newlinesystem reduces the time consumption for string comparison and higher newlineaverage accuracy than that of the existing systems newlineNone of the existing system produces the dis similarity percentage newlinebetween pair of string in given data set Here we have presented an efficient newlinesolution for calculating string dis Similarity percentage of De duplication newlinesystem by using Multi Level Clustering MLC Information Gain Our newlinealgorithms work in two phases Multi Level Clustering construction and Text newlineDis Similarity calculation Our methods reduce the time consumption for newlinefinding a duplicate record and using smaller amount of memory than the newlineexisting method newline newline
Pagination: xvi, 160p.
URI: http://hdl.handle.net/10603/39834
Appears in Departments:Faculty of Science and Humanities

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File28.82 kBAdobe PDFView/Open
02_certificate.pdf182.56 kBAdobe PDFView/Open
03_abstract.pdf18.98 kBAdobe PDFView/Open
04_acknowledgement.pdf22.03 kBAdobe PDFView/Open
05_content.pdf44.58 kBAdobe PDFView/Open
06_chapter1.pdf179.69 kBAdobe PDFView/Open
07_chapter2.pdf193.92 kBAdobe PDFView/Open
08_chapter3.pdf32.66 kBAdobe PDFView/Open
09_chapter4.pdf368.8 kBAdobe PDFView/Open
10_chapter5.pdf359.07 kBAdobe PDFView/Open
11_chapter6.pdf227.44 kBAdobe PDFView/Open
12_chapter7.pdf27.32 kBAdobe PDFView/Open
13_appendix.pdf852.74 kBAdobe PDFView/Open
14_reference.pdf44.59 kBAdobe PDFView/Open
15_publication.pdf22.88 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: