Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/39834
Title: | A study on de duplication |
Researcher: | Venkatesh kumar A |
Guide(s): | Kuppuswami S |
Keywords: | Information Gain Multi Level Clustering |
Upload Date: | 28-Apr-2015 |
University: | Anna University |
Completed Date: | 01/04/2014 |
Abstract: | We present two algorithms for calculating string Dis Similarity newlinepercentage of De duplication system Our algorithms are multiple levels of newlineclustering to incorporate constraints for reducing the volume of data and newlineInformation Gain IG for calculating Dis Similarity In our proposed system newlinewe will first separate the records into block sized subset by using clustering newlinealgorithm and applying the subset value to IG Most of the existing algorithm newlinesystems depend on generic or manually tuned distance metrics for estimating newlinethe similarity We ran extensive experiments with huge data and compared newlinethem with various versions of existing algorithms and reported that the new newlinesystem reduces the time consumption for string comparison and higher newlineaverage accuracy than that of the existing systems newlineNone of the existing system produces the dis similarity percentage newlinebetween pair of string in given data set Here we have presented an efficient newlinesolution for calculating string dis Similarity percentage of De duplication newlinesystem by using Multi Level Clustering MLC Information Gain Our newlinealgorithms work in two phases Multi Level Clustering construction and Text newlineDis Similarity calculation Our methods reduce the time consumption for newlinefinding a duplicate record and using smaller amount of memory than the newlineexisting method newline newline |
Pagination: | xvi, 160p. |
URI: | http://hdl.handle.net/10603/39834 |
Appears in Departments: | Faculty of Science and Humanities |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 28.82 kB | Adobe PDF | View/Open |
02_certificate.pdf | 182.56 kB | Adobe PDF | View/Open | |
03_abstract.pdf | 18.98 kB | Adobe PDF | View/Open | |
04_acknowledgement.pdf | 22.03 kB | Adobe PDF | View/Open | |
05_content.pdf | 44.58 kB | Adobe PDF | View/Open | |
06_chapter1.pdf | 179.69 kB | Adobe PDF | View/Open | |
07_chapter2.pdf | 193.92 kB | Adobe PDF | View/Open | |
08_chapter3.pdf | 32.66 kB | Adobe PDF | View/Open | |
09_chapter4.pdf | 368.8 kB | Adobe PDF | View/Open | |
10_chapter5.pdf | 359.07 kB | Adobe PDF | View/Open | |
11_chapter6.pdf | 227.44 kB | Adobe PDF | View/Open | |
12_chapter7.pdf | 27.32 kB | Adobe PDF | View/Open | |
13_appendix.pdf | 852.74 kB | Adobe PDF | View/Open | |
14_reference.pdf | 44.59 kB | Adobe PDF | View/Open | |
15_publication.pdf | 22.88 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: