Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/39834
Full metadata record
DC FieldValueLanguage
dc.coverage.spatialA study on de duplicationen_US
dc.date.accessioned2015-04-28T07:13:45Z-
dc.date.available2015-04-28T07:13:45Z-
dc.date.issued2015-04-28-
dc.identifier.urihttp://hdl.handle.net/10603/39834-
dc.description.abstractWe present two algorithms for calculating string Dis Similarity newlinepercentage of De duplication system Our algorithms are multiple levels of newlineclustering to incorporate constraints for reducing the volume of data and newlineInformation Gain IG for calculating Dis Similarity In our proposed system newlinewe will first separate the records into block sized subset by using clustering newlinealgorithm and applying the subset value to IG Most of the existing algorithm newlinesystems depend on generic or manually tuned distance metrics for estimating newlinethe similarity We ran extensive experiments with huge data and compared newlinethem with various versions of existing algorithms and reported that the new newlinesystem reduces the time consumption for string comparison and higher newlineaverage accuracy than that of the existing systems newlineNone of the existing system produces the dis similarity percentage newlinebetween pair of string in given data set Here we have presented an efficient newlinesolution for calculating string dis Similarity percentage of De duplication newlinesystem by using Multi Level Clustering MLC Information Gain Our newlinealgorithms work in two phases Multi Level Clustering construction and Text newlineDis Similarity calculation Our methods reduce the time consumption for newlinefinding a duplicate record and using smaller amount of memory than the newlineexisting method newline newlineen_US
dc.format.extentxvi, 160p.en_US
dc.languageEnglishen_US
dc.relationp152-159.en_US
dc.rightsuniversityen_US
dc.titleA study on de duplicationen_US
dc.title.alternativeen_US
dc.creator.researcherVenkatesh kumar Aen_US
dc.subject.keywordInformation Gainen_US
dc.subject.keywordMulti Level Clusteringen_US
dc.description.noteappendix p137-151, reference p152-159.en_US
dc.contributor.guideKuppuswami Sen_US
dc.publisher.placeChennaien_US
dc.publisher.universityAnna Universityen_US
dc.publisher.institutionFaculty of Science and Humanitiesen_US
dc.date.registeredn.d,en_US
dc.date.completed01/04/2014en_US
dc.date.awarded30/04/2014en_US
dc.format.dimensions23cm.en_US
dc.format.accompanyingmaterialNoneen_US
dc.source.universityUniversityen_US
dc.type.degreePh.D.en_US
Appears in Departments:Faculty of Science and Humanities

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File28.82 kBAdobe PDFView/Open
02_certificate.pdf182.56 kBAdobe PDFView/Open
03_abstract.pdf18.98 kBAdobe PDFView/Open
04_acknowledgement.pdf22.03 kBAdobe PDFView/Open
05_content.pdf44.58 kBAdobe PDFView/Open
06_chapter1.pdf179.69 kBAdobe PDFView/Open
07_chapter2.pdf193.92 kBAdobe PDFView/Open
08_chapter3.pdf32.66 kBAdobe PDFView/Open
09_chapter4.pdf368.8 kBAdobe PDFView/Open
10_chapter5.pdf359.07 kBAdobe PDFView/Open
11_chapter6.pdf227.44 kBAdobe PDFView/Open
12_chapter7.pdf27.32 kBAdobe PDFView/Open
13_appendix.pdf852.74 kBAdobe PDFView/Open
14_reference.pdf44.59 kBAdobe PDFView/Open
15_publication.pdf22.88 kBAdobe PDFView/Open


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: