Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/334801
Title: A secure hash based deduplication system in hadoop architecture
Researcher: Ramya, P
Guide(s): Sundar, C and Babu, P
Keywords: Hadoop architecture
Deduplication
Big data
University: Anna University
Completed Date: 2020
Abstract: The digital world has greatly shifted towards a distributed environment where the autonomous devices are enabled to do work in a co-operative manner. As more demand for distributed computing blossomed many devices are connected and able to share their resources among themselves in a disciplined manner. In such an environment data has emerged as a valuable asset, since more data sources extraordinarily produce information. The volume of data gets increased along with the velocity of generation and this huge amount of data comes from a variety of sources. The handling, processing, and storing of the huge volume of data has introduced a new concept called Big Data. Big Data is a methodology that employs different techniques to extract more accurate information from the huge volume of data. The data has changed as a more important asset since they provide valuable information, so they have to be stored and managed properly. The data are stored in an environment like Hadoop which is capable of processing and handling huge amount of data which are scattered among clusters of computers. The exponential growth in the volume of data has threatened the availability of the storage space. To tackle the situation many storage optimization techniques are proposed, one such technique is data deduplication and it eliminates redundant or duplicate data and stores a unique copy of data. The deduplication reduces the storage space requirement considerably. The big data ecosystem is an open environment where the deduplication is carried out in a decentralized manner where the security issues will arise. The deduplication is a space-saving technique that divides the given file into many fixed or variable-size blocks and a fingerprint for each block is calculated using the hash algorithm. The fingerprints are unique in nature so it can be used to compare with other blocks to eliminate redundant blocks. The fingerprints are stored in an index like structure which facilitates the search iv operation. However, each operation has its own
Pagination: xvii,153 p.
URI: http://hdl.handle.net/10603/334801
Appears in Departments:Faculty of Information and Communication Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File71.18 kBAdobe PDFView/Open
02_certificates.pdf223.3 kBAdobe PDFView/Open
03_vivaproceedings.pdf435.88 kBAdobe PDFView/Open
04_bonafidecertificate.pdf278.07 kBAdobe PDFView/Open
05_abstracts.pdf57.4 kBAdobe PDFView/Open
06_acknowledgements.pdf325.56 kBAdobe PDFView/Open
07_contents.pdf57.08 kBAdobe PDFView/Open
08_listoftables.pdf50.52 kBAdobe PDFView/Open
09_listoffigures.pdf51.02 kBAdobe PDFView/Open
10_listofabbreviations.pdf63.49 kBAdobe PDFView/Open
11_chapter1.pdf283.1 kBAdobe PDFView/Open
12_chapter2.pdf241.6 kBAdobe PDFView/Open
13_chapter3.pdf118.84 kBAdobe PDFView/Open
14_chapter4.pdf395.98 kBAdobe PDFView/Open
15_chapter5.pdf785.99 kBAdobe PDFView/Open
16_chapter6.pdf271.72 kBAdobe PDFView/Open
17_conclusion.pdf86.79 kBAdobe PDFView/Open
18_reference.pdf121.03 kBAdobe PDFView/Open
19_listofpublications.pdf78.1 kBAdobe PDFView/Open
80_recommendation.pdf90.67 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial 4.0 International (CC BY-NC 4.0).