A secure hash based deduplication system in hadoop architecture

Ramya, P

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/334801

Title:	A secure hash based deduplication system in hadoop architecture
Researcher:	Ramya, P
Guide(s):	Sundar, C and Babu, P
Keywords:	Hadoop architecture Deduplication Big data
University:	Anna University
Completed Date:	2020
Abstract:	The digital world has greatly shifted towards a distributed environment where the autonomous devices are enabled to do work in a co-operative manner. As more demand for distributed computing blossomed many devices are connected and able to share their resources among themselves in a disciplined manner. In such an environment data has emerged as a valuable asset, since more data sources extraordinarily produce information. The volume of data gets increased along with the velocity of generation and this huge amount of data comes from a variety of sources. The handling, processing, and storing of the huge volume of data has introduced a new concept called Big Data. Big Data is a methodology that employs different techniques to extract more accurate information from the huge volume of data. The data has changed as a more important asset since they provide valuable information, so they have to be stored and managed properly. The data are stored in an environment like Hadoop which is capable of processing and handling huge amount of data which are scattered among clusters of computers. The exponential growth in the volume of data has threatened the availability of the storage space. To tackle the situation many storage optimization techniques are proposed, one such technique is data deduplication and it eliminates redundant or duplicate data and stores a unique copy of data. The deduplication reduces the storage space requirement considerably. The big data ecosystem is an open environment where the deduplication is carried out in a decentralized manner where the security issues will arise. The deduplication is a space-saving technique that divides the given file into many fixed or variable-size blocks and a fingerprint for each block is calculated using the hash algorithm. The fingerprints are unique in nature so it can be used to compare with other blocks to eliminate redundant blocks. The fingerprints are stored in an index like structure which facilitates the search iv operation. However, each operation has its own
Pagination:	xvii,153 p.
URI:	http://hdl.handle.net/10603/334801
Appears in Departments:	Faculty of Information and Communication Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	71.18 kB	Adobe PDF	View/Open
02_certificates.pdf		223.3 kB	Adobe PDF	View/Open
03_vivaproceedings.pdf		435.88 kB	Adobe PDF	View/Open
04_bonafidecertificate.pdf		278.07 kB	Adobe PDF	View/Open
05_abstracts.pdf		57.4 kB	Adobe PDF	View/Open
06_acknowledgements.pdf		325.56 kB	Adobe PDF	View/Open
07_contents.pdf		57.08 kB	Adobe PDF	View/Open
08_listoftables.pdf		50.52 kB	Adobe PDF	View/Open
09_listoffigures.pdf		51.02 kB	Adobe PDF	View/Open
10_listofabbreviations.pdf		63.49 kB	Adobe PDF	View/Open
11_chapter1.pdf		283.1 kB	Adobe PDF	View/Open
12_chapter2.pdf		241.6 kB	Adobe PDF	View/Open
13_chapter3.pdf		118.84 kB	Adobe PDF	View/Open
14_chapter4.pdf		395.98 kB	Adobe PDF	View/Open
15_chapter5.pdf		785.99 kB	Adobe PDF	View/Open
16_chapter6.pdf		271.72 kB	Adobe PDF	View/Open
17_conclusion.pdf		86.79 kB	Adobe PDF	View/Open
18_reference.pdf		121.03 kB	Adobe PDF	View/Open
19_listofpublications.pdf		78.1 kB	Adobe PDF	View/Open
80_recommendation.pdf		90.67 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET