Framework to improve performance of hadoop

Balraj Singh

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/442421

Title:	Framework to improve performance of hadoop
Researcher:	Balraj Singh
Guide(s):	Harsh K Verma and H S Johal
Keywords:	Computer Science Computer Science Hardware and Architecture Engineering and Technology
University:	Dr B R Ambedkar National Institute of Technology Jalandhar
Completed Date:	2022
Abstract:	newlinev newlineABSTRACT newlineThe present era demands continuous support to bring improvements in executing newlinelarge scale data and to work beyond the traditional systems. The need for processing newlinediverse data types and solutions for different domains of the industry is rising. Such newlineneeds increase the requirement for sophisticated techniques and methods to enhance newlinethe existing platforms and mechanisms further. This provides an opportunity for the newlineresearch community to investigate further into the present systems, to find out newlinepotential issues, and propose new ways to improve the current systems. newlineHadoop is a popular choice to manage and process Big data. It is an open-source newlineplatform and a front runner in the batch processing of large scale jobs. The economy newlineassociated with the cluster in scaling is low as compared to other platforms. However, newlinethis popularity by no means guarantees high performance in all scenarios. With the newlinecontinuous evolution in data development and industrial requirements, it is imperative newlineto investigate and look into the new methods and techniques to bring advancements in newlinethe existing system. The performance of a cluster is largely dependent upon the newlinedifferent job processing mechanisms and the policies associated with it. While newlineextensive studies and solutions are proposed, the performance bottlenecks in terms of newlinescheduling, load balancing, and content management still prevail. The performance newlinechallenges are due to the complex nature of the existing system and their limited newlineabilities to understand the diverse and changing needs of the jobs. The key issues to newlinebe addressed are scheduling, skew mitigation through load balancing, and efficient newlinedata splitting and merging. Not much of the solutions are there on scheduling newlineconcerning the trade-off between the different parameters. The process of content newlinesplitting and merging is not explored to a large extent. The skew mitigation solutions newlineare more focused on Reduce side of the MapReduce, while the Map side is not newlineutilized much for load balancing. newlineThis thesis, a
Pagination:
URI:	http://hdl.handle.net/10603/442421
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
80_recommendation.pdf	Attached File	79.11 kB	Adobe PDF	View/Open
abstract.pdf		85.72 kB	Adobe PDF	View/Open
bibliography.pdf		221.07 kB	Adobe PDF	View/Open
chapter 1.pdf		558.16 kB	Adobe PDF	View/Open
chapter 2.pdf		321.67 kB	Adobe PDF	View/Open
chapter 3.pdf		697.55 kB	Adobe PDF	View/Open
chapter 4.pdf		925.67 kB	Adobe PDF	View/Open
chapter 5.pdf		406.52 kB	Adobe PDF	View/Open
prelim.pdf		909.75 kB	Adobe PDF	View/Open
table of contents.pdf		86.03 kB	Adobe PDF	View/Open
title.pdf		85.79 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET