Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/442421
Title: | Framework to improve performance of hadoop |
Researcher: | Balraj Singh |
Guide(s): | Harsh K Verma and H S Johal |
Keywords: | Computer Science Computer Science Hardware and Architecture Engineering and Technology |
University: | Dr B R Ambedkar National Institute of Technology Jalandhar |
Completed Date: | 2022 |
Abstract: | newlinev newlineABSTRACT newlineThe present era demands continuous support to bring improvements in executing newlinelarge scale data and to work beyond the traditional systems. The need for processing newlinediverse data types and solutions for different domains of the industry is rising. Such newlineneeds increase the requirement for sophisticated techniques and methods to enhance newlinethe existing platforms and mechanisms further. This provides an opportunity for the newlineresearch community to investigate further into the present systems, to find out newlinepotential issues, and propose new ways to improve the current systems. newlineHadoop is a popular choice to manage and process Big data. It is an open-source newlineplatform and a front runner in the batch processing of large scale jobs. The economy newlineassociated with the cluster in scaling is low as compared to other platforms. However, newlinethis popularity by no means guarantees high performance in all scenarios. With the newlinecontinuous evolution in data development and industrial requirements, it is imperative newlineto investigate and look into the new methods and techniques to bring advancements in newlinethe existing system. The performance of a cluster is largely dependent upon the newlinedifferent job processing mechanisms and the policies associated with it. While newlineextensive studies and solutions are proposed, the performance bottlenecks in terms of newlinescheduling, load balancing, and content management still prevail. The performance newlinechallenges are due to the complex nature of the existing system and their limited newlineabilities to understand the diverse and changing needs of the jobs. The key issues to newlinebe addressed are scheduling, skew mitigation through load balancing, and efficient newlinedata splitting and merging. Not much of the solutions are there on scheduling newlineconcerning the trade-off between the different parameters. The process of content newlinesplitting and merging is not explored to a large extent. The skew mitigation solutions newlineare more focused on Reduce side of the MapReduce, while the Map side is not newlineutilized much for load balancing. newlineThis thesis, a |
Pagination: | |
URI: | http://hdl.handle.net/10603/442421 |
Appears in Departments: | Department of Computer Science and Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
80_recommendation.pdf | Attached File | 79.11 kB | Adobe PDF | View/Open |
abstract.pdf | 85.72 kB | Adobe PDF | View/Open | |
bibliography.pdf | 221.07 kB | Adobe PDF | View/Open | |
chapter 1.pdf | 558.16 kB | Adobe PDF | View/Open | |
chapter 2.pdf | 321.67 kB | Adobe PDF | View/Open | |
chapter 3.pdf | 697.55 kB | Adobe PDF | View/Open | |
chapter 4.pdf | 925.67 kB | Adobe PDF | View/Open | |
chapter 5.pdf | 406.52 kB | Adobe PDF | View/Open | |
prelim.pdf | 909.75 kB | Adobe PDF | View/Open | |
table of contents.pdf | 86.03 kB | Adobe PDF | View/Open | |
title.pdf | 85.79 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: