Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/427502
Full metadata record
DC FieldValueLanguage
dc.coverage.spatial
dc.date.accessioned2022-12-18T09:31:06Z-
dc.date.available2022-12-18T09:31:06Z-
dc.identifier.urihttp://hdl.handle.net/10603/427502-
dc.description.abstractBig Data was initially used in 1970 on atmospheric and deep-sea soundings. It s a collection of newlinegiant and complex data sets to be processed by traditional tools. Traditional tools were not good newlineenough to process such huge data. newlineHadoop MapReduce framework was developed by Google for processing vast amount of data in newlineparallel and distributed environment. The default Hadoop implementation assumes that the newlineexecuting nodes are homogeneous. The easiness of the model and the fault-tolerance feature of the newlineframework make it very popular in processing Big Data. As this programming model gets popular, newlinethe scheduling and locality of the jobs and data becomes very significant. newlineData locality is an important feature that Hadoop introduced to improve the performance of the newlinemodel. The key idea is to move the map task closer to the node where the actual data resides rather newlinethan transferring the vast data set near the computation. Data locality helps in lowering the network newlinecongestion and improving performance. However, this practice fails when processing the data in newlinea heterogeneous Hadoop cluster. In a heterogeneous setup, nodes with different computational newlinecapabilities pose a crucial challenge. Nodes with a faster processing capacity finish the job newlinecompared to the nodes with slower processing ability. newlineThe objective of this dissertation is to provide with a scheduling theory which is based on KNN newlineclustering and prefetching. The process starts with speculative prefetching and then performing newlinethe KNN clustering on the intermediate map output before directing it to the reducer for final newlineprocessing. The performance evaluation of scheduler performance is analyzed by executing newlinedifferent workloads like word count, random text, random num, and Sort. The results show that newlinethe proposed idea improves the performance of job execution newline
dc.format.extentXII, 136
dc.languageEnglish
dc.relation115
dc.rightsuniversity
dc.titlePerformance Improvement in Hadoop Mapreduce
dc.title.alternative
dc.creator.researcherKalia, Khushboo
dc.subject.keywordComputer Science
dc.subject.keywordComputer Science Artificial Intelligence
dc.subject.keywordEngineering and Technology
dc.description.note
dc.contributor.guideNagpal, Pooja and Neeraj Gupta
dc.publisher.placeGurgaon
dc.publisher.universityK.R. Mangalam Univeristy, Gurgaon
dc.publisher.institutionDepartment of Computer Science and Engineering
dc.date.registered2015
dc.date.completed2022
dc.date.awarded2022
dc.format.dimensions21X29.7
dc.format.accompanyingmaterialDVD
dc.source.universityUniversity
dc.type.degreePh.D.
Appears in Departments:Department of Computer Science

Files in This Item:
File Description SizeFormat 
01_title page.pdfAttached File43.08 kBAdobe PDFView/Open
02_prelim.pdf252.55 kBAdobe PDFView/Open
03_content.pdf126.27 kBAdobe PDFView/Open
04_abstract.pdf120.09 kBAdobe PDFView/Open
05_chapter1- introduction.pdf751.96 kBAdobe PDFView/Open
06_chapter 2 literature survey.pdf786.5 kBAdobe PDFView/Open
07_chapter 3 methodology.pdf554.62 kBAdobe PDFView/Open
08_chapter 4 result and discussion.pdf421.99 kBAdobe PDFView/Open
09_chapter 5 conclusion and future work.pdf216.71 kBAdobe PDFView/Open
10_annexures.pdf1.85 MBAdobe PDFView/Open
80_recommendation.pdf242.28 kBAdobe PDFView/Open


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: