Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/547923
Title: Assessment of computational cost in big data by implementing mapreduce strategy
Researcher: Mini Prince
Guide(s): Joe Prathap R P M
Keywords: Big Data
Mapreduce Strategy
Neural Network
University: Anna University
Completed Date: 2023
Abstract: Big data is a key component of most contemporary technologies, newlineincluding social media, smart cities, and the internet of things (IoT). Class newlineoverlap and class imbalance are two data issues that arise when large data is newlineused in practical applications. Most conventional classifiers are trapped in the newlinelocal optimum problem when working with huge datasets. As a result, newlineresearch into novel approaches to handling massive data volumes is required. newlineThe issue has been addressed with a number of methods. The fast expansion newlineof data sources poses a challenge to the continued usefulness of many newlineestablished techniques. Class imbalance concerns have shown considerable newlinepromise for methods like oversampling and under-sampling. newline The Synthetic Minority Oversampling Technique (SMOTE), which newlinegenerates synthetic samples for the minority class in constructing a balanced newlinedataset, has produced the greatest results of any of these strategies. The newlineproblem is that their practical application is limited to situations where there newlineare tens of thousands or fewer of each. A parallel mode method combining newlineSMOTE and the MapReduce strategy was put forth in this study to address newlinethe aforementioned issue. This method divides the algorithm s operation newlineamong a number of processing nodes. The first step is to divide the data into newlinevarious blocks using a mapping function. Each mapping block is then newlinesubjected to a pre-processing step that uses a hybrid SMOTE method to solve newlinethe class unbalanced problem. A decision tree model would be built for each newlinemap block. The decision tree building components would then be integrated newlineto produce a categorization model. newline
Pagination: xiv,123p.
URI: http://hdl.handle.net/10603/547923
Appears in Departments:Faculty of Information and Communication Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File118.5 kBAdobe PDFView/Open
02_prelim pages.pdf2.47 MBAdobe PDFView/Open
03_contents.pdf223.54 kBAdobe PDFView/Open
04_abstracts.pdf220.77 kBAdobe PDFView/Open
05_chapter1.pdf502.53 kBAdobe PDFView/Open
06_chapter2.pdf411.3 kBAdobe PDFView/Open
07_chapter3.pdf1.01 MBAdobe PDFView/Open
08_chapter4.pdf1.45 MBAdobe PDFView/Open
09_chapter5.pdf1.04 MBAdobe PDFView/Open
10_chapter6.pdf226.97 kBAdobe PDFView/Open
11_annexures.pdf251.06 kBAdobe PDFView/Open
80_recommendation.pdf336.78 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: