Assessment of computational cost in big data by implementing mapreduce strategy

Mini Prince

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/547923

Title:	Assessment of computational cost in big data by implementing mapreduce strategy
Researcher:	Mini Prince
Guide(s):	Joe Prathap R P M
Keywords:	Big Data Mapreduce Strategy Neural Network
University:	Anna University
Completed Date:	2023
Abstract:	Big data is a key component of most contemporary technologies, newlineincluding social media, smart cities, and the internet of things (IoT). Class newlineoverlap and class imbalance are two data issues that arise when large data is newlineused in practical applications. Most conventional classifiers are trapped in the newlinelocal optimum problem when working with huge datasets. As a result, newlineresearch into novel approaches to handling massive data volumes is required. newlineThe issue has been addressed with a number of methods. The fast expansion newlineof data sources poses a challenge to the continued usefulness of many newlineestablished techniques. Class imbalance concerns have shown considerable newlinepromise for methods like oversampling and under-sampling. newline The Synthetic Minority Oversampling Technique (SMOTE), which newlinegenerates synthetic samples for the minority class in constructing a balanced newlinedataset, has produced the greatest results of any of these strategies. The newlineproblem is that their practical application is limited to situations where there newlineare tens of thousands or fewer of each. A parallel mode method combining newlineSMOTE and the MapReduce strategy was put forth in this study to address newlinethe aforementioned issue. This method divides the algorithm s operation newlineamong a number of processing nodes. The first step is to divide the data into newlinevarious blocks using a mapping function. Each mapping block is then newlinesubjected to a pre-processing step that uses a hybrid SMOTE method to solve newlinethe class unbalanced problem. A decision tree model would be built for each newlinemap block. The decision tree building components would then be integrated newlineto produce a categorization model. newline
Pagination:	xiv,123p.
URI:	http://hdl.handle.net/10603/547923
Appears in Departments:	Faculty of Information and Communication Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	118.5 kB	Adobe PDF	View/Open
02_prelim pages.pdf		2.47 MB	Adobe PDF	View/Open
03_contents.pdf		223.54 kB	Adobe PDF	View/Open
04_abstracts.pdf		220.77 kB	Adobe PDF	View/Open
05_chapter1.pdf		502.53 kB	Adobe PDF	View/Open
06_chapter2.pdf		411.3 kB	Adobe PDF	View/Open
07_chapter3.pdf		1.01 MB	Adobe PDF	View/Open
08_chapter4.pdf		1.45 MB	Adobe PDF	View/Open
09_chapter5.pdf		1.04 MB	Adobe PDF	View/Open
10_chapter6.pdf		226.97 kB	Adobe PDF	View/Open
11_annexures.pdf		251.06 kB	Adobe PDF	View/Open
80_recommendation.pdf		336.78 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET