Designing and implementation of Big Data Analytics platform using Map reduce architecture with Massive Parallel Processing MPP

Vikas, S

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/462725

Title:	Designing and implementation of Big Data Analytics platform using Map reduce architecture with Massive Parallel Processing MPP
Researcher:	Vikas, S
Guide(s):	Thimmaraju, S N
Keywords:	Computer Science Computer Science Interdisciplinary Applications Engineering and Technology
University:	Visvesvaraya Technological University, Belagavi
Completed Date:	2020
Abstract:	Big data arose as an experimentation space in the digital data age to tackle precisely the enormous newlinequantities of produced information. The term quotbig dataquot is used to describe information as an newlineamount, distance and speed. This information typically includes huge quantities of semi-structured newlineand unstructured data formats that are very difficult to store, process and analyze using earlier newlineInformation Technology. It is very important to extract meaningful and valuable information from newlineBrobdingnagian (huge) datasets in order to provide new services and raise the higher principle newlinestandards. Massive information keeps parallel to excessively distributed processing methods such newlineas new information and innovation can be deeply mined at an affordable interval of time. Apache newlineHadoop and Apache Spark were illustrated as parallel systems that provide the basis for newlineMapReduce programming model implementation. As Hadoop MapReduce runs on the disk due to newlinehigher Input/output operations it is sometimes slow and expensive. In this present work an efficient newlineFlink-based algorithm has been implemented for the mining sequence from large repositories of newlinebreast cancer. In order to test the effectiveness of the method of analyzing big data, Flink algorithm newlinehas been used to the datasets of breast cancer. newlineIn the present Research work the technique to delete unwanted redundant data pairs in newlinemultiple frames has been prepared by using Natural Language Processing technique like Text newlineSimilarity. This prepared data is then fed to one of the Machine Learning Method called Random newlineForest. After this data collected, target file will be compared with the predicted source file from newlinethe predictors in which the input and output files are prepared. This process is carried in order to newlineobtain exact efficiency and to enrich the product details. newlineThere are different techniques of machine learning Algorithms are available for the newlineanalysis of WBCD of BC dataset classification. In this study three key algorithms are used for the newlineanalysis SVM, N
Pagination:	126
URI:	http://hdl.handle.net/10603/462725
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
01_tirtle.pdf	Attached File	91.54 kB	Adobe PDF	View/Open
02_prelim pages.pdf		243.64 kB	Adobe PDF	View/Open
03_content.pdf		151.85 kB	Adobe PDF	View/Open
04_abstract.pdf		91.96 kB	Adobe PDF	View/Open
05_chapter 1.pdf		346.12 kB	Adobe PDF	View/Open
06_chapter 2.pdf		238.75 kB	Adobe PDF	View/Open
07_chapter 3.pdf		862.45 kB	Adobe PDF	View/Open
08_chapter 4.pdf		854.86 kB	Adobe PDF	View/Open
09_chapter 5.pdf		1.28 MB	Adobe PDF	View/Open
10_annexures.pdf		228.92 kB	Adobe PDF	View/Open
11_chapter 6.pdf		2.18 MB	Adobe PDF	View/Open
80_recommendation.pdf		40.84 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET