Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/462725
Title: Designing and implementation of Big Data Analytics platform using Map reduce architecture with Massive Parallel Processing MPP
Researcher: Vikas, S
Guide(s): Thimmaraju, S N
Keywords: Computer Science
Computer Science Interdisciplinary Applications
Engineering and Technology
University: Visvesvaraya Technological University, Belagavi
Completed Date: 2020
Abstract: Big data arose as an experimentation space in the digital data age to tackle precisely the enormous newlinequantities of produced information. The term quotbig dataquot is used to describe information as an newlineamount, distance and speed. This information typically includes huge quantities of semi-structured newlineand unstructured data formats that are very difficult to store, process and analyze using earlier newlineInformation Technology. It is very important to extract meaningful and valuable information from newlineBrobdingnagian (huge) datasets in order to provide new services and raise the higher principle newlinestandards. Massive information keeps parallel to excessively distributed processing methods such newlineas new information and innovation can be deeply mined at an affordable interval of time. Apache newlineHadoop and Apache Spark were illustrated as parallel systems that provide the basis for newlineMapReduce programming model implementation. As Hadoop MapReduce runs on the disk due to newlinehigher Input/output operations it is sometimes slow and expensive. In this present work an efficient newlineFlink-based algorithm has been implemented for the mining sequence from large repositories of newlinebreast cancer. In order to test the effectiveness of the method of analyzing big data, Flink algorithm newlinehas been used to the datasets of breast cancer. newlineIn the present Research work the technique to delete unwanted redundant data pairs in newlinemultiple frames has been prepared by using Natural Language Processing technique like Text newlineSimilarity. This prepared data is then fed to one of the Machine Learning Method called Random newlineForest. After this data collected, target file will be compared with the predicted source file from newlinethe predictors in which the input and output files are prepared. This process is carried in order to newlineobtain exact efficiency and to enrich the product details. newlineThere are different techniques of machine learning Algorithms are available for the newlineanalysis of WBCD of BC dataset classification. In this study three key algorithms are used for the newlineanalysis SVM, N
Pagination: 126
URI: http://hdl.handle.net/10603/462725
Appears in Departments:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
01_tirtle.pdfAttached File91.54 kBAdobe PDFView/Open
02_prelim pages.pdf243.64 kBAdobe PDFView/Open
03_content.pdf151.85 kBAdobe PDFView/Open
04_abstract.pdf91.96 kBAdobe PDFView/Open
05_chapter 1.pdf346.12 kBAdobe PDFView/Open
06_chapter 2.pdf238.75 kBAdobe PDFView/Open
07_chapter 3.pdf862.45 kBAdobe PDFView/Open
08_chapter 4.pdf854.86 kBAdobe PDFView/Open
09_chapter 5.pdf1.28 MBAdobe PDFView/Open
10_annexures.pdf228.92 kBAdobe PDFView/Open
11_chapter 6.pdf2.18 MBAdobe PDFView/Open
80_recommendation.pdf40.84 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: