Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/448126
Title: Analyzing Big Data Originated from Social Networks and Data Communication Networks
Researcher: Shyamasundar, L B
Guide(s): Jhansi Rani, P
Keywords: Computer Science
Computer Science Software Engineering
Engineering and Technology
University: Visvesvaraya Technological University, Belagavi
Completed Date: 2022
Abstract: The dissertation work is divided into two parts. The newlinefirst part discusses about sentiment analysis of tweets newlinegenerated in the Big Data form, by using machine learning newlinealgorithms. Second part of the dissertation work discusses newlineabout fine-tuning the resource allocation mechanism of newlinedistributed Apache Spark s multinode cluster. This results in newlinefaster processing and analysis of Big Data originated from newlinedata communication networks, using K-means machine newlinelearning algorithm. newlineAs first part of the dissertation work, a multiple tier newlinearchitecture is proposed for performing sentiment newlineclassification. This includes several modules like newlinepreprocessing, data cleaning, tokenization, stemming, an newlineupdated set of stopwords, lexicon and emoticon dictionaries newlineand mechanisms for selecting the best features. newlineIn sentiment analysis, prior to training a Machine Learning newline(ML) model, the person using the software tool should select newlinea ML algorithm manually and tune the model parameters newlinesince an algorithm and its tuning parameter values will newlinegreatly impact a model s performance. But selecting and fine newlinetuning them requires high expertise and labor-intensive newlineiterations. Thus, automating this process is much needed to newlinemake ML accessible to layman users with a limited computing newlineand programming expertise. newlineIn particular, there is no a-single-model-fits-all solution to newlineachieve highest accuracy for all varieties of dataset in a newlinespecific application domain. It is a tedious, time-consuming newlineand inefficient process to try out several ML algorithms with newlinevarying parameter configurations. Hence, automating the ML newlinemodelling process is of much importance. In the proposed newlinemethod, the algorithm automatically selects the best newlineiv newlineperforming ML algorithm against a particular dataset by newlineoptimizing the parameter settings for the selected algorithm. newlineThis yields a much better performance than selecting an newlinealgorithm with its default settings. newlineProposed model is a multiple layered ML architecture, with newlineaccuracy as the evaluation criteria while analyzing the tweets. newlineTuning
Pagination: xvii, 183
URI: http://hdl.handle.net/10603/448126
Appears in Departments:CMR Institute of Technology

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File197.12 kBAdobe PDFView/Open
02_prelim pages.pdf404.4 kBAdobe PDFView/Open
03_content.pdf130.6 kBAdobe PDFView/Open
04_abstract.pdf58.31 kBAdobe PDFView/Open
05_chapter 1.pdf488.79 kBAdobe PDFView/Open
06_chapter 2.pdf127.44 kBAdobe PDFView/Open
07_chapter 3.pdf359.06 kBAdobe PDFView/Open
08_chapter 4.pdf1.54 MBAdobe PDFView/Open
09_chapter 5.pdf467.28 kBAdobe PDFView/Open
10_annexures.pdf194.64 kBAdobe PDFView/Open
11_chapter 6.pdf190.79 kBAdobe PDFView/Open
12_chapter 7.pdf1.31 MBAdobe PDFView/Open
80_recommendation.pdf560.98 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: