Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/448126
Title: | Analyzing Big Data Originated from Social Networks and Data Communication Networks |
Researcher: | Shyamasundar, L B |
Guide(s): | Jhansi Rani, P |
Keywords: | Computer Science Computer Science Software Engineering Engineering and Technology |
University: | Visvesvaraya Technological University, Belagavi |
Completed Date: | 2022 |
Abstract: | The dissertation work is divided into two parts. The newlinefirst part discusses about sentiment analysis of tweets newlinegenerated in the Big Data form, by using machine learning newlinealgorithms. Second part of the dissertation work discusses newlineabout fine-tuning the resource allocation mechanism of newlinedistributed Apache Spark s multinode cluster. This results in newlinefaster processing and analysis of Big Data originated from newlinedata communication networks, using K-means machine newlinelearning algorithm. newlineAs first part of the dissertation work, a multiple tier newlinearchitecture is proposed for performing sentiment newlineclassification. This includes several modules like newlinepreprocessing, data cleaning, tokenization, stemming, an newlineupdated set of stopwords, lexicon and emoticon dictionaries newlineand mechanisms for selecting the best features. newlineIn sentiment analysis, prior to training a Machine Learning newline(ML) model, the person using the software tool should select newlinea ML algorithm manually and tune the model parameters newlinesince an algorithm and its tuning parameter values will newlinegreatly impact a model s performance. But selecting and fine newlinetuning them requires high expertise and labor-intensive newlineiterations. Thus, automating this process is much needed to newlinemake ML accessible to layman users with a limited computing newlineand programming expertise. newlineIn particular, there is no a-single-model-fits-all solution to newlineachieve highest accuracy for all varieties of dataset in a newlinespecific application domain. It is a tedious, time-consuming newlineand inefficient process to try out several ML algorithms with newlinevarying parameter configurations. Hence, automating the ML newlinemodelling process is of much importance. In the proposed newlinemethod, the algorithm automatically selects the best newlineiv newlineperforming ML algorithm against a particular dataset by newlineoptimizing the parameter settings for the selected algorithm. newlineThis yields a much better performance than selecting an newlinealgorithm with its default settings. newlineProposed model is a multiple layered ML architecture, with newlineaccuracy as the evaluation criteria while analyzing the tweets. newlineTuning |
Pagination: | xvii, 183 |
URI: | http://hdl.handle.net/10603/448126 |
Appears in Departments: | CMR Institute of Technology |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 197.12 kB | Adobe PDF | View/Open |
02_prelim pages.pdf | 404.4 kB | Adobe PDF | View/Open | |
03_content.pdf | 130.6 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 58.31 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 488.79 kB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 127.44 kB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 359.06 kB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 1.54 MB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 467.28 kB | Adobe PDF | View/Open | |
10_annexures.pdf | 194.64 kB | Adobe PDF | View/Open | |
11_chapter 6.pdf | 190.79 kB | Adobe PDF | View/Open | |
12_chapter 7.pdf | 1.31 MB | Adobe PDF | View/Open | |
80_recommendation.pdf | 560.98 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: