Mining Opinions about Traffic Status in Tweets using Sentiment Analysis

BOOPALAN, K

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/314282

Title:	Mining Opinions about Traffic Status in Tweets using Sentiment Analysis
Researcher:	BOOPALAN, K
Guide(s):	NALINI, C
Keywords:	Computer Science Computer Science Theory and Methods Engineering and Technology
University:	Bharath University
Completed Date:	2018
Abstract:	Text Analytics is the most promising field in information technology the past newlinedecade. Most of the organizations use text analytics to uncover meaningful information newlinefrom unstructured text because considering Natural language processing techniques are newlinehighly challenging. They often cause many problems due to the inconsistency in syntax newlineand semantics.. This research work focuses on the importance of text analytics in the newlinefield of traffic analysis and evaluates the performance of various text classification newlinealgorithms. This thesis proposes experiments, demonstrates and evaluates the concept of newlinemining opinions about traffic in tweet messages using sentiment analysis. newlineExperimentation involves discussion and comparison of ensemble classifiers over newlinelabeled tweets. A maximum of 1500 tweets per day were called for on four different newlinedays and almost 5000 tweets have been collected in all. newlineIn this research work, the research I have taken up the task of extracting public newlineopinion on traffic conditions from the tweets they make while on the road. Sentiment newlineanalysis has developed rapidly in recent years as the internet usages have reached newlinesuccessful growth among people. Only a few studies have focused on the field of newlinetransportation only with failure to meet the strict requirements of safety, efficiency and newlineinformation exchange of ITS (Intelligent Transportation System). TSA (Traffic newlineSentiment Analysis) system is used for overcoming this problem. newlineThe objective of this work is to provide a system for mining opinions relating to newlineabout the traffic status from tweets. TSA (Traffic Sentiment Analysis) system is also newlineproposed to treat the traffic problems in newer angle. A technique called keyword newlineextraction is used in this work. The research has collected a corpus of around 5000 newlinetraffic related tweets using twitter API. An ensemble model has been built by applying newlinevarious classifier algorithms over the labeled tweets of training set. Labelling the tweets newlineappropriately as positive and negative has been done manually. Finally, a ensemble has newlinebeen followed, it is used the use of classification of the test set. The results obtained newlinefrom the system are quite competitive, considering the fact that no complex NLP newlineprocessing systems have been used. newline The methodology has been demonstrated with the implementation of following newlinealgorithms used for text classification namely, MAXENT, SVM, GLMNET, SLDA, newlineBOOSTING, BAGGING, RF, NNET, TREE. In MAXENT (Maximum entropy) newlinealgorithm word count has been used as the main feature. The SVM (Support vector newlinemachine) algorithm has been used for classification and regression analysis. GLMNET newlinewhich is a linear regression library for R makes regression models and predictions. The newlinesupervised LDA method is used to pair each document with a response. BAGGING newlineaggregates the various classifiers to have a smaller variance than each classifier. newlineBOOSTING has been used for converting a weak learning algorithm into a strong one. newlineRF(Random forest) is a kind of feature bagging where the features which are strong newlinepredictions for the response will be selected. NNET is the easiest to use neural network newlinelibrary found in R.TREE classifiers are called as tree classifiers because the result newlineresembles a tree. In order to evaluate and compare the different proposed schemes with newlineeach other and to optimize the results, this thesis considers the main performance newlinemeasures of precision, recall, Fscore and mean accuracy. newline newline
Pagination:
URI:	http://hdl.handle.net/10603/314282
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
80_recommendation.pdf	Attached File	295.2 kB	Adobe PDF	View/Open
certificate.pdf		64.33 kB	Adobe PDF	View/Open
chapter 1.pdf		352.17 kB	Adobe PDF	View/Open
chapter 2.pdf		133.64 kB	Adobe PDF	View/Open
chapter 3.pdf		330.27 kB	Adobe PDF	View/Open
chapter 4.pdf		560.92 kB	Adobe PDF	View/Open
chapter 5.pdf		1.08 MB	Adobe PDF	View/Open
chapter 6.pdf		82.88 kB	Adobe PDF	View/Open
preiminary pages.pdf		176.53 kB	Adobe PDF	View/Open
references.pdf		93.43 kB	Adobe PDF	View/Open
title page.pdf		215.3 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET