An efficient sentiment classification technique in Hadoop framework using optimized tree

Sridharan, K

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/336265

Title:	An efficient sentiment classification technique in Hadoop framework using optimized tree
Researcher:	Sridharan, K
Guide(s):	Komarasamy, G and Daniel Madan Raja
Keywords:	Engineering and Technology Computer Science Computer Science Hardware and Architecture
University:	Anna University
Completed Date:	2020
Abstract:	The big data analysis requires a fast mining on a large scale data set, i.e., the immense amount of data should be processed in a limited time to show useful information. As the computing power improves, the more volume of date cab be processed. The more data are retrieved and processed; the better understanding of problems can be obtained. The process whereby the subsets of features that are obtained from the data are extorted for a learning algorithm s application is referred to as feature extraction. Classification is the problem identifying to which set of categories a new observation belongs on the basis of training set of data contains observations whose category membership is known. Feature selection can address the curse of dimensionality by selecting only relevant features for classification. By eliminating and reducing irrelevant features and redundant features, feature selection could reduce the number of features, cut down the training time, simplify the learned classifiers and improve the classification performance. While handling big data, Hadoop provides a platform for users in developing their own sentiment analysis with the help of lexicon dictionary or available application programming interfaces (APIs) or external programs. The aim of classifying a data is to analyze large data and develop an appropriate description or model for every organized class with the feature present in the data. This work involves Hadoop framework in obtaining an effective classification with the help of Random Forest (RF) Techniques. Feature extraction using Term Frequency-Inverse Document Frequency (TF-IDF) TECHNIQUE. Term frequency (TF) means the number of times a term appeared in a document. Document frequency (DF) is defined as the number of documents that includes a term. Inverse Document Frequency (IDF) measures the amount of information. The TF-IDF is used to calculate the product of TF and IDF. RF be the parametric supervised classification method which can be considered as a Classification and Regr
Pagination:	xiv,126 p.
URI:	http://hdl.handle.net/10603/336265
Appears in Departments:	Faculty of Information and Communication Engineering

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	132.21 kB	Adobe PDF	View/Open
02_certificates.pdf		682.8 kB	Adobe PDF	View/Open
03_vivaproceedings.pdf		202.87 kB	Adobe PDF	View/Open
04_bonafidecertificate.pdf		396.04 kB	Adobe PDF	View/Open
05_abstracts.pdf		111.24 kB	Adobe PDF	View/Open
06_acknowledgements.pdf		221.03 kB	Adobe PDF	View/Open
07_contents.pdf		232.57 kB	Adobe PDF	View/Open
08_listoftables.pdf		83.77 kB	Adobe PDF	View/Open
09_listoffigures.pdf		140.34 kB	Adobe PDF	View/Open
10_listofabbreviations.pdf		121.13 kB	Adobe PDF	View/Open
11_chapter1.pdf		467.09 kB	Adobe PDF	View/Open
12_chapter2.pdf		352.93 kB	Adobe PDF	View/Open
13_chapter3.pdf		352.62 kB	Adobe PDF	View/Open
14_chapter4.pdf		421.22 kB	Adobe PDF	View/Open
15_chapter5.pdf		293.15 kB	Adobe PDF	View/Open
16_conclusion.pdf		167.89 kB	Adobe PDF	View/Open
17_references.pdf		290.24 kB	Adobe PDF	View/Open
18_listofpublications.pdf		152.64 kB	Adobe PDF	View/Open
80_recommendation.pdf		189.61 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET