Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/336265
Title: | An efficient sentiment classification technique in Hadoop framework using optimized tree |
Researcher: | Sridharan, K |
Guide(s): | Komarasamy, G and Daniel Madan Raja |
Keywords: | Engineering and Technology Computer Science Computer Science Hardware and Architecture |
University: | Anna University |
Completed Date: | 2020 |
Abstract: | The big data analysis requires a fast mining on a large scale data set, i.e., the immense amount of data should be processed in a limited time to show useful information. As the computing power improves, the more volume of date cab be processed. The more data are retrieved and processed; the better understanding of problems can be obtained. The process whereby the subsets of features that are obtained from the data are extorted for a learning algorithm s application is referred to as feature extraction. Classification is the problem identifying to which set of categories a new observation belongs on the basis of training set of data contains observations whose category membership is known. Feature selection can address the curse of dimensionality by selecting only relevant features for classification. By eliminating and reducing irrelevant features and redundant features, feature selection could reduce the number of features, cut down the training time, simplify the learned classifiers and improve the classification performance. While handling big data, Hadoop provides a platform for users in developing their own sentiment analysis with the help of lexicon dictionary or available application programming interfaces (APIs) or external programs. The aim of classifying a data is to analyze large data and develop an appropriate description or model for every organized class with the feature present in the data. This work involves Hadoop framework in obtaining an effective classification with the help of Random Forest (RF) Techniques. Feature extraction using Term Frequency-Inverse Document Frequency (TF-IDF) TECHNIQUE. Term frequency (TF) means the number of times a term appeared in a document. Document frequency (DF) is defined as the number of documents that includes a term. Inverse Document Frequency (IDF) measures the amount of information. The TF-IDF is used to calculate the product of TF and IDF. RF be the parametric supervised classification method which can be considered as a Classification and Regr |
Pagination: | xiv,126 p. |
URI: | http://hdl.handle.net/10603/336265 |
Appears in Departments: | Faculty of Information and Communication Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 132.21 kB | Adobe PDF | View/Open |
02_certificates.pdf | 682.8 kB | Adobe PDF | View/Open | |
03_vivaproceedings.pdf | 202.87 kB | Adobe PDF | View/Open | |
04_bonafidecertificate.pdf | 396.04 kB | Adobe PDF | View/Open | |
05_abstracts.pdf | 111.24 kB | Adobe PDF | View/Open | |
06_acknowledgements.pdf | 221.03 kB | Adobe PDF | View/Open | |
07_contents.pdf | 232.57 kB | Adobe PDF | View/Open | |
08_listoftables.pdf | 83.77 kB | Adobe PDF | View/Open | |
09_listoffigures.pdf | 140.34 kB | Adobe PDF | View/Open | |
10_listofabbreviations.pdf | 121.13 kB | Adobe PDF | View/Open | |
11_chapter1.pdf | 467.09 kB | Adobe PDF | View/Open | |
12_chapter2.pdf | 352.93 kB | Adobe PDF | View/Open | |
13_chapter3.pdf | 352.62 kB | Adobe PDF | View/Open | |
14_chapter4.pdf | 421.22 kB | Adobe PDF | View/Open | |
15_chapter5.pdf | 293.15 kB | Adobe PDF | View/Open | |
16_conclusion.pdf | 167.89 kB | Adobe PDF | View/Open | |
17_references.pdf | 290.24 kB | Adobe PDF | View/Open | |
18_listofpublications.pdf | 152.64 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 189.61 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: