Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/366162
Title: | Study on Exploitation of Optimization and Topic Modeling Approaches for Web News Mining |
Researcher: | Nithya D |
Guide(s): | Sivakumari S |
Keywords: | Engineering and Technology Computer Science Computer Science Information Systems |
University: | Avinashilingam Institute for Home Science and Higher Education for Women |
Completed Date: | 2022 |
Abstract: | Online news is a type of news that is disseminated and posted online and is publicly available via web services. There are different techniques are used to analyze and extract information from on-line news sources. But still the information in the online news is continuously changing and growing, an efficient technique is needed to extract the more valuable information from the web news. newlineIn the first phase of this research work, EFS (Evolving Fuzzy System) is utilized to categories the news articles. PeSOA (Penguin Search Optimization Algorithm) is used to select optimal threshold for pruning irrelevant terms. A bell-shaped fuzzy GMF (Gaussian Membership Function) is used to describe closeness between terms and news category. WNM (Web News Mining) based on EFS-PeSOA is enhanced with related TwD (Twitter Data) in second phase. The LDA model is used to find out the main topics of TwD. The pruned TF-IDF (Term Frequency-Inverse Document Frequency) of terms of TwD is analyzed by STGM (Spatio-Temporal Generalized Additive Model) which consider place and time of posted tweets by users. RLAR (Randomized Least Angel Regression) is used to estimate more relevant TF_IDF for generating fuzzy rules. newlineLDA based ToM considered a collection of training articles that needs approximate inference techniques which has high computational complexity problem. In order to solve this problem, Sparse-LDA with distributed learning is proposed across various processors and they separately execute Sparse-LDA on their local TwD and shared. After the ToM, FR is formed from TF-IDF values of important terms from WNA and tweets to classify the WNA. The separable topic discovery is introduced in last phase of the research. ToS separates each distinct topics have their own unique term that is not found in any other topic. newlineThe key intention of this research work is to efficiently categorize the WNA by enhancing the WNW techniques subsequently. The dataset is built from the NYT news reports that have been organized into seven distinct categories. |
Pagination: | 120 p. |
URI: | http://hdl.handle.net/10603/366162 |
Appears in Departments: | Department of Computer Science and Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 112.88 kB | Adobe PDF | View/Open |
02_certificate.pdf | 250.67 kB | Adobe PDF | View/Open | |
03_acknowledgement.pdf | 9.76 kB | Adobe PDF | View/Open | |
04_contents.pdf | 20.11 kB | Adobe PDF | View/Open | |
05_list of tables, figures and abbreivations.pdf | 13.89 kB | Adobe PDF | View/Open | |
06_chapter 1.pdf | 879.36 kB | Adobe PDF | View/Open | |
07_chapter 2.pdf | 375.95 kB | Adobe PDF | View/Open | |
08_chapter 3.pdf | 998.65 kB | Adobe PDF | View/Open | |
09_chapter 4.pdf | 733.46 kB | Adobe PDF | View/Open | |
10_chapter 5.pdf | 728.52 kB | Adobe PDF | View/Open | |
11_chapter 6.pdf | 810.41 kB | Adobe PDF | View/Open | |
12_chapter 7.pdf | 505.4 kB | Adobe PDF | View/Open | |
13_chapter 8.pdf | 223.53 kB | Adobe PDF | View/Open | |
14_references.pdf | 358.63 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 115.53 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: