Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/366162
Title: Study on Exploitation of Optimization and Topic Modeling Approaches for Web News Mining
Researcher: Nithya D
Guide(s): Sivakumari S
Keywords: Engineering and Technology
Computer Science
Computer Science Information Systems
University: Avinashilingam Institute for Home Science and Higher Education for Women
Completed Date: 2022
Abstract: Online news is a type of news that is disseminated and posted online and is publicly available via web services. There are different techniques are used to analyze and extract information from on-line news sources. But still the information in the online news is continuously changing and growing, an efficient technique is needed to extract the more valuable information from the web news. newlineIn the first phase of this research work, EFS (Evolving Fuzzy System) is utilized to categories the news articles. PeSOA (Penguin Search Optimization Algorithm) is used to select optimal threshold for pruning irrelevant terms. A bell-shaped fuzzy GMF (Gaussian Membership Function) is used to describe closeness between terms and news category. WNM (Web News Mining) based on EFS-PeSOA is enhanced with related TwD (Twitter Data) in second phase. The LDA model is used to find out the main topics of TwD. The pruned TF-IDF (Term Frequency-Inverse Document Frequency) of terms of TwD is analyzed by STGM (Spatio-Temporal Generalized Additive Model) which consider place and time of posted tweets by users. RLAR (Randomized Least Angel Regression) is used to estimate more relevant TF_IDF for generating fuzzy rules. newlineLDA based ToM considered a collection of training articles that needs approximate inference techniques which has high computational complexity problem. In order to solve this problem, Sparse-LDA with distributed learning is proposed across various processors and they separately execute Sparse-LDA on their local TwD and shared. After the ToM, FR is formed from TF-IDF values of important terms from WNA and tweets to classify the WNA. The separable topic discovery is introduced in last phase of the research. ToS separates each distinct topics have their own unique term that is not found in any other topic. newlineThe key intention of this research work is to efficiently categorize the WNA by enhancing the WNW techniques subsequently. The dataset is built from the NYT news reports that have been organized into seven distinct categories.
Pagination: 120 p.
URI: http://hdl.handle.net/10603/366162
Appears in Departments:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File112.88 kBAdobe PDFView/Open
02_certificate.pdf250.67 kBAdobe PDFView/Open
03_acknowledgement.pdf9.76 kBAdobe PDFView/Open
04_contents.pdf20.11 kBAdobe PDFView/Open
05_list of tables, figures and abbreivations.pdf13.89 kBAdobe PDFView/Open
06_chapter 1.pdf879.36 kBAdobe PDFView/Open
07_chapter 2.pdf375.95 kBAdobe PDFView/Open
08_chapter 3.pdf998.65 kBAdobe PDFView/Open
09_chapter 4.pdf733.46 kBAdobe PDFView/Open
10_chapter 5.pdf728.52 kBAdobe PDFView/Open
11_chapter 6.pdf810.41 kBAdobe PDFView/Open
12_chapter 7.pdf505.4 kBAdobe PDFView/Open
13_chapter 8.pdf223.53 kBAdobe PDFView/Open
14_references.pdf358.63 kBAdobe PDFView/Open
80_recommendation.pdf115.53 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: