Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/537731
Title: Automatic Identificatio of Email Document using Text Mining
Researcher: Shroff, Namrata
Guide(s): Shingala, Amisha
Keywords: Engineering
Engineering and Technology
Engineering Industrial
University: Gujarat Technological University
Completed Date: 2022
Abstract: quotThe research work presents the mechanism for the automatic classification of email documents. The research work is carried out with the mixed approach of a combination of Natural Language Processing and Computational Linguistics as per the demand of the research problem. In this research work, the primary focus is on email (document) topic distribution and keyword topic distribution. newline newlineThe hidden knowledge from the email corpus is collected and used to retrieve the topics/ labels for the emails. Unfortunately, due to the increase in the number of emails in the inbox sometimes proper management and organization are difficult and so the important emails remain unattended. This research work tried to generate the labels in front of the subject of the email and also create a folder in Gmail inbox and store the email in the folder. Label and folder creation will be for important emails. Emails from different domains (e.g. doctor s emails, advocate emails, medical representative emails, teaching professional emails, etc.)were studied, and after collecting, identifying, and manually validating the rules through the manual calculations of each rule found from the different sources, the knowledge corpus is created to make it usable for research purposes. newline newlineFurther, the construction rules for knowledge corpus are rule-based modeled, through which the detection and identification of the labels for emails take place. Along with that, stop words filtering is also incorporated. Apart from this, the noun and verbs are detected from the subject and body of email through the NV-LDA (Noun Verb Latent Dirichlet Allocation) to understand the keywords better. The automatically generated metadata concerning computational linguistics includes details about noun-verb from subject and body of email corpus much more metadata, which is matched with the knowledge corpus and the labels for email are predicted. This research work also contributes to the knowledge corpus creation of label prediction for different users from a different doma
Pagination: 4297 KB
URI: http://hdl.handle.net/10603/537731
Appears in Departments:Computer/IT Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File181.42 kBAdobe PDFView/Open
03_abstract.pdf132.68 kBAdobe PDFView/Open
06_contents.pdf584.77 kBAdobe PDFView/Open
10_chapter1.pdf431.9 kBAdobe PDFView/Open
11_chapter2.pdf446.01 kBAdobe PDFView/Open
12_chapter3.pdf813.35 kBAdobe PDFView/Open
13_chapter4.pdf1.08 MBAdobe PDFView/Open
14_chapter5.pdf186.42 kBAdobe PDFView/Open
15_conclusion.pdf137.72 kBAdobe PDFView/Open
17_biblography.pdf176.8 kBAdobe PDFView/Open
80_recommendation.pdf756.88 kBAdobe PDFView/Open
prelim pages.pdf316.29 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: