Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/368806
Title: | Efficient Unsupervised Learning Technique Based Automatic Text Categorization |
Researcher: | Jain, Deepti |
Guide(s): | Jain, R.C. and Verma, Bhupendra |
Keywords: | Computer Science Computer Science Information Systems Engineering and Technology |
University: | Rajiv Gandhi Proudyogiki Vishwavidyalaya |
Completed Date: | 2013 |
Abstract: | Automatic text categorization can play an important role in a wide variety of more newlinetlexible. dynamic and personalized information management tasks such as real-time newlinesorting of email or files into folder hierarchies; topic identification to support topicspecific newlineprocessing operations; structured search and/or browsing; or finding documents newlinethat match long-term standing interests or more dynamic task-based interests. newlineIn many contexts, textual information is a more important communication data in newlineWorld Wide Web which is employed to categorize new knowledge by the trained newlineprofessionals. This process is very time consuming and costly, thus limiting its newlineapplicability. Consequently there is increased interest in developing technologies for newlineautomatic text categorization. newlineThe main focus of this research work is to study the problem of automatic newlinetext categorization and to develop efficient unsupervised learning technique based text newlinecategorization mechanism. In this thesis, an attempt is made to overcome the challenges newlineof the various classifiers in terms of learning speed, real-time classification speed, and newlineaccuracy. Three new algorithms are implements and results are analyzed to see the newlineperformance of these algorithms using two different types of datasets DS0 and DS1 (20- newlineNewsgroups, and Reuters-21578 WebPages). The performance evaluations of the newlineproposed algorithms are done on different combinations of classifiers (Naïve Bayes and newlineJ48) and datasets (DS0 and DS1). newlineThe first algorithm describes a novel unsupervised learning based approach newlinewhich uses frequent item (term) sets for text clustering for reducing drastically the newlinedimensionality of the data. All the way through the performance analysis, it provides newlinehetter accuracy of classilication as compared to direct classification. |
Pagination: | 11.6MB |
URI: | http://hdl.handle.net/10603/368806 |
Appears in Departments: | Computer Science Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 261.71 kB | Adobe PDF | View/Open |
02_declarations.pdf | 277.3 kB | Adobe PDF | View/Open | |
03_certificate.pdf | 190.52 kB | Adobe PDF | View/Open | |
04 _acknowledgement.pdf | 1.05 MB | Adobe PDF | View/Open | |
05 _ content.pdf | 1.1 MB | Adobe PDF | View/Open | |
06 _list of tables.pdf | 258.88 kB | Adobe PDF | View/Open | |
07 _ chapter 1.pdf | 1.07 MB | Adobe PDF | View/Open | |
08 _chapter 2.pdf | 877.82 kB | Adobe PDF | View/Open | |
09 _ chapter 3.pdf | 382.76 kB | Adobe PDF | View/Open | |
10 _ a chapter 5.pdf | 1.2 MB | Adobe PDF | View/Open | |
10 _ b chapter 6.pdf | 1.26 MB | Adobe PDF | View/Open | |
10 _ c chapter 7.pdf | 163.03 kB | Adobe PDF | View/Open | |
10 _ chapter 4.pdf | 1.29 MB | Adobe PDF | View/Open | |
11_ bibliography.pdf | 851.95 kB | Adobe PDF | View/Open | |
11_ list of publicatins.pdf | 279.85 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 468.43 kB | Adobe PDF | View/Open | |
_abstract.pdf | 468.43 kB | Adobe PDF | View/Open | |
_ list of abbreviations.pdf | 254.71 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: