Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/340821
Title: An improved framework for authorship identification in online messages using clustering techniques and metaheuristic algorithms
Researcher: Srinivasan, l
Guide(s): Nalini, C
Keywords: Engineering and Technology
Computer Science
Computer Science Information Systems
Authorship identification
Clustering
University: Anna University
Completed Date: 2020
Abstract: Online messages, as the major channel of web communication, are important sources for identity tracing in cyberspace. Although authorshipidentification methods have achieved successes in many literary and forensic applications, very limited investigations have been undertaken specifically on online messages. The work has been developed using a framework for the identification of authorship of the online messages for addressing as well as tracing such problems. For this framework, that is used for the identification of authorship there are four writing style features (like the lexical, the syntactic, the structural and the n-gram features) that are extracted and the inductive learning algorithms have been used for building a feature based classification models for the identification of the authorship of the online messages. C4.5, Random tree, Fuzzy and the AdaBoost classifiers are used here and an experimental work are evaluated in this framework with the effects of these classification techniques on online messages are used. Feature selection select relevant features based on a specific measurement, its purpose being to simplify training and reduce training time. In this work, the Artificial Bee Colony (ABC) based feature selection algorithm is proposed. In the proposed feature selection approach, ABC algorithm optimizes the process of feature selection and yields the best optimal feature subset which increases the predictive accuracy of the classifier. ABC is used as a feature selector and generates the feature subsets and a classifier is used to evaluate each feature subset produced by the onlookers; hence the proposed system is a wrapper based system. Finding an optimal feature subset is usually intractable and many problems related to feature selection have been shown to be Non-deterministic Polynomial (NP) hard. In order to overcome the intractable property of the feature selection problem, good search algorithms are required. In this work, an optimized AdaBoost classifier using ABC algorithm is proposed. As the AdaBoost approach produces a large number of weak classifiers, ABC has the potential to automatically elect a good set of weak classifiers for AdaBoost and improve the algorithm performance. newline
Pagination: xii,122 p.
URI: http://hdl.handle.net/10603/340821
Appears in Departments:Faculty of Information and Communication Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File25.09 kBAdobe PDFView/Open
02_certificates.pdf131.03 kBAdobe PDFView/Open
03_vivaproceedings.pdf862.49 kBAdobe PDFView/Open
04_bonafidecertificate.pdf104 kBAdobe PDFView/Open
05_abstracts.pdf7.09 kBAdobe PDFView/Open
06_acknowledgements.pdf170.45 kBAdobe PDFView/Open
07_contents.pdf97.3 kBAdobe PDFView/Open
08_listoftables.pdf6.2 kBAdobe PDFView/Open
09_listoffigures.pdf6.54 kBAdobe PDFView/Open
10_listofabbreviations.pdf6.33 kBAdobe PDFView/Open
11_chapter1.pdf160.54 kBAdobe PDFView/Open
12_chapter2.pdf142.08 kBAdobe PDFView/Open
13_chapter3.pdf219.46 kBAdobe PDFView/Open
14_chapter4.pdf161.48 kBAdobe PDFView/Open
15_chapter5.pdf106.01 kBAdobe PDFView/Open
16_conclusion.pdf72.02 kBAdobe PDFView/Open
17_references.pdf116.58 kBAdobe PDFView/Open
18_listofpublications.pdf68.19 kBAdobe PDFView/Open
80_recommendation.pdf80.85 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: