Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/254348
Full metadata record
DC FieldValueLanguage
dc.coverage.spatialImproved Weightage Approaches for Data Dependency and Classification in Web Documents
dc.date.accessioned2019-08-22T11:36:10Z-
dc.date.available2019-08-22T11:36:10Z-
dc.identifier.urihttp://hdl.handle.net/10603/254348-
dc.description.abstractIn recent years, World Wide Web (WWW) has become only the largest and most important source for all kind of information. Web mining is one of the applications in data mining that is widely spread in various fields like science, business, medical, engineering, banking, etc. Hence, many researchers are working and developing various algorithms and techniques to deal with different issues in web mining. However, there are some challenging tasks in the field of web mining to extract the optimal and valuable information from the web. Term weighting plays an important role in retrieving the documents from the web. So, several term weighting methods have been proposed for assigning weight to the documents based on the occurrences of the terms. Term Frequency-Inverse Document Frequency (TF-IDF) is the most frequently used term weighting method in the field of Information Retrieval. The main objective of this thesis is to implement an efficient and effective term weighting method based on classical TF-IDF method for text classification. To improve the retrieval accuracy of the web documents, this thesis investigates two different term weighting methods namely Improved TF-IDF (ImpTF-IDF) and Co-Term Frequency (CTF) method. ImpTF-IDF is a single term based term weighting method which is an extension of classical TF formula. The idea behind this term weighting method is that the average frequency of a term in a collection of documents is considered for assigning weight to that term. The ratio between the lengths of the document to the total number of distinct terms in a corpus is found for normalization. newline
dc.format.extentxxi, 187p.
dc.languageEnglish
dc.relationp.176-186
dc.rightsuniversity
dc.titleImproved weightage approaches for data dependency and classification in web documents
dc.title.alternative
dc.creator.researcherSanthanakumar M
dc.subject.keywordData Dependency
dc.subject.keywordEngineering and Technology,Computer Science,Computer Science Interdisciplinary Applications
dc.subject.keywordWeb Documents
dc.subject.keywordWeightage Approaches
dc.description.note
dc.contributor.guideChristopher Columbus C
dc.publisher.placeChennai
dc.publisher.universityAnna University
dc.publisher.institutionFaculty of Information and Communication Engineering
dc.date.registeredn.d.
dc.date.completed2018
dc.date.awarded31/11/2018
dc.format.dimensions21 cm
dc.format.accompanyingmaterialNone
dc.source.universityUniversity
dc.type.degreePh.D.
Appears in Departments:Faculty of Information and Communication Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File24.75 kBAdobe PDFView/Open
02_certificates.pdf476.27 kBAdobe PDFView/Open
03_abstract.pdf6.81 kBAdobe PDFView/Open
04_acknowledgement.pdf5.34 kBAdobe PDFView/Open
05_table of contents.pdf107.72 kBAdobe PDFView/Open
06_list_of_symbols and abbreviations.pdf257.59 kBAdobe PDFView/Open
07_chapter1.pdf278.93 kBAdobe PDFView/Open
08_chapter2.pdf199.64 kBAdobe PDFView/Open
09_chapter3.pdf415.44 kBAdobe PDFView/Open
10_chapter4.pdf516.76 kBAdobe PDFView/Open
11_chapter5.pdf549.3 kBAdobe PDFView/Open
12_chapter6.pdf659 kBAdobe PDFView/Open
13_chapter7.pdf325.68 kBAdobe PDFView/Open
14_conclusion.pdf438.13 kBAdobe PDFView/Open
15_references.pdf249.06 kBAdobe PDFView/Open
16_list_of_publications.pdf127.32 kBAdobe PDFView/Open


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: