Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/2385
Title: A hybrid approach to web page categorization
Researcher: Swathi, B V
Guide(s): Govardhan, A
Keywords: Web Page Retrieval, Document Clustering, Meta Search Engines, Soft Computing Methods, Genetic Algorithms, Hyperlink Structure, Jaccard Index
Upload Date: 25-Aug-2011
University: Jawaharlal Nehru Technological University
Completed Date: August 2010
Abstract: In the recent past, the World Wide Web has been witnessing an explosive growth and search engines are the most popular way of finding information on it. In most cases, the user is flooded with thousands of web pages in response to his or her search query and many users hardly go past the first few web pages. It is really debatable as to how useful or meaningful it is for any search engine to return so many web pages in response to a user query. In spite of the sophisticated page ranking algorithms employed by the search engines, the pages the user actually needs may actually get lost in the huge amount of information returned. Since most users of the web are not experts, grouping of the web pages into meaningful categories helps them to navigate quickly by reducing the search space. Web page classification and clustering are the two tasks which have been traditionally carried out by human beings who are experts in the domain. But in this electronic age, with the explosion in the amount of information available on the net, it is becoming increasingly difficult for human experts to classify or cluster all the documents available on the World Wide Web. Hence, it is increasingly evident that automatic techniques be used instead of human experts to carry out the tasks of web document classification and clustering, as part of the activity of categorizing them. Web page categorization is the main focus of this thesis. It is strongly believed and felt that the experience of a person using a web search engine is enhanced multifold if the search results are nicely categorized as against the case where the results are displayed as a flat list.
Pagination: xi, 153p.
URI: http://hdl.handle.net/10603/2385
Appears in Departments:Faculty of Computer Science & Engineering

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File1.06 MBAdobe PDFView/Open
02_certificate.pdf214.73 kBAdobe PDFView/Open
03_acknowledgements.pdf217.18 kBAdobe PDFView/Open
04_abstract.pdf687.04 kBAdobe PDFView/Open
05_table of contents.pdf686.04 kBAdobe PDFView/Open
06_list of figures.pdf683.88 kBAdobe PDFView/Open
07_list of tables.pdf681.93 kBAdobe PDFView/Open
08_abbreviations.pdf680.7 kBAdobe PDFView/Open
09_chapter 1.pdf951.07 kBAdobe PDFView/Open
10_chapter 2.pdf827.13 kBAdobe PDFView/Open
11_chapter 3.pdf983.58 kBAdobe PDFView/Open
12_chapter 4.pdf868.68 kBAdobe PDFView/Open
13_chapter 5.pdf2.09 MBAdobe PDFView/Open
14_chapter 6.pdf509.37 kBAdobe PDFView/Open
15_chapter 7.pdf795.55 kBAdobe PDFView/Open
16_chapter 8.pdf785.69 kBAdobe PDFView/Open
17_references.pdf1.38 MBAdobe PDFView/Open


Items in Shodhganga are protected by copyright, with all rights reserved, unless otherwise indicated.

Altmetric Badge: