Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/133097
Title: | ENHANCING THE PERFORMANCE OF AN ONTOLOGY BASED INFORMATION RETRIEVAL SYSTEM USING A HYBRID GENETIC ALGORITHM |
Researcher: | Vanjulavalli, S |
Guide(s): | Kovalan, Dr .A |
Keywords: | REPtree,BFtree,J48,CART |
University: | Periyar Maniammai University |
Completed Date: | 28/07/2014 |
Abstract: | newlineInformation Retrieval (IR) issues have attracted increasing newlineattention due to the growing availability of the documents. IR determines relevant newlinedocuments from a collection of documents based on a query from the user. The newlineretrieval of web pages is more challenging due to the ambiguous nature of the newlineunstructured information found in these pages. Ontologies help to overcome the newlinedisambiguate nature of the natural language by the use of standard terms that newlinerelate to specific concepts. Ontology is a hierarchy of concepts with attributes and newlinerelations that defines an agreed terminology to describe semantic networks of newlineinterrelated information units. Ontology provides a vocabulary of classes and newlineproperties to describe a domain, emphasizing the sharing of knowledge and the newlineconsensus about its representation. newlineSome of the key challenges in Web based information retrieval is the newlineambiguity of words due to the meaning it conveys. These challenges can be newlineovercome using semantic interpretation and Ontology based systems. However the newlinecorpus in web is extremely large with a good number of attributes not contributing newlineto the information retrieval process. Poor features decrease the precision and newlinerecall. Selecting features can be done statistically using techniques like newlineInformation Gain (IG), Mutual Information (MI) or Singular Value newlineDecomposition (SVD). However feature selection is NP hard. This work newlineinvestigates technique for feature selection and soft computing based classifiers newlinefor classification. newlineThe work proposed and carried out in this work can be broadly classified into newlineand#61623; Investigation of various Bagging based classifiers for web page newlineclassification newlineand#61623; Ontology based feature extraction newlineand#61623; A novel feature selection algorithm using Genetic Algorithm newlineand#61623; An improved Neural Network. newlinevi newlineKeywords and features based on ontology are classified through newlinebagging with various decision trees like REPtree, BFtree, J48, and CART. newlineExperiments show that the new feature extraction improves precision and recall newlinesatisfactorily. newlineFor effective classification, the extracted features should give valuable newlineinformation about the categories, and it should be inexpensive in terms of newlinecomputation. In the proposed features extraction, the features are extracted based newlineon the ontology and feature selection is achieved by GA. A concept based tree newlinestructure is built on a generalization/specialization relationship to newlineconceptualization the domain. Browsing knowledge is made easier if the newlineconceptual architecture of the knowledge base is identified as a whole and newlineinformation is accessible by intra conceptual hierarchical links during browsing. newlineThe experimental results demonstrate that proposed feature extraction improves newlinethe precision and recall satisfactorily. The Hybrid GA based feature selection newlineimproves classification accuracy by 0.27% to 1.7% than GA based feature newlineselection. newlineUsing features selected based on GA a Multilayer Perceptron Neural newlineNetwork (MLPNN) is trained to classify web pages. It is proposed to use GA for newlinetraining the MLPNN.The GA tends to get trapped in the local minima. Thus to newlineovercome this problem, Hill Climbing is used as a local search in the hybrid newlinealgorithm. Numerical results revealed that hybrid classifier trained by multilayer newlineNeural Network (NN) with GA to select IDF and ontology based features gave newline94% accuracy, high precision and recall and lowest Root Mean Square Error newline(RMSE) when compared to other methods. |
Pagination: | |
URI: | http://hdl.handle.net/10603/133097 |
Appears in Departments: | Department of Computer Science and Applications |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 124.48 kB | Adobe PDF | View/Open |
02_certificate.pdf | 953.67 kB | Adobe PDF | View/Open | |
03_declaration.pdf | 953.67 kB | Adobe PDF | View/Open | |
04_acknowledgement.pdf | 60.18 kB | Adobe PDF | View/Open | |
05_abstract.pdf | 44.77 kB | Adobe PDF | View/Open | |
06_table.pdf | 35 kB | Adobe PDF | View/Open | |
07_figure.pdf | 37.13 kB | Adobe PDF | View/Open | |
08_abbreviation.pdf | 60.33 kB | Adobe PDF | View/Open | |
09_contents.pdf | 39.78 kB | Adobe PDF | View/Open | |
10_chapter1.pdf | 166.58 kB | Adobe PDF | View/Open | |
11_chapter2.pdf | 231.14 kB | Adobe PDF | View/Open | |
12_chapter3.pdf | 532.74 kB | Adobe PDF | View/Open | |
13_chapter4.pdf | 450.74 kB | Adobe PDF | View/Open | |
14_chapter5.pdf | 410.67 kB | Adobe PDF | View/Open | |
15_chapter6.pdf | 89.6 kB | Adobe PDF | View/Open | |
16_reference.pdf | 170.56 kB | Adobe PDF | View/Open | |
17_appendix.pdf | 97.44 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: