Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/355111
Title: Named entity recognition For kannada using conditional random fields
Researcher: PALLAVI K, P
Guide(s): Ramya, M M
Keywords: Computer Science
Computer Science Information Systems
Engineering and Technology
University: Hindustan University
Completed Date: 2019
Abstract: Natural Language Processing (NLP) has attracted the researchers in recent years newlinebecause of the emerging of automatic text processing technology. NLP is newlinechallenging task in Indian languages due to their morphological richness, newlineambiguities and free word order nature. Very less research work has been done in newlineIndian regional languages and it is growing due to the increase in the usage of newlinemobiles applications. These applications are in general used mainly by common newlinepeople for booking cab to creating official letter in their own native official newlinelanguage. Data processes instantly based on real time scenario. Often generic newlinecorpus looks like a good choice, especially during language translation, sentence newlinesuggestion, opinion mining, sentiment analysis and product analysis. newlineGeneric data collected from Wikipedia, newswires and twitter are always noisy newlineand contain additional data of images, URLs and special symbols. This makes newlineentity recognition task more difficult. Tweets consists only upto a maximum of newline140 characters, and it is smaller than an article abstract or summary. The meaning newlineof the sentence is not clear due to lack of grammatical syntax and use of less newlinenumber of words. In some cases, sentences written are attenuated such that entity newlinerecognition itself becomes a challenging task. Though the entities are newlinefundamental elements used in a sentence to understand its subject. The newlineappearance of entities in the Wikipedia and newswire articles also causes some newlinesyntax variations in the sentence, resulting in limited information on language newlinegrammar. Hence, corpus must undergo pre-processing in the first stage to build newlineefficient entity recognition. This pre-processing must include data cleaning, separation of words as the one or more words are morphologically and newlineorthographically combined together, tokenization and annotation. newlineThough, there are pre-processors available for languages like English, German newlineand Chinese. They fail to perform in Indian languages, due to the difference in newlinesyntax and semantics. Since, language independe
Pagination: 
URI: http://hdl.handle.net/10603/355111
Appears in Departments:Department of Computer Application

Files in This Item:
File Description SizeFormat 
10_chapter 3.pdfAttached File5.42 MBAdobe PDFView/Open
11_chapter 4.pdf6.19 MBAdobe PDFView/Open
12_chapter 5.pdf8.1 MBAdobe PDFView/Open
13_chapter 6.pdf4.37 MBAdobe PDFView/Open
14_chapter 7.pdf966.14 kBAdobe PDFView/Open
15_chapter 8.pdf115.9 kBAdobe PDFView/Open
16-reference.pdf4.41 MBAdobe PDFView/Open
1_title.pdf89.53 kBAdobe PDFView/Open
2_certificates.pdf243.84 kBAdobe PDFView/Open
3_declaration.pdf134.15 kBAdobe PDFView/Open
4_acknowledgement.pdf390.31 kBAdobe PDFView/Open
5_table of contents.pdf415.28 kBAdobe PDFView/Open
6_abstract.pdf921.05 kBAdobe PDFView/Open
7_list of tables, figures & abbreviations.pdf693.42 kBAdobe PDFView/Open
80_recommendation.pdf1.12 MBAdobe PDFView/Open
8_chapter 1.pdf3.26 MBAdobe PDFView/Open
9_chapter 2.pdf5.09 MBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: