Named entity recognition For kannada using conditional random fields

PALLAVI K, P

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/355111

Title:	Named entity recognition For kannada using conditional random fields
Researcher:	PALLAVI K, P
Guide(s):	Ramya, M M
Keywords:	Computer Science Computer Science Information Systems Engineering and Technology
University:	Hindustan University
Completed Date:	2019
Abstract:	Natural Language Processing (NLP) has attracted the researchers in recent years newlinebecause of the emerging of automatic text processing technology. NLP is newlinechallenging task in Indian languages due to their morphological richness, newlineambiguities and free word order nature. Very less research work has been done in newlineIndian regional languages and it is growing due to the increase in the usage of newlinemobiles applications. These applications are in general used mainly by common newlinepeople for booking cab to creating official letter in their own native official newlinelanguage. Data processes instantly based on real time scenario. Often generic newlinecorpus looks like a good choice, especially during language translation, sentence newlinesuggestion, opinion mining, sentiment analysis and product analysis. newlineGeneric data collected from Wikipedia, newswires and twitter are always noisy newlineand contain additional data of images, URLs and special symbols. This makes newlineentity recognition task more difficult. Tweets consists only upto a maximum of newline140 characters, and it is smaller than an article abstract or summary. The meaning newlineof the sentence is not clear due to lack of grammatical syntax and use of less newlinenumber of words. In some cases, sentences written are attenuated such that entity newlinerecognition itself becomes a challenging task. Though the entities are newlinefundamental elements used in a sentence to understand its subject. The newlineappearance of entities in the Wikipedia and newswire articles also causes some newlinesyntax variations in the sentence, resulting in limited information on language newlinegrammar. Hence, corpus must undergo pre-processing in the first stage to build newlineefficient entity recognition. This pre-processing must include data cleaning, separation of words as the one or more words are morphologically and newlineorthographically combined together, tokenization and annotation. newlineThough, there are pre-processors available for languages like English, German newlineand Chinese. They fail to perform in Indian languages, due to the difference in newlinesyntax and semantics. Since, language independe
Pagination:
URI:	http://hdl.handle.net/10603/355111
Appears in Departments:	Department of Computer Application

Files in This Item:

File	Description	Size	Format
10_chapter 3.pdf	Attached File	5.42 MB	Adobe PDF	View/Open
11_chapter 4.pdf		6.19 MB	Adobe PDF	View/Open
12_chapter 5.pdf		8.1 MB	Adobe PDF	View/Open
13_chapter 6.pdf		4.37 MB	Adobe PDF	View/Open
14_chapter 7.pdf		966.14 kB	Adobe PDF	View/Open
15_chapter 8.pdf		115.9 kB	Adobe PDF	View/Open
16-reference.pdf		4.41 MB	Adobe PDF	View/Open
1_title.pdf		89.53 kB	Adobe PDF	View/Open
2_certificates.pdf		243.84 kB	Adobe PDF	View/Open
3_declaration.pdf		134.15 kB	Adobe PDF	View/Open
4_acknowledgement.pdf		390.31 kB	Adobe PDF	View/Open
5_table of contents.pdf		415.28 kB	Adobe PDF	View/Open
6_abstract.pdf		921.05 kB	Adobe PDF	View/Open
7_list of tables, figures & abbreviations.pdf		693.42 kB	Adobe PDF	View/Open
80_recommendation.pdf		1.12 MB	Adobe PDF	View/Open
8_chapter 1.pdf		3.26 MB	Adobe PDF	View/Open
9_chapter 2.pdf		5.09 MB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET