Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/355111
Title: | Named entity recognition For kannada using conditional random fields |
Researcher: | PALLAVI K, P |
Guide(s): | Ramya, M M |
Keywords: | Computer Science Computer Science Information Systems Engineering and Technology |
University: | Hindustan University |
Completed Date: | 2019 |
Abstract: | Natural Language Processing (NLP) has attracted the researchers in recent years newlinebecause of the emerging of automatic text processing technology. NLP is newlinechallenging task in Indian languages due to their morphological richness, newlineambiguities and free word order nature. Very less research work has been done in newlineIndian regional languages and it is growing due to the increase in the usage of newlinemobiles applications. These applications are in general used mainly by common newlinepeople for booking cab to creating official letter in their own native official newlinelanguage. Data processes instantly based on real time scenario. Often generic newlinecorpus looks like a good choice, especially during language translation, sentence newlinesuggestion, opinion mining, sentiment analysis and product analysis. newlineGeneric data collected from Wikipedia, newswires and twitter are always noisy newlineand contain additional data of images, URLs and special symbols. This makes newlineentity recognition task more difficult. Tweets consists only upto a maximum of newline140 characters, and it is smaller than an article abstract or summary. The meaning newlineof the sentence is not clear due to lack of grammatical syntax and use of less newlinenumber of words. In some cases, sentences written are attenuated such that entity newlinerecognition itself becomes a challenging task. Though the entities are newlinefundamental elements used in a sentence to understand its subject. The newlineappearance of entities in the Wikipedia and newswire articles also causes some newlinesyntax variations in the sentence, resulting in limited information on language newlinegrammar. Hence, corpus must undergo pre-processing in the first stage to build newlineefficient entity recognition. This pre-processing must include data cleaning, separation of words as the one or more words are morphologically and newlineorthographically combined together, tokenization and annotation. newlineThough, there are pre-processors available for languages like English, German newlineand Chinese. They fail to perform in Indian languages, due to the difference in newlinesyntax and semantics. Since, language independe |
Pagination: | |
URI: | http://hdl.handle.net/10603/355111 |
Appears in Departments: | Department of Computer Application |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
10_chapter 3.pdf | Attached File | 5.42 MB | Adobe PDF | View/Open |
11_chapter 4.pdf | 6.19 MB | Adobe PDF | View/Open | |
12_chapter 5.pdf | 8.1 MB | Adobe PDF | View/Open | |
13_chapter 6.pdf | 4.37 MB | Adobe PDF | View/Open | |
14_chapter 7.pdf | 966.14 kB | Adobe PDF | View/Open | |
15_chapter 8.pdf | 115.9 kB | Adobe PDF | View/Open | |
16-reference.pdf | 4.41 MB | Adobe PDF | View/Open | |
1_title.pdf | 89.53 kB | Adobe PDF | View/Open | |
2_certificates.pdf | 243.84 kB | Adobe PDF | View/Open | |
3_declaration.pdf | 134.15 kB | Adobe PDF | View/Open | |
4_acknowledgement.pdf | 390.31 kB | Adobe PDF | View/Open | |
5_table of contents.pdf | 415.28 kB | Adobe PDF | View/Open | |
6_abstract.pdf | 921.05 kB | Adobe PDF | View/Open | |
7_list of tables, figures & abbreviations.pdf | 693.42 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 1.12 MB | Adobe PDF | View/Open | |
8_chapter 1.pdf | 3.26 MB | Adobe PDF | View/Open | |
9_chapter 2.pdf | 5.09 MB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: