Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/360622
Full metadata record
DC Field | Value | Language |
---|---|---|
dc.coverage.spatial | ||
dc.date.accessioned | 2022-02-08T07:13:22Z | - |
dc.date.available | 2022-02-08T07:13:22Z | - |
dc.identifier.uri | http://hdl.handle.net/10603/360622 | - |
dc.description.abstract | The prime objective of the investigation presented in this thesis was to explore the semantic space in word vectors using neural word embedding. Thenon-existence of a clean, sentence aligned parallel corpus for English-Tamil language pair calls for a sufficiently large bilingual corpus for the implementation of various Natural Language Processing (NLP) applications such as machine translation, cross-lingual information retrieval and semantic comparison. Although word embedding has been in vogue in recent years, the adequate method for the evaluation of word embedding begs attention. Besides an in-depth discussion of the intrinsic and extrinsic evaluation of bilingual word embedding models, a data set was developed for the evaluation of English -Tamil bilingual word embedding algorithms. The data set was evaluated on a bilingual model; analysis of experimental results showcased insightful inferences into the semantics captured by word vectors and human cognition. However, bilingual embeddings typically capture common semantics and reject variations. Hence, transfer function-based generated embedding (TFGE), a deeply learned transfer function was developed, where vectors from the embedding space of one language are projected onto that of the other language.Three well regarded off-the-shelf embedding algorithms, Word2Vec, GloVe,and FastText, were used to train the TFGE model, from English, a resource rich source language, to Tamil, a resource-deficient target language, in a data efficient way. The efficacy of the proposed TFGE model was confirmed by a better synthesis of new vectors for unknown source language words. Pre -trained Word2Vec Hindi and Chinese embeddings were marshalled to appraise the deployable capability of the TFGE model across other target languages. The versatility of the developed model was substantively demonstrated in selected NLP use-cases - Text Summarization, Part Of Speech (POS) Tagging,and Bilingual Dictionary Induction (BDI).In a nutshell,the following developments are the major ... | |
dc.format.extent | xxi, 162 | |
dc.language | English | |
dc.relation | ||
dc.rights | university | |
dc.title | Exploration of Semantic Space of Word Vectors Using Word Embedding | |
dc.title.alternative | ||
dc.creator.researcher | Sanjanasri J P | |
dc.subject.keyword | Center for Computational Engineering and Networking; Natural Language Processing; NLP; Neural Networks;semantic space ; bilingual word; Word Embedding; Deep Learning; machine learning; Pruning; Indian languages | |
dc.subject.keyword | Computer Science; Interdisciplinary Applications; | |
dc.description.note | ||
dc.contributor.guide | Soman K P | |
dc.publisher.place | Coimbatore | |
dc.publisher.university | Amrita Vishwa Vidyapeetham University | |
dc.publisher.institution | Center for Computational Engineering and Networking (CEN) | |
dc.date.registered | 2014 | |
dc.date.completed | 2021 | |
dc.date.awarded | 2021 | |
dc.format.dimensions | ||
dc.format.accompanyingmaterial | None | |
dc.source.university | University | |
dc.type.degree | Ph.D. | |
Appears in Departments: | Center for Computational Engineering and Networking (CEN) |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 145.17 kB | Adobe PDF | View/Open |
02_certificate.pdf | 194.19 kB | Adobe PDF | View/Open | |
03_ preliminary pages.pdf | 421.73 kB | Adobe PDF | View/Open | |
04_chapter 1.pdf | 157.8 kB | Adobe PDF | View/Open | |
05_chapter 2.pdf | 440.2 kB | Adobe PDF | View/Open | |
06_chapter 3.pdf | 381.74 kB | Adobe PDF | View/Open | |
07_chapter 4.pdf | 423.78 kB | Adobe PDF | View/Open | |
08_chapter 5.pdf | 1.13 MB | Adobe PDF | View/Open | |
09_chapter 6.pdf | 114.09 kB | Adobe PDF | View/Open | |
10_bibliography.pdf | 156.11 kB | Adobe PDF | View/Open | |
11_publications.pdf | 74.95 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 258.83 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: