Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/486732
Title: Machine learning based automatic text summarization system
Researcher: Gambhir, Mahak
Guide(s): Gupta, Vishal
Keywords: Attention mechanism
Bi-LSTM (Bidirectional Long Short Term Memory)
CNN (Convolution Neural Networks)
Contextualized Word Embeddings
Deep Learning
Extractive Text Summarization
Natural Language Processing
University: Panjab University
Completed Date: 2022
Abstract: In this research work, two deep-learning-based text summarization techniques named WL-AttenSumm and AttSum-Hybrid have been proposed. WL-AttenSumm is the first deep learning-based summarization model proposed in this study for extractive summarization of single documents that learns the syntactic and semantic relationships from the text. It employs a Word-level Attention mechanism that focuses more on the important parts of the input sequence so relevant semantic features are captured at the word level. This model employs CNN and Bi-GRU. Experimentation has been done with three different pre-trained word embedding models: GloVe, word2vec, and fasttext. newlineAnother deep-learning-based text summarization technique proposed as a part of this study is a novel hybrid summarization system, AttSum-Hybrid that takes into consideration language context and relationship between the text as well as captures structural information of the sentences. In this hybrid summarization framework, a contextual model based on a deep learning approach is combined with the statistical feature-based model. BERT works as a newlinefeature extractor in the contextualized representation model such that it generates a vector representation for each word depending upon the context in which the word appears. The Convolutional Bi-LSTM network is used in this contextual model. On the other hand, a statistical feature representation framework has been developed as a part of this hybrid summarization system that incorporates a few better-performing sentence scoring features so that the structural aspects of the text can also be captured while creating an extractive summary of the document. Experimental results demonstrate that AttSum-Hybrid has performed better than WL-AttenSumm. newline newline
Pagination: xx, 144p.
URI: http://hdl.handle.net/10603/486732
Appears in Departments:University Institute of Engineering and Technology

Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: