A Study on Text Detection and Recognition in Video Frames

Eshwarappa, Laxmikant

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/592904

Title:	A Study on Text Detection and Recognition in Video Frames
Researcher:	Eshwarappa, Laxmikant
Guide(s):	Rajput, G G
Keywords:	Computer Science Computer Science Software Engineering Engineering and Technology
University:	Visvesvaraya Technological University, Belagavi
Completed Date:	2023
Abstract:	One of the most interesting and difficult engineering study fields is text detection and recognition in video frames which is still expanding. To increase detection accuracy, a Text Extraction technique based on a Type-2-Fuzzy Sets-Canny Edge detection algorithm was suggested in this study. It allows considerable information to be extracted from complicated video scenes. The benefits of type-2 fuzzy sets for resolving uncertainties by automated threshold value selection were included in this text extraction system. On the other hand, an enhanced canny edge detection technique is incorporated to make it easier to segment gradient images and avoid the problem of picture vagueness in the image borders, which is most frequently caused by inadequate lighting. newlineThe creation of a text extraction strategy utilizing deep Convolutional Neural Network (CNN/ConvNet) and Long Short-Term Memory(LSTM) is proposed as a component of the future scope to achieve the highest possible classification accuracy with the various video frames under consideration.In this research, a unique five-stage approach for text recognition from video is provided. Initially, pre-processing was used to convert video to frames. Additionally, CNN was used to identify important frames and conduct text region verification. Then, using Maximally Stable Extremal Regions(MSER), enhanced candidate text blocks were retrieved. Then, Discrete Cosine Transform(DCT) features, enhanced distance map features, and features based on continuous gradients were retrieved. The LSTM was then given access to these characteristics for detection. Optical Character Recognition (OCR) was finally used to identify the texts in the picture. In particular, the Self-Improved Bald Eagle Search (SI-BESO) method was used to adjust the LSTM weights. Compared to the chosen scheme without optimization, the SI-BESO + LSTM with traditional MSER, and the SI-BESO + LSTM with conventional distance map, the SI-BESO + LSTM has revealed improved Matthews Correlation Coefficient (MCC) and F1-score
Pagination:	123
URI:	http://hdl.handle.net/10603/592904
Appears in Departments:	Department of Computer Application

Files in This Item:

File	Description	Size	Format
01_title.pdf	Attached File	127.73 kB	Adobe PDF	View/Open
02_prelim pages.pdf		273.66 kB	Adobe PDF	View/Open
03_content.pdf		145.46 kB	Adobe PDF	View/Open
04_abstract.pdf		24.86 kB	Adobe PDF	View/Open
05_chapter 1.pdf		424.26 kB	Adobe PDF	View/Open
06_chapter 2.pdf		437.03 kB	Adobe PDF	View/Open
07_chapter 3.pdf		539.24 kB	Adobe PDF	View/Open
08_chapter 4.pdf		801.4 kB	Adobe PDF	View/Open
09_chapter 5.pdf		1.99 MB	Adobe PDF	View/Open
10_annexures.pdf		165.83 kB	Adobe PDF	View/Open
80_recommendation.pdf		4.8 kB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET