Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/592904
Title: A Study on Text Detection and Recognition in Video Frames
Researcher: Eshwarappa, Laxmikant
Guide(s): Rajput, G G
Keywords: Computer Science
Computer Science Software Engineering
Engineering and Technology
University: Visvesvaraya Technological University, Belagavi
Completed Date: 2023
Abstract: One of the most interesting and difficult engineering study fields is text detection and recognition in video frames which is still expanding. To increase detection accuracy, a Text Extraction technique based on a Type-2-Fuzzy Sets-Canny Edge detection algorithm was suggested in this study. It allows considerable information to be extracted from complicated video scenes. The benefits of type-2 fuzzy sets for resolving uncertainties by automated threshold value selection were included in this text extraction system. On the other hand, an enhanced canny edge detection technique is incorporated to make it easier to segment gradient images and avoid the problem of picture vagueness in the image borders, which is most frequently caused by inadequate lighting. newlineThe creation of a text extraction strategy utilizing deep Convolutional Neural Network (CNN/ConvNet) and Long Short-Term Memory(LSTM) is proposed as a component of the future scope to achieve the highest possible classification accuracy with the various video frames under consideration.In this research, a unique five-stage approach for text recognition from video is provided. Initially, pre-processing was used to convert video to frames. Additionally, CNN was used to identify important frames and conduct text region verification. Then, using Maximally Stable Extremal Regions(MSER), enhanced candidate text blocks were retrieved. Then, Discrete Cosine Transform(DCT) features, enhanced distance map features, and features based on continuous gradients were retrieved. The LSTM was then given access to these characteristics for detection. Optical Character Recognition (OCR) was finally used to identify the texts in the picture. In particular, the Self-Improved Bald Eagle Search (SI-BESO) method was used to adjust the LSTM weights. Compared to the chosen scheme without optimization, the SI-BESO + LSTM with traditional MSER, and the SI-BESO + LSTM with conventional distance map, the SI-BESO + LSTM has revealed improved Matthews Correlation Coefficient (MCC) and F1-score
Pagination: 123
URI: http://hdl.handle.net/10603/592904
Appears in Departments:Department of Computer Application

Files in This Item:
File Description SizeFormat 
01_title.pdfAttached File127.73 kBAdobe PDFView/Open
02_prelim pages.pdf273.66 kBAdobe PDFView/Open
03_content.pdf145.46 kBAdobe PDFView/Open
04_abstract.pdf24.86 kBAdobe PDFView/Open
05_chapter 1.pdf424.26 kBAdobe PDFView/Open
06_chapter 2.pdf437.03 kBAdobe PDFView/Open
07_chapter 3.pdf539.24 kBAdobe PDFView/Open
08_chapter 4.pdf801.4 kBAdobe PDFView/Open
09_chapter 5.pdf1.99 MBAdobe PDFView/Open
10_annexures.pdf165.83 kBAdobe PDFView/Open
80_recommendation.pdf4.8 kBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: