Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/592904
Title: | A Study on Text Detection and Recognition in Video Frames |
Researcher: | Eshwarappa, Laxmikant |
Guide(s): | Rajput, G G |
Keywords: | Computer Science Computer Science Software Engineering Engineering and Technology |
University: | Visvesvaraya Technological University, Belagavi |
Completed Date: | 2023 |
Abstract: | One of the most interesting and difficult engineering study fields is text detection and recognition in video frames which is still expanding. To increase detection accuracy, a Text Extraction technique based on a Type-2-Fuzzy Sets-Canny Edge detection algorithm was suggested in this study. It allows considerable information to be extracted from complicated video scenes. The benefits of type-2 fuzzy sets for resolving uncertainties by automated threshold value selection were included in this text extraction system. On the other hand, an enhanced canny edge detection technique is incorporated to make it easier to segment gradient images and avoid the problem of picture vagueness in the image borders, which is most frequently caused by inadequate lighting. newlineThe creation of a text extraction strategy utilizing deep Convolutional Neural Network (CNN/ConvNet) and Long Short-Term Memory(LSTM) is proposed as a component of the future scope to achieve the highest possible classification accuracy with the various video frames under consideration.In this research, a unique five-stage approach for text recognition from video is provided. Initially, pre-processing was used to convert video to frames. Additionally, CNN was used to identify important frames and conduct text region verification. Then, using Maximally Stable Extremal Regions(MSER), enhanced candidate text blocks were retrieved. Then, Discrete Cosine Transform(DCT) features, enhanced distance map features, and features based on continuous gradients were retrieved. The LSTM was then given access to these characteristics for detection. Optical Character Recognition (OCR) was finally used to identify the texts in the picture. In particular, the Self-Improved Bald Eagle Search (SI-BESO) method was used to adjust the LSTM weights. Compared to the chosen scheme without optimization, the SI-BESO + LSTM with traditional MSER, and the SI-BESO + LSTM with conventional distance map, the SI-BESO + LSTM has revealed improved Matthews Correlation Coefficient (MCC) and F1-score |
Pagination: | 123 |
URI: | http://hdl.handle.net/10603/592904 |
Appears in Departments: | Department of Computer Application |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
01_title.pdf | Attached File | 127.73 kB | Adobe PDF | View/Open |
02_prelim pages.pdf | 273.66 kB | Adobe PDF | View/Open | |
03_content.pdf | 145.46 kB | Adobe PDF | View/Open | |
04_abstract.pdf | 24.86 kB | Adobe PDF | View/Open | |
05_chapter 1.pdf | 424.26 kB | Adobe PDF | View/Open | |
06_chapter 2.pdf | 437.03 kB | Adobe PDF | View/Open | |
07_chapter 3.pdf | 539.24 kB | Adobe PDF | View/Open | |
08_chapter 4.pdf | 801.4 kB | Adobe PDF | View/Open | |
09_chapter 5.pdf | 1.99 MB | Adobe PDF | View/Open | |
10_annexures.pdf | 165.83 kB | Adobe PDF | View/Open | |
80_recommendation.pdf | 4.8 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: