Please use this identifier to cite or link to this item:
http://hdl.handle.net/10603/481188
Title: | Image and Video Text Recognition System |
Researcher: | Chaitra, Y L |
Guide(s): | Dinesh, R |
Keywords: | Computer Science Computer Science Artificial Intelligence, IVTR Engineering and Technology |
University: | Jain University |
Completed Date: | 2022 |
Abstract: | The text conveys much information through tags, signs, logos, labels, billboards, and newlinemarkers and has been an integral part of human life for ages now. It can deliver newlineinformation by embedding it in natural scenes images/videos; hence, they have received newlineincreasing research attention in computer vision. Furthermore, with the development of newlinedigital technology, TDR-Text Detection and Recognition in images/video has become newlinemore popular for real-time applications, such as robot navigation systems, assisting newlineblind people in travelling on roads, monitoring vehicle license plates, and security newlinereasons. The text properties include arbitrary orientations, varied font sizes, and aspect newlineratios, which are challenging to address. Difficulties presented in video-text scenes are newlinelighting changes/effects, motion blur, and occlusion. Because of the importance of newlineTDR from images and frames/videos, several researchers are working towards the newlinedevelopment of effective text recognition systems from videos and images. Therefore, newlinethe proposed system introduces different efficient text detection and recognition newlinemethods. Firstly, different pre-processing techniques are studied in the proposed newlinemethod, and it showed that Radon Transform (R.T.) gives good results compared to newlineother filtering techniques and improved by 4.68% F-Score. However, this traditional newlinemethod is inefficient in the case of vertical, far text and is sensitive to blurred text. newlineHence we proposed the next method using the neural network approach. The second newlineapproach, deep learning, is better for text localization than the previous method. newlineHowever, localizing the text in natural scene images has become challenging. The newlineproposed research provides a comprehensive solution for text localization using newlineTransfer Learning (T.L.) with Deep Convolution Neural Network (DCNN), an newlineimproved version of the first objective with a reasonably good F-Score of 82.79% is newlineachieved. As a part of the third objective, we have designed the model to identify and newlinerecognize the text from the video data, which is DEFUSE (Deep Fused) Model. newlineDEFUSE model is fused with the DEASTD (Deep Efficient and Accurate Scene Text newlineDetector) and KOCR (Keras Optical Character Recognition) model to locate and newlinerecognize the text in image/video frames and is powered by a neural network. The newlinemodel has handled the high complexity of challenging text and data dynamicity newlinedifficulties, where the video screen changes from one location to another. This model newlineimproved the accuracy by 2.85% and 10.55% on different datasets. To work with newlineimages and video frames together, the YOLOv5x model is used for text detection and newline newlinexix newlineTesserectOCR for text recognition purposes. The proposed work also concentrates on newlinecapturing text on real-time challenging videos and getting good results on different newlinedatasets. newline |
Pagination: | 131 p. |
URI: | http://hdl.handle.net/10603/481188 |
Appears in Departments: | Department of Computer Science Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
80_recommendation.pdf | Attached File | 1.19 MB | Adobe PDF | View/Open |
abstract.pdf | 432.91 kB | Adobe PDF | View/Open | |
chapter 1.pdf | 1.3 MB | Adobe PDF | View/Open | |
chapter 2.pdf | 724.76 kB | Adobe PDF | View/Open | |
chapter 3.pdf | 1.39 MB | Adobe PDF | View/Open | |
chapter 4.pdf | 1.89 MB | Adobe PDF | View/Open | |
chapter 5.pdf | 1.43 MB | Adobe PDF | View/Open | |
chapter 6.pdf | 1.48 MB | Adobe PDF | View/Open | |
cover page.pdf | 655.25 kB | Adobe PDF | View/Open | |
prelim pages.pdf | 325.19 kB | Adobe PDF | View/Open | |
references with publications.pdf | 244.46 kB | Adobe PDF | View/Open | |
table of contents.pdf | 408.45 kB | Adobe PDF | View/Open |
Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).
Altmetric Badge: