Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/222789
Title: Degraded text recognition of Gurmukhi script
Researcher: Kumar, Manish
Guide(s): Sharma, R.K. and Lehal, G.S.
Keywords: Character Recognition
Engineering and Technology,Computer Science,Computer Science Software Engineering
Gurmukhi Script
University: Thapar Institute of Engineering and Technology
Completed Date: 2008
Abstract: Character recognition is one of the important subjects in the field of Document Analysis and Recognition (DAR). Character recognition can be performed on printed text or handwritten text. Printed text can be from good quality documents or degraded documents. There are several kinds of degradations in almost every script of the world. The list of normally found degradations in any printed script includes touching characters, broken characters, heavy printed characters (self touching), faxed documents, typewritten documents and backside text visible documents. The problem of touching characters commonly exists in all the degraded documents containing these kinds of degradations. Hence, it is the need of the time to cope with the problem of touching characters to make an Optical Character Recognition (OCR) for degraded text. Researchers involved in recognition of good quality printed text in different scripts around the world have reported drastic decrease in recognition accuracy due to presence of touching characters in the text. Research and experiments have shown that performance breakdown of commercial document recognition system under real application situations is caused mainly due to the difficulty in dealing with touching characters that are abundant in documents as a result of document degradations. Touching characters make it difficult to correctly segment character images for individual classification, and therefore, pose severe difficulty to conventional document recognition systems that are critically dependent on character segmentation. The problem of heavy printed characters also decreases recognition accuracy. A document containing touching characters generally contains heavily printed characters also. Objective of this work is to seek new approaches to degraded document recognition of printed Gurmukhi script containing touching characters and heavily printed characters. OCR algorithms can achieve good recognition rates (near 99%) on images with little degradation.
Pagination: xix,205p.
URI: http://hdl.handle.net/10603/222789
Appears in Departments:Department of Computer Science and Engineering

Files in This Item:
File Description SizeFormat 
file10(chapter 7).pdfAttached File92.7 kBAdobe PDFView/Open
file11(bibliography).pdf114.97 kBAdobe PDFView/Open
file12(publications).pdf33.34 kBAdobe PDFView/Open
file1(title).pdf39.35 kBAdobe PDFView/Open
file2(certificate).pdf131.64 kBAdobe PDFView/Open
file3(preliminary pages).pdf566.41 kBAdobe PDFView/Open
file4(chapter 1).pdf184.82 kBAdobe PDFView/Open
file5(chapter 2).pdf213.02 kBAdobe PDFView/Open
file6(chapter 3).pdf301.13 kBAdobe PDFView/Open
file7(chapter 4).pdf1.02 MBAdobe PDFView/Open
file8(chapter 5).pdf1.1 MBAdobe PDFView/Open
file9(chapter 6).pdf1.23 MBAdobe PDFView/Open
Show full item record


Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge: