Degraded text recognition of Gurmukhi script

Kumar, Manish

Please use this identifier to cite or link to this item: http://hdl.handle.net/10603/222789

Title:	Degraded text recognition of Gurmukhi script
Researcher:	Kumar, Manish
Guide(s):	Sharma, R.K. and Lehal, G.S.
Keywords:	Character Recognition Engineering and Technology,Computer Science,Computer Science Software Engineering Gurmukhi Script
University:	Thapar Institute of Engineering and Technology
Completed Date:	2008
Abstract:	Character recognition is one of the important subjects in the field of Document Analysis and Recognition (DAR). Character recognition can be performed on printed text or handwritten text. Printed text can be from good quality documents or degraded documents. There are several kinds of degradations in almost every script of the world. The list of normally found degradations in any printed script includes touching characters, broken characters, heavy printed characters (self touching), faxed documents, typewritten documents and backside text visible documents. The problem of touching characters commonly exists in all the degraded documents containing these kinds of degradations. Hence, it is the need of the time to cope with the problem of touching characters to make an Optical Character Recognition (OCR) for degraded text. Researchers involved in recognition of good quality printed text in different scripts around the world have reported drastic decrease in recognition accuracy due to presence of touching characters in the text. Research and experiments have shown that performance breakdown of commercial document recognition system under real application situations is caused mainly due to the difficulty in dealing with touching characters that are abundant in documents as a result of document degradations. Touching characters make it difficult to correctly segment character images for individual classification, and therefore, pose severe difficulty to conventional document recognition systems that are critically dependent on character segmentation. The problem of heavy printed characters also decreases recognition accuracy. A document containing touching characters generally contains heavily printed characters also. Objective of this work is to seek new approaches to degraded document recognition of printed Gurmukhi script containing touching characters and heavily printed characters. OCR algorithms can achieve good recognition rates (near 99%) on images with little degradation.
Pagination:	xix,205p.
URI:	http://hdl.handle.net/10603/222789
Appears in Departments:	Department of Computer Science and Engineering

Files in This Item:

File	Description	Size	Format
file10(chapter 7).pdf	Attached File	92.7 kB	Adobe PDF	View/Open
file11(bibliography).pdf		114.97 kB	Adobe PDF	View/Open
file12(publications).pdf		33.34 kB	Adobe PDF	View/Open
file1(title).pdf		39.35 kB	Adobe PDF	View/Open
file2(certificate).pdf		131.64 kB	Adobe PDF	View/Open
file3(preliminary pages).pdf		566.41 kB	Adobe PDF	View/Open
file4(chapter 1).pdf		184.82 kB	Adobe PDF	View/Open
file5(chapter 2).pdf		213.02 kB	Adobe PDF	View/Open
file6(chapter 3).pdf		301.13 kB	Adobe PDF	View/Open
file7(chapter 4).pdf		1.02 MB	Adobe PDF	View/Open
file8(chapter 5).pdf		1.1 MB	Adobe PDF	View/Open
file9(chapter 6).pdf		1.23 MB	Adobe PDF	View/Open

Show full item record

Items in Shodhganga are licensed under Creative Commons Licence Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Altmetric Badge:

Shodhganga : a reservoir of Indian theses @ INFLIBNET