A robust video text extraction and recognition approach using OCR feedback information

Guangyu Gao; He Zhang; Hongting Chen

doi:10.1007/978-3-319-24075-6_49

A robust video text extraction and recognition approach using OCR feedback information

Guangyu Gao^*, He Zhang, Hongting Chen

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Conference article › peer-review

2 Citations (Scopus)

Abstract

Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

Original language	English
Pages (from-to)	507-517
Number of pages	11
Journal	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume	9314
DOIs	https://doi.org/10.1007/978-3-319-24075-6_49
Publication status	Published - 2015
Event	16th Pacific-Rim Conference on Multimedia, PCM 2015 - Gwangju, Korea, Republic of Duration: 16 Sept 2015 → 18 Sept 2015

Keywords

Character segmentation K-means
OCR feedback
Text extraction
Text recognition

Access to Document

10.1007/978-3-319-24075-6_49

Cite this

@article{15dd91b6dc3d49979132f4529daa8f6e,

title = "A robust video text extraction and recognition approach using OCR feedback information",

abstract = "Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn{\textquoteright}t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.",

keywords = "Character segmentation K-means, OCR feedback, Text extraction, Text recognition",

author = "Guangyu Gao and He Zhang and Hongting Chen",

note = "Publisher Copyright: {\textcopyright} Springer International Publishing Switzerland 2015.; 16th Pacific-Rim Conference on Multimedia, PCM 2015 ; Conference date: 16-09-2015 Through 18-09-2015",

year = "2015",

doi = "10.1007/978-3-319-24075-6_49",

language = "English",

volume = "9314",

pages = "507--517",

journal = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

issn = "0302-9743",

publisher = "Springer Science and Business Media Deutschland GmbH",

}

A robust video text extraction and recognition approach using OCR feedback information. / Gao, Guangyu; Zhang, He; Chen, Hongting.
In: Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), Vol. 9314, 2015, p. 507-517.

Research output: Contribution to journal › Conference article › peer-review

TY - JOUR

T1 - A robust video text extraction and recognition approach using OCR feedback information

AU - Gao, Guangyu

AU - Zhang, He

AU - Chen, Hongting

N1 - Publisher Copyright: © Springer International Publishing Switzerland 2015.

PY - 2015

Y1 - 2015

N2 - Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

AB - Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

KW - Character segmentation K-means

KW - OCR feedback

KW - Text extraction

KW - Text recognition

UR - http://www.scopus.com/inward/record.url?scp=84984612024&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-24075-6_49

DO - 10.1007/978-3-319-24075-6_49

M3 - Conference article

AN - SCOPUS:84984612024

SN - 0302-9743

VL - 9314

SP - 507

EP - 517

JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

T2 - 16th Pacific-Rim Conference on Multimedia, PCM 2015

Y2 - 16 September 2015 through 18 September 2015

ER -

A robust video text extraction and recognition approach using OCR feedback information

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this