A robust video text extraction and recognition approach using OCR feedback information

Guangyu Gao; He Zhang; Hongting Chen

doi:10.1007/978-3-319-24075-6_49

A robust video text extraction and recognition approach using OCR feedback information

Guangyu Gao^*, He Zhang, Hongting Chen

^*此作品的通讯作者

计算机学院

科研成果: 期刊稿件 › 会议文章 › 同行评审

2 引用（Scopus）

摘要

Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

源语言	英语
页（从-至）	507-517
页数	11
期刊	Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
卷	9314
DOI	https://doi.org/10.1007/978-3-319-24075-6_49
出版状态	已出版 - 2015
活动	16th Pacific-Rim Conference on Multimedia, PCM 2015 - Gwangju, 韩国期限: 16 9月 2015 → 18 9月 2015

访问文件

10.1007/978-3-319-24075-6_49

其它文件与链接

链接到 Scopus 的出版物

引用此

Gao, G., Zhang, H., & Chen, H. (2015). A robust video text extraction and recognition approach using OCR feedback information. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 9314, 507-517. https://doi.org/10.1007/978-3-319-24075-6_49

@article{15dd91b6dc3d49979132f4529daa8f6e,

title = "A robust video text extraction and recognition approach using OCR feedback information",

abstract = "Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn{\textquoteright}t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.",

keywords = "Character segmentation K-means, OCR feedback, Text extraction, Text recognition",

author = "Guangyu Gao and He Zhang and Hongting Chen",

note = "Publisher Copyright: {\textcopyright} Springer International Publishing Switzerland 2015.; 16th Pacific-Rim Conference on Multimedia, PCM 2015 ; Conference date: 16-09-2015 Through 18-09-2015",

year = "2015",

doi = "10.1007/978-3-319-24075-6_49",

language = "English",

volume = "9314",

pages = "507--517",

journal = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",

issn = "0302-9743",

publisher = "Springer Science and Business Media Deutschland GmbH",

}

TY - JOUR

T1 - A robust video text extraction and recognition approach using OCR feedback information

AU - Gao, Guangyu

AU - Zhang, He

AU - Chen, Hongting

N1 - Publisher Copyright: © Springer International Publishing Switzerland 2015.

PY - 2015

Y1 - 2015

N2 - Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

AB - Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.

KW - Character segmentation K-means

KW - OCR feedback

KW - Text extraction

KW - Text recognition

UR - http://www.scopus.com/inward/record.url?scp=84984612024&partnerID=8YFLogxK

U2 - 10.1007/978-3-319-24075-6_49

DO - 10.1007/978-3-319-24075-6_49

M3 - Conference article

AN - SCOPUS:84984612024

SN - 0302-9743

VL - 9314

SP - 507

EP - 517

JO - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

JF - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

T2 - 16th Pacific-Rim Conference on Multimedia, PCM 2015

Y2 - 16 September 2015 through 18 September 2015

ER -

A robust video text extraction and recognition approach using OCR feedback information

摘要

访问文件

其它文件与链接

指纹

引用此