TY - JOUR
T1 - Deep Contextual Stroke Pooling for Scene Character Recognition
AU - Zhang, Zhong
AU - Wang, Hong
AU - Liu, Shuang
AU - Xiao, Baihua
N1 - Publisher Copyright:
© 2018 IEEE.
PY - 2018/3/17
Y1 - 2018/3/17
N2 - Characters, as a kind of symbols carrying rich semantic information, are composed of strokes arranged in a certain structure and are of great significance in our daily life. In this paper, we are concerned with the problem of scene character recognition, and study the problem from the perspective of feature representation. We propose a novel pooling method termed deep contextual stroke pooling (DCSP) for scene character recognition. The proposed DCSP discovers the most prominent stroke information by using stroke detectors and captures the spatial context of discriminative strokes by learning contextual factor. Specifically, we first utilize the convolutional summing map in one convolutional layer to select discriminative strokes and use the convolutional activation features of discriminative strokes to train stroke detectors. Then, we propose the contextual factor to represent the co-occurrence probability of the stroke and its location. Finally, in the response regions, we incorporate the contextual factor into the detector scores and obtain the deep contextual confidence vectors of scene characters. Extensive experiments are conducted on three databases, i.e., ICDAR2003, Chars74k, and SVHN, and the experimental results demonstrate that our method achieves higher accuracies than the state-of-the-art methods.
AB - Characters, as a kind of symbols carrying rich semantic information, are composed of strokes arranged in a certain structure and are of great significance in our daily life. In this paper, we are concerned with the problem of scene character recognition, and study the problem from the perspective of feature representation. We propose a novel pooling method termed deep contextual stroke pooling (DCSP) for scene character recognition. The proposed DCSP discovers the most prominent stroke information by using stroke detectors and captures the spatial context of discriminative strokes by learning contextual factor. Specifically, we first utilize the convolutional summing map in one convolutional layer to select discriminative strokes and use the convolutional activation features of discriminative strokes to train stroke detectors. Then, we propose the contextual factor to represent the co-occurrence probability of the stroke and its location. Finally, in the response regions, we incorporate the contextual factor into the detector scores and obtain the deep contextual confidence vectors of scene characters. Extensive experiments are conducted on three databases, i.e., ICDAR2003, Chars74k, and SVHN, and the experimental results demonstrate that our method achieves higher accuracies than the state-of-the-art methods.
KW - Scene character recognition
KW - contextual factor
KW - deep contextual stroke pooling
UR - http://www.scopus.com/inward/record.url?scp=85044031286&partnerID=8YFLogxK
U2 - 10.1109/ACCESS.2018.2817342
DO - 10.1109/ACCESS.2018.2817342
M3 - Article
AN - SCOPUS:85044031286
SN - 2169-3536
VL - 6
SP - 16454
EP - 16463
JO - IEEE Access
JF - IEEE Access
ER -