Scene text recognition using residual convolutional recurrent neural network

Zhengchao Lei; Sanyuan Zhao; Hongmei Song; Jianbing Shen

doi:10.1007/s00138-018-0942-y

Scene text recognition using residual convolutional recurrent neural network

Zhengchao Lei, Sanyuan Zhao^*, Hongmei Song, Jianbing Shen

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

33 Citations (Scopus)

Abstract

Text is a significant tool for human communication, and text recognition in scene images becomes more and more important. In this paper, we propose a residual convolutional recurrent neural network for solving the task of scene text recognition. The general convolutional recurrent neural network (CRNN) is realized by combining convolutional neural network (CNN) with recurrent neural network (RNN). The CNN part extracts features and the RNN part encodes and decodes feature sequences. In order to improve the accuracy rate of scene text recognition based on CRNN, we explore different deeper CNN architectures to get feature descriptors and analyze the corresponding text recognition results. Specifically, VGG and ResNet are introduced to train these different deep models and obtain the encoding information of images. The experimental results on public datasets demonstrate the effectiveness of our method.

Original language	English
Pages (from-to)	861-871
Number of pages	11
Journal	Machine Vision and Applications
Volume	29
Issue number	5
DOIs	https://doi.org/10.1007/s00138-018-0942-y
Publication status	Published - 1 Jul 2018

Keywords

Convolutional neural network
Recurrent neural network
Residual convolutional recurrent neural network
Residual network
Scene text recognition

Access to Document

10.1007/s00138-018-0942-y

Cite this

@article{4784b88b94704f2581b6348fe4f63e42,

title = "Scene text recognition using residual convolutional recurrent neural network",

abstract = "Text is a significant tool for human communication, and text recognition in scene images becomes more and more important. In this paper, we propose a residual convolutional recurrent neural network for solving the task of scene text recognition. The general convolutional recurrent neural network (CRNN) is realized by combining convolutional neural network (CNN) with recurrent neural network (RNN). The CNN part extracts features and the RNN part encodes and decodes feature sequences. In order to improve the accuracy rate of scene text recognition based on CRNN, we explore different deeper CNN architectures to get feature descriptors and analyze the corresponding text recognition results. Specifically, VGG and ResNet are introduced to train these different deep models and obtain the encoding information of images. The experimental results on public datasets demonstrate the effectiveness of our method.",

keywords = "Convolutional neural network, Recurrent neural network, Residual convolutional recurrent neural network, Residual network, Scene text recognition",

author = "Zhengchao Lei and Sanyuan Zhao and Hongmei Song and Jianbing Shen",

note = "Publisher Copyright: {\textcopyright} 2018, Springer-Verlag GmbH Germany, part of Springer Nature.",

year = "2018",

month = jul,

day = "1",

doi = "10.1007/s00138-018-0942-y",

language = "English",

volume = "29",

pages = "861--871",

journal = "Machine Vision and Applications",

issn = "0932-8092",

publisher = "Springer Verlag",

number = "5",

}

TY - JOUR

T1 - Scene text recognition using residual convolutional recurrent neural network

AU - Lei, Zhengchao

AU - Zhao, Sanyuan

AU - Song, Hongmei

AU - Shen, Jianbing

PY - 2018/7/1

Y1 - 2018/7/1

N2 - Text is a significant tool for human communication, and text recognition in scene images becomes more and more important. In this paper, we propose a residual convolutional recurrent neural network for solving the task of scene text recognition. The general convolutional recurrent neural network (CRNN) is realized by combining convolutional neural network (CNN) with recurrent neural network (RNN). The CNN part extracts features and the RNN part encodes and decodes feature sequences. In order to improve the accuracy rate of scene text recognition based on CRNN, we explore different deeper CNN architectures to get feature descriptors and analyze the corresponding text recognition results. Specifically, VGG and ResNet are introduced to train these different deep models and obtain the encoding information of images. The experimental results on public datasets demonstrate the effectiveness of our method.

AB - Text is a significant tool for human communication, and text recognition in scene images becomes more and more important. In this paper, we propose a residual convolutional recurrent neural network for solving the task of scene text recognition. The general convolutional recurrent neural network (CRNN) is realized by combining convolutional neural network (CNN) with recurrent neural network (RNN). The CNN part extracts features and the RNN part encodes and decodes feature sequences. In order to improve the accuracy rate of scene text recognition based on CRNN, we explore different deeper CNN architectures to get feature descriptors and analyze the corresponding text recognition results. Specifically, VGG and ResNet are introduced to train these different deep models and obtain the encoding information of images. The experimental results on public datasets demonstrate the effectiveness of our method.

KW - Convolutional neural network

KW - Recurrent neural network

KW - Residual convolutional recurrent neural network

KW - Residual network

KW - Scene text recognition

UR - http://www.scopus.com/inward/record.url?scp=85048570009&partnerID=8YFLogxK

U2 - 10.1007/s00138-018-0942-y

DO - 10.1007/s00138-018-0942-y

M3 - Article

AN - SCOPUS:85048570009

SN - 0932-8092

VL - 29

SP - 861

EP - 871

JO - Machine Vision and Applications

JF - Machine Vision and Applications

IS - 5

ER -

Scene text recognition using residual convolutional recurrent neural network

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this