Detecting both superimposed and scene text with multiple languages and multiple alignments in video

Xiaodong Huang; Huadong Ma; Charles X. Ling; Guangyu Gao

doi:10.1007/s11042-012-1201-2

Detecting both superimposed and scene text with multiple languages and multiple alignments in video

Xiaodong Huang, Huadong Ma^*, Charles X. Ling, Guangyu Gao

^*此作品的通讯作者

科研成果: 期刊稿件 › 文章 › 同行评审

3 引用（Scopus）

摘要

Video text often contains highly useful semantic information that can contribute significantly to video retrieval and understanding. Video text can be classified into scene text and superimposed text. Most of the previous methods detect superimposed or scene text separately due to different text alignments. Moreover, because different language characters have different edge and texture features, it is very difficult to detect the multilingual text. In this paper, we first perform a detailed analysis of motion patterns of video text, and show that the superimposed and scene text exhibit different motion patterns on consecutive frames, which is insensitive to multiple language characters and multiple text alignments. Based on our analysis, we define Motion Perception Field (MPF) to represent the text motion patterns. Finally, we propose a text detection algorithms using MPF for both superimposed and scene text with multiple languages and multiple alignments. Experimental results on diverse videos demonstrate that our algorithms are robust, and outperform previous methods for detecting both superimposed and scene texts with multiple languages and multiple alignments.

源语言	英语
页（从-至）	1703-1727
页数	25
期刊	Multimedia Tools and Applications
卷	70
期	3
DOI	https://doi.org/10.1007/s11042-012-1201-2
出版状态	已出版 - 6月 2014
已对外发布	是

访问文件

10.1007/s11042-012-1201-2

其它文件与链接

链接到 Scopus 的出版物

引用此

Huang, X., Ma, H., Ling, C. X., & Gao, G. (2014). Detecting both superimposed and scene text with multiple languages and multiple alignments in video. Multimedia Tools and Applications, 70(3), 1703-1727. https://doi.org/10.1007/s11042-012-1201-2

@article{a54fd23aa385452ba371c1906d6a22d3,

title = "Detecting both superimposed and scene text with multiple languages and multiple alignments in video",

abstract = "Video text often contains highly useful semantic information that can contribute significantly to video retrieval and understanding. Video text can be classified into scene text and superimposed text. Most of the previous methods detect superimposed or scene text separately due to different text alignments. Moreover, because different language characters have different edge and texture features, it is very difficult to detect the multilingual text. In this paper, we first perform a detailed analysis of motion patterns of video text, and show that the superimposed and scene text exhibit different motion patterns on consecutive frames, which is insensitive to multiple language characters and multiple text alignments. Based on our analysis, we define Motion Perception Field (MPF) to represent the text motion patterns. Finally, we propose a text detection algorithms using MPF for both superimposed and scene text with multiple languages and multiple alignments. Experimental results on diverse videos demonstrate that our algorithms are robust, and outperform previous methods for detecting both superimposed and scene texts with multiple languages and multiple alignments.",

keywords = "Motion field, Scene text, Superimposed text, Text detection",

author = "Xiaodong Huang and Huadong Ma and Ling, {Charles X.} and Guangyu Gao",

year = "2014",

month = jun,

doi = "10.1007/s11042-012-1201-2",

language = "English",

volume = "70",

pages = "1703--1727",

journal = "Multimedia Tools and Applications",

issn = "1380-7501",

publisher = "Springer",

number = "3",

}

TY - JOUR

T1 - Detecting both superimposed and scene text with multiple languages and multiple alignments in video

AU - Huang, Xiaodong

AU - Ma, Huadong

AU - Ling, Charles X.

AU - Gao, Guangyu

PY - 2014/6

Y1 - 2014/6

N2 - Video text often contains highly useful semantic information that can contribute significantly to video retrieval and understanding. Video text can be classified into scene text and superimposed text. Most of the previous methods detect superimposed or scene text separately due to different text alignments. Moreover, because different language characters have different edge and texture features, it is very difficult to detect the multilingual text. In this paper, we first perform a detailed analysis of motion patterns of video text, and show that the superimposed and scene text exhibit different motion patterns on consecutive frames, which is insensitive to multiple language characters and multiple text alignments. Based on our analysis, we define Motion Perception Field (MPF) to represent the text motion patterns. Finally, we propose a text detection algorithms using MPF for both superimposed and scene text with multiple languages and multiple alignments. Experimental results on diverse videos demonstrate that our algorithms are robust, and outperform previous methods for detecting both superimposed and scene texts with multiple languages and multiple alignments.

AB - Video text often contains highly useful semantic information that can contribute significantly to video retrieval and understanding. Video text can be classified into scene text and superimposed text. Most of the previous methods detect superimposed or scene text separately due to different text alignments. Moreover, because different language characters have different edge and texture features, it is very difficult to detect the multilingual text. In this paper, we first perform a detailed analysis of motion patterns of video text, and show that the superimposed and scene text exhibit different motion patterns on consecutive frames, which is insensitive to multiple language characters and multiple text alignments. Based on our analysis, we define Motion Perception Field (MPF) to represent the text motion patterns. Finally, we propose a text detection algorithms using MPF for both superimposed and scene text with multiple languages and multiple alignments. Experimental results on diverse videos demonstrate that our algorithms are robust, and outperform previous methods for detecting both superimposed and scene texts with multiple languages and multiple alignments.

KW - Motion field

KW - Scene text

KW - Superimposed text

KW - Text detection

UR - http://www.scopus.com/inward/record.url?scp=84905566970&partnerID=8YFLogxK

U2 - 10.1007/s11042-012-1201-2

DO - 10.1007/s11042-012-1201-2

M3 - Article

AN - SCOPUS:84905566970

SN - 1380-7501

VL - 70

SP - 1703

EP - 1727

JO - Multimedia Tools and Applications

JF - Multimedia Tools and Applications

IS - 3

ER -

Detecting both superimposed and scene text with multiple languages and multiple alignments in video

摘要

访问文件

其它文件与链接

指纹

引用此