Edge-pixels clustering for text area extraction

Hui Fu; Xiabi Liu; Yunde Jia

Edge-pixels clustering for text area extraction

Hui Fu^*, Xiabi Liu, Yunde Jia

^*Corresponding author for this work

School of Computer Science and Technology

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

3 Citations (Scopus)

Abstract

An approach based on edge-pixels clustering to extract Chinese and English text areas from an image is proposed. The image is segmented into pixel-subclasses based on the colors and positions of edge-pixels. And then the initial text areas are extracted according to the features of edges in text area. The boundaries of the initial text areas are expanded for the entire text areas. Furthermore, an algorithm of text area binarization is presented to improve the efficiency of post-processing by reducing the number of binary images when the text color polarity is unknown. The experimental results show that the proposed approach is effective with integrality up to 99%.

Original language	English
Pages (from-to)	729-734
Number of pages	6
Journal	Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics
Volume	18
Issue number	5
Publication status	Published - May 2006

Keywords

Clustering
Image binary
Image retrieval
Optical character recognition (OCR)
Text area extraction

Cite this

@article{bdc9230317574873bef0c84ebd424638,

title = "Edge-pixels clustering for text area extraction",

abstract = "An approach based on edge-pixels clustering to extract Chinese and English text areas from an image is proposed. The image is segmented into pixel-subclasses based on the colors and positions of edge-pixels. And then the initial text areas are extracted according to the features of edges in text area. The boundaries of the initial text areas are expanded for the entire text areas. Furthermore, an algorithm of text area binarization is presented to improve the efficiency of post-processing by reducing the number of binary images when the text color polarity is unknown. The experimental results show that the proposed approach is effective with integrality up to 99%.",

keywords = "Clustering, Image binary, Image retrieval, Optical character recognition (OCR), Text area extraction",

author = "Hui Fu and Xiabi Liu and Yunde Jia",

year = "2006",

month = may,

language = "English",

volume = "18",

pages = "729--734",

journal = "Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics",

issn = "1003-9775",

publisher = "Institute of Computing Technology",

number = "5",

}

TY - JOUR

T1 - Edge-pixels clustering for text area extraction

AU - Fu, Hui

AU - Liu, Xiabi

AU - Jia, Yunde

PY - 2006/5

Y1 - 2006/5

N2 - An approach based on edge-pixels clustering to extract Chinese and English text areas from an image is proposed. The image is segmented into pixel-subclasses based on the colors and positions of edge-pixels. And then the initial text areas are extracted according to the features of edges in text area. The boundaries of the initial text areas are expanded for the entire text areas. Furthermore, an algorithm of text area binarization is presented to improve the efficiency of post-processing by reducing the number of binary images when the text color polarity is unknown. The experimental results show that the proposed approach is effective with integrality up to 99%.

AB - An approach based on edge-pixels clustering to extract Chinese and English text areas from an image is proposed. The image is segmented into pixel-subclasses based on the colors and positions of edge-pixels. And then the initial text areas are extracted according to the features of edges in text area. The boundaries of the initial text areas are expanded for the entire text areas. Furthermore, an algorithm of text area binarization is presented to improve the efficiency of post-processing by reducing the number of binary images when the text color polarity is unknown. The experimental results show that the proposed approach is effective with integrality up to 99%.

KW - Clustering

KW - Image binary

KW - Image retrieval

KW - Optical character recognition (OCR)

KW - Text area extraction

UR - http://www.scopus.com/inward/record.url?scp=33744777150&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:33744777150

SN - 1003-9775

VL - 18

SP - 729

EP - 734

JO - Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics

JF - Jisuanji Fuzhu Sheji Yu Tuxingxue Xuebao/Journal of Computer-Aided Design and Computer Graphics

IS - 5

ER -

Edge-pixels clustering for text area extraction

Abstract

Keywords

Other files and links

Fingerprint

Cite this