Text extraction based on maximum-minimum similarity training method

Hui Fu; Xia Bi Liu; Yun De Jia

doi:10.3724/SP.J.1001.2008.00621

Text extraction based on maximum-minimum similarity training method

Hui Fu^*, Xia Bi Liu, Yun De Jia

^*Corresponding author for this work

School of Computer Science and Technology

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

This paper proposes a maximum-minimum similarity training algorithm to optimize the parameters in the effective method of text extraction based on Gaussian mixture modeling of neighbor characters. The maximum-minimum similarity training (MMS) methods optimize recognizer performance through maximizing the similarities of positive samples and minimizing the similarities of negative samples. Based on this approach to discriminative training, it defines the objective function for text extraction, and uses the gradient descent method to search the minimum of the objective function and the optimum parameters for the text extraction method. The experimental results of text extraction show the effectiveness of MMS training in text extraction. Compared with the maximum likelihood estimation of parameters from expectation maximization (EM) algorithm, the training results after MMS has the performance of text extraction improved greatly. The recall rate of 98.55% and the precision rate of 93.56% are achieved. The experimental results also show that the maximum-minimum similarity (MMS) training behaves better than the commonly used discriminative training of the minimum classification error (MCE).

Original language	English
Pages (from-to)	621-629
Number of pages	9
Journal	Ruan Jian Xue Bao/Journal of Software
Volume	19
Issue number	3
DOIs	https://doi.org/10.3724/SP.J.1001.2008.00621
Publication status	Published - Mar 2008

Keywords

Discriminative training
Gaussian mixture modeling
Maximum-minimum similarity training
Minimum classification error training
Text extraction

Access to Document

10.3724/SP.J.1001.2008.00621

Cite this

Fu, H., Liu, X. B., & Jia, Y. D. (2008). Text extraction based on maximum-minimum similarity training method. Ruan Jian Xue Bao/Journal of Software, 19(3), 621-629. https://doi.org/10.3724/SP.J.1001.2008.00621

@article{9421b00565d74339ab38f7431e309d2e,

title = "Text extraction based on maximum-minimum similarity training method",

abstract = "This paper proposes a maximum-minimum similarity training algorithm to optimize the parameters in the effective method of text extraction based on Gaussian mixture modeling of neighbor characters. The maximum-minimum similarity training (MMS) methods optimize recognizer performance through maximizing the similarities of positive samples and minimizing the similarities of negative samples. Based on this approach to discriminative training, it defines the objective function for text extraction, and uses the gradient descent method to search the minimum of the objective function and the optimum parameters for the text extraction method. The experimental results of text extraction show the effectiveness of MMS training in text extraction. Compared with the maximum likelihood estimation of parameters from expectation maximization (EM) algorithm, the training results after MMS has the performance of text extraction improved greatly. The recall rate of 98.55% and the precision rate of 93.56% are achieved. The experimental results also show that the maximum-minimum similarity (MMS) training behaves better than the commonly used discriminative training of the minimum classification error (MCE).",

keywords = "Discriminative training, Gaussian mixture modeling, Maximum-minimum similarity training, Minimum classification error training, Text extraction",

author = "Hui Fu and Liu, {Xia Bi} and Jia, {Yun De}",

year = "2008",

month = mar,

doi = "10.3724/SP.J.1001.2008.00621",

language = "English",

volume = "19",

pages = "621--629",

journal = "Ruan Jian Xue Bao/Journal of Software",

issn = "1000-9825",

publisher = "Chinese Academy of Sciences",

number = "3",

}

TY - JOUR

T1 - Text extraction based on maximum-minimum similarity training method

AU - Fu, Hui

AU - Liu, Xia Bi

AU - Jia, Yun De

PY - 2008/3

Y1 - 2008/3

N2 - This paper proposes a maximum-minimum similarity training algorithm to optimize the parameters in the effective method of text extraction based on Gaussian mixture modeling of neighbor characters. The maximum-minimum similarity training (MMS) methods optimize recognizer performance through maximizing the similarities of positive samples and minimizing the similarities of negative samples. Based on this approach to discriminative training, it defines the objective function for text extraction, and uses the gradient descent method to search the minimum of the objective function and the optimum parameters for the text extraction method. The experimental results of text extraction show the effectiveness of MMS training in text extraction. Compared with the maximum likelihood estimation of parameters from expectation maximization (EM) algorithm, the training results after MMS has the performance of text extraction improved greatly. The recall rate of 98.55% and the precision rate of 93.56% are achieved. The experimental results also show that the maximum-minimum similarity (MMS) training behaves better than the commonly used discriminative training of the minimum classification error (MCE).

AB - This paper proposes a maximum-minimum similarity training algorithm to optimize the parameters in the effective method of text extraction based on Gaussian mixture modeling of neighbor characters. The maximum-minimum similarity training (MMS) methods optimize recognizer performance through maximizing the similarities of positive samples and minimizing the similarities of negative samples. Based on this approach to discriminative training, it defines the objective function for text extraction, and uses the gradient descent method to search the minimum of the objective function and the optimum parameters for the text extraction method. The experimental results of text extraction show the effectiveness of MMS training in text extraction. Compared with the maximum likelihood estimation of parameters from expectation maximization (EM) algorithm, the training results after MMS has the performance of text extraction improved greatly. The recall rate of 98.55% and the precision rate of 93.56% are achieved. The experimental results also show that the maximum-minimum similarity (MMS) training behaves better than the commonly used discriminative training of the minimum classification error (MCE).

KW - Discriminative training

KW - Gaussian mixture modeling

KW - Maximum-minimum similarity training

KW - Minimum classification error training

KW - Text extraction

UR - http://www.scopus.com/inward/record.url?scp=41949115231&partnerID=8YFLogxK

U2 - 10.3724/SP.J.1001.2008.00621

DO - 10.3724/SP.J.1001.2008.00621

M3 - Article

AN - SCOPUS:41949115231

SN - 1000-9825

VL - 19

SP - 621

EP - 629

JO - Ruan Jian Xue Bao/Journal of Software

JF - Ruan Jian Xue Bao/Journal of Software

IS - 3

ER -

Text extraction based on maximum-minimum similarity training method

Abstract

Keywords

Access to Document

Other files and links

Fingerprint

Cite this