Text extraction based on maximum-minimum similarity training method

Hui Fu*, Xia Bi Liu, Yun De Jia

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

2 Citations (Scopus)

Abstract

This paper proposes a maximum-minimum similarity training algorithm to optimize the parameters in the effective method of text extraction based on Gaussian mixture modeling of neighbor characters. The maximum-minimum similarity training (MMS) methods optimize recognizer performance through maximizing the similarities of positive samples and minimizing the similarities of negative samples. Based on this approach to discriminative training, it defines the objective function for text extraction, and uses the gradient descent method to search the minimum of the objective function and the optimum parameters for the text extraction method. The experimental results of text extraction show the effectiveness of MMS training in text extraction. Compared with the maximum likelihood estimation of parameters from expectation maximization (EM) algorithm, the training results after MMS has the performance of text extraction improved greatly. The recall rate of 98.55% and the precision rate of 93.56% are achieved. The experimental results also show that the maximum-minimum similarity (MMS) training behaves better than the commonly used discriminative training of the minimum classification error (MCE).

Original languageEnglish
Pages (from-to)621-629
Number of pages9
JournalRuan Jian Xue Bao/Journal of Software
Volume19
Issue number3
DOIs
Publication statusPublished - Mar 2008

Keywords

  • Discriminative training
  • Gaussian mixture modeling
  • Maximum-minimum similarity training
  • Minimum classification error training
  • Text extraction

Fingerprint

Dive into the research topics of 'Text extraction based on maximum-minimum similarity training method'. Together they form a unique fingerprint.

Cite this