跳到主要导航 跳到搜索 跳到主要内容

Study on Chinese error checking

  • Beijing Institute of Technology

科研成果: 书/报告/会议事项章节会议稿件同行评审

摘要

The word-level error checking in Chinese has been discussed. During words Segmentation, the algorithm is divided into two steps. Firstly, the longest match algorithm of forward heuristic, reverse backtracking and the recursive word segmentation algorithm of left and right sub-segment have been used to divide the text into more small loose strings. Secondly, the forward longest matching algorithm has been used to merge casual strings backward as far as possible, and the casual strings being segmented are the basis of error checking operation later. In the system of error detecting, an algorithm based on similar pronunciation strategy has been introduced. This strategy uses large-scale lexicon (340 millions) as the basis of data analysis. Then, error checking algorithm that based on similar shape which includes similar character table, Wubi repeat-code table, and Zhengma repeat-code table has been introduced to check character error. Experiments show satisfactory results.

源语言英语
主期刊名Advances in Computer Science and Education
147-154
页数8
DOI
出版状态已出版 - 2012
活动2011 International Conference on Computer Science and Education, CSE 2011 - Wuhan, 中国
期限: 26 11月 201127 11月 2011

出版系列

姓名Advances in Intelligent and Soft Computing
140 AISC
ISSN(印刷版)1867-5662

会议

会议2011 International Conference on Computer Science and Education, CSE 2011
国家/地区中国
Wuhan
时期26/11/1127/11/11

指纹

探究 'Study on Chinese error checking' 的科研主题。它们共同构成独一无二的指纹。

引用此