Abstract
This paper proposes a bi-gram model based on dynamic programming to Chinese person named entity recognition. By studying the previous work, we concluded that we can improve the precision of NER by improving the recall rate and narrowing the gap between the recall rate and the precision rate. The algorithm defines five recognition rules which ensure the names can be recognized and returned firstly to improve the recall rate. This paper's innovation is a filtering stage introduced to filter out the invalid names by combining the inverse-maximum-matching with bi-gram model. The bi-gram model takes four pairs of transition probability into consideration when segments the sentence which can effectively narrow the gap between precision rate and recall rate. We take the open test in different corpus and materials extracted from the Internet straightly, its precision rate achieves 83.53 %, recall rate achieves 91.43 % and its F-value achieves 87.3 %.
| Original language | English |
|---|---|
| Pages (from-to) | 441-451 |
| Number of pages | 11 |
| Journal | Advances in Intelligent Systems and Computing |
| Volume | 214 |
| DOIs | |
| Publication status | Published - 2014 |
| Event | 7th International Conference on Intelligent Systems and Knowledge Engineering, ISKE 2012 - Beijing, China Duration: 15 Dec 2012 → 17 Dec 2012 |
Keywords
- Bi-gram
- Chinese person named recognition
- Dynamic programming
- Inverse-maximum-matching
- Named entity recognition