Abstract
Video text is very important semantic information, which brings precise and meaningful clues for video indexing and retrieval. However, most previous approaches did video text extraction and recognition separately, while the main difficulty of extraction and recognition with complex background wasn’t handled very well. In this paper, these difficulty is investigated by combining text extraction and recognition together as well as using OCR feedback information. The following features are highlighted in our approach: (i) an efficient character image segmentation method is proposed in consideration of most prior knowledge. (ii) text extraction are implemented both on text-row and segmented single character images, since text-row based extraction maintains the color consistency of characters and backgrounds while single character has simpler background. After that, the best binary image is chosen for recognition with OCR feedback. (iii) The K-means algorithm is used for extraction which ensures that the best extraction result is involved, which is the binary image with clear classification of text strokes and background. Finally, extensive experiments and empirical evaluations on several video text images are conducted to demonstrate the satisfying performance of the proposed approach.
Original language | English |
---|---|
Pages (from-to) | 507-517 |
Number of pages | 11 |
Journal | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Volume | 9314 |
DOIs | |
Publication status | Published - 2015 |
Event | 16th Pacific-Rim Conference on Multimedia, PCM 2015 - Gwangju, Korea, Republic of Duration: 16 Sept 2015 → 18 Sept 2015 |
Keywords
- Character segmentation K-means
- OCR feedback
- Text extraction
- Text recognition