Abstract
By analyzing the features of the audio signal, and solving the problem of confused recognition between pure music and speech sounds-music mixed part, a method which can recognize pure music and speech sounds-music mixed part, based on the average short time energy and standard deviation of zero-crossing rate features is put forward. It can precisely recognize the pure music and speech sounds-music mixed part, providing a method to pre-process the audio signal to get rid of the unnecessary part (meaningless part) of the audio signal, so that it can prove the efficiency and performance of the audio data feature extraction. By processing lots of different style, different singers and different languages, from the experimental results, the average correct recognition rate of the pure music part reached 92.30%, the average correct recognition rate of speech sounds-music mixed part reached 96.36%.
Original language | English |
---|---|
Pages (from-to) | 63-67 |
Number of pages | 5 |
Journal | Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology |
Volume | 29 |
Issue number | 1 |
Publication status | Published - Jan 2009 |
Keywords
- Audio recognition
- Average short time energy
- Inertia smooth processing
- Standard deviation of zero-crossing rate