Recognition of pure music from speech sound-music mixed part of audio signal

Ling Zhi Kong; Sen Lin Luo; Bing Zhang; Yao Wei Wang

Recognition of pure music from speech sound-music mixed part of audio signal

Ling Zhi Kong^*, Sen Lin Luo, Bing Zhang, Yao Wei Wang

^*Corresponding author for this work

School of Information and Electronics

Research output: Contribution to journal › Article › peer-review

Abstract

By analyzing the features of the audio signal, and solving the problem of confused recognition between pure music and speech sounds-music mixed part, a method which can recognize pure music and speech sounds-music mixed part, based on the average short time energy and standard deviation of zero-crossing rate features is put forward. It can precisely recognize the pure music and speech sounds-music mixed part, providing a method to pre-process the audio signal to get rid of the unnecessary part (meaningless part) of the audio signal, so that it can prove the efficiency and performance of the audio data feature extraction. By processing lots of different style, different singers and different languages, from the experimental results, the average correct recognition rate of the pure music part reached 92.30%, the average correct recognition rate of speech sounds-music mixed part reached 96.36%.

Original language	English
Pages (from-to)	63-67
Number of pages	5
Journal	Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology
Volume	29
Issue number	1
Publication status	Published - Jan 2009

Keywords

Audio recognition
Average short time energy
Inertia smooth processing
Standard deviation of zero-crossing rate

Cite this

@article{e033876d771b40ea9eb115d47afa1426,

title = "Recognition of pure music from speech sound-music mixed part of audio signal",

abstract = "By analyzing the features of the audio signal, and solving the problem of confused recognition between pure music and speech sounds-music mixed part, a method which can recognize pure music and speech sounds-music mixed part, based on the average short time energy and standard deviation of zero-crossing rate features is put forward. It can precisely recognize the pure music and speech sounds-music mixed part, providing a method to pre-process the audio signal to get rid of the unnecessary part (meaningless part) of the audio signal, so that it can prove the efficiency and performance of the audio data feature extraction. By processing lots of different style, different singers and different languages, from the experimental results, the average correct recognition rate of the pure music part reached 92.30%, the average correct recognition rate of speech sounds-music mixed part reached 96.36%.",

keywords = "Audio recognition, Average short time energy, Inertia smooth processing, Standard deviation of zero-crossing rate",

author = "Kong, {Ling Zhi} and Luo, {Sen Lin} and Bing Zhang and Wang, {Yao Wei}",

year = "2009",

month = jan,

language = "English",

volume = "29",

pages = "63--67",

journal = "Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology",

issn = "1001-0645",

publisher = "Beijing Institute of Technology",

number = "1",

}

TY - JOUR

T1 - Recognition of pure music from speech sound-music mixed part of audio signal

AU - Kong, Ling Zhi

AU - Luo, Sen Lin

AU - Zhang, Bing

AU - Wang, Yao Wei

PY - 2009/1

Y1 - 2009/1

N2 - By analyzing the features of the audio signal, and solving the problem of confused recognition between pure music and speech sounds-music mixed part, a method which can recognize pure music and speech sounds-music mixed part, based on the average short time energy and standard deviation of zero-crossing rate features is put forward. It can precisely recognize the pure music and speech sounds-music mixed part, providing a method to pre-process the audio signal to get rid of the unnecessary part (meaningless part) of the audio signal, so that it can prove the efficiency and performance of the audio data feature extraction. By processing lots of different style, different singers and different languages, from the experimental results, the average correct recognition rate of the pure music part reached 92.30%, the average correct recognition rate of speech sounds-music mixed part reached 96.36%.

AB - By analyzing the features of the audio signal, and solving the problem of confused recognition between pure music and speech sounds-music mixed part, a method which can recognize pure music and speech sounds-music mixed part, based on the average short time energy and standard deviation of zero-crossing rate features is put forward. It can precisely recognize the pure music and speech sounds-music mixed part, providing a method to pre-process the audio signal to get rid of the unnecessary part (meaningless part) of the audio signal, so that it can prove the efficiency and performance of the audio data feature extraction. By processing lots of different style, different singers and different languages, from the experimental results, the average correct recognition rate of the pure music part reached 92.30%, the average correct recognition rate of speech sounds-music mixed part reached 96.36%.

KW - Audio recognition

KW - Average short time energy

KW - Inertia smooth processing

KW - Standard deviation of zero-crossing rate

UR - http://www.scopus.com/inward/record.url?scp=61649116950&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:61649116950

SN - 1001-0645

VL - 29

SP - 63

EP - 67

JO - Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology

JF - Beijing Ligong Daxue Xuebao/Transaction of Beijing Institute of Technology

IS - 1

ER -

Recognition of pure music from speech sound-music mixed part of audio signal

Abstract

Keywords

Other files and links

Fingerprint

Cite this