Novel full reference perceptual quality metric for audio-visual asynchrony

Yao Du Wei; Xiang Xie; Jing Ming Kuang; Xin Lu Han

Novel full reference perceptual quality metric for audio-visual asynchrony

Yao Du Wei^*, Xiang Xie, Jing Ming Kuang, Xin Lu Han

^*Corresponding author for this work

School of Information and Electronics

Beijing Institute of Technology

Research output: Contribution to journal › Article › peer-review

2 Citations (Scopus)

Abstract

A full reference model was proposed to evaluate the perceptual quality of audiovisual asynchrony. A standard synchronization process was used to determine the time difference between audio and video. The mapping between the time difference and the perceptual quality was derived by co-inertia analysis. The co-inertia analysis extracted the most related component from audio and video features, and then formed a mapping for each audiovisual sequence. Audiovisual contents were divided into three categories: clean speech, non speech and mixed speech. The clean speech category was further split into two subcategories. Audio and video features were chosen separately for each category. Subjective test results showed that the proposed model conforms well with subjective results.

Original language	English
Pages (from-to)	182-190
Number of pages	9
Journal	Tongxin Xuebao/Journal on Communications
Volume	33
Issue number	2
Publication status	Published - Feb 2012

Keywords

Audiovisual quality assessment
Co-inertia analysis
Signal processing technique
Synchrony

Cite this

Wei, Y. D., Xie, X., Kuang, J. M., & Han, X. L. (2012). Novel full reference perceptual quality metric for audio-visual asynchrony. Tongxin Xuebao/Journal on Communications, 33(2), 182-190.

@article{10d7351f1e554363bcbaf681c8a100d7,

title = "Novel full reference perceptual quality metric for audio-visual asynchrony",

abstract = "A full reference model was proposed to evaluate the perceptual quality of audiovisual asynchrony. A standard synchronization process was used to determine the time difference between audio and video. The mapping between the time difference and the perceptual quality was derived by co-inertia analysis. The co-inertia analysis extracted the most related component from audio and video features, and then formed a mapping for each audiovisual sequence. Audiovisual contents were divided into three categories: clean speech, non speech and mixed speech. The clean speech category was further split into two subcategories. Audio and video features were chosen separately for each category. Subjective test results showed that the proposed model conforms well with subjective results.",

keywords = "Audiovisual quality assessment, Co-inertia analysis, Signal processing technique, Synchrony",

author = "Wei, {Yao Du} and Xiang Xie and Kuang, {Jing Ming} and Han, {Xin Lu}",

year = "2012",

month = feb,

language = "English",

volume = "33",

pages = "182--190",

journal = "Tongxin Xuebao/Journal on Communications",

issn = "1000-436X",

publisher = "Posts &Telecom Press",

number = "2",

}

TY - JOUR

T1 - Novel full reference perceptual quality metric for audio-visual asynchrony

AU - Wei, Yao Du

AU - Xie, Xiang

AU - Kuang, Jing Ming

AU - Han, Xin Lu

PY - 2012/2

Y1 - 2012/2

N2 - A full reference model was proposed to evaluate the perceptual quality of audiovisual asynchrony. A standard synchronization process was used to determine the time difference between audio and video. The mapping between the time difference and the perceptual quality was derived by co-inertia analysis. The co-inertia analysis extracted the most related component from audio and video features, and then formed a mapping for each audiovisual sequence. Audiovisual contents were divided into three categories: clean speech, non speech and mixed speech. The clean speech category was further split into two subcategories. Audio and video features were chosen separately for each category. Subjective test results showed that the proposed model conforms well with subjective results.

AB - A full reference model was proposed to evaluate the perceptual quality of audiovisual asynchrony. A standard synchronization process was used to determine the time difference between audio and video. The mapping between the time difference and the perceptual quality was derived by co-inertia analysis. The co-inertia analysis extracted the most related component from audio and video features, and then formed a mapping for each audiovisual sequence. Audiovisual contents were divided into three categories: clean speech, non speech and mixed speech. The clean speech category was further split into two subcategories. Audio and video features were chosen separately for each category. Subjective test results showed that the proposed model conforms well with subjective results.

KW - Audiovisual quality assessment

KW - Co-inertia analysis

KW - Signal processing technique

KW - Synchrony

UR - http://www.scopus.com/inward/record.url?scp=84858434566&partnerID=8YFLogxK

M3 - Article

AN - SCOPUS:84858434566

SN - 1000-436X

VL - 33

SP - 182

EP - 190

JO - Tongxin Xuebao/Journal on Communications

JF - Tongxin Xuebao/Journal on Communications

IS - 2

ER -

Novel full reference perceptual quality metric for audio-visual asynchrony

Abstract

Keywords

Other files and links

Fingerprint

Cite this