面向虚拟现实场景的房间脉冲响应计算模型

Zhiyu Li; Jing Wang; Xinwen Yue; Lidong Yang; Shenghui Zhao; Xiang Xie

doi:10.12395/0371-0025.2024150

面向虚拟现实场景的房间脉冲响应计算模型

Zhiyu Li, Jing Wang^*, Xinwen Yue, Lidong Yang, Shenghui Zhao, Xiang Xie

^*此作品的通讯作者

信息与电子学院

科研成果: 期刊稿件 › 文章 › 同行评审

摘要

This study proposes a room impulse response (RIR) computation model tailored for virtual reality applications, integrating deep learning neural network techniques with psychoacoustic perception parameters. This model can efficiently predict perceptually meaningful RIRs from virtual reality scene data while ensuring high-quality predictions. It meets the requirements for real-time generation, high sampling rate, unrestricted length, and lightweight implementation in virtual reality audio scenarios. The model first encodes the acoustic information from the scene using a graph convolutional neural network, then decodes this information through a neural sound field and transposed convolution model to obtain the RIR perception parameters. Finally, the RIR signal is reconstructed from these parameters. Experimental results demonstrate that the proposed model offers significant advantages in RIR generation quality, computational efficiency, and functionality, making it well-suited to meet the real-time RIR generation needs of virtual reality audio.

投稿的翻译标题	Room impulse response calculation model for virtual reality scenarios
源语言	繁体中文
页（从-至）	1186-1196
页数	11
期刊	Shengxue Xuebao/Acta Acustica
卷	49
期	6
DOI	https://doi.org/10.12395/0371-0025.2024150
出版状态	已出版 - 11月 2024

关键词

Deep learning
Perceptual parameter
Room impulse response
Virtual reality

访问文件

10.12395/0371-0025.2024150

其它文件与链接

链接到 Scopus 的出版物

引用此

@article{e680ce99bd024e36aba0b62c828f63b4,

title = "面向虚拟现实场景的房间脉冲响应计算模型",

abstract = "This study proposes a room impulse response (RIR) computation model tailored for virtual reality applications, integrating deep learning neural network techniques with psychoacoustic perception parameters. This model can efficiently predict perceptually meaningful RIRs from virtual reality scene data while ensuring high-quality predictions. It meets the requirements for real-time generation, high sampling rate, unrestricted length, and lightweight implementation in virtual reality audio scenarios. The model first encodes the acoustic information from the scene using a graph convolutional neural network, then decodes this information through a neural sound field and transposed convolution model to obtain the RIR perception parameters. Finally, the RIR signal is reconstructed from these parameters. Experimental results demonstrate that the proposed model offers significant advantages in RIR generation quality, computational efficiency, and functionality, making it well-suited to meet the real-time RIR generation needs of virtual reality audio.",

keywords = "Deep learning, Perceptual parameter, Room impulse response, Virtual reality",

author = "Zhiyu Li and Jing Wang and Xinwen Yue and Lidong Yang and Shenghui Zhao and Xiang Xie",

year = "2024",

month = nov,

doi = "10.12395/0371-0025.2024150",

language = "繁体中文",

volume = "49",

pages = "1186--1196",

journal = "Shengxue Xuebao/Acta Acustica",

issn = "0371-0025",

publisher = "Science China Press",

number = "6",

}

TY - JOUR

T1 - 面向虚拟现实场景的房间脉冲响应计算模型

AU - Li, Zhiyu

AU - Wang, Jing

AU - Yue, Xinwen

AU - Yang, Lidong

AU - Zhao, Shenghui

AU - Xie, Xiang

PY - 2024/11

Y1 - 2024/11

N2 - This study proposes a room impulse response (RIR) computation model tailored for virtual reality applications, integrating deep learning neural network techniques with psychoacoustic perception parameters. This model can efficiently predict perceptually meaningful RIRs from virtual reality scene data while ensuring high-quality predictions. It meets the requirements for real-time generation, high sampling rate, unrestricted length, and lightweight implementation in virtual reality audio scenarios. The model first encodes the acoustic information from the scene using a graph convolutional neural network, then decodes this information through a neural sound field and transposed convolution model to obtain the RIR perception parameters. Finally, the RIR signal is reconstructed from these parameters. Experimental results demonstrate that the proposed model offers significant advantages in RIR generation quality, computational efficiency, and functionality, making it well-suited to meet the real-time RIR generation needs of virtual reality audio.

AB - This study proposes a room impulse response (RIR) computation model tailored for virtual reality applications, integrating deep learning neural network techniques with psychoacoustic perception parameters. This model can efficiently predict perceptually meaningful RIRs from virtual reality scene data while ensuring high-quality predictions. It meets the requirements for real-time generation, high sampling rate, unrestricted length, and lightweight implementation in virtual reality audio scenarios. The model first encodes the acoustic information from the scene using a graph convolutional neural network, then decodes this information through a neural sound field and transposed convolution model to obtain the RIR perception parameters. Finally, the RIR signal is reconstructed from these parameters. Experimental results demonstrate that the proposed model offers significant advantages in RIR generation quality, computational efficiency, and functionality, making it well-suited to meet the real-time RIR generation needs of virtual reality audio.

KW - Deep learning

KW - Perceptual parameter

KW - Room impulse response

KW - Virtual reality

UR - http://www.scopus.com/inward/record.url?scp=85211001424&partnerID=8YFLogxK

U2 - 10.12395/0371-0025.2024150

DO - 10.12395/0371-0025.2024150

M3 - 文章

AN - SCOPUS:85211001424

SN - 0371-0025

VL - 49

SP - 1186

EP - 1196

JO - Shengxue Xuebao/Acta Acustica

JF - Shengxue Xuebao/Acta Acustica

IS - 6

ER -

面向虚拟现实场景的房间脉冲响应计算模型

摘要

关键词

访问文件

其它文件与链接

指纹

引用此