TY - JOUR
T1 - Prompt-guided Precise Audio Editing with Diffusion Models
AU - Xu, Manjie
AU - Li, Chenxing
AU - Zhang, Duzhen
AU - Su, Dan
AU - Liang, Wei
AU - Yu, Dong
N1 - Publisher Copyright:
Copyright 2024 by the author(s)
PY - 2024
Y1 - 2024
N2 - Audio editing involves the arbitrary manipulation of audio content through precise control. Although text-guided diffusion models have made significant advancements in text-to-audio generation, they still face challenges in finding a flexible and precise way to modify target events within an audio track. We present a novel approach, referred to as Prompt-guided Precise Audio Editing (PPAE), which serves as a general module for diffusion models and enables precise audio editing. The editing is based on the input textual prompt only and is entirely training-free. We exploit the cross-attention maps of diffusion models to facilitate accurate local editing and employ a hierarchical local-global pipeline to ensure a smoother editing process. Experimental results highlight the effectiveness of our method in various editing tasks.
AB - Audio editing involves the arbitrary manipulation of audio content through precise control. Although text-guided diffusion models have made significant advancements in text-to-audio generation, they still face challenges in finding a flexible and precise way to modify target events within an audio track. We present a novel approach, referred to as Prompt-guided Precise Audio Editing (PPAE), which serves as a general module for diffusion models and enables precise audio editing. The editing is based on the input textual prompt only and is entirely training-free. We exploit the cross-attention maps of diffusion models to facilitate accurate local editing and employ a hierarchical local-global pipeline to ensure a smoother editing process. Experimental results highlight the effectiveness of our method in various editing tasks.
UR - http://www.scopus.com/inward/record.url?scp=85203794114&partnerID=8YFLogxK
M3 - Conference article
AN - SCOPUS:85203794114
SN - 2640-3498
VL - 235
SP - 55126
EP - 55143
JO - Proceedings of Machine Learning Research
JF - Proceedings of Machine Learning Research
T2 - 41st International Conference on Machine Learning, ICML 2024
Y2 - 21 July 2024 through 27 July 2024
ER -