Multi-Track Music Generation with WGAN-GP and Attention Mechanisms

Luyu Chen, Lin Shen, Dan Yu, Zhihua Wang, Kun Qian*, Bin Hu*, Björn W. Schuller, Yoshiharu Yamamoto

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Music generation with artificial intelligence is a complex and captivating task. The utilisation of generative adversarial networks (GANs) has exhibited promising outcomes in producing realistic and diverse music compositions. In this paper, we propose a model based on Wasserstein GAN with gradient penalty (WGAN-GP) for multi-track music generation. This model incorporates self-attention and introduces a novel cross-attention mechanism in the generator to enhance its expressive capability. Additionally, we transpose all music to C major in training to ensure data consistency and quality. Experimental results demonstrate that our model can produce multi-track music with enhanced rhythm and sound characteristics, accelerate convergence, and improve generation quality.

Original languageEnglish
Title of host publicationGCCE 2023 - 2023 IEEE 12th Global Conference on Consumer Electronics
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages606-607
Number of pages2
ISBN (Electronic)9798350340181
DOIs
Publication statusPublished - 2023
Event12th IEEE Global Conference on Consumer Electronics, GCCE 2023 - Nara, Japan
Duration: 10 Oct 202313 Oct 2023

Publication series

NameGCCE 2023 - 2023 IEEE 12th Global Conference on Consumer Electronics

Conference

Conference12th IEEE Global Conference on Consumer Electronics, GCCE 2023
Country/TerritoryJapan
CityNara
Period10/10/2313/10/23

Fingerprint

Dive into the research topics of 'Multi-Track Music Generation with WGAN-GP and Attention Mechanisms'. Together they form a unique fingerprint.

Cite this