Musical Transposition Directly from Audio with Deep Recurrent Neural Networks

Parker Carlson, Dr. Patrick Donnelly, Department of Electrical Engineering and Computer Science, Oregon State University, Corvallis, OR, 97331

Music transposition is a relatively unexplored area of audio deep learning which has many potential applications ranging from musical production to medical uses. Whereas recent research in the field has been primarily focused on speech synthesis and music composition, we explore using deep learning to automatically transpose audio. This system increases the pitch and respective harmonics of a given sound by one octave, remapping the sound. Modelling audio with neural networks remains a challenging task due to both the information density of audio signals, exceeding 16000 samples per second, as well as the necessity to retain both local and global structure (micro- and macro-timing). Current systems that model raw-audio are autoregressive and limited to generating audio one sample at a time, which is prohibitively slow. Thus we investigate the use of deep recurrent neural networks to develop a system that runs quickly, while retaining the original timbre (tone color) and introducing minimal noise. We take batches of raw audio as input to a neural network designed to learn encodings of pitch and timbre.These encodings are then used to generate audio that has been shifted up in pitch while maintaining the perception of the music’s timbre. Our model remaps monophonic and polyphonic sounds to a different pitch significantly faster than current autoregressive methods. This is a first step in research aimed towards improving music perception in cochlear implant listeners by creating personalized sound profiles to better approximate frequency mappings of their surgical implant. The present research provides a foundation to enable future clinical trials on the use of deep learning to improve music perception for cochlear-implant individuals.

Additional Abstract Information

Presenter: Parker Carlson

Institution: Oregon State University

Type: Poster

Subject: Computer Science

Status: Approved

Time and Location

Session: Poster 5
Date/Time: Tue 12:30pm-1:30pm
Session Number: 4046