Towards end-to-end polyphonic music transcription: Transforming music audio directly to a score

Ralf Gunter Correa Carvalho, Paris Smaragdis

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

We present a neural network model that learns to produce music scores directly from audio signals. Instead of employing commonplace processing steps, such as frequency transform front-ends, harmonicity and scale priors, or temporal pitch smoothing, we show that a neural network can learn such steps on its own when presented with the appropriate training data. We show how such a network can perform monophonic transcription with very high accuracy, and how it also generalizes well to transcribing polyphonic music.

Original languageEnglish (US)
Title of host publication2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2017
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages151-155
Number of pages5
ISBN (Electronic)9781538616321
DOIs
StatePublished - Dec 7 2017
Event2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2017 - New Paltz, United States
Duration: Oct 15 2017Oct 18 2017

Publication series

NameIEEE Workshop on Applications of Signal Processing to Audio and Acoustics
Volume2017-October

Other

Other2017 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics, WASPAA 2017
CountryUnited States
CityNew Paltz
Period10/15/1710/18/17

Keywords

  • Music transcription
  • deep learning
  • end-to-end systems
  • seq2seq

ASJC Scopus subject areas

  • Electrical and Electronic Engineering
  • Computer Science Applications

Fingerprint Dive into the research topics of 'Towards end-to-end polyphonic music transcription: Transforming music audio directly to a score'. Together they form a unique fingerprint.

Cite this