A PARALLEL FUSION APPROACH TO PIANO MUSIC TRANSCRIPTION BASED ON CONVOLUTIONAL NEURAL NETWORK
391 - 395
MetadataShow full item record
In this paper, a supervised approach based on Convolutional Neural Networks (CNN) for polyphonic piano transcription is presented. The system consists of pitch detection model, onset/offset detection model, and note search model. The pitch detection model is a single-channel CNN predicting the probabilities of pitches contained in one frame of the audio. The onset/offset model based on dual-channel CNN is used for estimating the probabilities of each pitch's onset or offset in a frame. The note search model is rule-based; it integrates the outputs of the pitch model and onset/offset model to determine the final onset, offset and pitch of notes in audio. Two experiments with different dataset conditions are accomplished to compare with state-of-the-art approaches on the same datasets. Experimental results reveal that the proposed approach preforms better in both frame- and note-based metrics.