Can generative adversarial networks be used to measure the perceptual quality of music?
Data from Epidemic Sound was used for training, together with data from an annotation system built on top of Amazon Mechanical Turk.
We propose training a generative adversarial network on a music library, and using its discriminator as a measure of the perceived quality of music. This method is unsupervised, needs no access to degraded material and can be tuned for various domains of music.
Data, audio from Epidemic Sound, and annotation collected through Amazon Mechanical Turk.
Model, WGAN
Code, custom annotation frontend
Figure 2: The distribution of discriminator score for the median human rating of each excerpt.
Finally, the method is shown to have a statistically significant correlation with human ratings of music.