Show simple item record

dc.contributor.authorSteinmetz, C
dc.contributor.authorSingh, S
dc.contributor.authorComunit�, M
dc.contributor.authorIbnyahya, I
dc.contributor.authorYuan, S
dc.contributor.authorBenetos, E
dc.contributor.authorReiss, J
dc.contributor.author25th International Society for Music Information Retrieval Conference (ISMIR)
dc.date.accessioned2024-08-02T10:04:03Z
dc.date.available2024-06-28
dc.date.available2024-08-02T10:04:03Z
dc.date.issued2024-11-10
dc.identifier.urihttps://qmro.qmul.ac.uk/xmlui/handle/123456789/98593
dc.description.abstractAudio production style transfer is the task of processing an input to impart stylistic elements from a reference recording. Existing approaches often train a neural network to estimate control parameters for a set of audio effects. However, these approaches are limited in that they can only control a fixed set of effects, where the effects must be differentiable or otherwise employ specialized training techniques. In this work, we introduce ST-ITO, Style Transfer with Inference-Time Optimization, an approach that instead searches the parameter space of an audio effect chain at inference. This method enables control of arbitrary audio effect chains, including unseen and non-differentiable effects. Our approach employs a learned metric of audio production style, which we train through a simple and scalable self-supervised pretraining strategy, along with a gradient-free optimizer. Due to the limited existing evaluation methods for audio production style transfer, we introduce a multi-part benchmark to evaluate audio production style metrics and style transfer systems. This evaluation demonstrates that our audio representation better captures attributes related to audio production and enables expressive style transfer via control of arbitrary audio effects.en_US
dc.publisherISMIRen_US
dc.titleST-ITO: Controlling audio effects for style transfer with inference-time optimizationen_US
dc.typeConference Proceedingen_US
pubs.notesNot knownen_US
pubs.publication-statusAccepteden_US
dcterms.dateAccepted2024-06-28
rioxxterms.funderDefault funderen_US
rioxxterms.identifier.projectDefault projecten_US
qmul.funderResource-efficient machine listening::Royal Academy of Engineeringen_US
qmul.funderResource-efficient machine listening::Royal Academy of Engineeringen_US
rioxxterms.funder.projectb215eee3-195d-4c4f-a85d-169a4331c138en_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record