Sound-and-Image-informed Music Artwork Generation Using Text-to-Image Models

Williams, A; Lattner, S; Barthet, M; Music Recommender Systems Workshop at the 17th ACM Conference on Recommender Systems

dc.contributor.author	Williams, A
dc.contributor.author	Lattner, S
dc.contributor.author	Barthet, M
dc.contributor.author	Music Recommender Systems Workshop at the 17th ACM Conference on Recommender Systems
dc.contributor.editor	Ferraro, A
dc.contributor.editor	Knees, P
dc.contributor.editor	Quadrana, M
dc.contributor.editor	Ye, T
dc.contributor.editor	Gouyon, F
dc.date.accessioned	2023-10-24T08:37:03Z
dc.date.available	2023-08-28
dc.date.available	2023-10-24T08:37:03Z
dc.date.issued	2023-09-19
dc.identifier.uri	https://qmro.qmul.ac.uk/xmlui/handle/123456789/91539
dc.description.abstract	While some artists are involved in both domains, the creation of music and artwork require different skill sets. The development of deep generative models for music and image generation has potential to democratise these mediums and make multi-modal creation more accessible for casual creators and other stakeholders. In this work, we propose a co-creative pipeline for the generation of images to accompany a musical piece. This pipeline utilises state-of-the-art models for music-to-text, image-to-text, and subsequently text-to-image generation to recommend, via generation, visuals for a piece of music that are informed not only by the audio of a musical piece, but also a user-recommended corpus of artworks and prompts to give a meaningful grounding in the generated material. We demonstrate the potential of our pipeline using a corpus of material from artists with strongly connected visual and musical identities, and make it available in the form of a Python notebook for users to easily generate their own musical and visual compositions using their chosen corpus - available here: https://github.com/alexjameswilliams/Music-Text-To-Image-Generation	en_US
dc.publisher	ACM	en_US
dc.subject	Computational Creativity	en_US
dc.subject	Generative AI	en_US
dc.subject	Image Generation	en_US
dc.subject	Music Tagging	en_US
dc.subject	Prompt Engineering	en_US
dc.subject	Visual Recommendation	en_US
dc.title	Sound-and-Image-informed Music Artwork Generation Using Text-to-Image Models	en_US
dc.type	Conference Proceeding	en_US
pubs.author-url	https://orcid.org/0000-0003-2387-6876	en_US
pubs.notes	Not known	en_US
pubs.place-of-publication	New York, NY, USA	en_US
pubs.publication-status	Published online	en_US
dcterms.dateAccepted	2023-08-28
qmul.funder	UKRI Centre for Doctoral Training in Artificial Intelligence and Music::Engineering and Physical Sciences Research Council	en_US
qmul.funder	UKRI Centre for Doctoral Training in Artificial Intelligence and Music::Engineering and Physical Sciences Research Council	en_US

Files in this item

Name:: Williams Sound-and-Image-informed ...
Size:: 1.468Mb
Format:: application/
Description:: Published version

View/Open

This item appears in the following Collection(s)

Electronic Engineering and Computer Science [3475]

Show simple item record