Show simple item record

dc.contributor.authorDeng, Z
dc.contributor.authorMa, Y
dc.contributor.authorLiu, Y
dc.contributor.authorGuo, R
dc.contributor.authorZhang, G
dc.contributor.authorChen, W
dc.contributor.authorHuang, W
dc.contributor.authorBenetos, E
dc.contributor.author2024 Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2024)
dc.date.accessioned2024-04-19T09:16:45Z
dc.date.available2024-03-13
dc.date.available2024-04-19T09:16:45Z
dc.date.issued2024-06-16
dc.identifier.urihttps://qmro.qmul.ac.uk/xmlui/handle/123456789/96229
dc.description.abstractLarge Language Models (LLMs) have shown immense potential in multimodal applications, yet the convergence of textual and musical domains remains not well-explored. To address this gap, we present MusiLingo, a novel system for music caption generation and music-related query responses. MusiLingo employs a single projection layer to align music representations from the pre-trained frozen music audio model MERT with a frozen LLM, bridging the gap between music audio and textual contexts. We train it on an extensive music caption dataset and fine-tune it with instructional data. Due to the scarcity of high-quality music Q&A datasets, we created the MusicInstruct (MI) dataset from captions in the MusicCaps datasets, tailored for open-ended music inquiries. Empirical evaluations demonstrate its competitive performance in generating music captions and composing music-related Q&A pairs.en_US
dc.format.extent? - ? (13)
dc.titleMusiLingo: bridging music and text with pre-trained language models for music captioning and query responseen_US
dc.typeConference Proceedingen_US
dc.rights.holder© 2024 ACL
pubs.notesNot knownen_US
pubs.publication-statusAccepteden_US
dcterms.dateAccepted2024-03-13
rioxxterms.funderDefault funderen_US
rioxxterms.identifier.projectDefault projecten_US
qmul.funderResource-efficient machine listening::Royal Academy of Engineeringen_US
qmul.funderResource-efficient machine listening::Royal Academy of Engineeringen_US
qmul.funderResource-efficient machine listening::Royal Academy of Engineeringen_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record