Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation

Zhou, Z; Wu, Y; Wu, Z; Zhang, X; Yuan, R; Ma, Y; Wang, L; Benetos, E; Xue, W; Guo, Y; 25th International Society for Music Information Retrieval Conference (ISMIR)

dc.contributor.author	Zhou, Z
dc.contributor.author	Wu, Y
dc.contributor.author	Wu, Z
dc.contributor.author	Zhang, X
dc.contributor.author	Yuan, R
dc.contributor.author	Ma, Y
dc.contributor.author	Wang, L
dc.contributor.author	Benetos, E
dc.contributor.author	Xue, W
dc.contributor.author	Guo, Y
dc.contributor.author	25th International Society for Music Information Retrieval Conference (ISMIR)
dc.date.accessioned	2024-08-05T14:24:40Z
dc.date.available	2024-06-28
dc.date.available	2024-08-05T14:24:40Z
dc.date.issued	2024-11-10
dc.identifier.uri	https://qmro.qmul.ac.uk/xmlui/handle/123456789/98625
dc.description.abstract	Symbolic Music, akin to language, can be encoded in discrete symbols. Recent research has extended the application of large language models (LLMs) such as GPT-4 and Llama2 to the symbolic music domain including understanding and generation. Yet scant research explores the details of how these LLMs perform on advanced music understanding and conditioned generation, especially from the multi-step reasoning perspective, which is a critical aspect in the conditioned, editable, and interactive human-computer co-creation process. This study conducts a thorough investigation of LLMs’ capability and limitations in symbolic music processing. We identify that current LLMs exhibit poor performance in song-level multi-step music reasoning, and typically fail to leverage learned music knowledge when addressing complex musical tasks. An analysis of LLMs’ responses highlights distinctly their pros and cons. Our findings suggest achieving advanced musical capability is not intrinsically obtained by LLMs, and future research should focus more on bridging the gap between music knowledge and reasoning, to improve the co-creation experience for musicians.	en_US
dc.rights	CC By
dc.title	Can LLMs "Reason" in Music? An Evaluation of LLMs' Capability of Music Understanding and Generation	en_US
dc.type	Conference Proceeding	en_US
pubs.notes	Not known	en_US
pubs.publication-status	Accepted	en_US
dcterms.dateAccepted	2024-06-28
rioxxterms.funder	Default funder	en_US
rioxxterms.identifier.project	Default project	en_US
qmul.funder	Self-supervision in machine listening::Engineering and Physical Sciences Research Council	en_US
qmul.funder	Self-supervision in machine listening::Engineering and Physical Sciences Research Council	en_US
rioxxterms.funder.project	b215eee3-195d-4c4f-a85d-169a4331c138	en_US

Files in this item

Name:: Benetos Can LLMs "Reason" in ...
Size:: 1.543Mb
Format:: application/
Description:: Accepted version

View/Open

This item appears in the following Collection(s)

Electronic Engineering and Computer Science [3490]

Show simple item record