## The Weisfeiler-Leman Dimension of Conjunctive Queries

##### Abstract

A graph parameter is a function f on graphs with the property that, for any pair of isomorphic graphs G1 and G2, f(G1)=f(G2). The Weisfeiler--Leman (WL) dimension of f is the minimum k such that, if G1 and G2 are indistinguishable by the k-dimensional WL-algorithm then f(G1)=f(G2). The WL-dimension of f is ∞ if no such k exists. We study the WL-dimension of graph parameters characterised by the number of answers from a fixed conjunctive query to the graph. Given a conjunctive query φ, we quantify the WL-dimension of the function that maps every graph G to the number of answers of φ in G.
The works of Dvorak (J. Graph Theory 2010), Dell, Grohe, and Rattan (ICALP 2018), and Neuen (ArXiv 2023) have answered this question for full conjunctive queries, which are conjunctive queries without existentially quantified variables. For such queries φ, the WL-dimension is equal to the treewidth of the Gaifman graph of φ.
In this work, we give a characterisation that applies to all conjunctive queries. Given any conjunctive query φ, we prove that its WL-dimension is equal to the semantic extension width sew(φ), a novel width measure that can be thought of as a combination of the treewidth of φ and its quantified star size, an invariant introduced by Durand and Mengel (ICDT 2013) describing how the existentially quantified variables of φ are connected with the free variables. Using the recently established equivalence between the WL-algorithm and higher-order Graph Neural Networks (GNNs) due to Morris et al. (AAAI 2019), we obtain as a consequence that the function counting answers to a conjunctive query φ cannot be computed by GNNs of order smaller than sew(φ).
The majority of the paper is concerned with establishing a lower bound of the WL-dimension of a query. Given any conjunctive query φ with semantic extension width k, we consider a graph F of treewidth k obtained from the Gaifman graph of φ by repeatedly cloning the vertices corresponding to existentially quantified variables. Using a modification due to Furer (ICALP 2001) of the Cai-Fürer-Immerman construction (Combinatorica 1992), we then obtain a pair of graphs χ(F) and ^χ(F) that are indistinguishable by the (k-1)-dimensional WL-algorithm since F has treewidth k. Finally, in the technical heart of the paper, we show that φ has a different number of answers in χ(F) and ^χ(F). Thus, φ can distinguish two graphs that cannot be distinguished by the (k-1)-dimensional WL-algorithm, so the WL-dimension of φ is at least k.