Biodiversity assessment across a dynamic riverine system: A comparison of eDNA metabarcoding versus traditional fish surveying methods

While many studies have considered the ability of eDNA to assess animal communi - ties in lacustrine settings, fewer have considered riverine systems, particularly those spanning the environmental gradients present in large river basins. Such dynamic systems are challenging for eDNA biomonitoring due to differing eDNA transport distances in rivers and the effects of river chemistry. To address this challenge


| INTRODUC TI ON
Environmental (e)DNA metabarcoding has rapidly become a robust biomonitoring tool to accurately assess the diversity of animal, specifically fish, communities in lacustrine systems, (Doble et al., 2020;Lawson Handley et al., 2019;Valdez-Moreno et al., 2019).In contrast, the dynamic nature of riverine systems presents a suite of conditions that are thought to influence eDNA transport and persistence, and while studies have attempted to understand the abiotic factors affecting eDNA detection in lotic systems (e.g., Barnes et al., 2020;Jerde et al., 2016;Shogren et al., 2018) there remains a higher degree of uncertainty regarding inferred species detections using eDNA (Evans & Lamberti, 2018;Thalinger et al., 2020).There are relatively few studies focusing explicitly on estuaries (e.g., Ahn et al., 2020;Stoeckle et al., 2017) which face a suite of challenges including water turbidity during the eDNA collection process, clogging filters, and impeding DNA extraction (Sanches & Schreier, 2020; K. E. Williams et al., 2017), as well as inhibitors to PCR such as humic acid from vegetation decomposition, and sewage by products (Schrader et al., 2012).Due to the variable abiotic conditions of riverine habitats, there are few lotic eDNA studies spanning environmental gradients of salinity (but see Sales et al. (2021) and García-Machado et al. (2021)).Most studies focus solely on freshwater stretches (Cilleros et al., 2019;Lecaudey et al., 2019;Pont et al., 2018), or purely marine habitats (Holman et al., 2019;Oka et al., 2021;Yamamoto et al., 2017).The validation of eDNA metabarcoding across such dynamic lotic environments is crucial for its acceptance as a feasible tool for monitoring riverine biodiversity.
Our objective is to test the accuracy of eDNA metabarcoding in recovering the diversity and community structure of fishes across a large dynamic river basin encompassing both fresh and tidal zones.
We focused on the Thames River Basin, UK, since this system has exceptional historical survey data, providing a robust biotic reference baseline.The Environment Agency (EA) routinely carries out fish surveys across the UK as part of the national monitoring campaign in accordance with the European Union Water Framework Directive (WFD, Directive 2000/60/EC) and previous legislation, with data from the Thames catchment beginning in 1978, across >1239 sites.Species richness in the Thames comprises 120 recorded fish species from 35 families (Kirk et al., 2002).Tidal waters contain two-thirds of the diversity of native species compared with freshwater zones (47 tidal vs. 28 freshwater species (Swaby & Potts, 1990;Wheeler & Maitland, 1973)), with freshwaters containing a significant number of non-native introduced species (14 species (Gozlan et al., 2010)).Due to the salinity gradient in the tidal Thames, a wide diversity of marine, estuarine, and freshwater fish species can be found (Mann, 1988), as well as marine vagrants that have entered the estuary (Swaby & Potts, 1990).Although traditional fish survey methods such as netting or trawling have been used successfully to capture this diversity, these methods suffer from low capture rates and are only reliable indicators of occurrence when species are present at moderate or high abundance (Magnuson et al., 1994).
The prevalence of introduced species in this system, and globally in freshwaters (Gozlan et al., 2010;Strayer & Dudgeon, 2010), warrants more sensitive surveillance.The ability to detect and deal with small populations of newly introduced non-native species is a crucial tool in mitigating freshwater biodiversity loss (Britton et al., 2011;Tickner et al., 2020).Such sensitive biomonitoring also extends to the detection of rare species, including endangered native species found in the Thames, such as European eel (Anguilla anguilla (L.)), sea lamprey (Petromyzon marinus (L.)), river lamprey (Lampetra fluviatilis (L.)), short snouted seahorse (Hippocampus hippocampus (L.)), and smelt (Osmerus eperlanus L.).
The aim of this study was to compare eDNA metabarcoding with traditional fishing surveys of the fresh and tidal Thames.We directly compared eDNA with simultaneous ('paired') surveys employing traditional capture methods, and also compare eDNA with EA survey data over the previous 5 years.In addition, we investigate the seasonal variability of eDNA detections in a subset of sites that were sampled in both summer and winter.To investigate the impact of variations in detections by different survey methods, we compare the results of a simple ecological model (elements of metacommunity structure).By using two genetic markers, biological replicates, and validation against both simultaneous and historic datasets, we are able to assess the viability of eDNA methods to detect fish species in both freshwater and tidal lotic conditions.

| Focal system
The River Thames is the second longest UK river (346 km) and drains a catchment in southern England of 12,930 km 2 , comprising more than 50 inflows (Francis et al., 2008) running though rural and urban areas (Figure 1).Much of the Thames basin is freshwater, encompassing the Upper and Middle Thames, while Teddington Lock marks the start of the Lower Thames, which is tidal, and is further split into Upper, Middle and Lower zones; with salinity and tidal influence increasing downstream.

| eDNA sampling
Using QGIS 3.10.4(QGIS.org,2018) a map of the Thames catchment was overlaid with the catchment polygons used by the EA and linked to data describing the locations, dates and recorded catches of historic EA fish surveys.Sites were selected from throughout the catchment, from source to outflow, for eDNA sampling reflecting catchment polygons with the highest frequency of recent surveys, greatest fish diversity, and where introduced non-native species were recorded as present.
For paired surveys (n = 14), eDNA samples were collected immediately prior to the EA survey (see below), avoiding potential contamination from eDNA released by fishes during the traditional surveys or resuspended by sediment disturbance caused by fishing activities.All freshwater samples were collected between 9:00 and 17:00 and in the tidal Thames were taken during slack tide (a period of no tidal movement in either direction).At each site, three 2 L water samples were collected from the water surface using sterilized Nalgene HDPE bottles.From each sample, 1 L of water was filtered on site using a 0.45 μm pore size Sterivex filter (Millipore Corp, USA) and a Geotech peristaltic pump (series II Geotech, USA) with sterilized tubing (Masterflex, Cole-Parmer, USA).As water samples contained suspended sediment that prevented the water from being filtered in a timely manner, water was filtered for a maximum of 20 minutes and the volume of water passed recorded.All water was expelled from the filter units, which were then sealed in individual sterile bags and transported on ice before being frozen at −20˚C.Between sites, sampling equipment was sterilized by washing in a 30% commercial bleach solution (containing <3% sodium hypochlorite) and then rinsed with distilled water.The sampler was rinsed again in river water at the next sampling location.A total of 15 filter controls were also collected (one in every 10 samples/3 sites), using 1 L of distilled water brought into the field, and filtered and stored following the same protocol.

| Traditional sampling methods and historical data
The vast majority of the sampling conducted by the EA and used as a comparison to our eDNA surveys employed electrofishing either as depletion electrofishing (<1 m water depth), in which three passes of a 100 m stretch (stop nets set up and downstream) are carried out using backpack electrofishing units, or boat-mounted electrofishing rig (>2 m depth) deploying a single pass of a reach (catches and time spent fishing is used to calculate the number of catches per minute).
For five tidal sites, EA fish surveys used a multi gear method which consists of: shore seining, deployed twice per site (50 × 2.5 m net with a 10 mm knotless mesh), to sample larger and more active fish species, applied at slack water from a 17 ft open dory; timed oneminute kick sampling with a standard hand net (0.25 × 0.3 m aperture with 1 mm polyester mesh), in the shallows to sample early fry and post-larvae; and beam trawling for 200 m to sample demersal species (2 m trawl rigged with a 5 m polyester trawl net with 40 mm knotless outer mesh and 10 mm knotless cod seine net).Individual fish are all identified to species level, or failing that, genus.All data collected by EA fish surveys are publicly available at https://data.gov.uk/.In this study, we extracted historic survey data for sites within the Thames catchment for the past 5 years (2014)(2015)(2016)(2017)(2018) with the aforementioned methods to compare catches of fish caught, with our eDNA detections.

F I G U R E 1
Map of the study site-the Thames catchment.Points marking origin of the Thames (black star), and the numbered eDNA surveys sites in the upper Thames (blue), middle Thames (orange), upper estuary (light green), middle estuary (dark green) lower estuary (black).Teddington Lock, the start of the Thames estuary, is indicated with a dashed black line

| Primer selection
Since a consensus on the optimal genetic marker for fish eDNA metabarcoding studies has yet to be reached (Collins et al., 2019;Morey et al., 2020;Shu et al., 2020), ideal study designs include a combination of primers or regions to assess biases and feasibility.
Cytochrome c oxidase subunit 1 (CO1) markers have been proposed as a standardized animal identification marker, and this genetic region can be found in taxonomically verified databases with reference sequences covering a huge variety of taxa (Hebert et al., 2003) limiting the necessity of generating specific reference databases.
Over 300,000 species are currently represented in these public databases (see www.bolds ystems.org:Ratnasingham & Hebert, 2007) with particular emphasis on the curation of a specific fish CO1 database (Ward et al., 2009).In contrast, the 12S rRNA region has most frequently been used as a genetic marker in fish eDNA studies, particularly due to the demonstrated specificity to fish combined with short amplification length, making it suitable for degraded DNA (Miya et al., 2015;Riaz et al., 2011;S. Zhang et al., 2020).Here, we targeted two regions for eDNA amplification: the mitochondrial 12S rRNA gene, and the mitochondrial cytochrome c oxidase subunit 1 (CO1) (Table S2).We used the MiFish 12S primer set (Miya et al., 2015), which has been successfully used in eDNA studies in both freshwater and marine systems (Berger et al., 2020;Doble et al., 2020;Oka et al., 2021), including for a subset of UK fish species (Antognazza et al., 2021;McDevitt et al., 2019) and the Fish_MiniE CO1 primer set designed for fragmented fish DNA (Shokralla et al., 2015), modified to remove the M13 tails.

| eDNA extraction and PCR amplification
DNA extractions from the water samples were undertaken using a modified protocol based on Doble et al. (2020) and Cruaud et al. (2017), and performed in a laminar flow cabinet in a sterile (pre-PCR) room.The casing of the Sterivex filter unit was cut under sterile conditions and the filter removed, cut into small pieces, and DNA extracted using a DNeasy Blood and Tissue Kit (Qiagen).After the initial lysing stage, filters were passed through a QIAshredder column, and the resulting lysate pooled with the original lysate before continuing with the DNeasy protocol.An extraction negative control was included in every sample batch.All extractions and negative controls were quantified on a Qubit v2 using the Qubit dsDNA HS Assay Kit (Invitrogen).
To optimize the PCR reactions, the amplification conditions for both primer sets were tested using temperature gradient experiments.MiFish primers were tested with annealing temperatures between 58˚C and 66˚C, and CO1 primers between 46˚C and 57˚C.
Resulting PCR products were visualized on a 1.5% agarose gel, and the optimal annealing temperature of 63˚C for MiFish and 46˚C for CO1 was selected, which removed non-specific amplification in most cases.Some samples from the middle freshwater Thames, and tidal Thames failed to amplify, most likely due to the presence of PCR inhibitors.After testing a series of DNA dilutions, a dilution of 1:5 was selected which counteracted these inhibitors and was applied to all samples.For further details of the PCR mix and conditions see Supporting text and Table S3.
A mock community was constructed in a designated post PCR laboratory after PCRs of the eDNA samples had been completed, using DNA from 10 fish species with a diverse phylogenetic history (see Table S4).The presence of any of the African freshwater fish found outside of the mock community was used to assess potential contamination, and British fish species were used to assess sequencing depth and investigate potential amplification and sequencing biases.The mock community samples comprising equal quantities (12.5 ng per species: total 125 ng) of tissue-derived DNA (measured with Qubit).These samples were processed following the same methodology as used for eDNA samples, from PCR to sequencing.PCR reaction conditions were the same as used for the field samples, with the exception that only 30 and 35 replication cycles for the CO1 and 12S primers, respectively, were undertaken due to the higher DNA concentration compared with the field samples.

| Reference database for UK fishes
A UK fish reference database was curated for use in this study.A list of fish species from freshwater, and transitional and coastal (TRAC) areas of the UK was generated from Fishbase (Froese & Pauly, 2017) and EA databases, with 531 species identified as part of the UK fauna (Collins et al., 2019).All fish species recorded in the UK and additional non-native species (Andrews & Wheeler, 1985;Gozlan et al., 2010) that could potentially be present but have not yet been confirmed, were included in order to confidently identify species from their DNA barcodes and facilitate bioinformatic processing.Reference databases may suffer from a lack of population sampling, thereby causing incipient species or genetic lineages to be missed in eDNA surveys (Doble et al., 2020).To avoid such issues, the reference databases used in this study included samples collected from across the UK (multiple samples where possible) in order to represent more regional genetic diversity since many of the species found in the Thames are cosmopolitan, as well as individuals sourced from the Thames.The same samples included in the mock community analysis were also included in the reference database.Sequences were downloaded from National Centre for Biotechnology Information (NCBI) Genbank, and the Barcode of Life Database (BOLD: Ratnasingham & Hebert, 2007) and curated for the mitochondrial gene regions targeted for 12S rRNA and CO1.

| Library and sequencing
Across the Thames, 39 eDNA sampling events took place producing 117 eDNA samples (three biological samples per event) and an additional 15 field controls were collected.Four PCR replicates were produced for each of these samples, for both the 12S and CO1 primers.To counteract PCR stochasticity, the PCR replicates of each biological sample were not pooled.In total, 958 samples were sequenced, including eight replicates of the mock community, 30 DNA extraction negative controls, and 6 PCR negative controls (Table S5).
PCR products from eDNA samples, controls, and eight replicates of the mock community were submitted to the Barts and the London Genome Centre at Queen Mary University of London (QMUL) for library building and sequencing.The library was the size and quality checked using Qubit and Tapestation and pooled for sequencing.A 10% PhiX spike-in was included to increase the sequence complexity.The amplicon libraries were sequenced using an Illumina MiSeq with the 2 × 150bp V2 chemistry.The raw reads were demultiplexed by tag at the sequencing facility, then filtered and low-quality sequences removed, followed by the removal of MID tags and conversion to FASTAQ files using Illumina software.

| Bioinformatics
The CO1 read libraries were paired using AdapterRemoval (Lindgreen, 2012) with a minimum overlap of 11 bp (default setting).The overall quality of the reads for each sample was analyzed with FastQC (Andrews et al., 2015) and subsequently uploaded to mBRAVE (Multiplex Barcode Research and Visualization Environment: (Ratnasingham, 2019)) for downstream analysis.The reads were trimmed by 23 bp at the 5' end to remove the forward primer, and 26 bp at the 3' end to remove the reverse primer.The quality value (QV) of each sequence was evaluated, and screening values to maximize data retention were set as follows: all records failing to meet the following standards were discarded, >25% of bp with QV <20; and >25% of bp with QV <10.Reads shorter than 200 bp and above 500 bp were discarded and the data were screened for chimeric sequences.A custom mBRAVE reference library of fishes was created, using a subset of CO1 sequences from the UK fishes database.
Retained sequence identifications were made at >98% match to a reference.Any reads not matching the CO1 UK fish library were subsequently queried against all available mBRAVE system libraries for other taxonomic groups (chordata, bacteria, protista, nonarthropoda, insecta, and fungi).Detections were retained to genus level in the three cases where more than one potential species ID was given.
The 12S read libraries were processed using the DADA2 pipeline (Callahan et al., 2016).The quality profile of each read was visualized and inspected, which led to the forward and reverse reads truncated at 220 and 200 nucleotides, respectively, to remove the primers along with poor quality ends.The DADA2 algorithm uses a parametric error model to infer true sequence variants.Reads with more than two expected errors were filtered out, the unpaired sequences were denoised and aligned with an overlap of at least 12 bp, and a table of amplicon sequence variants (ASVs) was created.The resulting ASV table was queried against the custom UK fish library and taxonomy assigned with at least 50% bootstrap confidence, and the identified ASVs were consolidated to species level.
After identification, any unassigned reads were removed, as were any assignments based on one sequence, that is, singletons (Alberdi et al., 2017).Sequence data from the mock communities were used to inform decisions regarding a suitable threshold for sequence removal in the analysis of the data (Alberdi et al., 2017;Doble et al., 2020;Evans et al., 2017).The mock communities contained fish species not native to the UK to check for contamination during sequencing and to set a threshold to minimize false positives (an incorrect positive detection).If a false positive was detected in the mock community, we used the number of reads corresponding to this false positive as the limit and dismissed any match in any sample with the corresponding number of reads.As three reads assigned to sea bass (Dicentrarchus labrax (L.)) were found in a mock community sample, we accepted no identification with three or fewer reads assigned.Field controls and negative controls were checked for contamination, and the maximum read count for any ASV present was subtracted from the read counts of the respective ASVs in the data.
Two approaches were taken to determine the validity of species detections.Since the PCR replicates of each biological sample were not pooled, an additive PCR detections approach was utilized.A species had to contain four or more reads, and be present in at least one in four PCR replicates, across two of three biological replicates to be considered a true detection, a level of stringency recommended in many eDNA studies (e.gAlberdi et al., 2017;Evans et al., 2017;Mächler et al., 2019).

| Statistical analyses, diversity metrics, and elements of metacommunity structure
The eDNA data were converted to presence/absence matrices for comparisons with the Environment Agency 2018 fish survey data.
ANOVA was used to assess differences in species richness detected by the different survey methods (i.e., multi-gear vs. eDNA for the tidal Thames; electrofishing vs. eDNA for the freshwater Thames).
Jaccard similarity was calculated and ordinated using non-metric multidimensional scaling (nMDS), and an analysis of similarities (ANOSIM) was performed to test if there was a statistically significant difference in: 1) the paired sites, between a) freshwater and tidal, and b) EA and eDNA; 2) the seasonal subset of sites, which had EA and eDNA surveys twice during the year, in summer and winter, and 3) all eDNA data from 2018, between the different zones of the Thames: upper and middle freshwater; freshwater tributaries; upper and middle tidal (Figure 1).Similarity percentage (SIMPER) was then applied to identify which species contributed most to the differences between Thames zones.We used read counts to compare the relative abundance of the different communities detected by eDNA with the 12S MiFish primers, and the traditional EA surveys, at the paired sites.The read count and abundance data were Hellinger transformed, as suggested by Legendre and Legendre (2012) and Laporte et al. (2021), before being visualized using non-metric multidimensional scaling (nMDS).Permutational multivariance of analysis (PERMANOVA) based on Bray-Curtis dissimilarity was performed to statistically assess the strength of the associations.These analyses were conducted in Vegan version 2.5-6 (Oksanen et al., 2019).
Mean estimated richness (Chao II estimator) and species accumulation curves were created using iNEXT version 2.0.19 (Hsieh et al., 2016).Sørensen dissimilarity values were calculated between the richness detected by the eDNA (i.e., combined 12S and CO1 genetic markers) and that detected by the EA at the paired surveys, and also for the overall detections by the individual genetic markers.
Beta diversity for the four regions of the Thames (upper freshwater, middle freshwater, upper tidal, and middle tidal) was calculated and consisted of total Sørensen dissimilarity, where a value of 0 indicates that communities share exactly the same species composition, and a value of 1 indicates total dissimilarity.Two further measures to partition the dissimilarity were calculated: Simpson dissimilarity or spatial turnover, which accounts for dissimilarity due to species replacement, and the nestedness-resultant fraction of Sørensen dissimilarity, a measure of the fraction of total dissimilarity that is not caused by species replacement, but instead by nestedness (Baselga, 2012).
To test whether the stochasticity between technical methods is biologically meaningful, we performed an analysis of community structure on the three datasets (CO1, 12S, traditional surveys), to test the hypothesis that subtle differences in recovery would not alter a community ecological analysis.We performed an Elements of Metacommunity Structure (EMS) analysis following Leibold et al. (2004) using the function Metacommunity from the package metacom (Dallas, 2014).This analysis takes a presence and absence matrix of species from multiple sites and uses a hierarchical analysis to compare actual data with a set of idealized patterns and their quasistructures (see Leibold et al., 2004 andde la Sancha et al. (2014) for analytical details).The analysis considers three aspects of species spatial distribution; coherence (non-random structuring, a response to a latent environmental gradient(s)), turnover (a measure of the replacement of one species along the gradient/ ordination axis) and boundary clumping (the degree to which boundaries of species ranges occur at the same site) to examine the consistency of species distributions between communities (sample points).

| Sequence data
Following bioinformatic steps, a total of 7.4 million reads remained from the 2 × 150 bp MiSeq run across the two primer sets: 3.5 million reads were assigned to the 12S MiFish-U primer set, and 3.5 million reads to the CO1 Fish_MiniE set.Following filtering, the mean sequencing depth per PCR replicate was 4575 reads for 12S and 8216 reads for CO1: an average 6294 reads per sample across the two markers.The data for PCR replicates for each marker were combined (four PCR replicates per biological sample, and three biological samples per site) to give a mean sequencing depth of 75,528 reads per site.Where contamination was detected (in five of the field controls: five 12S and two of the CO1 field controls), the reads for the contaminating ASVs found in the controls were removed from the eDNA samples.
The filtered data-considering both primer sets-were assigned to taxa from 15 fish families encompassing 33 species within the reference database.The CO1 data could not be assigned to species within the genera Lampetra, Leuciscus, and Pomatoschistus due to a lack of resolution in the CO1 reference data.Any detections by 12S of species belonging to these genera were thus retained at genus level for comparison of markers.Of the total species detected from the two pooled eDNA primer sets, 23 species were detected by both primer sets.The 12S primers detected 31 fish species (32 species prior to the two Pomatoschistus species being grouped), whereas the CO1 primers detected 25 species.Two tidal Thames sites failed to amplify and produced samples which did not generate sequences across either loci (Thurrock-site 34 and Gravesend-site 35, which are both lower tidal sites, Figure 1), while two upper tidal sites (Putney Bridge-site 26; Greenwich -site 30) generated sequences for 12S but not CO1.The CO1 data also recovered sequences which were matched to birds, mammals, aquatic invertebrates, bacteria, and fungi.

| Mock community detections
The mock community samples had an average read count of 9,759 for the 12S samples and 10,407 for the CO1 samples.Despite similar original DNA concentrations, recovered read abundance was not even across species in the mock communities.All 10 mock community species were identified in the 12S mock communities, however, the proportion of read counts varied (from the expected 10% per species) from 0.28% and 27.3% (M = 10, SD = 7.29).In the 12S mock communities Agonus cataphractus (L.) represented on average 25% of the total reads, whereas Callionymus reticulatus (Valenciennes, 1837) was represented by only 0.36% of the total reads.The CO1 primers failed to recover C. reticulatus, Chiloglanis pretoriae (van der Horst, 1931), and Solea solea (L.) from the mock communities (Figure S2) and the proportions of read counts varied (from the expected 10% per species) between 0 and 28% (M = 6.19%,SD = 7.97).In the CO1 mock communities, Osmerus eperlanus represented on average 27.24% of the total reads.On average, 38.12% of the reads in the CO1 mock communities were unassigned, compared with 0.02% of the 12S reads.Due to the highly variable proportional read count recovery, the difference in the recovered and expected proportions was not statistically significant for the 12S primers (paired Wilcoxon test: p = 0.4) and was marginally not significant for the CO1 primers (paired Wilcoxon test: p = 0.06).Due to the wide range of amplification exhibited in the mock communities, analyses conducted using presence/absence were considered more robust, and therefore we focus on these results.However, we cautiously discuss abundance using read counts in the light of our 12S data, whereas CO1 data were excluded from abundance analyses using read counts, due to the proportion of unassigned reads and lack of recovery of three of the 10 mock community species.

| Comparison of paired eDNA and traditional fish surveys
At the 14 paired sites (fresh and tidal) where eDNA sampling was simultaneous with an EA fishing survey, a total of 40 species were detected, with 22 species detected by both methods.There was a significant difference between the total species detected per site for the EA and combined marker eDNA surveys (paired t-test: p = 0.031, df = 13, t-value = −2.421).nMDS ordination of the fish assemblages detected by the two methods indicated a significant difference between EA fishing methods and eDNA using ANOSIM (R = 0.1254, p = 0.011).Since detections in the freshwater and tidal Thames sites were significantly different, irrespective of survey method (ANOSIM: R = 0.358, p = <0.001)we consider the comparisons separately.
At the nine paired freshwater sites, a total of 22 species were detected, with a 72.7% overlap in the species detected by eDNA and traditional surveying methods (Figure 2a).The EA surveys detected 16 species, while eDNA metabarcoding detected an additional six species, including: stone loach (Barbatula barbatula (L.)), crucian carp (Carassius carassius (L.)), 3-spined stickleback (Gasterosteus aculeatus (L.)), ruffe (Gymnocephalus cernua (L.)), lamprey (Lampetra spp.) and 10-spined stickleback (Pungitius pungitius (L.)).These species have been recorded by the EA previously at other sites in the Thames catchment.At all of the paired sites, eDNA detected some species which were not caught by the EA.Examination of historic EA fish survey data for these paired sites from the last 5 years prior to our eDNA sampling (i.e., 2014-2018) revealed 75% of the unique eDNA detections were supported by historic EA data.
In contrast to the freshwater surveys, the five paired tidal sites had only a 52% overlap in the species detected by the two methods (Figure 2b).Here, the EA recorded 18 species that included five species not detected by eDNA (considered as eDNA false negatives as they are incorrectly indicated as being absent).These species included bleak (Alburnus alburnus (L.)), herring (Clupea harengus L.), zander (Sander lucioperca (L.)), rudd (Scardinius erythrophthalmus (L.)), and sole (Solea solea).eDNA detected seven species which were not recorded by the EA during the paired tidal surveys: crucian carp, bullhead (Cottus gobio L.), common carp (Cyprinus carpio L.), thin-lipped gray mullet (Chelon ramada (L.)), sea lamprey, minnow (Phoxinus phoxinus (L.)), brown trout (Salmo trutta (L.)).These species identified by eDNA have all been previously recorded in the tidal Thames by the EA, with the exceptions of crucian carp which has previously only been recorded in the freshwater Thames, and sea lamprey which have never been detected during fish surveys.
Historic EA data (2014-2018) for the paired tidal sites supported 80% of the unique eDNA detections.
The combined marker eDNA detected significantly greater richness than the EA surveys at all 14 paired sites (paired t-test, p < 0.001, t(13) = −6.01).Sørensen dissimilarity values of the richness detected by EA or eDNA showed relatively low dissimilarity between the methods, ranging from 0.24 to 0.54 (M = 0.42, SD = 0.09) (Table 1).

F I G U R E 2
Venn diagrams of the fish species detected during the paired surveys: In (a) the freshwater Thames eDNA detected all 16 species caught by the EA (Table S8 for species), and an additional six species; (b) the tidal Thames a total of 13 species were detected by both methods, five by the EA only, and seven by eDNA only.
The nestedness component (beta.sne)contributed more to the variance than turnover at nine of the 14 paired sites.At four of the paired freshwater sites (sites 2, 14, 16, and 17) the turnover component was 0 where eDNA detected every species caught by the EA.The total species richness as detected by the combined eDNA and EA methods was calculated for each of the paired sites.The percentage proportion of species richness detected by eDNA was significantly higher than that detected by EA surveys.(Welch t-test, p < 0.001, df = 19.2,t(19.2) = −8.8).Of the paired freshwater sites, eDNA detected significantly more of the EA detections than at the tidal sites (Welch's t-test, p < 0.01, t(11.4)= 3.3).At the nine freshwater sites this ranged from 66.6% to 100% (M = 89.5%,SD = 11.8) of the EA detections, compared with the five tidal sites where eDNA detected from 62.5% to 80% of the EA detections (M = 72.1%,SD = 7.86).
Although we are cautious about using read counts for this study, a comparison of the relative abundance across the 14 paired sites (incorporating both fresh and tidal waters) using traditional methods with the 12S read counts, also revealed a significant difference between the communities detected by the two different methods (PERMANOVA, R² = 0.07, p = 0.04) (Figure S4).

| Fish diversity detected by eDNA across the Thames
During 2018, the EA conducted a total of 172 surveys in the Thames catchment, with 153 in the freshwater Thames, and 19 in the Tidal Thames (Table 2).This detected a total of 42 fish species: 23 in freshwater, and 30 in the tidal, and includes 11 species which were found in both environments.We conducted 39 eDNA surveys and de- The ordination of eDNA sampling sites and detected species was visualized by a nMDS of Jaccard dissimilarity matrices (Figure 5).Despite some overlap, ANOSIM analysis detected a significant difference between the river zones (freshwater upper, middle Thames, and the tidal upper, middle Thames) (ANOSIM on Jaccard dissimilarity matrix, R = 0.528, p < 0.001).Similarity percentage (SIMPER) analysis was applied to identify the discriminating taxa between the four zones.Between the upper freshwater Thames and middle freshwater Thames, there was a difference of 42.5%, with the presence of lamprey, grayling (Thymallus thymallus L.), and brown trout in the upper Thames, and eel and roach (Rutilus rutilus (L.)) in the middle Thames contributing to 34.37% of the differences.Between the middle freshwater Thames and upper estuary there was a difference of 45.2%, with the presence of bleak, ruffe, and gudgeon (Gobio gobio (L.)) in the middle Thames, and flounder (Pleuronectes platessa (L.)) in the upper estuary contributing to 25.14% of the differences.
Between the upper estuary and the middle estuary there was a difference of 56.9%, with the presence of chub, 3-spined stickleback, minnow, pike (Esox lucius L.) and perch (Perca fluviatilis (L.)) in the upper estuary, and smelt, sea bass, Atlantic salmon (Salmo salar L.), thin-lipped gray mullet, whiting (Merlangus merlangus (L.)), red mullet (Mullus surmuletus (L.)), and Pomatoschistus goby species contributing to 62.06% of the differences (Table S9).The largest dissimilarity was between the two most geographically farthest sites; the upper freshwater Thames and the middle estuary, with a dissimilarity of 80.7%.The beta diversity estimates for the middle freshwater, and upper and middle estuary were high with a mean Sørensen dissimilarity of 0.69, with the turnover component dominating over nestedness (Table S10).The upper freshwater zone (comprising five sites) had lower beta diversity, with nestedness and turnover given similar weighting (beta.sor= 0.46, beta.sim= 0.21, beta.sne= 0.25).
Considering the two genetic markers separately, the 12S marker detected greater fish richness per site than the CO1 marker (Wilcoxon signed-rank test, p < 0.01) (Figure 3, and Table S6).Across all the 22 eDNA sampling sites in the freshwater Thames, the average Sørensen dissimilarity between richness detected by the two genetic markers ranged from 0.16 to 0.62 (M = was 0.42, SD = 0.13), with the nestedness component dominating in 21 of the 22 sites (89% of the CO1 detections nested within the 12S detections).Of the 15 (out of 17) tidal eDNA sites which produced sequences for which both loci were sequenced, the average Sørensen dissimilarity ranged from 0.14 to 1 (M = 0.66, SD = 0.23) due to the CO1 primers being unable to recover sequences at two of the sites.Dissimilarity was attributed to nestedness in eight (61.5%)sites, due to the overlap in species detected by both primer sets.Dissimilarity was attributed to species turnover in four (31%) sites where species were detected by one or the other primer, but not both.Water turbidity (clarity) varied across the sites but overall became increasingly turbid and proved difficult to filter at the furthermost seaward sites (Figure S1), with on average 447.9 ml of water filtered at tidal sites, compared to 802.5 ml at freshwater sites.

| Seasonal comparisons
At the three tidal sites selected to investigate seasonal variability (sites 23, 24, and 25), nMDS ordination of the detections by the EA and eDNA, showed a significant difference to the fish communities depending on the survey method (ANOSIM, R = 0.60, p < 0.01).
However, there were no significant differences between these communities in summer and winter (ANOSIM, R = 0.11, p = 0.17), or at each site (ANOSIM, R = −0.15,p = 0.84) (Figure S3).While the species richness was not significantly different between seasons, Sørensen dissimilarity comparisons of the summer and winter collections showed moderate dissimilarity ranging from 0.31 to 0.56

F I G U R E 4
Rarefaction curves for the total EA and eDNA surveys undertaken in the freshwater (left) and tidal (right) Thames during 2018.Dashed line represents the extrapolated richness estimate calculated with R package iNEXT (Hsieh et al., 2016) (M = 0.44, SD = 0.18, Table S7).Kew exhibited the lowest dissimilarity across surveys (beta.sor= 0.31).Beta diversity between the two seasons at Richmond and Kew was attributed to species turnover, rather than nestedness due to the difference in detected community composition across the two time points, compared with Chiswick, where the between seasons beta diversity was attributed to nestedness rather than species turnover.Of the shared species, three were common to all sites in both seasons: bream (Abramis brama), perch, and roach.The use of two separate sampling events increased overall species richness detected by eDNA that year for each site, compared with a single sampling event.

| Ecological analysis using EMS
Despite differences in the species detected by the different methods, metacommunity structure from initial elements of metacommunity structure (EMS) analyses of the Thames EA data, and the 12S eDNA data, both revealed a Clementsian structure (i.e., groups of species responding to environmental gradients (Tonkin et al., 2016)) with significant positive coherence (p < 0.001), significant positive turnover (p < 0.001) and significant boundary clumping (p < 0.001).
A visual inspection of the structures of both these analyses suggests a distinct division in the data that corresponded to the transitional and coastal (TRAC) surveys conducted in the tidal stretches (Figure S5) As such, this tidal subset was considered separately, and EMS

| DISCUSS ION
To address the challenge of accurately assessing the diversity of fish communities in a large river basin that encompasses differing chemical and physical gradients, we focused on the Thames River, UK. eDNA metabarcoding consistently outperformed traditional survey methods at detecting freshwater species richness, despite extensive sampling efforts using the latter methods.In comparison, metabarcoding did not perform as well as traditional approaches in estuarine waters likely due to high turbidity limiting the volume of water that was filtered.Despite this, we were able to make reliable detections, including the novel detection of the rare sea lamprey not encountered in EA surveys.We further demonstrated that minor variations in the data from all survey methods would not impact on the assessment of simple ecological models of community structure.
Rather, our findings support a growing consensus that eDNA can reliably detect fish communities across dynamic freshwater habitats (Fujii et al., 2019;García-Machado et al., 2021;Lecaudey et al., 2019;Sales et al., 2021), and that in many cases, traditional and eDNA approaches should be viewed as complementary.However, we caution that methods need to be optimized to account for differing F I G U R E 5 Non-metric multidimensional scaling (nMDS) plot of the eDNA detections across the Thames in 2018, with polygons grouping river zones.There is an overlap between the upper Thames (blue) and middle Thames (orange) sites, as well as an overlap of middle Thames sites with upper tidal sites (light green).The middle tidal (dark green) sites are separated from the other three zones.(Table S8 of abbreviations used) chemistries among river systems.The turbidity encountered in the tidal Thames led to an undersampling of eDNA at these sites, and a more complete picture of the diversity in this zone may have been achieved had a greater volume of water been filtered.While it is possible that our eDNA detections may be false positives produced by wastewater, or the resuspension of DNA in sediments, the exceptional historic fishing records recorded for the Thames forms a robust reference baseline from which we were able to validate detections.

| Performance of eDNA metabarcoding across lotic fresh and tidal waters
Overall, eDNA metabarcoding out-performed traditional survey methods at detecting species richness (Figure 3) and recovered the same community structure despite a lower sampling effort (39 eDNA surveys vs. 172 traditional fish surveys).This smaller effort in sampling eDNA detected 10 fewer species across fresh and tidal waters but detected species that are consistently underrepresented by traditional methods, particularly in freshwater.By extrapolation of the detection data for freshwater fishes (Figure 4), we demonstrated the superior ability of eDNA metabarcoding for species richness estimation in this habitat.Our findings add to a growing body of literature supporting the use of eDNA for the analysis of freshwater fish communities in riverine systems (e.gAntognazza et al., 2021;Berger et al., 2020;Cilleros et al., 2019).However, we observed a distinct drop in species detection in the tidal stretches of the river Thames compared with that obtained by traditional survey methods, in which several lower tidal sites failed to produce any sequences.
This likely reflects the collection methods employed for our study being optimized for freshwaters, which do not take into account the heavy turbidity encountered in estuaries and so led to clogged filters and under-sampled diversity.Despite these issues, we detected the presence of the rare and protected sea lamprey, which has not previously been reported from >20 years of traditional EA catch surveys.
All survey methods have limitations and biases such that selected methods represent a compromise of accuracy, efficiency, and cost (Coté & Perrow, 2006;Oliveira et al., 2012;Portt et al., 2006).Traditional fish surveying often employs a range of methods reflecting the constraints of the environment and the taxa/life stage being targeted (Perrow et al., 2017;Pope et al., 2010).As such, many studies employ a multigear approach (Colclough et al., 2002;Oliveira et al., 2012).In the freshwater zones, the EA most frequently deploys a single approach-electrofishing as recommended by the Water Framework Directive (CEN, 2003).Electrofishing is known to be biased toward larger individuals and more buoyant species, while small, benthic, and cryptic species are all underrepresented in electrofishing surveys (Portt et al., 2006).Despite this, electrofishing is generally assumed to provide reliable estimates of relative fish abundance combined with low equipment and labor costs (Jordan et al., 2008;Oliveira et al., 2012).Based on our findings for the paired freshwater sites, all species caught by electrofishing were also detected by eDNA metabarcoding, however, we detected a further six species by the latter method.It is of note for future surveying strategies that these species all display ecological traits biased against detection by electrofishing, for example, stone loach and lamprey are benthic species; 3-and 10-spined stickleback are small (average adult size >50 mm); ruffe and crucian carp are cryptic species in coloration and habit, and may be overlooked (Wheeler, 1978).
In contrast, at the paired tidal Thames sites, five species captured using traditional methods were not detected by eDNA: bleak, herring, zander, rudd, and Dover sole (Solea solea).These species all occupy midwater habitats, apart from Dover sole, a purely benthic species.Water turbidity varied across the Thames sites but overall became increasingly turbid and proved difficult to filter in the estuary (Figure S1) with on average 447.9 ml of water filtered at tidal sites, compared to 802.5 ml at freshwater sites.The multigear approach (kick sampling, seine net, beam trawl) deployed by the EA in the tidal Thames targets three habitats per site (shoreline, midwater, benthic) in an attempt to capture the most representative species composition (Colclough et al., 2002).By deploying a multigear approach in the tidal zone, the EA surveys detect a greater percentage of total richness per site than the EA methods employed in the freshwater zones, and by targeting different tidal habitats the EA multigear approach is also superior to our tidal eDNA sampling of a single habitat (shoreline) with a volume of water filtered which had been compromised by turbidity.As the tidal zone is considerably deeper and wider than the freshwater zones, there is the potential that the shoreline-based surface sampling that we deployed did not capture the eDNA of deeper water fish species.However, the tidal Thames is considered a well-mixed estuary, and studies on water chemistry do not show significant differences between the surface and bottom layers due to intense vertical mixing (Premier et al., 2019).
While the Thames estuary may be an exception, there is growing evidence that eDNA is not homogeneously distributed.Habitat preference and thermal stratification have implications for eDNA detectability, with eDNA shown to undergo little vertical mixing in marine environments (Jeunen et al., 2020) and in lacustrine environments, it remains relatively localized to its origins depending on the extent of water mixing (Lawson Handley et al., 2019;Littlefair et al., 2020).Recent studies on lotic environments have shown eDNA to not only be highly mobile and detectable many kilometers from its origin (Pont et al., 2018), but to also exhibit weak widthwise diffusion, allowing the recovery of fish community compositions at smaller geographic scales (Berger et al., 2020;Laporte et al., 2020).
Estuaries are potentially more complicated where the input of mixed water from lotic habitats combines with tidal action and stratification due to differences in water density.However, García-Machado et al. ( 2021) demonstrated the excellent recovery of distinct fish communities from a large temperate riverine system incorporating both fresh and estuarine waters.
There have been fewer explicitly estuarine eDNA studies (although see Zhang et al., 2019), and estuarine sites are frequently included as either part of freshwater (Sales et al., 2021;Yamanaka & Minamoto, 2016) or marine studies (Afzali et al., 2020;Kelly et al., 2018).Estuaries are considered particularly challenging for eDNA analyses due to high turbidity, which clogs filters, and elevated levels of PCR inhibitors (Sanches & Schreier, 2020), compounded by the transport of eDNA from upstream and tidal movements, making interpretation of results more complicated.In addition to not detecting the total richness per site of traditional survey methods in the tidal zone, we also found every tidal survey resulted in eDNA false negatives, compared with 56% of the freshwater sites.One very likely explanation for false negatives is less water being filtered due to filter clogging, in which we showed that the volume of water that we were able to filter steadily decreased further seaward (Figure S1).This finding corresponds with an accumulation of sediment from Woolwich onwards (Baugh et al., 2013).Turbidity in the Thames estuary is greater than either the freshwater stretches or the coastal and offshore waters (Devlin et al., 2008) and studies have only recently investigated optimal field and laboratory protocols for estuarine eDNA (Sanches & Schreier, 2020).In future work, replacing clogging filters to increase the overall volume of filtered water could increase the probability of detecting less abundant species.Our study highlights the need to adapt sampling methodology, as well as sampling strategy to effectively survey estuarine waters, such as those of the lower reaches of the Tidal Thames.

| Using eDNA metabarcoding for ecological analyses
The suitability of eDNA-based identification as a replacement for morphological identification is particularly high for fishes, especially as a replacement for costly and destructive methods (Hering et al., 2018;Pont et al., 2019).The traditional methods employed by the EA in the Thames catchment produced gear-specific richness estimates, which were subsequently inflated by our eDNA detections highlighting the presence of rare species not detected by the traditional surveys.As shown in our study and others, eDNA metabarcoding can be valuable in capturing unseen diversity in fish communities (Kiszka et al., 2018) as well as detecting diversity in more challenging systems (Cilleros et al., 2019).In the freshwater Thames, the estimated Chao richness derived from 22 eDNA surveys was comparable with the cumulated number of species collected during 40 years of traditional surveys (36.5 and 40, respectively).Chao richness derived from the 2018 traditional surveys (n = 153) produced an estimate close to the detected richness (23.9 and 23).In the tidal Thames, the two methods detected similar species richness, although the species identities only matched partially.The multigear approach used by the EA in the tidal Thames provides a more comprehensive picture of the estuary than our eDNA method was able to capture in this study.However, any future studies should increase the volume of water filtered, as this is likely to increase the probability of detecting less abundant species.The detection of estuarine fish species richness using traditional methods is known to be problematic (Waugh et al., 2018) and although it has great potential, eDNA needs to be optimized for estuarine environments.Data derived from eDNA have only recently begun to be used in existing bioassessment programs with encouraging results (Bagley et al., 2019;Pont et al., 2019) although it is also proposed that current assessments of ecological quality could be adapted to eDNA frameworks (Hering et al., 2018;Ruppert et al., 2019).
The estimation of fish abundance from eDNA metabarcoding read counts remains an area of active research and contention (Afzali et al., 2020;Boivin-Delisle et al., 2021).Read count recovery in our mock communities was highly variable.For the CO1 primers, missing taxa precluded the use of this data as a measure of abundance, and a comparison of the fish communities detected with the 12S region (read counts) and traditional methods (abundance) showed a significant difference (Figure S4).These results suggest two potential interpretations of read counts.We created our mock community samples (a mix of U.K. and African freshwater fish species) using equal quantities of DNA from each species and we thus expected approximately equal DNA read representation.For the 12S data, the proportions of observed read counts for different species varied between 0.28% to 27.3% of the total assigned reads, compared with the expected 10% based on input DNA, with pogge, sand smelt, and Mastacembelus tanganicae over-represented in the data and Dover sole, pike, and dragonet underrepresented.For CO1, the proportion of observed read counts varied from 0% to 28.02%.Paired Wilcoxon tests suggest these values did not differ significantly from expected (CO1 marginally so) and, thus, it could be argued that abundance-based analyses can be performed from these data.However, with a threefold difference in read counts being non-significant it becomes equally difficult to interpret read abundances.While a non-significant difference in expected reads suggests some relative quantification is possible, the inconsistency in read recovery makes even relative ranking extremely difficult to interpret in a biological context.Associations between the relative abundance of eDNA metabarcoding read counts, and the relative abundance of fishes detected by traditional survey methods (Boivin-Delisle et al., 2021;Di Muri et al., 2020;Kelly et al., 2014) suggest the eDNA metabarcoding has the ability to reflect a quantitative estimate of aquatic diversity.However, given the variability Mock communities constructed using pooled genomic DNA, which are then PCR amplified, represent a much more informative analog to biases which may occur in eDNA samples during the amplification process.Studies which have used this approach reiterate that relative abundances estimated from metabarcoding reads should be interpreted with caution (Lamb et al., 2019;Leray & Knowlton, 2017;Ratcliffe et al., 2021).
The EA surveys the tidal Thames twice a year in an attempt to monitor seasonal changes, including the presence of larval stages.
eDNA-based methods are at present unable to provide information on age, developmental stage, or size class (Evans & Lamberti, 2018;Pont et al., 2019) and when using maternally inherited mitochondrial markers it remains impossible to distinguish hybrids, such as the frequent cyprinid hybrids, which are recorded in the Thames system.
A careful analysis considering the uncertainty associated with the methods, as well as the costs, time, and logistics, would be useful in comparing eDNA metabarcoding with the traditional survey methods employed in this system.

| Novel species recovery based on eDNA
Our eDNA approach identified several species which were not detected by the traditional EA surveys.These included the rare and protected sea lamprey, which has never been detected by EA fish surveys (Kirk et al., 2002) and due to their benthic habits and slender, eel-like morphology, are likely to evade capture when seine netting is deployed (Portt et al., 2006).In a study of sea lamprey distribution around the Great Lakes of North America (Gingera et al., 2016) the eDNA detection frequency remained high (81%-97%) during the spawning season, but decreased to 6% when spawning finished.We detected this species at two tidal sites (Kew-site 24 and Chiswicksite 25, Figure 1) during June, which corresponds with the section of river where dead, post-spawning individuals were reported in 1999 (Kirk et al., 2002): thus, it is likely that our detection is a true presence.Further eDNA sampling on a more detailed scale using a more sensitive approach such as qPCR or CRISPR-Cas (e.g., Williams et al., 2021) at and around these sites and across the year, would be valuable in determining the presence and extent of occupancy of these fish in the Thames.
Several other notable species were detected with eDNA but were not documented during the 2018 EA surveys, including the two freshwater species: 10-spined stickleback and crucian carp.Crucian carp was detected using the 12S marker from lower freshwater and the upper tidal sites.Although crucian carp have previously been recorded by the EA in the freshwater Thames, there are no tidal records, such that our upper tidal detection could be due to the downstream transport of eDNA from freshwater reaches.There is also a possibility that these detections are false-positive assignments to the reference database as identification errors with goldfish (C.auratus (L.)) are common and the 12S reference data may be derived from a misidentified or hybrid of C. carassius (Knytl et al., 2018).If that is the case, these may represent actual detections of introduced feral goldfish in the tidal Thames, which is plausible due to this species tolerance of saltwater (Tweedley et al., 2017).
The detection of Atlantic salmon is also intriguing.Historically, the Thames had a significant run of salmon, but pollution led to the local extinction of the species (Wheeler, 1979).Attempts to restock the Thames with salmon stopped in 1994 (Griffiths et al., 2011), but this species is still occasionally found, with the most recent record by the EA in 2014 (Marlow, Buckinghamshire).As such our detections of this species (at Fulham and Billingsgate) may represent true positives, although we cannot rule out detections of human food waste at these locations.

| Spatial and temporal influences of eDNA
The transport of eDNA from upstream sites had previously been cited as a source of error in lotic eDNA studies (Roussel et al., 2015) with transport distances for eDNA recorded to vary from meters (Pilliod et al., 2014) to kilometers (Deiner & Altermatt, 2014;Pont et al., 2018).In addition, factors such as discharge, temperature, pH, and substrate all affecting eDNA transport and detectability (Jo & Minamoto, 2021;Seymour et al., 2018;Shogren et al., 2018).Although in high discharge river environments, eDNA transport may inflate downstream richness and reduce beta diversity between sites (Deiner et al., 2016), recent studies have shown it is possible for eDNA to remain relatively spatially exclusive with low lateral dispersion even in lotic conditions (Berger et al., 2020;Laporte et al., 2020;Thalinger et al., 2020).In our study, species assemblages showed absences, additions, and turnovers along the river gradient consistent with known river fish community assemblages and expected structure (Li et al., 2018).The SIMPER analysis highlighted the dissimilarity between communities detected by eDNA in the different Thames zones, rather than a homogenous community predicted for extensive eDNA transport.The analysis of beta diversity showed the turnover component dominating over nestedness along the majority of the river (with the exception of the five sites within the upper Thames zone), also implying that the transportation of eDNA was not enough to blur community compositions.The small sample size and distinct rheophilic fish community of the upper Thames (including lamprey, grayling, and brown trout) may explain the low beta diversity of this zone.
At three sites in the tidal Thames (Richmond, Kew, Chiswick), winter and summer samples were taken to investigate seasonal changes in diversity.We detected a winter increase in richness, and summer detections of species corresponding with known spawning events (sea lamprey and flounder).The detection of the predominantly freshwater species: bullhead, lamprey, rainbow trout (Oncorhynchus mykiss (Walbaum, 1792)), and stone loach during winter surveys may potentially be false positives due to transport of eDNA from communities upstream by seasonally increased water flow and the higher persistence of eDNA at colder temperatures (Collins et al., 2018;Jo et al., 2021).While it has been recommended that eDNA sampling takes place during low water flow (Milhau et al., 2019), this approach may miss seasonal changes in community structure, and there is the potential that the other sites in this study may also exhibit inter seasonal variations in diversity which were missed due to the single sampling which took place.Ultimately, a more detailed investigation of seasonal variation in eDNA is needed.

| Comparative performance of molecular markers
It is widely acknowledged that the use of more than one DNA target or primer set provides superior coverage by limiting the impact of biases associated with any one method (Hänfling et al., 2016;Morey et al., 2020;Shaw et al., 2016), and the use of more than one marker has been recommended (Loeza-quintana et al., 2020;Shu et al., 2020).Multiple genetic markers can compensate for inadequacies associated with specific gene regions, such as binding biases, incomplete databases or the inability to differentiate sister-taxa (Doble et al., 2020;Lecaudey et al., 2019;Morey et al., 2020).Despite both primer sets being designed specifically for fish detection, we recorded a clear difference in the taxonomic diversity detected by them.The 12S MiFish-U primers (Miya et al., 2015) have been widely used with success in a variety of habitats (Afzali et al., 2020;Doble et al., 2020;Littlefair et al., 2020;Sales et al., 2021) and in this study 12S consistently recovered a greater number of fish species.Fish represented a smaller portion of CO1 detections (Figure 3), with the remaining representing birds, mammals, aquatic invertebrates, bacteria, and fungi.The low fish specificity and wide amplification of other taxa may be a result of suboptimal amplification conditions; although the optimum annealing temperature of the primers was 46°, low primer annealing temperature during PCR is also known to lead to low specificity (Collins et al., 2019;Siddall et al., 2009).The CO1 gene region has been used widely in fish barcoding studies (Collins et al., 2012;Hubert et al., 2008;Lowenstein et al., 2009;Vandamme et al., 2016), however, it has been argued that CO1 does not contain suitably conserved regions for targeted eDNA applications (Deagle et al., 2014).The Fish_MiniE primers (Shokralla et al., 2015) used in this study were designed for use with degraded DNA in food, and the short length of the target fragment (226 bp) and well-curated reference databases for CO1 fish sequences suggested these primers may also have a practical application for use with degraded eDNA samples.
Notably, in mock communities, 38% of CO1 reads were unassigned compared with only 0.02% of 12S reads.All species were detected with 12S in the mock communities, whereas three were not recovered by CO1.Despite exploration of general databases, it remains unclear what the unassigned reads in CO1 data represented in both mock communities and real samples, suggesting many of these may be amplification or sequencing errors.The CO1 region is useful when a broader taxonomic view is desirable or where the 12S reference databases are limited.The BOLD database (Ratnasingham & Hebert, 2007) and mBRAVE platform (Ratnasingham, 2019) provides the largest reference collection of barcodes, and has curated reference databases with excellent coverage of fish species (Collins et al., 2019).The CO1 data on BOLD are validated to a very high standard, with the inclusion of voucher specimen and location data, unlike much of the ribosomal data found on Genbank (Ward et al., 2009).Our observations demonstrate the clear difference in the application of these two regions and we would advocate 12S for fish recovery until more taxon-specific fish primers for CO1 are designed.In contrast, for a broader aquatic community, CO1 data are likely to be more applicable.

| CON CLUS ION
To our knowledge, this study is one of the first to measure the efficiency of eDNA-based biomonitoring in a large dynamic river catchment from upper reaches through to estuary.We have shown eDNA metabarcoding in freshwater river systems detects greater species richness than traditional catch survey methods.The eDNA metabarcoding signal we detected was representative of the pattern of fish communities known to exist within the catchment, from the upper freshwater reaches to the tidal Thames.Our study thus adds to a growing number of examples of eDNA metabarcoding being comparable, or even outperforming, traditional fish survey methods (i.e., electrofishing, visual surveys, acoustic telemetry, poisoning), across diverse aquatic environments including temperate rivers and canals (Pont et al., 2018;McDevitt et al., 2019;Antognazza et al., 2021), temperate and tropical lakes (Doble et al., 2020;Hänfling et al., 2016), and marine settings (Afzali et al., 2020).Although our results show the power and potential for the detection of fish diversity in freshwaters, they also illustrated the need for further refinement of eDNA collection in turbid estuaries.Given the ecosystem functions and services of estuaries, and how heavily they are affected by anthropogenic pressures (Sheaves et al., 2015) the development of accurate eDNA sampling is an important next step for the long-term management of this habitat.

ACK N OWLED G M ENTS
tected 33 species (27 in freshwater, 29 tidal) in total.Of the 33 species detected by eDNA, five species overall were not recorded by the EA during 2018: crucian carp, thin-lipped gray mullet (Chelon ramada (L.)), sea lamprey, 10-spined stickleback, and Atlantic salmon.These species identified by eDNA have all been previously recorded in the Thames catchment by the EA, with the exception of the sea F I G U R E 3 Species richness derived from the different sampling methods: 12S, CO1, both markers combined, and by traditional EA fish surveys lamprey which has never been detected during fish surveys.A total of 14 species detected by the EA were not detected by eDNA (see TableS8for species), which with the exception of zander and rudd, are predominantly marine species caught in the tidal Thames.Within the freshwater surveys, the EA data had reached an asymptote in species richness with 172 surveys (Figure4), demonstrating sampling completeness, with a final richness estimate of 23.9 species.Extrapolating eDNA detections to 172 surveys, a species richness estimate of 36.5 was obtained, thus exceeding the estimate based on traditional survey data and comparable with the richness obtained from the historic EA data.In comparison with the 53% difference in freshwater richness estimates between the eDNA versus traditional survey data, there was a 22% difference in the richness estimates for the tidal Thames.When extrapolated to an endpoint (EA = 38, eDNA = 37 surveys) the richness estimate derived by the traditional survey was 35.8, compared with 30.2 derived from eDNA.
revealed a Gleasonian structure (i.e., indicating clear but individualistic turnover between sites) that was significantly positively coherent (p < 0.001), with significant positive turnover (p < 0.001), and non-significant boundary clumping for both the EA and 12S data.The remaining data for the EA, which consisted of freshwater surveys, were also considered in isolation with EMS suggesting a Clementsian structure again, but with no obvious ecological hypothesis for how this might be further separated.The remaining data for 12S, consisting of upper and middle Thames sites, exhibited a quasi Clementsian (i.e., significantly positive coherence and significant boundary clumping as seen in a Clementsian structure, but with non-significant positive turnover) structure.EMS of the CO1 eDNA data also revealed a quasi Clementsian structure: significantly positively coherent (p < 0.001), with non-significant positive turnover, and significant boundary clumping (p < 0.001).There was no ecological rationale for further subsetting and EMS analysis was stopped at that point.
in our mock community recovery, it is not clear from our data what magnitude of difference in read count would indicate a true difference in abundance (or biomass).As a consequence, we have retained presence-absence data primarily in our interpretation and comparisons.While a recent study by Boivin-Delisle et al. (2021) showed no amplification bias in a mock community which was similarly amplified with the MiFish primers used here, their use of pre-amplified DNA extracts negates the investigation of amplification bias during PCR, and rather illustrates a lack of error post PCR and during sequencing.
This work was supported by the Natural Environment Research Council (award NE/L002485/1) and The Fishmongers' Company's Fisheries Charitable Trust.We thank the staff of the Environment Agency for their assistance and advice, in particular Darryl Clifton-Dey, Tom Cousins, and Jon Baxter.We are also grateful to Dr Christopher Doble for laboratory support, Dr Rosie Drinkwater for bioinformatic support, Dr Rupert Collins for technical advice, Joseph Trafford and Wendy Hart for help with DNA extractions, and the field assistants involved in collecting eDNA samples.We thank two anonymous reviewers for their insightful comments on an earlier version of this work.
Comparisons of the species richness detected at the paired EA and eDNA surveys, with total richness combined from both survey methods.Sites 1-17 are freshwater Thames, sites 23.1 -30 tidal Thames.Sites numbered X.1 were surveyed on more than one occasion.Beta.sor = total Sørensen dissimilarity, beta.sim= Simpson pair-wise dissimilarity measuring species turnover, beta.sne = dissimilarity accounting for species nestedness TA B L E 1 TA B L E 2 Species richness estimates derived from EA fish surveys and eDNA in the freshwater and tidal Thames in 20018 both separately and combined and compared to the total richness derived from the historic EA data beginning in 1978 (note N surveys)