Abstract
Large-scale biodiversity monitoring remains a challenge in science and policy. ‘Biodiversity Observation Networks’ provide an integrated infrastructure for monitoring biodiversity through timely discovery, access, and re-use of data, but their establishment relies on an in-depth understanding of existing monitoring effort. We performed a scoping review and network analysis to assess the scope of available data on amphibians and reptiles in the UK and catalogue the mobilisation of information across the data landscape, thereby highlighting existing gaps. The monitoring portfolio has grown rapidly in recent decades, with over three times as many data sources than there are amphibian and reptile species in the UK now available. We identified 45 active sources of ‘FAIR’ (‘Findable’, ‘Accessible’, ‘Interoperable’ and ‘Reusable’) data. The taxonomic, geographic and temporal coverage of datasets appears largely uneven and no single source is currently suitable for producing robust multispecies assessments on large scales. A dynamic and patchy exchange of data occurs between different recording projects, recording communities and digital data platforms. The National Biodiversity Network Atlas is a highly connected source but the scope of its data (re-)use is potentially limited by insufficient accompanying metadata. The emerging complexity and fragmented nature of this dynamic data landscape is likely to grow without a concerted effort to integrate existing activities. The factors driving this complexity extend beyond the UK and to other facets of biodiversity. We recommend integration and greater stakeholder collaboration behind a coordinated infrastructure for data collection, storage and analysis, capable of delivering comprehensive assessments for large-scale biodiversity monitoring.
Similar content being viewed by others
Avoid common mistakes on your manuscript.
Introduction
Human activity is influencing biodiversity turnover across the globe (Dornelas et al. 2014; Keil et al. 2015; Kaarlejärvi et al. 2021). Monitoring biodiversity at large spatial and temporal scales is central to understanding the magnitude of change, and is important in conservation planning and resource allocation by decision-makers (Parr et al. 2002; Petersen et al. 2021; Thornhill et al. 2021). To understand species status and monitor change requires high-quality data. Data must provide sufficient taxonomic, temporal and geographic coverage to reliably inform evidence-based conservation (Wetzel et al. 2018). Given the economical and logistical constraints involved with monitoring, our understanding of biodiversity turnover would likely benefit from combining data originating from different sources. Modern statistical advancements can facilitate the integration of data to produce accurate assessments on the state of biodiversity (Isaac et al. 2020). However, effective monitoring for the long-term requires streamlining disparate efforts in the collection, storage and analysis of biodiversity data.
Historically, biological recording has been coordinated by institutions and carried out by volunteer recorders (Roy et al. 2012; Pocock et al. 2015). Nowadays, the popular practice of engaging volunteers in a scientific project is commonly regarded as ‘citizen science’ (Cohn 2008) which can generate ‘community contributed data’. Citizen (or ‘community’) scientists can assist with the collection of biodiversity data on large spatial and temporal scales that would otherwise not be feasible using small-scale studies (Cohn 2008; Pocock et al. 2017; Dobson et al. 2020; Thornhill et al. 2021). Citizen science projects can vary in their objectives and methodological approaches, and, as with any dataset, can be subject to errors and biases imposed by observational processes (Oliveira et al. 2016; Dobson et al. 2020). Once the characteristics of datasets are understood, constituent data can be handled for bias mitigation (Roy et al. 2012; Isaac et al. 2014; Dobson et al. 2020). For instance, sophisticated analytical tools have emerged in recent years to assist data users to assess (e.g., Boyd et al. 2021) and address (e.g., Bird et al. 2014; Geldmann et al. 2016) sampling errors and biases within observation datasets. Datasets derived from citizen science projects can therefore complement small-scale systematic monitoring and are important sources of observation data, usable by scientists and resource-managers for monitoring biodiversity change (Roy et al. 2012; Bonney et al. 2014; Burgess et al. 2017; McKinley et al. 2017; Tredick et al. 2017; Thornhill et al. 2021).
The value of citizen science, combined with a surge in technological innovation, has increased the diversity of projects that generate data in recent years (Kosmala et al. 2016; McKinley et al. 2017; Pocock et al. 2017; Thornhill et al. 2021). Data management has also evolved with the increased wealth of biodiversity data. Data flows from individuals (e.g. recorders, project organisers, stakeholder groups, consultants) to digital data platforms (e.g. web-based apps, e-infrastructure data portals, multi-dataset repositories) via an array of quality assurance techniques (e.g. automated data verification tools, validated datasets) applied at various levels before reaching data users (James 2011; Roy et al. 2012; Pocock et al. 2015). Collectively, such diversification has resulted in a rich data-gathering landscape. Despite these advances, significant challenges remain for collating and analysing data from different sources. These include issues associated with data confidentiality and fears that open data could endanger sensitive species and their habitats (e.g. persecution, or accidental damage to sites by naturalists wanting to see species); as well as a reluctance to share data that could be used for commercial purposes (Griffiths et al. 2015; Fox et al. 2019). Whilst a rich array of biodiversity data now exists, there remain significant barriers to ensuring that it informs decision-making efficiently. Nonetheless, at a time where there has never been a greater wealth of data available, recent advances in computing mean that there is now also an accompanying suite of statistical tools available to researchers to maximise the re-use of multiple datasets. Hence, the potential to widen the reach of existing biodiversity monitoring efforts and increase the re-use of data through integration is rapidly building momentum.
Effective integration requires a framework that unifies disparate monitoring efforts (König et al. 2019). By its very nature, data integration is dependent upon data sharing. The sheer magnitude of all biodiversity on Earth means that no single institution can know more than a tiny proportion of it at any given time (Walters and Scholes 2017). Stakeholder collaboration in the collection, sharing and analysis of biodiversity data therefore has clear benefits for biodiversity monitoring. Whilst innovative technology can enhance data collection, database design, and data sharing (Roy et al. 2012), large-scale monitoring is difficult without a coordinated infrastructure (Walters and Scholes 2017). ‘Biodiversity Observation Networks’ (e.g., see Wetzel et al. 2018) are collaborative organisational structures for monitoring biodiversity through sharing data, resource and expertise by stakeholders (Walters and Scholes 2017). A network can thus increase the mobilisation of data for research and resource-management. Accordingly, more precise tracking of biodiversity across space and time is possible, thereby enriching the contribution of disparate monitoring efforts (Constable et al. 2010; Jetz et al. 2019). However, the success of a network relies on uptake of common practices across the sector (König et al. 2019). Examples of relevant practices include data standards such as Darwin Core (Wieczorek et al. 2012) for data storage and guiding principles such as the FAIR data principles (Wilkinson et al. 2016) to ensure that data may be ‘Findable’, ‘Accessible’, ‘Interoperable’ and ‘Reusable’, thus enabling the mobilisation of usable biodiversity data.
While advances in technology will facilitate sharing of ‘best practice’ for handling and mobilising biodiversity data, there will still be challenges associated with maximising the quality of data and overcoming biases. Understanding the differences amongst data sources; their taxonomic, temporal and geographic scope; and the flow of data between sources can identify gaps in existing monitoring portfolios. In turn, this can illuminate ways in which monitoring and data may be integrated (Petersen et al. 2021). For instance, the monitoring of amphibians and reptiles in the United Kingdom (UK) is carried out in a number of ways by a diverse recording community. There are thirteen species of amphibians and reptiles native to the UK, many of which have experienced recent declines (Humphreys et al. 2011; Wilkinson and Arnell 2013; Beebee and Ratcliffe 2018; Gardner et al. 2019). Whilst climate change and habitat degradation threaten the UK populations (Dunford and Berry 2012; Turner and Maclean 2022) and may be implicated in declines, formal assessments of status and national trends have largely relied on anecdotal evidence (Hayhow et al. 2019). The lack of empirical evidence is surprising given that such a limited range of species should be relatively simple to identify, and that several are also subject to legal protection with mandatory reporting requirements. Adopting an integrated, network approach is therefore likely to enhance the monitoring and conservation effort for these species.
As an essential first step towards understanding the value of existing data and opportunities for integration, we surveyed the UK amphibian and reptile data management landscape. We used a scoping review and network analysis to characterise and track the mobilisation of information across the data landscape. To our knowledge, this study is the first to use this approach in the context of biodiversity data exploration. We identified an array of existing sources of amphibian and reptile data, characterised the scope of data sources using the available meta-data associated with each source, and highlighted limitations in the existing monitoring portfolio for tracking species national status and population trends. The network analysis illuminated the dynamics at play within an unrealised Biodiversity Observation Network. To this end, the aims of this review were to:
-
(1)
Identify existing sources of UK amphibian and reptile observation data.
-
(2)
Characterise the taxonomic, geographic and temporal scope of UK amphibian and reptile data sources and their corresponding sampling and dataset quality assurance procedures.
-
(3)
Catalogue the mobilisation of data between sources of UK amphibian and reptile observation data.
-
(4)
Identify gaps in the existing UK amphibian and reptile data management landscape.
-
(5)
Provide recommendations for achieving an integrated Biodiversity Observation Network.
Methodology
Search strategy
We used a scoping review framework (see Arksey and O’Malley 2005; Levac et al. 2010) to survey sources of amphibian and reptile observation data. This enabled us to identify sources that were not strictly locked within academic literature, thus reflecting the type of data suitable for integration across a network. We performed searches between 7 November 2020 and 15 January 2021. First, data sources were identified through consultations with three stakeholder organisations: the Amphibian and Reptile Group (ARG UK) network, the Amphibian and Reptile Conservation (ARC) Trust, and the British Trust for Ornithology (BTO). The ARG UK network and the ARC Trust are leading amphibian and reptile conservation charities that support various projects monitoring and conserving native amphibian and reptile populations across the UK. The BTO is mainstream UK conservation charity leading on the conservation and research of birds and other British wildlife. The BTO Garden BirdWatch scheme is the one of the largest biodiversity monitoring citizen science projects in the UK and generates thousands of amphibian and reptile observations annually. Following initial consultations, we performed a series of desk-based searches of electronic databases, internet search engines, registries of biological records data, and the grey literature. We searched Google search engine (www.google.co.uk) using the key terms “UK reptile amphibian data” and “UK biodiversity recording database”, respectively. The first 100 results of each search were reviewed as potential sources of reptile and/or amphibian observation data. Next, we searched the National Biodiversity Network (NBN) Atlas (www.nbnatlas.org) using the ‘advanced search’ function to filter for relevant data partners and datasets using the [Species/Taxon (any)] field and searching the key term “reptile OR amphibian”. We then searched the UK Environmental Observation Framework Catalogue (www.ukeof.org.uk) for relevant datasets using the key term “reptile OR amphibian”. Relevant platforms were also identified from manual interrogation of those listed on the National Forum for Biological Recording (www.nfbr.org.uk) and the Biological Records Centre (BRC) (www.brc.ac.uk). Lastly, it was important to manually search key organisations’ websites to identify data sources that may have been missed in other searches (Arksey and O’Malley 2005). Accordingly, we searched the websites of six biological recording organisations: the ARC Trust (www.arc-trust.org), ARG UK (www.arguk.org), the British Herpetological Society (BHS) (www.thebhs.org), BTO (www.bto.org), Froglife (www.froglife.org) and Natural England (NE) (www.naturalengland-defra.opendata.arcgis.com). All desk-based searches were repeated between 22 and 24 May 2021 to identify additional sources missed during the initial search. The results of our search strategy are provided in full in the electronic supplementary information (see S1).
Data source selection
Data source selection was an iterative process which involved searching for sources of data, refining the search strategy, and reviewing sources for inclusion (Levac et al. 2010) (see Fig. 1). Where appropriate, we grouped sources according to their overarching ‘umbrella’ organisations or collectives as these were analogous in their purpose and operations. The inclusion criteria used in this review focused on capturing FAIR datasets (see Wilkinson et al. 2016) which included records of UK-native live, wild amphibians (Common frog Rana temporaria, Common toad Bufo bufo, Great crested newt Triturus cristatus, Natterjack toad Epidalea calamita, Northern pool frog Pelophylax lessonae, Palmate newt Lissotriton helveticus, Smooth newt Lissotriton vulgaris) and reptiles (European adder Vipera berus, Grass snake Natrix helvetica, Sand lizard Lacerta agilis, Slow-worm Anguis fragilis, Smooth snake Coronella austriaca, Viviparous lizard Zootoca vivipara). Data sources were included if they were discoverable using the search strategy (‘Findable’), contained available datasets (‘Accessible’) with broadly applicable content and language (‘Interoperable’), and had descriptive information available about the sampling methods used for generating data (‘Reusable’).
Data charting
We collated information on the characteristics of datasets from data source websites and through consultations with data publishers, where this could be arranged (incl. ARC Trust, ARG UK, BTO, Royal Society for the Protection of Birds, Froglife, BRC, and The Woodland Trust). We abstracted the metadata and sampling event information for each data source using a data charting technique to synthesise and interpret the information. Where available, we captured the following information using a standardised form: source name; publisher/organiser; background information and purpose; recorder characteristics; type of available data; temporal coverage (i.e., year of establishment and/or year that source became involved in amphibian and/or reptile data collection, storage and/or management); geographic coverage; taxonomic coverage; data quality assurance procedures; data transfer activity.
Methodological appraisal
We summarised data sources according to their characteristics, data generation procedures and dataset attributes. This included an assessment of the taxonomic, geographic and temporal scope of the dataset, recorder characteristics and any data quality assurance techniques used, as derived from the available metadata associated with each data source. We categorised data sources based on the structural traits of their equivalent datasets using five-point Likert scale ranging from highly structured (A) to unstructured (E) data (see electronic supplementary information, S2).
Data analysis
We performed a network analysis to visualise the mobilisation of data between sources and identify prominent sources in the network. Data sources were represented in a network as nodes and data transfers were mapped as links, plotted using the GGally (Schloerke et al. 2021), ggplot2 (Wickham 2016), network (Butts 2015), and igraph (Csardi and Nepusz 2016) R packages. Network metrics were computed using the ‘networkD3’ (Allaire et al. 2017) R package. The ‘degree’ of nodes reflected the number of directional links with other nodes. The average number of links to pass through a node was calculated as ‘betweenness’. ‘Betweenness centrality’ was computed as the number of instances in which a node fell on the shortest path between two other nodes, thus facilitating data transfer between sources. To identify influential sources in the network, eigenvector values were computed which took account of the ‘degree’ of nodes and their connectedness to other well-connected nodes. Nodes with high eigenvector values were centralised in the network and were, therefore, largely influential in the mobilisation of data across the data landscape. All data analysis procedures were carried out using R studio v4.0.2 (R Core Team 2021).
Results
Data source attributes, contributors and quality assurance
We identified 45 sources of UK amphibian and reptile observation data from the scoping review (see Table 1). These sources clustered into three typologies: ‘recording projects’ (n = 26), ‘recording communities’ (n = 4) and ‘digital data platforms’ (n = 15). Recording projects reflected a coordinated data collection activity that followed a defined methodology (e.g. systematic or semi-structured monitoring, see Table 1) with a discrete taxonomic, geographic or temporal focus. Recording communities were organised groups of individuals that carried out the collection of data and coordinated the storage and sharing of data. Recording communities typically organised and participated in sampling events, though were not defined by methodological constraints, and hence we identified recording communities that were associated with several datasets. Digital data platforms represented online tools for the direct capture, storage, or export of records. The structure of datasets varied across data sources. Heterogeneous datasets (Group C) were the most widely available across sources (42%), particularly for digital data platforms, with component records an aggregation of verified and unverified opportunistic sightings and systematic survey data collected by a variety of recorders. There was comparable abundance of highly structured (Group A, 24%) and semi-structured datasets (Group B, 22%). Generally, these datasets consisted of validated and verified records that had been collected using pre-defined (semi-)systematic methodologies.
The recorders contributing to the various data sources ranged from novice citizen scientists to experienced species surveyors and a combination thereof. Nine recording projects solely recruited citizen scientists (of any ability) to collect data. Seven recording projects only recruited experienced (often licenced) species surveyors to collect data, particularly when European-protected species were the taxonomic focus of monitoring. When these species were the taxonomic focus of citizen science-based recording projects, citizen scientists accompanied licenced species surveyors to collect data on systematic surveys and received training in species survey methodologies and identification. Across all citizen science-based recording projects, six of the project organisers provided only identification guides, whereas five of the projects did not issue identification guides or any formal training to citizen scientists. Most digital data platforms included verification and validation procedures for quality control purposes. Typically, this encompassed verification by species experts (e.g. ‘County Recorders’). Record verification for some platforms also relied on community knowledge, whereby online communities of wildlife recorders provided identification suggestions to each other’s observations, and automated computer checks to flag (likely) errors to recorders entering data or to verifiers after records had been entered. Further information on the characteristics of each data source is provided in the electronic supplementary information (S3).
Taxonomic coverage
Sources of data for all native species of UK amphibians and reptiles were identified. As illustrated in Fig. 2, multispecies datasets, particularly for widespread species, featured extensively across sources. Recording communities only generated multispecies data. Digital data platforms also typically captured multispecies data, whilst a targeted taxonomic focus was more common amongst recording projects. Amphibians were the taxonomic focus of data sources more frequently than reptiles. Common frog had the best coverage across all data sources, as approximately three-quarters of all sources captured observation data for this species. Sources of data for great crested newts were also widely available and seven sources included eDNA records. European adder, sand lizard and grass snake were the focal species most frequently targeted amongst the reptile recording projects. No source specifically targeted the sole collection of data for palmate newt, slow-worm, smooth newt or viviparous lizard, though data for these species were captured by multispecies sources and can be obtained from sources targeting species with legislative reporting requirements.
Geographic coverage
There was division in the geographic availability of data sources included in this review (see electronic supplementary information, S5). Data sources pertained mostly to England (n = 41), followed by Wales (n = 34) and Scotland (n = 30). Northern Ireland had the fewest sources of data (n = 24). Data available through digital data platforms generally had the largest (national) geographic coverage. We note, however, that while many data sources had the potential for national coverage, their actual spatial extents were often more restricted to a number of targeted sites (see electronic supplementary information, S3). This was particularly evident for recording projects and for recording communities that were bound by a local (e.g. Vice-county) perimeter of operation (see electronic supplementary information, S3).
Temporal coverage
The number of data sources has fluctuated widely over the last century (see Fig. 3). Records of human observations were available from 1900 whilst eDNA records emerged from 2013 onwards. One digital data platform was active since the start of the 1900s. Recording communities emerged in the 1910s and the first (still active) recording projects emerged in, or shortly prior to, the 1950s. The largest increase in the total number of active sources was observed between the 2000s and the 2010s; rising from 12 sources at the end of the 1990s to 44 active sources by 2019. Despite the majority of sources emerging later towards the 2000s, digital data platforms usually included historical records prior to the platform’s establishment. Recording projects generally ranged between 3 (IQ1) and 31 (IQ3) years, with a mode of 3 years. Historical records were also available through the ARC Reserves Surveys, reflecting some of the earliest records available through a recording project. The Natterjack Toad Monitoring Programme and the Sand Lizard Monitoring Programme had the longest periods of continuous monitoring of any recording project, spanning 40 and 37 years respectively. Recording communities also typically had extensive periods of activity, on average 66 years.
Network analysis
The network analysis illustrated the flow of data between sources (see Fig. 4). The analysis indicated that the UK amphibian and reptile monitoring portfolio is a dynamic and fragmented data landscape. Two isolated nodes, with no links to any other source, were identified in the network. All other sources had at least one link, but some appeared to only receive data and did not export data to other sources. The degree (‘g’) of nodes averaged to 4.6 links per data source though 53% of sources had two or fewer links. Digital data platforms generally had the highest number of connections. Overall, the NBN Atlas had the highest number of connections (g = 21), followed by the LERCs (g = 19) and the Living ARCive (g = 17). The ARGs/RAGs (g = 13) and Great Crested Newt Level 1 Licence Returns (g = 6) sources were the most connected recording community and recording project in the network, respectively. On average, data mobilised across 2.4 links between sources within the network. Digital data platforms often fell on the shortest path between other nodes in the network (betweenness centrality, ‘bc’) and were highly influential over the mobilisation of data across the network (eigenvector centrality, ‘ec’). The NBN Atlas (bc = 309), Living ARCive (bc = 281), and LERCs (bc = 172) had the highest bc across all sources, indicating that these sources, particularly the NBN Atlas, most frequently bridged the transfer of data between two other sources in the network. The LERCs had the highest ec overall, indicating that these were the most centralised sources of data within the network, with high connectedness to other centralised sources. Other centralised sources (ec > 0.60) in the network included the NBN Atlas (ec = 0.90), iRecord (ec = 0.67), ARGs/RAGs (ec = 0.65), and Record Pool (ec = 0.64).
Discussion
Integrated biodiversity monitoring may enhance the (re)usability of available data and enable more precise tracking of biodiversity over large spatial and temporal extents. In this review, we explored the scope of existing sources of FAIR (see Wilkinson et al. 2016) amphibian and reptile data for assessing species status and national trends in the UK. Recognising that individual datasets were collated with specific purposes in mind, we did not seek to ascertain which were “the best” sources of data. Rather, to illustrate the heterogeneity of the data landscape and to identify taxonomic, temporal and geographic gaps in the existing monitoring portfolio. Whilst diversity can enhance monitoring capabilities, we observed an emerging problem of complexity and fragmentation that is likely to amplify under ongoing technological innovation (e.g., see August et al. 2015). Collectively, datasets may provide comprehensive information for all species and regions but without integrating disparate monitoring efforts, the ongoing complexity and fragmentation of the evidence base is only likely to increase. Many of the factors driving this situation are pertinent to biodiversity monitoring more widely, so the problems and solutions are likely to be general. The integration of data in a unifying network infrastructure that streamlines fragmented monitoring may offer more precise, up-to-date biodiversity assessments over multiple scales.
The UK amphibian and reptile monitoring portfolio is a diverse data landscape comprising recording projects, recording communities and digital data platforms that collect, curate, and share data for all native species. The large number of sources is testament to a growing conservation community and should be celebrated. However, this diversity presents challenges for synthesis in research and decision-making processes at national scales. Digital data platforms are key for the mobilisation of data, particularly the LERCs and the NBN Atlas, which are highly connected and centralised sources in the data landscape. Collectively, the LERCs interacted with other important sources more frequently than the NBN Atlas alone which led to their aggregated position as the most centralised sources in the data landscape. Though the mobilisation of data from some LERCs can sometimes be restricted by paywalls, formatting incompatibility or due to constraints on data sensitivity and confidentiality. At its inception, the NBN Atlas sought to become; “the best wildlife information management structure”, by capturing, enhancing and mobilising wildlife data, making information widely available and engaging people about wildlife (NBN Trust 2014). We found that the NBN Atlas is highly connected to other data sources and is a central distributor of information, frequently bridging the transfer of data between other sources. This suggests that the NBN Atlas has been reasonably successful towards achieving its aims in collating and making data widely available. However, the full vision of the NBN Atlas may not yet be realised as we found that it lacked detailed metadata on the sampling protocols used to generate datasets. This information is essential for reusing data in other contexts. Taken together, our findings suggest that whilst the NBN Atlas is the most publicly accessible source and has the potential to reach its objective of becoming “the best wildlife information management structure”, it currently falls short due to insufficient metadata and lower rates of data sharing with other important data distributors than could be achieved.
It is important to stress that the high centrality metric used in our analysis does not directly relate to “the ‘best” data source. Instead, we used this metric to highlight which sources are influential in the mobilisation of biodiversity data (Zhao and Zhang 2020). There are many advantages to diversity in species recording and data management. Multiple organisations working together can address more facets of biodiversity monitoring beyond the capacity of any standalone organisation. A variety of stakeholders also fulfil different roles within a nature conservation network; from bottom-up primary data generators, with detailed regional or taxonomic expertise, to top-down statutory monitoring and governance. It is encouraging that we observed a high reciprocity of data transfer between sources as this suggests that many organisations are promoting a FAIR and open data landscape. However, high mobilisation of data may affect the quality of available data as there are multiple levels at which information may be lost through data manipulation and interpretation by data users. We identified isolated sources and one-way links which may pose significant weaknesses in the network. Catastrophic data loss could occur for some species and regions if an organisation collapses or ceases to collect data into the future. Poorly connected sources may also be less likely to contribute to wider biodiversity conservation efforts than well-connected sources. Hence, sources of this nature may limit the mobilisation of data across a network, hampering future integration efforts and restricting the information available for research and for informing national policy.
Currently, none of the existing sources of UK amphibian and reptile data appear to provide sufficient baseline information for national monitoring of all species, though some sources may have adequate foundations to build on for specific species, regions and time periods. Digital data platforms and recording communities generally have wide taxonomic scope, acting as “catch-all buckets” for any available data across large temporal and spatial extents. However, data made available through digital data platforms tend to lack sufficient quality to make reliable inferences of biodiversity dynamics (Bayraktarov et al. 2019) as they typically contain only presence-only records. National platforms with information-rich abundance data is lacking, but incorporation of standards such as the Darwin Core ‘Event’ category (Wieczorek et al. 2012), which formalise the capture and presentation of sampling information across heterogeneous datasets would make this possible. Nonetheless, the large datasets of presence-only records available through digital data platforms can complement systematic surveys, filling some of the spatial and temporal gaps often associated with small-scale studies (Isaac et al. 2020). In isolation, however, these datasets usually contain a variety of data biases (Petrovan et al. 2020), which can lead to misleading conclusions if not recognised and accounted for (Isaac and Pocock 2015).
Structured datasets arising through systematic monitoring of multiple species can enable standalone assessments of biodiversity. We observed that systematic monitoring is often restricted to a selection of sites within regions (i.e., via convenience sampling as sites are managed by project coordinators). Systematic monitoring currently favours amphibians over reptiles and most structured datasets are limited to species with legislative reporting requirements. For instance, most of the existing suite of structured amphibian and reptile datasets are single-species and arise from the systematic monitoring of European-protected species coordinated solely by conservation organisations. We did find that citizen science-based recording projects frequently generated multispecies datasets, usually for species that are widespread in their occurrence. Though such data may contain sampling biases of varying degrees, particularly as large heterogeneous collections of records (Isaac and Pocock 2015), and there has been limited empirical analyses of these datasets with regard to amphibians and reptiles (though see Humphreys et al. 2011; Wilkinson and Arnell 2013). Where recording projects, however, have focussed on single-species monitoring, there have been relatively more empirical outcomes. For instance, common frog (Scott et al. 2008, great crested newt (Beebee 1997; Denoël 2012), common toad (Petrovan and Schmidt 2016), and adder (Gardner et al. 2019) have all featured in empirical studies and we observed that these species are a popular focus in recording projects. Whereas, quantitative assessments for palmate newt, smooth newt, slow-worm and viviparous lizard are largely lacking and we found that these species had the lowest rates of monitoring of all widespread species. In the case of palmate newt, this could in part be due to difficulties with identification or lower rates of occupancy nationally. Likewise, we observed a clear geographic bias for England and lower rates for Northern Ireland. Though differences in human population densities and regional taxonomic prevalence could explain these findings.
Advances in computing are likely to have led to a variety of means for collecting, validating, and verifying data, and therefore have likely contributed to an increase in the uptake of citizen science approaches in biodiversity monitoring in recent years (August et al. 2015; Pocock et al. 2017). For instance, eDNA has emerged as a viable tool for amphibian monitoring (Biggs et al. 2015), and we found seven sources of eDNA in our search, all initiated since 2013. Amphibian and reptile surveillance is also a primary frontier in several emerging ecological remote-sensing techniques such as camera trapping (Welbourne et al. 2017). In line with other accounts (e.g. James 2011; Roy et al. 2012; Pocock et al. 2015), we observed that an array of data quality assurance techniques can be imposed on datasets before being made available to data users. We caution, however, that excessive manipulation of data by publishers may reduce the quality of available metadata depending on the format in which it is published. As is typical for sources of biodiversity data (Roy et al. 2012; Dobson et al. 2020; Thornhill et al. 2021), many UK amphibian and reptile datasets reflect heterogeneous collections of records originating from opportunistic and systematic surveys. However, we found that extracting specific data collection procedures from these sources was either challenging or impossible. By restricting the availability of sampling event information associated with datasets, the potential for reuse of existing biodiversity data may be constrained for several large data sources.
Biodiversity and conservation science is in the midst of adopting more formal and systematic approaches to evidence synthesis. Historically, evidence reviews in biodiversity science have had lower standards of reproducibility (Grames and Elphick 2020). We adapted the traditional scoping review framework (Arksey and O’Malley 2005) to suit the needs of this review to provide a rigorous and transparent approach to mapping a baseline account of FAIR UK amphibian and reptile observation data; permitting gaps in the current monitoring portfolio to come to light. We hope that this study may serve as a template for summarising sources of biodiversity data, enabling comparable assessments and appraisals of existing data for other taxa and environments. We grouped and evaluated some sources as collective units as the evaluation of their separate entities was not feasible. These represented branched organisations that operated as independent groups. Therefore, it is important to note that not every independent branch of grouped sources (i.e. the ‘LERCs’, ‘ARGs/RAGs’, ‘Local Nature Partnerships’, and ‘Wildlife Trusts’) will necessarily have links to all of the data sources identified in the network analysis. Nonetheless, we expect that mapping them in this way provides a typical depiction of the characteristics and wider mobilisation of data and across a biodiversity data management landscape.
Effective large-scale biodiversity monitoring requires integration of localised and fragmented monitoring efforts, thereby extending the capacity of any stand-alone programme, to address pressing science and conservation issues (Kühl et al. 2020). We recommend integration of datasets and coordinated monitoring for more comprehensive status and trends assessments. A discussion on the complementarities amongst data sources in this review is provided in the electronic supplementary information (see S4). To achieve an integrated monitoring portfolio, stakeholder collaboration within a unified infrastructure, such as a network, is paramount. Aligning pathways in shared, interoperable formats, combined with core monitoring, allows for robust analyses on the patterns of large-scale biodiversity change (Kühl et al. 2020). We conclude this review by providing recommendations to improve on current practice and achieve an integrated biodiversity monitoring portfolio.
First, to improve transparency and allow data to be used more widely, data publishers should seek to improve the ‘interoperability’ and ‘reusability’ of datasets by providing data in clear, interoperable formats [e.g., ‘Darwin Core’ (Wieczorek et al. 2012)] to align with data standards and ensure that important sampling event metadata accompany records in datasets. Second, we urge data publishers to provide clarity on how information is disseminated and shared between recorders, scheme organisers, scientists and decision-makers. As a minimum, this would provide information about the level of data duplication when combining datasets in a single analytical framework. Data should be presented in a way that would enable it to be traced back to its origin and allow data-users to ascertain how the data was collected. Examining all facets of network communication was not within the scope of this review, but clear channels of communication will be essential to enable an integrated network to generate and share information more effectively. Future work should explore current practice for sharing information and evidence between data publishers and government bodies so that clear channels for sharing data and information can permeate across the landscape. Third, the development of a validated tool to assess the ‘structure’ of datasets would likely enable more timely identification of fit-for-purpose datasets. Finally, we advocate for the establishment of an effective realised Biodiversity Observation Network, co-developed by stakeholders, and the enhancement of existing centralised data infrastructures that take account of these recommendations for collating, characterising, and sharing biodiversity data.
References
Allaire JJ, Gandrud C, Russell K, Yetman CJ (2017) networkD3: D3 JavaScript Network Graphs from R
Arksey H, O’Malley L (2005) Scoping studies: towards a methodological framework. Int J Soc Res Methodol 8:19–32. https://doi.org/10.1080/1364557032000119616
August T, Harvey M, Lightfoot P et al (2015) Emerging technologies for biological recording. Biol J Linn Soc 115:731–749. https://doi.org/10.1111/bij.12534
Bayraktarov E, Ehmke G, O’Connor J et al (2019) Do big unstructured biodiversity data mean more knowledge? Front Ecol Evol. https://doi.org/10.3389/fevo.2018.00239
Beebee TJC (1997) Changes in dewpond numbers and amphibian diversity over 20 years on chalk downland in Sussex, England. Biol Conserv 81:215–219. https://doi.org/10.1016/S0006-3207(97)00002-5
Beebee TJC, Ratcliffe S (2018) Inferring status changes of three widespread British reptiles from NBN Atlas records. Herpetol Bull 143:18–22
Biggs J, Ewald N, Valentini A et al (2015) Using eDNA to develop a national citizen science-based monitoring programme for the great crested newt (Triturus cristatus). Biol Conserv 183:19–28. https://doi.org/10.1016/j.biocon.2014.11.029
Bird TJ, Bates AE, Lefcheck JS et al (2014) Statistical solutions for error and bias in global citizen science datasets. Biol Conserv 173:144–154. https://doi.org/10.1016/j.biocon.2013.07.037
Bonney R, Shirk JL, Phillips TB et al (2014) Next Steps for citizen science. Science 343:1436–1437. https://doi.org/10.1126/science.1251554
Boyd RJ, Powney GD, Carvell C, Pescott OL (2021) occAssess: an R package for assessing potential biases in species occurrence data. Ecol Evol 11:16177–16187. https://doi.org/10.1002/ece3.8299
Burgess HK, DeBey LB, Froehlich HE et al (2017) The science of citizen science: exploring barriers to use as a primary research tool. Biol Conserv 208:113–120. https://doi.org/10.1016/j.biocon.2016.05.014
Butts C (2015) network: Classes for Relational Data. The Statnet Project (http://www.statnet.org). R package version 1.13.0.1. https://CRAN.R-project.org/package=network
Cohn JP (2008) Citizen science: can volunteers do real research? Bioscience 58:192–197. https://doi.org/10.1641/B580303
Constable H, Guralnick R, Wieczorek J et al (2010) VertNet: a new model for biodiversity data sharing. PLOS Biol 8:e1000309. https://doi.org/10.1371/journal.pbio.1000309
Csardi G, Nepusz T (2016) The igraph software packafe for complex network research
Denoël M (2012) Newt decline in Western Europe: highlights from relative distribution changes within guilds. Biodivers Conserv 21:2887–2898. https://doi.org/10.1007/s10531-012-0343-x
Dobson ADM, Milner-Gulland EJ, Aebischer NJ et al (2020) Making messy data work for conservation. One Earth 2:455–465. https://doi.org/10.1016/j.oneear.2020.04.012
Dornelas M, Gotelli NJ, McGill B et al (2014) assemblage time series reveal biodiversity change but not systematic loss. Science 344:296–299. https://doi.org/10.1126/science.1248484
Dunford R, Berry P (2012) Climate change modelling of English amphibians and reptiles: report to Amphibian and Reptile Conservation Trust (ARC-Trust)
Fox R, Bourn NAD, Dennis EB et al (2019) Opinions of citizen scientists on open access to UK butterfly and moth occurrence data. Biodivers Conserv 28:3321–3341. https://doi.org/10.1007/s10531-019-01824-6
Gardner E, Julian A, Monk C, Baker J (2019) Make the adder count: population trends from a citizen science survey of UK adders. Herpetol J 29:57–70. https://doi.org/10.33256/hj29.1.5770
Geldmann J, Heilmann-Clausen J, Holm TE et al (2016) What determines spatial bias in citizen science? Exploring four recording schemes with different proficiency requirements. Divers Distrib 22:1139–1149. https://doi.org/10.1111/ddi.12477
Grames EM, Elphick CS (2020) Use of study design principles would increase the reproducibility of reviews in conservation biology. Biol Conserv 241:108385. https://doi.org/10.1016/j.biocon.2019.108385
Griffiths RA, Foster J, Wilkinson JW, Sewell D (2015) Science, statistics and surveys: a herpetological perspective. J Appl Ecol 52:1413–1417. https://doi.org/10.1111/1365-2664.12463
Hayhow DB, Eaton MA, Stanbury AJ et al (2019) The State of Nature 2019. The State of Nature partnership
Humphreys E, Toms M, Newson S et al (2011) An examination of reptile and amphibian populations in gardens, the factors influencing garden use and the role of a “Citizen Science” approach for monitoring their populations within this habitat, BTO Research Report No. 572
Isaac NJB, Pocock MJO (2015) Bias and information in biological records. Biol J Linn Soc 115:522–531. https://doi.org/10.1111/bij.12532
Isaac NJB, van Strien AJ, August TA et al (2014) Statistics for citizen science: extracting signals of change from noisy ecological data. Methods Ecol Evol 5:1052–1060. https://doi.org/10.1111/2041-210X.12254
Isaac NJB, Jarzyna MA, Keil P et al (2020) Data integration for large-scale Mmodels of species distributions. Trends Ecol Evol. https://doi.org/10.1016/j.tree.2019.08.006
James T (2011) Improving Wildlife Data Quality. NBN Trust, Nottingham
Jetz W, McGeoch MA, Guralnick R et al (2019) Essential biodiversity variables for mapping and monitoring species populations. Nat Ecol Evol 3:539–551. https://doi.org/10.1038/s41559-019-0826-1
Kaarlejärvi E, Salemaa M, Tonteri T et al (2021) Temporal biodiversity change following disturbance varies along an environmental gradient. Glob Ecol Biogeogr 30:476–489. https://doi.org/10.1111/geb.13233
Keil P, Storch D, Jetz W (2015) On the decline of biodiversity due to area loss. Nat Commun 6:8837. https://doi.org/10.1038/ncomms9837
König C, Weigelt P, Schrader J et al (2019) Biodiversity data integration—the significance of data resolution and domain. PLOS Biol 17:e3000183. https://doi.org/10.1371/journal.pbio.3000183
Kosmala M, Wiggins A, Swanson A, Simmons B (2016) Assessing data quality in citizen science. Front Ecol Environ 14:551–560. https://doi.org/10.1002/fee.1436
Kühl HS, Bowler DE, Bösch L et al (2020) Effective biodiversity monitoring needs a culture of integration. One Earth. https://doi.org/10.1016/j.oneear.2020.09.010
Levac D, Colquhoun H, O’Brien KK (2010) Scoping studies: advancing the methodology. Implement Sci 5:69. https://doi.org/10.1186/1748-5908-5-69
McKinley DC, Miller-Rushing AJ, Ballard HL et al (2017) Citizen science can improve conservation science, natural resource management, and environmental protection. Biol Conserv 208:15–28. https://doi.org/10.1016/j.biocon.2016.05.015
NBN Trust (2014) NBN strategy 2015–2020 first draft for consultation 2015
Oliveira U, Paglia AP, Brescovit AD et al (2016) The strong influence of collection bias on biodiversity knowledge shortfalls of Brazilian terrestrial biodiversity. Divers Distrib 22:1232–1244. https://doi.org/10.1111/ddi.12489
Parr TW, Ferretti M, Simpson IC et al (2002) Towards a long-term integrated monitoring programme in europe: network design in theory and practice. Environ Monit Assess 78:253–290. https://doi.org/10.1023/A:1019934919140
Petersen TK, Speed JDM, Grøtan V, Austrheim G (2021) Species data for understanding biodiversity dynamics: the what, where and when of species occurrence data collection. Ecol Solut Evid 2:e12048. https://doi.org/10.1002/2688-8319.12048
Petrovan SO, Schmidt BR (2016) Volunteer conservation action data reveals large-scale and long-term negative population trends of a widespread amphibian, the common toad (Bufo bufo). PLoS ONE 11:e0161943. https://doi.org/10.1371/journal.pone.0161943
Petrovan SO, Vale CG, Sillero N (2020) Using citizen science in road surveys for large-scale amphibian monitoring: are biased data representative for species distribution? Biodivers Conserv 29:1767–1781. https://doi.org/10.1007/s10531-020-01956-0
Pocock MJO, Roy HE, Preston CD, Roy DB (2015) The Biological Records Centre: a pioneer of citizen science. Biol J Linn Soc 115:475–493. https://doi.org/10.1111/bij.12548
Pocock MJO, Tweddle JC, Savage J et al (2017) The diversity and evolution of ecological and environmental citizen science. PLoS ONE 12:e0172579. https://doi.org/10.1371/journal.pone.0172579
R Core Team (2021) R: a language and environment for statistical computing
Roy HE, Pocock MJO, Preston CD et al (2012) Understanding Citizen Science & Environmental Monitoring. NERC Centre for Ecology & Hydrology and Natural History Museum, Final Report on behalf of UK-EOF
Schloerke B, Cooke D, Larmarange J et al (2021) GGally: extension to “ggplot2”
Scott WA, Pithart D, Adamson JK (2008) Long-Term United Kingdom Trends in the Breeding Phenology of the Common Frog, Rana temporaria. J Herpetol 42:89–96. http://www.jstor.org/stable/40060486
Thornhill I, Cornelissen JHC, McPherson JM et al (2021) Towards ecological science for all by all. J Appl Ecol 58:206–213. https://doi.org/10.1111/1365-2664.13841
Tredick CA, Lewison RL, Deutschman DH et al (2017) A rubric to evaluate citizen-science programs for long-term ecological monitoring. Bioscience 67:834–844. https://doi.org/10.1093/biosci/bix090
Turner RK, Maclean IMD (2022) Microclimate-driven trends in spring-emergence phenology in a temperate reptile (Vipera berus): evidence for a potential “climate trap”? Ecol Evol 12:e8623. https://doi.org/10.1002/ece3.8623
Walters M, Scholes RJ (eds) (2017) The GEO Handbook on Biodiversity Observation Networks. Springer International Publishing, Cham
Welbourne DJ, Paull DJ, Claridge AW, Ford F (2017) A frontier in the use of camera traps: surveying terrestrial squamate assemblages. Remote Sens Ecol Conserv 3:133–145. https://doi.org/10.1002/rse2.57
Wetzel FT, Bingham HC, Groom Q et al (2018) Unlocking biodiversity data: prioritization and filling the gaps in biodiversity observation data in Europe. Biol Conserv 221:78–85. https://doi.org/10.1016/j.biocon.2017.12.024
Wickham H (2016) ggplot2: elegant graphics for data analysis
Wieczorek J, Bloom D, Guralnick R et al (2012) Darwin core: an evolving community-developed biodiversity data standard. PLoS ONE 7:e29715. https://doi.org/10.1371/journal.pone.0029715
Wilkinson JW, Arnell AP (2013) NARRS Report 2012: Establishing the Baseline (HWM Edition). ARC Research Report 13/01
Wilkinson MD, Dumontier M, IjJ A et al (2016) The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3:160018. https://doi.org/10.1038/sdata.2016.18
Zhao Y, Zhang H (2020) Eigenvalues make the difference—a network analysis of the Chinese Super League. Int J Sports Sci Coach 15:184–194. https://doi.org/10.1177/1747954120908822
Acknowledgements
This work was supported by the Natural Environment Research Council and ARIES Doctoral Training Partnership [Grant Number NE/S007334/1]. We are grateful to the Amphibian and Reptile Conservation (ARC) Trust, Amphibian and Reptile Groups (ARG) UK, the British Trust for Ornithology (BTO), Royal Society for the Protection of Birds (RSPB), Froglife, Biological Records Centre (BRC), and the Woodland Trust for sharing information on their data and their invaluable expertise on the dynamics of the UK amphibian and reptile data management landscape. In particular, we would like to thank Rob Ward (ARC Trust), Yvette Martin (ARC Trust), Emma Gardner (UK Centre for Ecology & Hydrology), Kate Risley (formerly BTO), Rebecca McHugh (RSPB), Silviu Petrovan (University of Cambridge), Jenny Tse-Leon (Froglife), Steve Langham (Surrey ARG), David Roy (BRC), Martin Harvey (BRC) and Lorienne Whittle (Woodland Trust). We thank our anonymous reviewers for their constructive feedback on previous manuscript drafts. We extend thanks to the thousands of dedicated volunteers who collect data on amphibians and reptiles in the UK and hope that this review showcases the value of their efforts for UK amphibian and reptile conservation.
Funding
This work was supported by the Natural Environment Research Council and the ARIES DTP [Grant Number NE/S007334/1].
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data searches, synthesis and analysis were performed by RKT. The first draft of the manuscript was written by RKT and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors have no conflicts of interest to declare.
Additional information
Communicated by Pedro Eisenlohr.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Turner, R.K., Griffiths, R.A., Wilkinson, J.W. et al. Diversity, fragmentation, and connectivity across the UK amphibian and reptile data management landscape. Biodivers Conserv 32, 37–64 (2023). https://doi.org/10.1007/s10531-022-02502-w
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10531-022-02502-w