Species data for understanding biodiversity dynamics: the what, where and when of species occurrence data collection.

Published online
08 Apr 2021
Content type
Journal article
Journal title
Ecological Solutions and Evidence

Petersen, T. K. & Speed, J. D. M. & Grøtan, V. & Austrheim, G.
Contact email(s)

Publication language
Norway & Nordic Countries


1. The availability and quantity of observational species occurrence records have greatly increased due to technological advancements and the rise of online portals, such as the Global Biodiversity Information Facility (GBIF), coalescing occurrence records from multiple datasets. It is well-established that such records are biased in time, space and taxonomy, but whether these datasets differ in relation to origin have not been assessed. If biases are specific to different types of datasets, and the relative contribution from these datasets have changed over time, these shifting biases will have implications for interpretations of results and, consequentially, for management and conservation measures. 2. We examined observational GBIF records from Norway to test potential differences in taxonomic, time and land-cover biases between 10 different datasets, with a focus on red-listed and non-native species. 3. The datasets differ in their taxonomic coverage, with datasets dominated by citizen scientist recorders focusing greatly on birds. The number of records has increased over time; in particular, citizen science datasets have had a sharp increase in recent years. 4. The different datasets (including division of the datasets by conservation status) showed differences in geographical coverage. Anthropogenic land covers have more records than would be expected by chance in the majority of cases. Remote areas have fewer records than would be expected, underlining the prevalence of a roadside bias. 5. Accounting for biases in opportunistic species occurrence records need to be a dynamic rather than static process, as the taxonomic and geographical biases have changed over time and differ between datasets, depending on origin and inherent characteristics. Data-collection programmes should be designed to counteract the biases of the specific datasets, and methods to account for the biases in existing data should be developed. When utilizing compiled, open-source data, care must be taken to ensure complementarity between the datasets, both regarding time and space. Incorporating strengths and accounting for biases between datasets can strengthen the integration between species occurrence records with different origins for science-policy impact and management.

Key words