Using subsets of species in biodiversity surveys.
In many biodiversity surveys, a small proportion of species require a disproportionate amount of a researcher's time and effort to detect or identify. If we are interested in predicting species diversity or composition, what are the consequences for statistical power of ignoring difficult species - that is, of surveying only a subset of the full suite of species? We analysed 10 data sets on a variety of taxa, at different spatial scales, to assess correlations for species richness and species composition between a full data set and subsets of data with different numbers of species deleted at random, or according to the time investment required for inclusion. Power analyses characterized the trade-off between the number of sites surveyed and the completeness of the survey in each site. For species richness, the majority of information regarding among-site patterns was retained even with large numbers of species removed. With only half the full species pool, the lower 95th percentile of correlations between the full vs. randomly generated reduced data sets was >0.75 in all 10 cases. With 10% of the full species pool removed, correlations were ≥0.95. Subsets of species were not as good at capturing among-site patterns of species composition (ordination scores and pairwise site dissimilarities). With half the full species pool, lower 95th percentile correlations between the full and randomly generated reduced data sets were as low as 0.1. Nonetheless, in most cases the lower 95th percentile correlation with half the number of species was >0.7, and removing 10% of species gave correlations >0.8 across all data sets. For the three data sets in which species were also removed according to the time investment for inclusion, correlations fell within the range of variability observed for random species removals. Synthesis and applications. In biological surveys, ignoring a relatively small proportion of species (e.g. <10%), and often a much larger proportion, results in very little loss of information on patterns of biodiversity. As such, statistical power in many biodiversity studies may be maximized by eliminating difficult species from a survey in order to increase the number of sites surveyed.