Using spatial Bayesian methods to determine the genetic structure of a continuously distributed population: clusters or isolation by distance?
Spatially explicit Bayesian clustering techniques offer a powerful tool for ecology and wildlife management, as genetic divisions can be correlated with landscape features. We used these methods to analyse the genetic structure of a population of European wild boar Sus scrofa with the aim of identifying effective barriers for disease management units. However, it has been suggested that the methods could produce biased results when faced with deviations from random mating not caused by genetic discontinuities, such as isolation by distance (IBD). We analysed a data set consisting of 697 wild boar multilocus genotypes using spatially explicit (BAPS, GENELAND) and non-explicit (STRUCTURE) Bayesian methods. We also simulated and analysed data sets characterized by different degrees of IBD, with and without genetic discontinuities. When analysing the empirical data set, different programs did not converge on the same clustering solution and some clusters were difficult to explain biologically. Results from the simulated data showed that IBD, also present in the empirical data set, could cause the Bayesian methods to overestimate genetic structure. Simulated barriers were identified correctly, but the programs superimposed further clusters at higher IBD levels. It was not possible to ascertain with confidence whether the clustering solutions offered by the various programs were an accurate reflection of population genetic structure in our empirical data set or were artefacts created by the underlying IBD pattern. Synthesis and applications: We show that Bayesian clustering methods can overestimate genetic structure when analysing an individual-based data set characterized by isolation by distance. This bias could lead to the erroneous delimitation of management or conservation units. Investigators should be critical and suspicious of clusters that cannot be explained biologically. Data sets should be tested for isolation by distance and conclusions should not be based on the output from just one method.