Predicting soil properties from floristic composition in western Amazonian rain forests: performance of k-nearest neighbour estimation and weighted averaging calibration.

Published online
04 Dec 2013
Content type
Journal article
Journal title
Journal of Applied Ecology

Suominen, L. & Ruokolainen, K. & Tuomisto, H. & Llerena, N. & Higgins, M. A.
Contact email(s)

Publication language


Soil quality is an important determinant of primary productivity and species occurrence patterns. Therefore, plant species composition can be used as an indicator of soil quality when direct sampling of soils is impractical. We test how well the species composition of the plant family Melastomataceae can predict soil properties in western Amazonian rain forests. We examine nine soil variables: pH; loss-on-ignition; the concentrations of Al and P; and the concentrations of Ca, K, Mg, Na and the sum of these base cations. We compare two commonly used prediction techniques, k-nearest neighbour (k-NN) estimation and weighted averaging calibration via species indicator values. The Melastomataceae and soil data come from 311 localities widely distributed in western Amazonia. We use two different sets of Melastomataceae: a full set including all 283 observed species and an easy set containing 58 species that are both abundant in the data set and relatively easy to identify in the field. Weighted averaging calibration and k-NN performed approximately equally well. Both were found to be useful techniques to convert information on Melastomataceae species composition to estimates of soil cation concentration, especially magnesium and calcium, and to a lesser degree potassium. In nearly all cases, the full set of Melastomataceae species gave more accurate predictions than the easy set, but the differences were relatively small. Synthesis and applications. Our results show that Melastomataceae can be used as an indicator group that facilitates the field estimation of soil cation concentration and hence the assessment and mapping of soil variation over large areas. This provides important background information for all types of land-use planning, including systematic conservation planning that aims at representativeness of conservation area networks.

Key words