Science Enabled by Specimen Data

Pang, S. E. H., Y. Zeng, J. D. T. Alban, and E. L. Webb. 2022. Occurrence–habitat mismatching and niche truncation when modelling distributions affected by anthropogenic range contractions B. Leroy [ed.],. Diversity and Distributions 28: 1327–1343. https://doi.org/10.1111/ddi.13544

Aims Human-induced pressures such as deforestation cause anthropogenic range contractions (ARCs). Such contractions present dynamic distributions that may engender data misrepresentations within species distribution models. The temporal bias of occurrence data—where occurrences represent distributions before (past bias) or after (recent bias) ARCs—underpins these data misrepresentations. Occurrence–habitat mismatching results when occurrences sampled before contractions are modelled with contemporary anthropogenic variables; niche truncation results when occurrences sampled after contractions are modelled without anthropogenic variables. Our understanding of their independent and interactive effects on model performance remains incomplete but is vital for developing good modelling protocols. Through a virtual ecologist approach, we demonstrate how these data misrepresentations manifest and investigate their effects on model performance. Location Virtual Southeast Asia. Methods Using 100 virtual species, we simulated ARCs with 100-year land-use data and generated temporally biased (past and recent) occurrence datasets. We modelled datasets with and without a contemporary land-use variable (conventional modelling protocols) and with a temporally dynamic land-use variable. We evaluated each model's ability to predict historical and contemporary distributions. Results Greater ARC resulted in greater occurrence–habitat mismatching for datasets with past bias and greater niche truncation for datasets with recent bias. Occurrence–habitat mismatching prevented models with the contemporary land-use variable from predicting anthropogenic-related absences, causing overpredictions of contemporary distributions. Although niche truncation caused underpredictions of historical distributions (environmentally suitable habitats), incorporating the contemporary land-use variable resolved these underpredictions, even when mismatching occurred. Models with the temporally dynamic land-use variable consistently outperformed models without. Main conclusions We showed how these data misrepresentations can degrade model performance, undermining their use for empirical research and conservation science. Given the ubiquity of ARCs, these data misrepresentations are likely inherent to most datasets. Therefore, we present a three-step strategy for handling data misrepresentations: maximize the temporal range of anthropogenic predictors, exclude mismatched occurrences and test for residual data misrepresentations.