Science Enabled by Specimen Data

Heo, N., M. V. Lomolino, J. E. Watkins, S. Yun, J. Weber-Townsend, and D. D. Fernando. 2022. Evolutionary history of the Asplenium scolopendrium complex (Aspleniaceae), a relictual fern with a northern pan-temperate disjunct distribution. Biological Journal of the Linnean Society.

Abstract Asplenium scolopendrium is distributed in northern temperate forests with many global biogeographic disjunctions. The species complex of A. scolopendrium has been generated by spatial segregation coupled with divergent evolution. We elucidated the biogeographic history of the A. scolopendrium complex by exploring its origin, dispersal and evolution, thus providing insights into the evolutionary history of the Tertiary floras with northern pan-temperate disjunct distributions. The results revealed that all infraspecific taxa descended from a widely distributed common ancestor in the Northern Hemisphere. This pan-temperate ancestral population formed by unidirectional westward dispersal from European origins primarily during the Early Eocene when the Earth’s climate was much warmer than today. The splitting of European, American and East Asian lineages occurred during the Early Miocene due to geo-climatic vicariances. Polyploidy events in the American ancestral populations created additional reproductive barriers. The star-shaped haplotypes in each continent indicated that local disjunctions also led to derived genotypes with potential to diverge into different taxa. This intracontinental lineage splitting is likely related to latitudinal range shift and habitat fragmentation caused by glacial cycles and climate change during the Pleistocene. The evolutionary history of the A. scolopendrium complex supported the Boreotropical hypothesis exhibiting range expansion during the Early Eocene Climatic Optimum.

Führding‐Potschkat, P., H. Kreft, and S. M. Ickert‐Bond. 2022. Influence of different data cleaning solutions of point‐occurrence records on downstream macroecological diversity models. Ecology and Evolution 12.

Digital point‐occurrence records from the Global Biodiversity Information Facility (GBIF) and other data providers enable a wide range of research in macroecology and biogeography. However, data errors may hamper immediate use. Manual data cleaning is time‐consuming and often unfeasible, given that the databases may contain thousands or millions of records. Automated data cleaning pipelines are therefore of high importance. Taking North American Ephedra as a model, we examined how different data cleaning pipelines (using, e.g., the GBIF web application, and four different R packages) affect downstream species distribution models (SDMs). We also assessed how data differed from expert data. From 13,889 North American Ephedra observations in GBIF, the pipelines removed 31.7% to 62.7% false positives, invalid coordinates, and duplicates, leading to datasets between 9484 (GBIF application) and 5196 records (manual‐guided filtering). The expert data consisted of 704 records, comparable to data from field studies. Although differences in the absolute numbers of records were relatively large, species richness models based on stacked SDMs (S‐SDM) from pipeline and expert data were strongly correlated (mean Pearson's r across the pipelines: .9986, vs. the expert data: .9173). Our results suggest that all R package‐based pipelines reliably identified invalid coordinates. In contrast, the GBIF‐filtered data still contained both spatial and taxonomic errors. Major drawbacks emerge from the fact that no pipeline fully discovered misidentified specimens without the assistance of taxonomic expert knowledge. We conclude that application‐filtered GBIF data will still need additional review to achieve higher spatial data quality. Achieving high‐quality taxonomic data will require extra effort, probably by thoroughly analyzing the data for misidentified taxa, supported by experts.

Chevalier, M. 2022. <i>crestr</i>: an R package to perform probabilistic climate reconstructions from palaeoecological datasets. Climate of the Past 18: 821–844.

Abstract. Statistical climate reconstruction techniques are fundamental tools to study past climate variability from fossil proxy data. In particular, the methods based on probability density functions (or PDFs) can be used in various environments and with different climate proxies because they rely on elementary calibration data (i.e. modern geolocalised presence data). However, the difficulty of accessing and curating these calibration data and the complexity of interpreting probabilistic results have often limited their use in palaeoclimatological studies. Here, I introduce a new R package (crestr) to apply the PDF-based method CREST (Climate REconstruction SofTware) on diverse palaeoecological datasets and address these problems. crestr includes a globally curated calibration dataset for six common climate proxies (i.e. plants, beetles, chironomids, rodents, foraminifera, and dinoflagellate cysts) associated with an extensive range of climate variables (20 terrestrial and 19 marine variables) that enables its use in most terrestrial and marine environments. Private data collections can also be used instead of, or in combination with, the provided calibration dataset. The package includes a suite of graphical diagnostic tools to represent the data at each step of the reconstruction process and provide insights into the effect of the different modelling assumptions and external factors that underlie a reconstruction. With this R package, the CREST method can now be used in a scriptable environment and thus be more easily integrated with existing workflows. It is hoped that crestr will be used to produce the much-needed quantified climate reconstructions from the many regions where they are currently lacking, despite the availability of suitable fossil records. To support this development, the use of the package is illustrated with a step-by-step replication of a 790 000-year-long mean annual temperature reconstruction based on a pollen record from southeastern Africa.

Joshi, M. D., and C. Joshi. 2022. Areas of species diversity and endemicity of Nepal. Ecosphere 13.

In this study, we analyzed the distribution and the spatial pattern of species diversity of vascular plants in Nepal. The aim was to identify and evaluate the occurrence and status of species‐rich areas in Nepal using ecological and environmental drivers. We used 52,973 georeferenced herbarium specimen records, representing 2650 species collected from Nepal. Altogether, 41 environmental variables were used for model development and validation. We used MaxEnt to predict the distribution pattern. All the significant species distribution predictions were then used to develop a species richness and endemism pattern in Nepal. The High Mountain and Himalaya, particularly east and central Nepal, were found to be species diverse and endemically rich areas, whereas western Nepal had lower species richness. We observed that isothermality, slope, rugosity, potential evapotranspiration, precipitation of humid months, temperature annual range, mean diurnal range, and normalized difference in vegetation index of humid months were the most influential environmental and climatic variables. We observed that about 60% of the areas, which had highest richness and endemism values, are still not included in protected areas in Nepal. We quantitatively analyzed the species richness and endemicity patterns of Nepal and were able to identify 19 areas of high species diversity and endemicity, six of which are newly identified.

Yousefi, M., A. Mahmoudi, A. Kafash, A. Khani, and B. Kryštufek. 2022. Biogeography of rodents in Iran: species richness, elevational distribution and their environmental correlates. Mammalia 86: 309–320.

Abstract Rodent biogeographic studies are disproportionately scarce in Iran, however, they are an ideal system to understand drivers of biodiversity distributions in the country. The aims of the present research are to determine (i) the pattern of rodent richness across the country, (ii) quantify th…

Odorico, D., E. Nicosia, C. Datizua, C. Langa, R. Raiva, J. Souane, S. Nhalungo, et al. 2022. An updated checklist of Mozambique’s vascular plants. PhytoKeys 189: 61–80.

An updated checklist of Mozambique’s vascular plants is presented. It was compiled referring to several information sources such as existing literature, relevant online databases and herbaria collections. The checklist includes 7,099 taxa (5,957 species, 605 subspecies, 537 varieties), belonging to …

Meller, P., M. Stellmes, A. Fidelis, and M. Finckh. 2022. Correlates of geoxyle diversity in Afrotropical grasslands. Journal of Biogeography 49: 339–352.

Aim: Tropical old-growth grasslands are increasingly acknowledged as biodiverse ecosystems, but they are understudied in many aspects. Geoxyle species are a key component in many of these ecosystems, their belowground storage organs and bud banks are functionally diverse and contribute to the grassl…

Vasconcelos, T., J. D. Boyko, and J. M. Beaulieu. 2021. Linking mode of seed dispersal and climatic niche evolution in flowering plants. Journal of Biogeography.

Aim: Due to the sessile nature of flowering plants, movements to new geographical areas occur mainly during seed dispersal. Frugivores tend to be efficient dispersers because animals move within the boundaries of their preferable niches, so seeds are more likely to be transported to environments tha…

Xue, T., S. R. Gadagkar, T. P. Albright, X. Yang, J. Li, C. Xia, J. Wu, and S. Yu. 2021. Prioritizing conservation of biodiversity in an alpine region: Distribution pattern and conservation status of seed plants in the Qinghai-Tibetan Plateau. Global Ecology and Conservation 32: e01885.

The Qinghai-Tibetan Plateau (QTP) harbors abundant and diverse plant life owing to its high habitat heterogeneity. However, the distribution pattern of biodiversity hotspots and their conservation status remain unclear. Based on 148,283 high-resolution occurrence coordinates of 13,450 seed plants, w…

Qu, J., Y. Xu, Y. Cui, S. Wu, L. Wang, X. Liu, Z. Xing, et al. 2021. MODB: a comprehensive mitochondrial genome database for Mollusca. Database 2021.

Mollusca is the largest marine phylum, comprising about 23% of all named marine organisms, Mollusca systematics are still in flux, and an increase in human activities has affected Molluscan reproduction and development, strongly impacting diversity and classification. Therefore, it is necessary to e…