Science Enabled by Specimen Data

Lannuzel, G., L. Pouget, D. Bruy, V. Hequet, S. Meyer, J. Munzinger, and G. Gâteblé. 2022. Mining rare Earth elements: Identifying the plant species most threatened by ore extraction in an insular hotspot. Frontiers in Ecology and Evolution 10. https://doi.org/10.3389/fevo.2022.952439

Conservation efforts in global biodiversity hotspots often face a common predicament: an urgent need for conservation action hampered by a significant lack of knowledge about that biodiversity. In recent decades, the computerisation of primary biodiversity data worldwide has provided the scientific community with raw material to increase our understanding of the shared natural heritage. These datasets, however, suffer from a lot of geographical and taxonomic inaccuracies. Automated tools developed to enhance their reliability have shown that detailed expert examination remains the best way to achieve robust and exhaustive datasets. In New Caledonia, one of the most important biodiversity hotspots worldwide, the plant diversity inventory is still underway, and most taxa awaiting formal description are narrow endemics, hence by definition hard to discern in the datasets. In the meantime, anthropogenic pressures, such as nickel-ore mining, are threatening the unique ultramafic ecosystems at an increasing rate. The conservation challenge is therefore a race against time, as the rarest species must be identified and protected before they vanish. In this study, based on all available datasets and resources, we applied a workflow capable of highlighting the lesser known taxa. The main challenges addressed were to aggregate all data available worldwide, and tackle the geographical and taxonomic biases, avoiding the data loss resulting from automated filtering. Every doubtful specimen went through a careful taxonomic analysis by a local and international taxonomist panel. Geolocation of the whole dataset was achieved through dataset cross-checking, local botanists’ field knowledge, and historical material examination. Field studies were also conducted to clarify the most unresolved taxa. With the help of this method and by analysing over 85,000 data, we were able to double the number of known narrow endemic taxa, elucidate 68 putative new species, and update our knowledge of the rarest species’ distributions so as to promote conservation measures.

Führding‐Potschkat, P., H. Kreft, and S. M. Ickert‐Bond. 2022. Influence of different data cleaning solutions of point‐occurrence records on downstream macroecological diversity models. Ecology and Evolution 12. https://doi.org/10.1002/ece3.9168

Digital point‐occurrence records from the Global Biodiversity Information Facility (GBIF) and other data providers enable a wide range of research in macroecology and biogeography. However, data errors may hamper immediate use. Manual data cleaning is time‐consuming and often unfeasible, given that the databases may contain thousands or millions of records. Automated data cleaning pipelines are therefore of high importance. Taking North American Ephedra as a model, we examined how different data cleaning pipelines (using, e.g., the GBIF web application, and four different R packages) affect downstream species distribution models (SDMs). We also assessed how data differed from expert data. From 13,889 North American Ephedra observations in GBIF, the pipelines removed 31.7% to 62.7% false positives, invalid coordinates, and duplicates, leading to datasets between 9484 (GBIF application) and 5196 records (manual‐guided filtering). The expert data consisted of 704 records, comparable to data from field studies. Although differences in the absolute numbers of records were relatively large, species richness models based on stacked SDMs (S‐SDM) from pipeline and expert data were strongly correlated (mean Pearson's r across the pipelines: .9986, vs. the expert data: .9173). Our results suggest that all R package‐based pipelines reliably identified invalid coordinates. In contrast, the GBIF‐filtered data still contained both spatial and taxonomic errors. Major drawbacks emerge from the fact that no pipeline fully discovered misidentified specimens without the assistance of taxonomic expert knowledge. We conclude that application‐filtered GBIF data will still need additional review to achieve higher spatial data quality. Achieving high‐quality taxonomic data will require extra effort, probably by thoroughly analyzing the data for misidentified taxa, supported by experts.

Colli-Silva, M., J. R. Pirani, and A. Zizka. 2022. Ecological niche models and point distribution data reveal a differential coverage of the cacao relatives (Malvaceae) in South American protected areas. Ecological Informatics 69: 101668. https://doi.org/10.1016/j.ecoinf.2022.101668

For many regions, such as in South America, it is unclear how well the existent protected areas network (PAs) covers different taxonomic groups and if there is a coverage bias of PAs towards certain biomes or species. Publicly available occurrence data along with ecological niche models might help to overcome this gap and to quantify the coverage of taxa by PAs ensuring an unbiased distribution of conservation effort. Here, we use an occurrence database of 271 species from the cacao family (Malvaceae) to address how South American PAs cover species with different distribution, abundance, and threat status. Furthermore, we compared the performance of online databases, expert knowledge, and modelled species distributions in estimating species coverage in PAs. We found 79 species from our survey (29% of the total) lack any record inside South American PAs and that 20 out of 23 species potentially threatened with extinction are not covered by PAs. The area covered by South American PAs was low across biomes, except for Amazonia, which had a relative high PA coverage, but little information on species distribution within PA available. Also, raw geo-referenced occurrence data were underestimating the number of species in PAs, and projections from ecological niche models were more prone to overestimating the number of species represented within PAs. We discuss that the protection of South American flora in heterogeneous environments demand for specific strategies tailored to particular biomes, including making new collections inside PAs in less collected areas, and the delimitation of more areas for protection in more known areas. Also, by presenting biasing scenarios of collection effort in a representative plant group, our results can benefit policy makers in conserving different spots of tropical environments highly biodiverse.

Xue, T., S. R. Gadagkar, T. P. Albright, X. Yang, J. Li, C. Xia, J. Wu, and S. Yu. 2021. Prioritizing conservation of biodiversity in an alpine region: Distribution pattern and conservation status of seed plants in the Qinghai-Tibetan Plateau. Global Ecology and Conservation 32: e01885. https://doi.org/10.1016/j.gecco.2021.e01885

The Qinghai-Tibetan Plateau (QTP) harbors abundant and diverse plant life owing to its high habitat heterogeneity. However, the distribution pattern of biodiversity hotspots and their conservation status remain unclear. Based on 148,283 high-resolution occurrence coordinates of 13,450 seed plants, w…

Lopes, A., L. O. Demarchi, A. C. Franco, A. B. Ferreira, C. S. Ferreira, F. Wittmann, I. N. Santiago, et al. 2021. Predicting the potential distribution of aquatic herbaceous plants in oligotrophic Central Amazonian wetland ecosystems. Acta Botanica Brasilica 35: 22–36. https://doi.org/10.1590/0102-33062020abb0188

Aquatic herbaceous plants are especially suitable for mapping environmental variability in wetlands, as they respond quickly to environmental gradients and are good indicators of habitat preference. We describe the composition of herbaceous species in two oligotrophic wetland ecosystems, floodplains…

de Oliveira, M. H. V., B. M. Torke, and T. E. Almeida. 2021. An inventory of the ferns and lycophytes of the Lower Tapajós River Basin in the Brazilian Amazon reveals collecting biases, sampling gaps, and previously undocumented diversity. Brittonia 73: 459–480. https://doi.org/10.1007/s12228-021-09668-7

Ferns and lycophytes are an excellent group for conservation and species distribution studies because they are closely related to environmental changes. In this study, we analyzed collection gaps, sampling biases, richness distribution, and the species conservation effectiveness of protected areas i…

Rincón‐Barrado, M., S. Olsson, T. Villaverde, B. Moncalvillo, L. Pokorny, A. Forrest, R. Riina, and I. Sanmartín. 2021. Ecological and geological processes impacting speciation modes drive the formation of wide‐range disjunctions within tribe Putorieae (Rubiaceae). Journal of Systematics and Evolution 59: 915–934. https://doi.org/10.1111/jse.12747

Wide‐range geographically discontinuous distributions have long intrigued scientists. We explore the role of ecology, geology, and dispersal in the formation of these large‐scale disjunctions, using the angiosperm tribe Putorieae (Rubiaceae) as a case study. From DNA sequences of nuclear ITS and six…

Tribble, C. M., J. Martínez‐Gómez, C. C. Howard, J. Males, V. Sosa, E. B. Sessa, N. Cellinese, and C. D. Specht. 2021. Get the shovel: morphological and evolutionary complexities of belowground organs in geophytes. American Journal of Botany 108: 372–387. https://doi.org/10.1002/ajb2.1623

Herbaceous plants collectively known as geophytes, which regrow from belowground buds, are distributed around the globe and throughout the land plant tree of life. The geophytic habit is an evolutionarily and ecologically important growth form in plants, permitting novel life history strategies, ena…

Goodwin, Z. A., P. Muñoz-Rodríguez, D. J. Harris, T. Wells, J. R. I. Wood, D. Filer, and R. W. Scotland. 2020. How long does it take to discover a species? Systematics and Biodiversity 18: 784–793. https://doi.org/10.1080/14772000.2020.1751339

The description of a new species is a key step in cataloguing the World’s flora. However, this is only a preliminary stage in a long process of understanding what that species represents. We investigated how long the species discovery process takes by focusing on three key stages: 1, the collection …

Levy, R., M. Paces, and R. Hufft. 2020. Sampling event dataset for ecological monitoring of riparian restoration effort in Colorado foothills. Biodiversity Data Journal 8. https://doi.org/10.3897/BDJ.8.e51817

The foothills and shortgrass prairie ecosystems of Colorado, United States, have undergone substantial and sustained anthropogenic habitat change over the past two centuries. Riparian systems have been dramatically altered by agriculture, hydrological engineering, urbanisation and the introduction o…