Science Enabled by Specimen Data

LIZARDO, V., V. MOCTEZUMA, and F. ESCOBAR. 2022. Distribution, Regionalization, and Diversity of the dung beetle genus Phanaeus MacLeay (Coleoptera: Scarabaeidae) using Species Distribution Models. Zootaxa 5213: 546–568. https://doi.org/10.11646/zootaxa.5213.5.4

The genus Phanaeus is a well-known group whose taxonomy has been described multiple times. Its distribution was previously classified into 11 ecogeographic groups that are equivalent to areas of endemism. Here we use Species Distribution Models to describe species richness patterns. We measured beta-diversity and regionalized its distribution into one region and one transition zone, both with three dominions: Mexican Transition Zone (North American, Mexican, and Mesoamerican dominions) and Neotropical region (Pacific, Brazilian, and Atlantic Forest dominions). We also present a species checklist and updated the distribution maps for 73 of 81 species described so far that reflects all the taxonomical updates. We include a list of all the recorded locations (by country, state, and province), list the recorded habitats and biomes, and describe the modelled environmental conditions for each species.

Liu, S., S. Xia, D. Wu, J. E. Behm, Y. Meng, H. Yuan, P. Wen, et al. 2022. Understanding global and regional patterns of termite diversity and regional functional traits. iScience: 105538. https://doi.org/10.1016/j.isci.2022.105538

Our understanding of broad-scale biodiversity and functional trait patterns is largely based on plants, and relatively little information is available on soil arthropods. Here, we investigated the distribution of termite diversity globally and morphological traits and diversity across China. Our analyses showed increasing termite species richness with decreasing latitude at both the globally, and within-China. Additionally, we detected obvious latitudinal trends in the mean community value of termite morphological traits on average, with body size and leg length decreasing with increasing latitude. Furthermore, temperature, NDVI and water variables were the most important drivers controlling the variation in termite richness, and temperature and soil properties were key drivers of the geographic distribution of termite morphological traits. Our global termite richness map is one of the first high resolution maps for any arthropod group and especially given the functional importance of termites, our work provides a useful baseline for further ecological analysis.

Lu, L.-L., B.-H. Jiao, F. Qin, G. Xie, K.-Q. Lu, J.-F. Li, B. Sun, et al. 2022. Artemisia pollen dataset for exploring the potential ecological indicators in deep time. Earth System Science Data 14: 3961–3995. https://doi.org/10.5194/essd-14-3961-2022

Abstract. Artemisia, along with Chenopodiaceae, is the dominant component growing in the desert and dry grassland of the Northern Hemisphere. Artemisia pollen with its high productivity, wide distribution, and easy identification is usually regarded as an eco-indicator for assessing aridity and distinguishing grassland from desert vegetation in terms of the pollen relative abundance ratio of Chenopodiaceae/Artemisia (C/A). Nevertheless, divergent opinions on the degree of aridity evaluated by Artemisia pollen have been circulating in the palynological community for a long time. To solve the confusion, we first selected 36 species from nine clades and three outgroups of Artemisia based on the phylogenetic framework, which attempts to cover the maximum range of pollen morphological variation. Then, sampling, experiments, photography, and measurements were taken using standard methods. Here, we present pollen datasets containing 4018 original pollen photographs, 9360 pollen morphological trait measurements, information on 30 858 source plant occurrences, and corresponding environmental factors. Hierarchical cluster analysis on pollen morphological traits was carried out to subdivide Artemisia pollen into three types. When plotting the three pollen types of Artemisia onto the global terrestrial biomes, different pollen types of Artemisia were found to have different habitat ranges. These findings change the traditional concept of Artemisia being restricted to arid and semi-arid environments. The data framework that we designed is open and expandable for new pollen data of Artemisia worldwide. In the future, linking pollen morphology with habitat via these pollen datasets will create additional knowledge that will increase the resolution of the ecological environment in the geological past. The Artemisia pollen datasets are freely available at Zenodo (https://doi.org/10.5281/zenodo.6900308; Lu et al., 2022).

Xu, X.-T., J. Szwedo, D.-Y. Huang, W.-Y.-D. Deng, M. Obroślak, F.-X. Wu, and T. Su. 2022. A New Genus of Spittlebugs (Hemiptera, Cercopidae) from the Eocene of Central Tibetan Plateau. Insects 13: 770. https://doi.org/10.3390/insects13090770

The superfamily Cercopoidea is commonly named as “spittlebugs”, as its nymphs produce a spittle mass to protect themselves. Cosmoscartini (Cercopoidea: Cercopidae) is a large and brightly colored Old World tropical tribe, including 11 genera. A new genus Nangamostethos gen. nov. (type species: Nangamostethostibetense sp. nov.) of Cosmoscartini is described from Niubao Formation, the late Eocene of central Tibetan Plateau (TP), China. Its placement is ensured by comparison with all the extant genera of the tribe Cosmoscartini. The new fossil represents one of few fossil Cercopidae species described from Asia. It is likely that Nangamostethos was extinct from the TP due to the regional aridification and an overturn of plant taxa in the late Paleogene.

Führding‐Potschkat, P., H. Kreft, and S. M. Ickert‐Bond. 2022. Influence of different data cleaning solutions of point‐occurrence records on downstream macroecological diversity models. Ecology and Evolution 12. https://doi.org/10.1002/ece3.9168

Digital point‐occurrence records from the Global Biodiversity Information Facility (GBIF) and other data providers enable a wide range of research in macroecology and biogeography. However, data errors may hamper immediate use. Manual data cleaning is time‐consuming and often unfeasible, given that the databases may contain thousands or millions of records. Automated data cleaning pipelines are therefore of high importance. Taking North American Ephedra as a model, we examined how different data cleaning pipelines (using, e.g., the GBIF web application, and four different R packages) affect downstream species distribution models (SDMs). We also assessed how data differed from expert data. From 13,889 North American Ephedra observations in GBIF, the pipelines removed 31.7% to 62.7% false positives, invalid coordinates, and duplicates, leading to datasets between 9484 (GBIF application) and 5196 records (manual‐guided filtering). The expert data consisted of 704 records, comparable to data from field studies. Although differences in the absolute numbers of records were relatively large, species richness models based on stacked SDMs (S‐SDM) from pipeline and expert data were strongly correlated (mean Pearson's r across the pipelines: .9986, vs. the expert data: .9173). Our results suggest that all R package‐based pipelines reliably identified invalid coordinates. In contrast, the GBIF‐filtered data still contained both spatial and taxonomic errors. Major drawbacks emerge from the fact that no pipeline fully discovered misidentified specimens without the assistance of taxonomic expert knowledge. We conclude that application‐filtered GBIF data will still need additional review to achieve higher spatial data quality. Achieving high‐quality taxonomic data will require extra effort, probably by thoroughly analyzing the data for misidentified taxa, supported by experts.

Boyd, R. J., M. A. Aizen, R. M. Barahona‐Segovia, L. Flores‐Prado, F. E. Fontúrbel, T. M. Francoy, M. Lopez‐Aliste, et al. 2022. Inferring trends in pollinator distributions across the Neotropics from publicly available data remains challenging despite mobilization efforts Y. Fourcade [ed.],. Diversity and Distributions 28: 1404–1415. https://doi.org/10.1111/ddi.13551

Aim Aggregated species occurrence data are increasingly accessible through public databases for the analysis of temporal trends in the geographic distributions of species. However, biases in these data present challenges for statistical inference. We assessed potential biases in data available through GBIF on the occurrences of four flower-visiting taxa: bees (Anthophila), hoverflies (Syrphidae), leaf-nosed bats (Phyllostomidae) and hummingbirds (Trochilidae). We also assessed whether and to what extent data mobilization efforts improved our ability to estimate trends in species' distributions. Location The Neotropics. Methods We used five data-driven heuristics to screen the data for potential geographic, temporal and taxonomic biases. We began with a continental-scale assessment of the data for all four taxa. We then identified two recent data mobilization efforts (2021) that drastically increased the quantity of records of bees collected in Chile available through GBIF. We compared the dataset before and after the addition of these new records in terms of their biases and estimated trends in species' distributions. Results We found evidence of potential sampling biases for all taxa. The addition of newly-mobilized records of bees in Chile decreased some biases but introduced others. Despite increasing the quantity of data for bees in Chile sixfold, estimates of trends in species' distributions derived using the postmobilization dataset were broadly similar to what would have been estimated before their introduction, albeit more precise. Main conclusions Our results highlight the challenges associated with drawing robust inferences about trends in species' distributions using publicly available data. Mobilizing historic records will not always enable trend estimation because more data do not necessarily equal less bias. Analysts should carefully assess their data before conducting analyses: this might enable the estimation of more robust trends and help to identify strategies for effective data mobilization. Our study also reinforces the need for targeted monitoring of pollinators worldwide.

Marshall, B. M., C. T. Strine, C. S. Fukushima, P. Cardoso, M. C. Orr, and A. C. Hughes. 2022. Searching the web builds fuller picture of arachnid trade. Communications Biology 5. https://doi.org/10.1038/s42003-022-03374-0

Wildlife trade is a major driver of biodiversity loss, yet whilst the impacts of trade in some species are relatively well-known, some taxa, such as many invertebrates are often overlooked. Here we explore global patterns of trade in the arachnids, and detected 1,264 species from 66 families and 371 genera in trade. Trade in these groups exceeds millions of individuals, with 67% coming directly from the wild, and up to 99% of individuals in some genera. For popular taxa, such as tarantulas up to 50% are in trade, including 25% of species described since 2000. CITES only covers 30 (2%) of the species potentially traded. We mapped the percentage and number of species native to each country in trade. To enable sustainable trade, better data on species distributions and better conservation status assessments are needed. The disparity between trade data sources highlights the need to expand monitoring if impacts on wild populations are to be accurately gauged and the impacts of trade minimised. Trade in arachnids includes millions of individuals and over 1264 species, with over 70% of individuals coming from the wild.

Williams, C. J. R., D. J. Lunt, U. Salzmann, T. Reichgelt, G. N. Inglis, D. R. Greenwood, W. Chan, et al. 2022. African Hydroclimate During the Early Eocene From the DeepMIP Simulations. Paleoceanography and Paleoclimatology 37. https://doi.org/10.1029/2022pa004419

The early Eocene (∼56‐48 million years ago) is characterised by high CO2 estimates (1200‐2500 ppmv) and elevated global temperatures (∼10 to 16°C higher than modern). However, the response of the hydrological cycle during the early Eocene is poorly constrained, especially in regions with sparse data coverage (e.g. Africa). Here we present a study of African hydroclimate during the early Eocene, as simulated by an ensemble of state‐of‐the‐art climate models in the Deep‐time Model Intercomparison Project (DeepMIP). A comparison between the DeepMIP pre‐industrial simulations and modern observations suggests that model biases are model‐ and geographically dependent, however these biases are reduced in the model ensemble mean. A comparison between the Eocene simulations and the pre‐industrial suggests that there is no obvious wetting or drying trend as the CO2 increases. The results suggest that changes to the land sea mask (relative to modern) in the models may be responsible for the simulated increases in precipitation to the north of Eocene Africa. There is an increase in precipitation over equatorial and West Africa and associated drying over northern Africa as CO2 rises. There are also important dynamical changes, with evidence that anticyclonic low‐level circulation is replaced by increased south‐westerly flow at high CO2 levels. Lastly, a model‐data comparison using newly‐compiled quantitative climate estimates from palaeobotanical proxy data suggests a marginally better fit with the reconstructions at lower levels of CO2.

Bywater‐Reyes, S., R. M. Diehl, A. C. Wilcox, J. C. Stella, and L. Kui. 2022. A Green New Balance: Interactions among riparian vegetation plant traits and morphodynamics in alluvial rivers. Earth Surface Processes and Landforms 47: 2410–2436. https://doi.org/10.1002/esp.5385

The strength of interactions between plants and river processes is mediated by plant traits and fluvial conditions, including above‐ground biomass, stem density and flexibility, channel and bed material properties, and flow and sediment regimes. In many rivers, concurrent changes in 1) the composition of riparian vegetation communities as a result of exotic species invasion and 2) shifts in hydrology have altered physical and ecological conditions in a manner that has been mediated by feedbacks between vegetation and morphodynamic processes. We review how Tamarix, which has invaded many U.S. Southwest waterways, and Populus species, woody pioneer trees that are native to the region, differentially affect hydraulics, sediment transport, and river morphology. We draw on flume, field, and modeling approaches spanning the individual seedling to river‐corridor scales. In a flume study, we found differences in the crown morphology, stem density, and flexibility of Tamarix compared to Populus influenced near‐bed flow velocities in a manner that favored aggradation associated with Tamarix. Similarly, at the patch and corridor scales, observations confirmed increased aggradation with increased vegetation density. Furthermore, long‐term channel adjustments were different for Tamarix‐ versus Populus‐dominated reaches, with faster and greater geomorphic adjustments for Tamarix. Collectively, our studies show how plant‐trait differences between Tamarix and Populus, from individual seedlings to larger spatial and temporal scales, influence the co‐adjustment of rivers and riparian plant communities. These findings provide a basis for predicting changes in alluvial riverine systems which we conceptualize as a Green New Balance model that considers how channels may adjust to changes in plant traits and community structure in additional to alterations in flow and sediment supply. We offer suggestions regarding how the Green New Balance can be used in management and invasive species management.

Chevalier, M. 2022. <i>crestr</i>: an R package to perform probabilistic climate reconstructions from palaeoecological datasets. Climate of the Past 18: 821–844. https://doi.org/10.5194/cp-18-821-2022

Abstract. Statistical climate reconstruction techniques are fundamental tools to study past climate variability from fossil proxy data. In particular, the methods based on probability density functions (or PDFs) can be used in various environments and with different climate proxies because they rely on elementary calibration data (i.e. modern geolocalised presence data). However, the difficulty of accessing and curating these calibration data and the complexity of interpreting probabilistic results have often limited their use in palaeoclimatological studies. Here, I introduce a new R package (crestr) to apply the PDF-based method CREST (Climate REconstruction SofTware) on diverse palaeoecological datasets and address these problems. crestr includes a globally curated calibration dataset for six common climate proxies (i.e. plants, beetles, chironomids, rodents, foraminifera, and dinoflagellate cysts) associated with an extensive range of climate variables (20 terrestrial and 19 marine variables) that enables its use in most terrestrial and marine environments. Private data collections can also be used instead of, or in combination with, the provided calibration dataset. The package includes a suite of graphical diagnostic tools to represent the data at each step of the reconstruction process and provide insights into the effect of the different modelling assumptions and external factors that underlie a reconstruction. With this R package, the CREST method can now be used in a scriptable environment and thus be more easily integrated with existing workflows. It is hoped that crestr will be used to produce the much-needed quantified climate reconstructions from the many regions where they are currently lacking, despite the availability of suitable fossil records. To support this development, the use of the package is illustrated with a step-by-step replication of a 790 000-year-long mean annual temperature reconstruction based on a pollen record from southeastern Africa.