Science Enabled by Specimen Data

Tackett, M., C. Berg, T. Simmonds, O. Lopez, J. Brown, R. Ruggiero, and J. Weber. 2022. Breeding system and geospatial variation shape the population genetics of Triodanis perfoliata. Ecology and Evolution 12.

Both intrinsic and extrinsic forces work together to shape connectivity and genetic variation in populations across the landscape. Here we explored how geography, breeding system traits, and environmental factors influence the population genetic patterns of Triodanis perfoliata, a widespread mix‐mating annual plant in the contiguous US. By integrating population genomic data with spatial analyses and modeling the relationship between a breeding system and genetic diversity, we illustrate the complex ways in which these forces shape genetic variation. Specifically, we used 4705 single nucleotide polymorphisms to assess genetic diversity, structure, and evolutionary history among 18 populations. Populations with more obligately selfing flowers harbored less genetic diversity (π: R2 = .63, p = .01, n = 9 populations), and we found significant population structuring (FST = 0.48). Both geographic isolation and environmental factors played significant roles in predicting the observed genetic diversity: we found that corridors of suitable environments appear to facilitate gene flow between populations, and that environmental resistance is correlated with increased genetic distance between populations. Last, we integrated our genetic results with species distribution modeling to assess likely patterns of connectivity among our study populations. Our landscape and evolutionary genetic results suggest that T. perfoliata experienced a complex demographic and evolutionary history, particularly in the center of its distribution. As such, there is no singular mechanism driving this species' evolution. Together, our analyses support the hypothesis that the breeding system, geography, and environmental variables shape the patterns of diversity and connectivity of T. perfoliata in the US.

Testo, W. L., A. L. de Gasper, S. Molino, J. M. G. y Galán, A. Salino, V. A. de O. Dittrich, and E. B. Sessa. 2022. Deep vicariance and frequent transoceanic dispersal shape the evolutionary history of a globally distributed fern family. American Journal of Botany.

Premise Historical biogeography of ferns is typically expected to be dominated by long-distance dispersal, due to their minuscule spores. However, few studies have inferred the historical biogeography of a large and widely distributed group of ferns to test this hypothesis. Our aims are to determine the extent to which long-distance dispersal vs. vicariance have shaped the history of the fern family Blechnaceae, to explore ecological correlates of dispersal and diversification, and to determine whether these patterns differ between the northern and southern hemispheres. Methods We used sequence data for three chloroplast loci to infer a time-calibrated phylogeny for 154 out of 265 species of Blechnaceae, including representatives of all genera in the family. This tree was used to conduct ancestral range reconstruction and stochastic character mapping, estimate diversification rates, and identify ecological correlates of diversification. Key results Blechnaceae originated in Eurasia and began diversifying in the late Cretaceous. A lineage comprising most extant diversity diversified principally in the austral Pacific region around the Paleocene-Eocene Thermal Maximum. Land connections that existed near the poles during periods of warm climates likely facilitated migration of several lineages, with subsequent climate-mediated vicariance shaping current distributions. Long-distance dispersal is frequent and asymmetrical, with New Zealand/Pacific Islands, Australia, and tropical America being major source areas. Conclusions Ancient vicariance and extensive long-distance dispersal have shaped the history of Blechnaceae in both the northern and southern hemispheres. The exceptional diversity in austral regions appears to reflect rapid speciation in these areas; mechanisms underlying this evolutionary success remain uncertain.

Lu, L.-L., B.-H. Jiao, F. Qin, G. Xie, K.-Q. Lu, J.-F. Li, B. Sun, et al. 2022. Artemisia pollen dataset for exploring the potential ecological indicators in deep time. Earth System Science Data 14: 3961–3995.

Abstract. Artemisia, along with Chenopodiaceae, is the dominant component growing in the desert and dry grassland of the Northern Hemisphere. Artemisia pollen with its high productivity, wide distribution, and easy identification is usually regarded as an eco-indicator for assessing aridity and distinguishing grassland from desert vegetation in terms of the pollen relative abundance ratio of Chenopodiaceae/Artemisia (C/A). Nevertheless, divergent opinions on the degree of aridity evaluated by Artemisia pollen have been circulating in the palynological community for a long time. To solve the confusion, we first selected 36 species from nine clades and three outgroups of Artemisia based on the phylogenetic framework, which attempts to cover the maximum range of pollen morphological variation. Then, sampling, experiments, photography, and measurements were taken using standard methods. Here, we present pollen datasets containing 4018 original pollen photographs, 9360 pollen morphological trait measurements, information on 30 858 source plant occurrences, and corresponding environmental factors. Hierarchical cluster analysis on pollen morphological traits was carried out to subdivide Artemisia pollen into three types. When plotting the three pollen types of Artemisia onto the global terrestrial biomes, different pollen types of Artemisia were found to have different habitat ranges. These findings change the traditional concept of Artemisia being restricted to arid and semi-arid environments. The data framework that we designed is open and expandable for new pollen data of Artemisia worldwide. In the future, linking pollen morphology with habitat via these pollen datasets will create additional knowledge that will increase the resolution of the ecological environment in the geological past. The Artemisia pollen datasets are freely available at Zenodo (; Lu et al., 2022).

Coca‐de‐la‐Iglesia, M., N. G. Medina, J. Wen, and V. Valcárcel. 2022. Evaluation of the tropical‐temperate transitions: An example of climatic characterization in the Asian Palmate group of Araliaceae. American Journal of Botany.

(no abstract available)

Lannuzel, G., L. Pouget, D. Bruy, V. Hequet, S. Meyer, J. Munzinger, and G. Gâteblé. 2022. Mining rare Earth elements: Identifying the plant species most threatened by ore extraction in an insular hotspot. Frontiers in Ecology and Evolution 10.

Conservation efforts in global biodiversity hotspots often face a common predicament: an urgent need for conservation action hampered by a significant lack of knowledge about that biodiversity. In recent decades, the computerisation of primary biodiversity data worldwide has provided the scientific community with raw material to increase our understanding of the shared natural heritage. These datasets, however, suffer from a lot of geographical and taxonomic inaccuracies. Automated tools developed to enhance their reliability have shown that detailed expert examination remains the best way to achieve robust and exhaustive datasets. In New Caledonia, one of the most important biodiversity hotspots worldwide, the plant diversity inventory is still underway, and most taxa awaiting formal description are narrow endemics, hence by definition hard to discern in the datasets. In the meantime, anthropogenic pressures, such as nickel-ore mining, are threatening the unique ultramafic ecosystems at an increasing rate. The conservation challenge is therefore a race against time, as the rarest species must be identified and protected before they vanish. In this study, based on all available datasets and resources, we applied a workflow capable of highlighting the lesser known taxa. The main challenges addressed were to aggregate all data available worldwide, and tackle the geographical and taxonomic biases, avoiding the data loss resulting from automated filtering. Every doubtful specimen went through a careful taxonomic analysis by a local and international taxonomist panel. Geolocation of the whole dataset was achieved through dataset cross-checking, local botanists’ field knowledge, and historical material examination. Field studies were also conducted to clarify the most unresolved taxa. With the help of this method and by analysing over 85,000 data, we were able to double the number of known narrow endemic taxa, elucidate 68 putative new species, and update our knowledge of the rarest species’ distributions so as to promote conservation measures.

Kopperud, B. T., S. Lidgard, and L. H. Liow. 2022. Enhancing georeferenced biodiversity inventories: automated information extraction from literature records reveal the gaps. PeerJ 10: e13921.

We use natural language processing (NLP) to retrieve location data for cheilostome bryozoan species (text-mined occurrences (TMO)) in an automated procedure. We compare these results with data combined from two major public databases (DB): the Ocean Biodiversity Information System (OBIS), and the Global Biodiversity Information Facility (GBIF). Using DB and TMO data separately and in combination, we present latitudinal species richness curves using standard estimators (Chao2 and the Jackknife) and range-through approaches. Our combined DB and TMO species richness curves quantitatively document a bimodal global latitudinal diversity gradient for extant cheilostomes for the first time, with peaks in the temperate zones. A total of 79% of the georeferenced species we retrieved from TMO (N = 1,408) and DB (N = 4,549) are non-overlapping. Despite clear indications that global location data compiled for cheilostomes should be improved with concerted effort, our study supports the view that many marine latitudinal species richness patterns deviate from the canonical latitudinal diversity gradient (LDG). Moreover, combining online biodiversity databases with automated information retrieval from the published literature is a promising avenue for expanding taxon-location datasets.

Führding‐Potschkat, P., H. Kreft, and S. M. Ickert‐Bond. 2022. Influence of different data cleaning solutions of point‐occurrence records on downstream macroecological diversity models. Ecology and Evolution 12.

Digital point‐occurrence records from the Global Biodiversity Information Facility (GBIF) and other data providers enable a wide range of research in macroecology and biogeography. However, data errors may hamper immediate use. Manual data cleaning is time‐consuming and often unfeasible, given that the databases may contain thousands or millions of records. Automated data cleaning pipelines are therefore of high importance. Taking North American Ephedra as a model, we examined how different data cleaning pipelines (using, e.g., the GBIF web application, and four different R packages) affect downstream species distribution models (SDMs). We also assessed how data differed from expert data. From 13,889 North American Ephedra observations in GBIF, the pipelines removed 31.7% to 62.7% false positives, invalid coordinates, and duplicates, leading to datasets between 9484 (GBIF application) and 5196 records (manual‐guided filtering). The expert data consisted of 704 records, comparable to data from field studies. Although differences in the absolute numbers of records were relatively large, species richness models based on stacked SDMs (S‐SDM) from pipeline and expert data were strongly correlated (mean Pearson's r across the pipelines: .9986, vs. the expert data: .9173). Our results suggest that all R package‐based pipelines reliably identified invalid coordinates. In contrast, the GBIF‐filtered data still contained both spatial and taxonomic errors. Major drawbacks emerge from the fact that no pipeline fully discovered misidentified specimens without the assistance of taxonomic expert knowledge. We conclude that application‐filtered GBIF data will still need additional review to achieve higher spatial data quality. Achieving high‐quality taxonomic data will require extra effort, probably by thoroughly analyzing the data for misidentified taxa, supported by experts.

Hirabayashi, K., S. J. Murch, and L. A. E. Erland. 2022. Predicted impacts of climate change on wild and commercial berry habitats will have food security, conservation and agricultural implications. Science of The Total Environment 845: 157341.

Climate change is now a reality and is altering ecosystems, with Canada experiencing 2–4 times the global average rate of warming. This will have a critical impact on berry cultivation and horticulture. Enhancing our understanding of how wild and cultivated berries will perform under changing climates will be essential to mitigating impacts on ecosystems, culture and food security. Our objective was to predict the impact of climate change on habitat suitability of four berry producing Vaccinium species: two species with primarily northern distributions (V. uliginosum, V. vitis-idaea), one species with a primarily southern distribution (V. oxycoccos), and the commercially cultivated V. macrocarpon. We used the maximum entropy (Maxent) model and the CMIP6 shared socioeconomic pathways (SSPs) 126 and 585 projected to 2041–2060 and 2061–2080. Wild species showed a uniform northward progression and expansion of suitable habitat. Our modeling predicts that suitable growing regions for commercial cranberries are also likely to shift with some farms becoming unsuitable for the current varieties and other regions becoming more suitable for cranberry farms. Both V. macrocarpon and V. oxycoccos showed a high dependence on precipitation-associated variables. Vaccinium vitis-idaea and V. uliginosum had a greater number of variables with smaller contributions which may improve their resilience to individual climactic events. Future competition between commercial cranberry farms and wild berries in protected areas could lead to conflicts between agriculture and conservation priorities. New varieties of commercial berries are required to maintain current commercial berry farms.

Ramirez-Villegas, J., C. K. Khoury, H. A. Achicanoy, M. V. Diaz, A. C. Mendez, C. C. Sosa, Z. Kehel, et al. 2022. State of ex situ conservation of landrace groups of 25 major crops. Nature Plants 8: 491–499.

Crop landraces have unique local agroecological and societal functions and offer important genetic resources for plant breeding. Recognition of the value of landrace diversity and concern about its erosion on farms have led to sustained efforts to establish ex situ collections worldwide. The degree to which these efforts have succeeded in conserving landraces has not been comprehensively assessed. Here we modelled the potential distributions of eco-geographically distinguishable groups of landraces of 25 cereal, pulse and starchy root/tuber/fruit crops within their geographic regions of diversity. We then analysed the extent to which these landrace groups are represented in genebank collections, using geographic and ecological coverage metrics as a proxy for genetic diversity. We find that ex situ conservation of landrace groups is currently moderately comprehensive on average, with substantial variation among crops; a mean of 63% ± 12.6% of distributions is currently represented in genebanks. Breadfruit, bananas and plantains, lentils, common beans, chickpeas, barley and bread wheat landrace groups are among the most fully represented, whereas the largest conservation gaps persist for pearl millet, yams, finger millet, groundnut, potatoes and peas. Geographic regions prioritized for further collection of landrace groups for ex situ conservation include South Asia, the Mediterranean and West Asia, Mesoamerica, sub-Saharan Africa, the Andean mountains of South America and Central to East Asia. With further progress to fill these gaps, a high degree of representation of landrace group diversity in genebanks is feasible globally, thus fulfilling international targets for their ex situ conservation. By analysing the state of representation of traditional varieties of 25 major crops in ex situ repositories, this study demonstrates conservation progress made over more than a half-century and identifies the gaps remaining to be filled.

Cano, Á., F. W. Stauffer, T. Andermann, I. M. Liberal, A. Zizka, C. D. Bacon, H. Lorenzi, et al. 2022. Recent and local diversification of Central American understorey palms. Global Ecology and Biogeography 31: 1513–1525.

Aim Central America is largely covered by hyperdiverse, yet poorly understood, rain forests. Understorey palms are diverse components of these forests, but little is known about their historical assembly. It is not clear when palms in Central America reached present diversity levels and whether most species arrived from neighbouring regions or evolved locally. We addressed these questions using the most species-rich American palm clades indicative of rain forests. We reconstructed and compared their phylogenomic and biogeographical history with the diversification of 54 other plant lineages, to gain a better understanding of the processes that shaped the assembly of Central American rain forests. Location Central America. Time period Cretaceous to present. Major taxa studied Arecaceae: Arecoideae: Bactridinae, Chamaedoreeae, Geonomateae. Methods We sampled 218 species through fieldwork and living collections. We sequenced their genomic DNA using target sequence-capture procedures. Using 12 calibration points, we reconstructed dated phylogenies under three approaches (multispecies coalescent, maximum likelihood and Bayesian inference), conducted biogeographical analyses (dispersal–extinction–cladogenesis) and estimated phylogenetic diversity metrics. Results Dated phylogenies revealed intense diversification in Central America from 12 Ma. Local diversification events were four times more frequent than dispersal events, and we found strong phylogenetic clustering in relationship to Central America. Main conclusions Our results suggest that most understorey palm species that characterize the Central American rain forests today evolved locally after repeated dispersal events, mostly from South America. Understorey palms in Central American rain forests diversified primarily after closure of the Central American Seaway at c. 13 Ma, suggesting that the Great American Biotic Interchange was a major trigger for plant diversification in Central American rain forests. This recent diversification contrasts with the much earlier existence of rain forest palms in neighbouring South America since c. 58 Ma. We found similar timings of diversification in 54 other seed plant lineages, suggesting an unexpectedly recent assembly of the hyperdiverse Central American flora.