Science Enabled by Specimen Data

Zhang, H., W. Guo, and W. Wang. 2023. The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models. Ecology and Evolution 13. https://doi.org/10.1002/ece3.10747

How to effectively obtain species‐related low‐dimensional data from massive environmental variables has become an urgent problem for species distribution models (SDMs). In this study, we will explore whether dimensionality reduction on environmental variables can improve the predictive performance of SDMs. We first used two linear (i.e., principal component analysis (PCA) and independent components analysis) and two nonlinear (i.e., kernel principal component analysis (KPCA) and uniform manifold approximation and projection) dimensionality reduction techniques (DRTs) to reduce the dimensionality of high‐dimensional environmental data. Then, we established five SDMs based on the environmental variables of dimensionality reduction for 23 real plant species and nine virtual species, and compared the predictive performance of those with the SDMs based on the selected environmental variables through Pearson's correlation coefficient (PCC). In addition, we studied the effects of DRTs, model complexity, and sample size on the predictive performance of SDMs. The predictive performance of SDMs under DRTs other than KPCA is better than using PCC. And the predictive performance of SDMs using linear DRTs is better than using nonlinear DRTs. In addition, using DRTs to deal with environmental variables has no less impact on the predictive performance of SDMs than model complexity and sample size. When the model complexity is at the complex level, PCA can improve the predictive performance of SDMs the most by 2.55% compared with PCC. At the middle level of sample size, the PCA improved the predictive performance of SDMs by 2.68% compared with the PCC. Our study demonstrates that DRTs have a significant effect on the predictive performance of SDMs. Specifically, linear DRTs, especially PCA, are more effective at improving model predictive performance under relatively complex model complexity or large sample sizes.

Richard-Bollans, A., C. Aitken, A. Antonelli, C. Bitencourt, D. Goyder, E. Lucas, I. Ondo, et al. 2023. Machine learning enhances prediction of plants as potential sources of antimalarials. Frontiers in Plant Science 14. https://doi.org/10.3389/fpls.2023.1173328

Plants are a rich source of bioactive compounds and a number of plant-derived antiplasmodial compounds have been developed into pharmaceutical drugs for the prevention and treatment of malaria, a major public health challenge. However, identifying plants with antiplasmodial potential can be time-consuming and costly. One approach for selecting plants to investigate is based on ethnobotanical knowledge which, though having provided some major successes, is restricted to a relatively small group of plant species. Machine learning, incorporating ethnobotanical and plant trait data, provides a promising approach to improve the identification of antiplasmodial plants and accelerate the search for new plant-derived antiplasmodial compounds. In this paper we present a novel dataset on antiplasmodial activity for three flowering plant families – Apocynaceae, Loganiaceae and Rubiaceae (together comprising c. 21,100 species) – and demonstrate the ability of machine learning algorithms to predict the antiplasmodial potential of plant species. We evaluate the predictive capability of a variety of algorithms – Support Vector Machines, Logistic Regression, Gradient Boosted Trees and Bayesian Neural Networks – and compare these to two ethnobotanical selection approaches – based on usage as an antimalarial and general usage as a medicine. We evaluate the approaches using the given data and when the given samples are reweighted to correct for sampling biases. In both evaluation settings each of the machine learning models have a higher precision than the ethnobotanical approaches. In the bias-corrected scenario, the Support Vector classifier performs best – attaining a mean precision of 0.67 compared to the best performing ethnobotanical approach with a mean precision of 0.46. We also use the bias correction method and the Support Vector classifier to estimate the potential of plants to provide novel antiplasmodial compounds. We estimate that 7677 species in Apocynaceae, Loganiaceae and Rubiaceae warrant further investigation and that at least 1300 active antiplasmodial species are highly unlikely to be investigated by conventional approaches. While traditional and Indigenous knowledge remains vital to our understanding of people-plant relationships and an invaluable source of information, these results indicate a vast and relatively untapped source in the search for new plant-derived antiplasmodial compounds.

Robin-Champigneul, F., J. Gravendyck, H. Huang, A. Woutersen, D. Pocknall, N. Meijer, G. Dupont-Nivet, et al. 2023. Northward expansion of the southern-temperate podocarp forest during the Early Eocene Climatic Optimum: Palynological evidence from the NE Tibetan Plateau (China). Review of Palaeobotany and Palynology: 104914. https://doi.org/10.1016/j.revpalbo.2023.104914

The debated vegetation response to climate change can be investigated through palynological fossil records from past extreme climate conditions. In this context, the early Eocene (53.3 to 41.2 million years ago (Ma)) is often referred to as a model for a greenhouse Earth. In the Xining Basin, situated on the North-eastern Tibetan Plateau (NETP), this time interval is represented by an extensive and well-dated sedimentary sequence of evaporites and red mudstones. Here we focus on the palynological record of the Early Eocene Climatic Optimum (EECO; 53.3 to 49.1 Ma) and study the fossil gymnosperm pollen composition in these sediments. In addition, we also investigate the nearest living relatives (NLR) or botanical affinity of these genera and the paleobiogeographic implications of their occurrence in the Eocene of the NETP. To reach our objective, we complemented transmitted light microscopy with laser scanning- and electron microscopy techniques, to produce high-resolution images, and illustrate the morphological variation within fossil and extant gymnosperm pollen. Furthermore, a morphometric analysis was carried out to investigate the infra- and intrageneric variation of these and related taxa. To place the data in context we produced paleobiogeographic maps for Phyllocladidites and for other Podocarpaceae, based on data from a global fossil pollen data base, and compare these with modern records from GBIF. We also assessed the climatic envelope of the NLR. Our analyses confirm the presence of Phyllocladidites (NLR Phyllocladus, Podocarpaceae) and Podocarpidites (NLR Podocarpus, Podocarpaceae) in the EECO deposits in the Xining Basin. In addition, a comparative study based on literature suggests that Parcisporites is likely a younger synonym of Phyllocladidites. Our findings further suggest that the Phyllocladidites specimens are derived from a lineage that was much more diverse than previously thought, and which had a much larger biogeographical distribution during the EECO than at present. Based on the climatic envelope of the NLR, we suggest that the paleoclimatic conditions in the Xining Basin were warmer and more humid during the EECO. We conclude that phylloclade-type conifers typical of the southern-temperate podocarp forests, had a northward geographical expansion during the EECO, followed by extirpation.

Huang, T., J. Chen, K. E. Hummer, L. A. Alice, W. Wang, Y. He, S. Yu, et al. 2023. Phylogeny of Rubus (Rosaceae): Integrating molecular and morphological evidence into an infrageneric revision. TAXON. https://doi.org/10.1002/tax.12885

Rubus (Rosaceae), one of the most complicated angiosperm genera, contains about 863 species, and is notorious for its taxonomic difficulty. The most recent (1910–1914) global taxonomic treatment of the genus was conducted by Focke, who defined 12 subgenera. Phylogenetic results over the past 25 years suggest that Focke's subdivisions of Rubus are not monophyletic, and large‐scale taxonomic revisions are necessary. Our objective was to provide a comprehensive phylogenetic analysis of the genus based on an integrative evidence approach. Morphological characters, obtained from our own investigation of living plants and examination of herbarium specimens are combined with chloroplast genomic data. Our dataset comprised 196 accessions representing 145 Rubus species (including cultivars and hybrids) and all of Focke's subgenera, including 60 endemic Chinese species. Maximum likelihood analyses inferred phylogenetic relationships. Our analyses concur with previous molecular studies, but with modifications. Our data strongly support the reclassification of several subgenera within Rubus. Our molecular analyses agree with others that only R. subg. Anoplobatus forms a monophyletic group. Other subgenera are para‐ or polyphyletic. We suggest a revised subgeneric framework to accommodate monophyletic groups. Character evolution is reconstructed, and diagnostic morphological characters for different clades are identified and discussed. Based on morphological and molecular evidence, we propose a new classification system with 10 subgenera: R. subg. Anoplobatus, R. subg. Batothamnus, R. subg. Chamaerubus, R. subg. Cylactis, R. subg. Dalibarda, R. subg. Idaeobatus, R. subg. Lineati, R. subg. Malachobatus, R. subg. Melanobatus, and R. subg. Rubus. The revised infrageneric nomenclature inferred from our analyses is provided along with synonymy and type citations. Our new taxonomic backbone is the first systematic and complete global revision of Rubus since Focke's treatment. It offers new insights into deep phylogenetic relationships of Rubus and has important theoretical and practical significance for the development and utilization of these important agronomic crops.

Reichgelt, T., A. Baumgartner, R. Feng, and D. A. Willard. 2023. Poleward amplification, seasonal rainfall and forest heterogeneity in the Miocene of the eastern USA. Global and Planetary Change 222: 104073. https://doi.org/10.1016/j.gloplacha.2023.104073

Paleoclimate reconstructions can provide a window into the environmental conditions in Earth history when atmospheric carbon dioxide concentrations were higher than today. In the eastern USA, paleoclimate reconstructions are sparse, because terrestrial sedimentary deposits are rare. Despite this, the eastern USA has the largest population and population density in North America, and understanding the effects of current and future climate change is of vital importance. Here, we provide terrestrial paleoclimate reconstructions of the eastern USA from Miocene fossil floras. Additionally, we compare proxy paleoclimate reconstructions from the warmest period in the Miocene, the Miocene Climatic Optimum (MCO), to those of an MCO Earth System Model. Reconstructed Miocene temperatures and precipitation north of 35°N are higher than modern. In contrast, south of 35°N, temperatures and precipitation are similar to today, suggesting a poleward amplification effect in eastern North America. Reconstructed Miocene rainfall seasonality was predominantly higher than modern, regardless of latitude, indicating greater variability in intra-annual moisture transport. Reconstructed climates are almost uniformly in the temperate seasonal forest biome, but heterogeneity of specific forest types is evident. Reconstructed Miocene terrestrial temperatures from the eastern USA are lower than modeled temperatures and coeval Atlantic sea surface temperatures. However, reconstructed rainfall is consistent with modeled rainfall. Our results show that during the Miocene, climate was most different from modern in the northeastern states, and may suggest a drastic reduction in the meridional temperature gradient along the North American east coast compared to today.

Watts, J. L., and J. E. Watkins. 2022. New Zealand Fern Distributions from the Last Glacial Maximum to 2070: A Dynamic Tale of Migration and Community Turnover. American Fern Journal 112. https://doi.org/10.1640/0002-8444-112.4.354

The coming decades are predicated to bring widespread shifts in local, regional, and global climatic patterns. Currently there is limited understanding of how ferns will respond to these changes and few studies have attempted to model shifts in fern distribution in response to climate change. In this paper, we present a series of these models using the country of New Zealand as our study system. Ferns are notably abundant in New Zealand and play important ecological roles in early succession, canopy biology, and understory dynamics. Here we describe how fern distributions have changed since the Last Glacial Maximum to the present and predict how they will change with anthropogenic climate change – assuming no measures are taken to reduce carbon emissions. To do this, we used MaxEnt species distribution modelling with publicly available data from gbif.org and worldclim.org to predict the past, present, and future distributions of 107 New Zealand fern species. The present study demonstrates that ferns in New Zealand have and will continue to expand their ranges and migrate southward and upslope. Despite the predicted general increased range size as a result of climate change, our models predict that the majority (52%) of many species' current suitable habitats may be climatically unsuitable in 50 years, including the ecologically important group: tree ferns. Additionally, fern communities are predicted to undergo drastic shifts in composition, which may be detrimental to overall ecosystem functioning in New Zealand.

Testo, W. L., A. L. de Gasper, S. Molino, J. M. G. y Galán, A. Salino, V. A. de O. Dittrich, and E. B. Sessa. 2022. Deep vicariance and frequent transoceanic dispersal shape the evolutionary history of a globally distributed fern family. American Journal of Botany. https://doi.org/10.1002/ajb2.16062

Premise Historical biogeography of ferns is typically expected to be dominated by long-distance dispersal, due to their minuscule spores. However, few studies have inferred the historical biogeography of a large and widely distributed group of ferns to test this hypothesis. Our aims are to determine the extent to which long-distance dispersal vs. vicariance have shaped the history of the fern family Blechnaceae, to explore ecological correlates of dispersal and diversification, and to determine whether these patterns differ between the northern and southern hemispheres. Methods We used sequence data for three chloroplast loci to infer a time-calibrated phylogeny for 154 out of 265 species of Blechnaceae, including representatives of all genera in the family. This tree was used to conduct ancestral range reconstruction and stochastic character mapping, estimate diversification rates, and identify ecological correlates of diversification. Key results Blechnaceae originated in Eurasia and began diversifying in the late Cretaceous. A lineage comprising most extant diversity diversified principally in the austral Pacific region around the Paleocene-Eocene Thermal Maximum. Land connections that existed near the poles during periods of warm climates likely facilitated migration of several lineages, with subsequent climate-mediated vicariance shaping current distributions. Long-distance dispersal is frequent and asymmetrical, with New Zealand/Pacific Islands, Australia, and tropical America being major source areas. Conclusions Ancient vicariance and extensive long-distance dispersal have shaped the history of Blechnaceae in both the northern and southern hemispheres. The exceptional diversity in austral regions appears to reflect rapid speciation in these areas; mechanisms underlying this evolutionary success remain uncertain.

Chevalier, M. 2022. <i>crestr</i>: an R package to perform probabilistic climate reconstructions from palaeoecological datasets. Climate of the Past 18: 821–844. https://doi.org/10.5194/cp-18-821-2022

Abstract. Statistical climate reconstruction techniques are fundamental tools to study past climate variability from fossil proxy data. In particular, the methods based on probability density functions (or PDFs) can be used in various environments and with different climate proxies because they rely on elementary calibration data (i.e. modern geolocalised presence data). However, the difficulty of accessing and curating these calibration data and the complexity of interpreting probabilistic results have often limited their use in palaeoclimatological studies. Here, I introduce a new R package (crestr) to apply the PDF-based method CREST (Climate REconstruction SofTware) on diverse palaeoecological datasets and address these problems. crestr includes a globally curated calibration dataset for six common climate proxies (i.e. plants, beetles, chironomids, rodents, foraminifera, and dinoflagellate cysts) associated with an extensive range of climate variables (20 terrestrial and 19 marine variables) that enables its use in most terrestrial and marine environments. Private data collections can also be used instead of, or in combination with, the provided calibration dataset. The package includes a suite of graphical diagnostic tools to represent the data at each step of the reconstruction process and provide insights into the effect of the different modelling assumptions and external factors that underlie a reconstruction. With this R package, the CREST method can now be used in a scriptable environment and thus be more easily integrated with existing workflows. It is hoped that crestr will be used to produce the much-needed quantified climate reconstructions from the many regions where they are currently lacking, despite the availability of suitable fossil records. To support this development, the use of the package is illustrated with a step-by-step replication of a 790 000-year-long mean annual temperature reconstruction based on a pollen record from southeastern Africa.

Beaulieu, W. T., D. G. Panaccione, Q. N. Quach, K. L. Smoot, and K. Clay. 2021. Diversification of ergot alkaloids and heritable fungal symbionts in morning glories. Communications Biology 4. https://doi.org/10.1038/s42003-021-02870-z

Heritable microorganisms play critical roles in life cycles of many macro-organisms but their prevalence and functional roles are unknown for most plants. Bioactive ergot alkaloids produced by heritable Periglandula fungi occur in some morning glories (Convolvulaceae), similar to ergot alkaloids in …

Vasconcelos, T., J. D. Boyko, and J. M. Beaulieu. 2021. Linking mode of seed dispersal and climatic niche evolution in flowering plants. Journal of Biogeography. https://doi.org/10.1111/jbi.14292

Aim: Due to the sessile nature of flowering plants, movements to new geographical areas occur mainly during seed dispersal. Frugivores tend to be efficient dispersers because animals move within the boundaries of their preferable niches, so seeds are more likely to be transported to environments tha…