Science Enabled by Specimen Data

Anest, A., Y. Bouchenak-Khelladi, T. Charles-Dominique, F. Forest, Y. Caraglio, G. P. Hempson, O. Maurin, and K. W. Tomlinson. 2024. Blocking then stinging as a case of two-step evolution of defensive cage architectures in herbivore-driven ecosystems. Nature Plants.

Dense branching and spines are common features of plant species in ecosystems with high mammalian herbivory pressure. While dense branching and spines can inhibit herbivory independently, when combined, they form a powerful defensive cage architecture. However, how cage architecture evolved under mammalian pressure has remained unexplored. Here we show how dense branching and spines emerged during the age of mammalian radiation in the Combretaceae family and diversified in herbivore-driven ecosystems in the tropics. Phylogenetic comparative methods revealed that modern plant architectural strategies defending against large mammals evolved via a stepwise process. First, dense branching emerged under intermediate herbivory pressure, followed by the acquisition of spines that supported higher speciation rates under high herbivory pressure. Our study highlights the adaptive value of dense branching as part of a herbivore defence strategy and identifies large mammal herbivory as a major selective force shaping the whole plant architecture of woody plants. This study explores the evolution of two traits, branching density and spine presence, in the globally distributed plant family Combretaceae. These traits were found to have appeared in a two-step process in response to mammalian herbivory pressure, revealing the importance of large mammals in the evolution of plant architecture diversity.

Weiss, R. M., F. Zanetti, B. Alberghini, D. Puttick, M. A. Vankosky, A. Monti, and C. Eynck. 2024. Bioclimatic analysis of potential worldwide production of spring‐type camelina [Camelina sativa (L.) Crantz] seeded in the spring. GCB Bioenergy 16.

Camelina [Camelina sativa (L.) Crantz] is a Brassicaceae oilseed that is gaining interest worldwide as low‐maintenance crop for diverse biobased applications. One of the most important factors determining its productivity is climate. We conducted a bioclimate analysis in order to analyze the relationship between climatic factors and the productivity of spring‐type camelina seeded in the spring, and to identify regions of the world with potential for camelina in this scenario. Using the modelling tool CLIMEX, a bioclimatic model was developed for spring‐seeded spring‐type camelina to match distribution, reported seed yields and phenology records in North America. Distribution, yield, and phenology data from outside of North America were used as independent datasets for model validation and demonstrated that model projections agreed with published distribution records, reported spring‐seeded camelina yields, and closely predicted crop phenology in Europe, South America, and Asia. Sensitivity analysis, used to quantify the response of camelina to changes in precipitation and temperature, indicated that crop performance was more sensitive to moisture than temperature index parameters, suggesting that the yield potential of spring‐seeded camelina may be more strongly impacted by water‐limited conditions than by high temperatures. Incremental climate scenarios also revealed that spring‐seeded camelina production will exhibit yield shifts at the continental scale as temperature and precipitation deviate from current conditions. Yield data were compared with indices of climatic suitability to provide estimates of potential worldwide camelina productivity. This information was used to identify new areas where spring‐seeded camelina could be grown and areas that may permit expanded production, including eastern Europe, China, eastern Russia, Australia and New Zealand. Our model is the first to have taken a systematic approach to determine suitable regions for potential worldwide production of spring‐seeded camelina.

Gachambi Mwangi, J., J. Haggar, S. Mohammed, T. Santika, and K. Mustapha Umar. 2023. The ecology, distribution, and anthropogenic threats of multipurpose hemi-parasitic plant Osyris lanceolata. Journal for Nature Conservation 76: 126478.

Osyris lanceolata Hochst. & Steud. ex A. DC. is a multipurpose plant with high socioeconomic and cultural values. It is endangered in the biogeographical region of eastern Africa, but of less concern in other regions where it occurs. The few natural populations remaining in the endangered sites continue to encounter many threats, and this has raised concerns about its long-term sustainability. Yet, existing knowledge about the ecology and distribution of the plant is scarce to inform strategies for the conservation and sustainable management of the species. In this study, we conducted a scoping review of the available literature on current knowledge about the plant. We recapitulated existing knowledge about the abiotic and biotic factors influencing the contemporary distribution of the plant, the anthropogenic threats, and existing conservation efforts. Based on the limited studies we reviewed, we identified that the plant prefers specific habitats (hilly areas and rocky outcrops), frequently parasitizes Fabaceae but can parasitize plants from a wide range of countries, have inadequate ex-situ propagation protocols which present issues for the survival of the species. Overharvesting from the wild driven by demand from regional and global markets poses further threats to the existing natural populations, especially in eastern Africa. A combination of ecological, social, and trade-related conservation measures can be envisioned to help improve the plant’s persistence. These include, but are not limited to, a better understanding of the species ecology to inform conservation planning, monitoring of trade flow and improve transnational environmental laws and cooperation among countries to prevent species smuggling.

Maurin, O., A. Anest, F. Forest, I. Turner, R. L. Barrett, R. C. Cowan, L. Wang, et al. 2023. Drift in the tropics: Phylogenetics and biogeographical patterns in Combretaceae. Global Ecology and Biogeography.

Aim The aim of this study was to further advance our understanding of the species-rich, and ecologically important angiosperm family Combretaceae to provide new insights into their evolutionary history. We assessed phylogenetic relationships in the family using target capture data and produced a dated phylogenetic tree to assess fruit dispersal modes and patterns of distribution. Location Tropical and subtropical regions. Time Period Cretaceous to present. Major Taxa Studied Family Combretaceae is a member of the rosid clade and comprises 10 genera and more than 500 species, predominantly assigned to genera Combretum and Terminalia, and occurring on all continents and in a wide range of ecosystems. Methods We use a target capture approach and the Angiosperms353 universal probes to reconstruct a robust dated phylogenetic tree for the family. This phylogenetic framework, combined with seed dispersal traits, biome data and biogeographic ranges, allows the reconstruction of the biogeographical history of the group. Results Ancestral range reconstructions suggest a Gondwanan origin (Africa/South America), with several intercontinental dispersals within the family and few transitions between biomes. Relative abundance of fruit dispersal types differed by both continent and biome. However, intercontinental colonizations were only significantly enhanced by water dispersal (drift fruit), and there was no evidence that seed dispersal modes influenced biome shifts. Main Conclusions Our analysis reveals a paradox as drift fruit greatly enhanced dispersal distances at intercontinental scale but did not affect the strong biome conservatism observed.

Richard-Bollans, A., C. Aitken, A. Antonelli, C. Bitencourt, D. Goyder, E. Lucas, I. Ondo, et al. 2023. Machine learning enhances prediction of plants as potential sources of antimalarials. Frontiers in Plant Science 14.

Plants are a rich source of bioactive compounds and a number of plant-derived antiplasmodial compounds have been developed into pharmaceutical drugs for the prevention and treatment of malaria, a major public health challenge. However, identifying plants with antiplasmodial potential can be time-consuming and costly. One approach for selecting plants to investigate is based on ethnobotanical knowledge which, though having provided some major successes, is restricted to a relatively small group of plant species. Machine learning, incorporating ethnobotanical and plant trait data, provides a promising approach to improve the identification of antiplasmodial plants and accelerate the search for new plant-derived antiplasmodial compounds. In this paper we present a novel dataset on antiplasmodial activity for three flowering plant families – Apocynaceae, Loganiaceae and Rubiaceae (together comprising c. 21,100 species) – and demonstrate the ability of machine learning algorithms to predict the antiplasmodial potential of plant species. We evaluate the predictive capability of a variety of algorithms – Support Vector Machines, Logistic Regression, Gradient Boosted Trees and Bayesian Neural Networks – and compare these to two ethnobotanical selection approaches – based on usage as an antimalarial and general usage as a medicine. We evaluate the approaches using the given data and when the given samples are reweighted to correct for sampling biases. In both evaluation settings each of the machine learning models have a higher precision than the ethnobotanical approaches. In the bias-corrected scenario, the Support Vector classifier performs best – attaining a mean precision of 0.67 compared to the best performing ethnobotanical approach with a mean precision of 0.46. We also use the bias correction method and the Support Vector classifier to estimate the potential of plants to provide novel antiplasmodial compounds. We estimate that 7677 species in Apocynaceae, Loganiaceae and Rubiaceae warrant further investigation and that at least 1300 active antiplasmodial species are highly unlikely to be investigated by conventional approaches. While traditional and Indigenous knowledge remains vital to our understanding of people-plant relationships and an invaluable source of information, these results indicate a vast and relatively untapped source in the search for new plant-derived antiplasmodial compounds.

Lannuzel, G., L. Pouget, D. Bruy, V. Hequet, S. Meyer, J. Munzinger, and G. Gâteblé. 2022. Mining rare Earth elements: Identifying the plant species most threatened by ore extraction in an insular hotspot. Frontiers in Ecology and Evolution 10.

Conservation efforts in global biodiversity hotspots often face a common predicament: an urgent need for conservation action hampered by a significant lack of knowledge about that biodiversity. In recent decades, the computerisation of primary biodiversity data worldwide has provided the scientific community with raw material to increase our understanding of the shared natural heritage. These datasets, however, suffer from a lot of geographical and taxonomic inaccuracies. Automated tools developed to enhance their reliability have shown that detailed expert examination remains the best way to achieve robust and exhaustive datasets. In New Caledonia, one of the most important biodiversity hotspots worldwide, the plant diversity inventory is still underway, and most taxa awaiting formal description are narrow endemics, hence by definition hard to discern in the datasets. In the meantime, anthropogenic pressures, such as nickel-ore mining, are threatening the unique ultramafic ecosystems at an increasing rate. The conservation challenge is therefore a race against time, as the rarest species must be identified and protected before they vanish. In this study, based on all available datasets and resources, we applied a workflow capable of highlighting the lesser known taxa. The main challenges addressed were to aggregate all data available worldwide, and tackle the geographical and taxonomic biases, avoiding the data loss resulting from automated filtering. Every doubtful specimen went through a careful taxonomic analysis by a local and international taxonomist panel. Geolocation of the whole dataset was achieved through dataset cross-checking, local botanists’ field knowledge, and historical material examination. Field studies were also conducted to clarify the most unresolved taxa. With the help of this method and by analysing over 85,000 data, we were able to double the number of known narrow endemic taxa, elucidate 68 putative new species, and update our knowledge of the rarest species’ distributions so as to promote conservation measures.

Führding‐Potschkat, P., H. Kreft, and S. M. Ickert‐Bond. 2022. Influence of different data cleaning solutions of point‐occurrence records on downstream macroecological diversity models. Ecology and Evolution 12.

Digital point‐occurrence records from the Global Biodiversity Information Facility (GBIF) and other data providers enable a wide range of research in macroecology and biogeography. However, data errors may hamper immediate use. Manual data cleaning is time‐consuming and often unfeasible, given that the databases may contain thousands or millions of records. Automated data cleaning pipelines are therefore of high importance. Taking North American Ephedra as a model, we examined how different data cleaning pipelines (using, e.g., the GBIF web application, and four different R packages) affect downstream species distribution models (SDMs). We also assessed how data differed from expert data. From 13,889 North American Ephedra observations in GBIF, the pipelines removed 31.7% to 62.7% false positives, invalid coordinates, and duplicates, leading to datasets between 9484 (GBIF application) and 5196 records (manual‐guided filtering). The expert data consisted of 704 records, comparable to data from field studies. Although differences in the absolute numbers of records were relatively large, species richness models based on stacked SDMs (S‐SDM) from pipeline and expert data were strongly correlated (mean Pearson's r across the pipelines: .9986, vs. the expert data: .9173). Our results suggest that all R package‐based pipelines reliably identified invalid coordinates. In contrast, the GBIF‐filtered data still contained both spatial and taxonomic errors. Major drawbacks emerge from the fact that no pipeline fully discovered misidentified specimens without the assistance of taxonomic expert knowledge. We conclude that application‐filtered GBIF data will still need additional review to achieve higher spatial data quality. Achieving high‐quality taxonomic data will require extra effort, probably by thoroughly analyzing the data for misidentified taxa, supported by experts.

Colli-Silva, M., J. R. Pirani, and A. Zizka. 2022. Ecological niche models and point distribution data reveal a differential coverage of the cacao relatives (Malvaceae) in South American protected areas. Ecological Informatics 69: 101668.

For many regions, such as in South America, it is unclear how well the existent protected areas network (PAs) covers different taxonomic groups and if there is a coverage bias of PAs towards certain biomes or species. Publicly available occurrence data along with ecological niche models might help to overcome this gap and to quantify the coverage of taxa by PAs ensuring an unbiased distribution of conservation effort. Here, we use an occurrence database of 271 species from the cacao family (Malvaceae) to address how South American PAs cover species with different distribution, abundance, and threat status. Furthermore, we compared the performance of online databases, expert knowledge, and modelled species distributions in estimating species coverage in PAs. We found 79 species from our survey (29% of the total) lack any record inside South American PAs and that 20 out of 23 species potentially threatened with extinction are not covered by PAs. The area covered by South American PAs was low across biomes, except for Amazonia, which had a relative high PA coverage, but little information on species distribution within PA available. Also, raw geo-referenced occurrence data were underestimating the number of species in PAs, and projections from ecological niche models were more prone to overestimating the number of species represented within PAs. We discuss that the protection of South American flora in heterogeneous environments demand for specific strategies tailored to particular biomes, including making new collections inside PAs in less collected areas, and the delimitation of more areas for protection in more known areas. Also, by presenting biasing scenarios of collection effort in a representative plant group, our results can benefit policy makers in conserving different spots of tropical environments highly biodiverse.

Xue, T., S. R. Gadagkar, T. P. Albright, X. Yang, J. Li, C. Xia, J. Wu, and S. Yu. 2021. Prioritizing conservation of biodiversity in an alpine region: Distribution pattern and conservation status of seed plants in the Qinghai-Tibetan Plateau. Global Ecology and Conservation 32: e01885.

The Qinghai-Tibetan Plateau (QTP) harbors abundant and diverse plant life owing to its high habitat heterogeneity. However, the distribution pattern of biodiversity hotspots and their conservation status remain unclear. Based on 148,283 high-resolution occurrence coordinates of 13,450 seed plants, w…

Lopes, A., L. O. Demarchi, A. C. Franco, A. B. Ferreira, C. S. Ferreira, F. Wittmann, I. N. Santiago, et al. 2021. Predicting the potential distribution of aquatic herbaceous plants in oligotrophic Central Amazonian wetland ecosystems. Acta Botanica Brasilica 35: 22–36.

Aquatic herbaceous plants are especially suitable for mapping environmental variability in wetlands, as they respond quickly to environmental gradients and are good indicators of habitat preference. We describe the composition of herbaceous species in two oligotrophic wetland ecosystems, floodplains…