Science Enabled by Specimen Data

Serra‐Diaz, J. M., J. Borderieux, B. Maitner, C. C. F. Boonman, D. Park, W. Guo, A. Callebaut, et al. 2024. occTest: An integrated approach for quality control of species occurrence data. Global Ecology and Biogeography.

Aim Species occurrence data are valuable information that enables one to estimate geographical distributions, characterize niches and their evolution, and guide spatial conservation planning. Rapid increases in species occurrence data stem from increasing digitization and aggregation efforts, and citizen science initiatives. However, persistent quality issues in occurrence data can impact the accuracy of scientific findings, underscoring the importance of filtering erroneous occurrence records in biodiversity analyses.InnovationWe introduce an R package, occTest, that synthesizes a growing open‐source ecosystem of biodiversity cleaning workflows to prepare occurrence data for different modelling applications. It offers a structured set of algorithms to identify potential problems with species occurrence records by employing a hierarchical organization of multiple tests. The workflow has a hierarchical structure organized in testPhases (i.e. cleaning vs. testing) that encompass different testBlocks grouping different testTypes (e.g. environmental outlier detection), which may use different testMethods (e.g. Rosner test, jacknife,etc.). Four different testBlocks characterize potential problems in geographic, environmental, human influence and temporal dimensions. Filtering and plotting functions are incorporated to facilitate the interpretation of tests. We provide examples with different data sources, with default and user‐defined parameters. Compared to other available tools and workflows, occTest offers a comprehensive suite of integrated tests, and allows multiple methods associated with each test to explore consensus among data cleaning methods. It uniquely incorporates both coordinate accuracy analysis and environmental analysis of occurrence records. Furthermore, it provides a hierarchical structure to incorporate future tests yet to be developed.Main conclusionsoccTest will help users understand the quality and quantity of data available before the start of data analysis, while also enabling users to filter data using either predefined rules or custom‐built rules. As a result, occTest can better assess each record's appropriateness for its intended application.

Anest, A., Y. Bouchenak-Khelladi, T. Charles-Dominique, F. Forest, Y. Caraglio, G. P. Hempson, O. Maurin, and K. W. Tomlinson. 2024. Blocking then stinging as a case of two-step evolution of defensive cage architectures in herbivore-driven ecosystems. Nature Plants.

Dense branching and spines are common features of plant species in ecosystems with high mammalian herbivory pressure. While dense branching and spines can inhibit herbivory independently, when combined, they form a powerful defensive cage architecture. However, how cage architecture evolved under mammalian pressure has remained unexplored. Here we show how dense branching and spines emerged during the age of mammalian radiation in the Combretaceae family and diversified in herbivore-driven ecosystems in the tropics. Phylogenetic comparative methods revealed that modern plant architectural strategies defending against large mammals evolved via a stepwise process. First, dense branching emerged under intermediate herbivory pressure, followed by the acquisition of spines that supported higher speciation rates under high herbivory pressure. Our study highlights the adaptive value of dense branching as part of a herbivore defence strategy and identifies large mammal herbivory as a major selective force shaping the whole plant architecture of woody plants. This study explores the evolution of two traits, branching density and spine presence, in the globally distributed plant family Combretaceae. These traits were found to have appeared in a two-step process in response to mammalian herbivory pressure, revealing the importance of large mammals in the evolution of plant architecture diversity.

Prochazka, L. S., S. Alcantara, J. G. Rando, T. Vasconcelos, R. C. Pizzardo, and A. Nogueira. 2024. Resource availability and disturbance frequency shape evolution of plant life forms in Neotropical habitats. New Phytologist.

Organisms use diverse strategies to thrive in varying habitats. While life history theory partly explains these relationships, the combined impact of resource availability and disturbance frequency on life form strategy evolution has received limited attention.We use Chamaecrista species, a legume plant lineage with a high diversity of plant life forms in the Neotropics, and employ ecological niche modeling and comparative phylogenetic methods to examine the correlated evolution of plant life forms and environmental niches.Chamaephytes and phanerophytes have optima in environments characterized by moderate water and nutrient availability coupled with infrequent fire disturbances. By contrast, annual plants thrive in environments with scarce water and nutrients, alongside frequent fire disturbances. Similarly, geophyte species also show increased resistance to frequent fire disturbances, although they thrive in resource‐rich environments.Our findings shed light on the evolution of plant strategies along environmental gradients, highlighting that annuals and geophytes respond differently to high incidences of fire disturbances, with one enduring it as seeds in a resource‐limited habitat and the other relying on reserves and root resprouting systems in resource‐abundant habitats. Furthermore, it deepens our understanding of how organisms evolve associated with their habitats, emphasizing a constraint posed by low‐resource and high‐disturbance environments.

Anon. 2023. Ecological Niche Modelling of an Industrially Important Mushroom - Ganoderma lucidum (Leys.) Karsten: A Machine Learning Global Appraisal. Journal of Scientific & Industrial Research 82.

Species Distribution Modelling (SDM) involves utilizing observations of a given species and its surrounding environment to produce a sound approximation of the species' potential distribution. The intricate relationships between organisms and their surroundings, coupled with the profusion of data, have captured the attention of ecologists and statisticians alike. Consequently, they have directed their efforts towards exploring the potential of machine learning techniques. Our study employs an ensemble machine learning approach to simulate the global ecological niche modelling of Ganoderma lucidum fungus. This involves the utilization of various environmental predictors and the averaging of multiple algorithms to achieve a comprehensive analysis. 563 spatially thinned presence points of G. lucidum were projected with three bio-climatic time frames, namely current, 2050, and 2070, and four Representative Concentration Pathways (RCPs), namely 2.6, 4.5, 6.0, and 8.5, as well as non-climatic variables (surface soil features, land use, rooting depth and water storage capacity at rooting zone). We observed excellent model qualities as the Area Under the receiver operating Curve (AUC) approached 0.90. Random Forest was identified as the best individual algorithm, while the Maxent entropy was identified as the least effective for Ecological Niche Modelling (ENM) of G. lucidum. Globally, under the current bio-climatic and non-bioclimatic projection, optimum habitat for this fungus covers 12510876.3 km2 area while, maximum area (13248546.9 Sq. km.) under this habitat class with future projections was recorded with RCP of 8.5 in 2070. The primary determinants of its current global distribution were ecosystem rooting depth, water storage capacity, and precipitation seasonality. While, with two future bioclimatic time frames and RCPs, Isothermality was identified as the most influential predictor. Based on our assessment, it has been determined that this particular fungus is exhibiting a persistent pattern of proliferation across the regions of Europe, America, and certain areas of India. The present investigation sought to underscore the importance of discerning the native habitats of this species, taking into account both current and anticipated climatic shifts. This knowledge is essential for effectively coordinating the artificial cultivation and natural harvesting of G. lucidum, which is necessary to meet the ever-increasing industrial demands.

Putra, A. R., K. A. Hodgins, and A. Fournier‐Level. 2023. Assessing the invasive potential of different source populations of ragweed (Ambrosia artemisiifolia L.) through genomically informed species distribution modelling. Evolutionary Applications.

The genetic composition of founding populations is likely to play a key role in determining invasion success. Individual genotypes may differ in habitat preference and environmental tolerance, so their ability to colonize novel environments can be highly variable. Despite the importance of genetic variation on invasion success, its influence on the potential distribution of invaders is rarely investigated. Here, we integrate population genomics and ecological niche models (ENMs) into a single framework to predict the distribution of globally invasive common ragweed (Ambrosia artemisiifolia) in Australia. We identified three genetic clusters for ragweed and used these to construct cluster‐specific ENMs and characterize within‐species niche differentiation. The potential range of ragweed in Australia depended on the genetic composition and continent of origin of the introduced population. Invaders originating from warmer, wetter climates had a broader potential distribution than those from cooler, drier ones. By quantifying this change, we identified source populations most likely to expand the ragweed distribution. As prevention remains the most effective method of invasive species management, our work provides a valuable way of ranking the threat posed by different populations to better inform management decisions.

Silva-Valderrama, I., J.-R. Úrbez-Torres, and T. J. Davies. 2024. From host to host: The taxonomic and geographic expansion of Botryosphaeriaceae. Fungal Biology Reviews 48: 100352.

Fungal pathogens are responsible for 30% of emerging infectious diseases (EIDs) in plants. The risk of a pathogen emerging on a new host is strongly tied to its host breadth; however, the determinants of host range are still poorly understood. Here, we explore the factors that shape host breadth of plant pathogens within Botryosphaeriaceae, a fungal family associated with several devastating diseases in economically important crops. While most host plants are associated with just one or a few fungal species, some hosts appear to be susceptible to infection by multiple fungi. However, the variation in the number of fungal taxa recorded across hosts is not easily explained by heritable plant traits. Nevertheless, we reveal strong evolutionary conservatism in host breadth, with most fungi infecting closely related host plants, but with some notable exceptions that seem to have escaped phylogenetic constraints on host range. Recent anthropogenic movement of plants, including widespread planting of crops, has provided new opportunities for pathogen spillover. We suggest that constraints to pathogen distributions will likely be further disrupted by climate change, and we may see future emergence events in regions where hosts are present but current climate is unfavorable.

Finegan, B., D. Delgado, A. L. Hernández Gordillo, N. Zamora Villalobos, R. Núñez Florez, F. Díaz Santos, and S. Vílchez Mendoza. 2024. Multi-dimensional temperature sensitivity of protected tropical mountain rain forests. Frontiers in Forests and Global Change 6.

Introduction Tropical mountain rain forests (TMRF, natural forests at > 300 m asl) are globally important for biodiversity and ecosystem services and are believed to be highly vulnerable to climate change. But there are no specific approaches for rigorous assessment of their vulnerability at the landscape and local scales necessary for management for adaptation. We address the challenge of evaluating the ecological sensitivity to temperature of TMRF, applying a multidimensional approach in protected areas over a 440–2,950 m asl altitudinal gradient in Costa Rica, synthesizing results of a long-term research programme (2012-present). We evaluate the sensitivity to the current spatial temperature gradient of eleven ecosystem properties in three categories: forest composition and diversity, thermal characteristics of forest stands and forest structure and dynamics.MethodsData are from 29 to 32 plots of 50 m x 50 m (0.25 ha) distributed over the gradient, in which all trees, palms and tree ferns ≥ 10 dbh are identified to species and measured for recruitment, growth and mortality. An experimental study of leaf litter decomposition rates was carried out in twelve plots. Current and future (SSP 585, 2070) values of mean annual temperatures MAT were obtained from online climate surfaces. Thermal characteristics of forest stands were determined using MATs of species occurrences in GBIF and include a new index, the Community Thermal Capital Index (CTCI), calculated as CTI-MAT.ResultsWe classified degrees of sensitivity to temperature as very weak, weak, moderate or substantial. All eleven ecosystem properties are substantially sensitive, so changes in their values are expected under rising temperatures. Species density, the community temperature index CTI, tree recruitment and mortality rates and leaf litter decomposition rates are positively related to temperature, while the community weighted mean thermal niche breadth, the CTCI, net basal area increments, stand basal area and carbon in aboveground biomass are negatively related. Results point to zones of vulnerability in the protected areas.DiscussionIn montane forests, positive values of the CTCI–climate credit– robust basal area growth and very low mortality and leaf litter decomposition rates suggest healthy ecosystems and no risk of mountaintop extinction. Lowland forests may be vulnerable to degradation and biotic attrition, showing current basal area loss, high mortality and climate debts. National and local actors are participating in a process of adoption of the sensitivity analysis and recommendations regarding zones of vulnerability.

Schertler, A., B. Lenzner, S. Dullinger, D. Moser, J. L. Bufford, L. Ghelardini, A. Santini, et al. 2023. Biogeography and global flows of 100 major alien fungal and fungus‐like oomycete pathogens. Journal of Biogeography.

AbstractAimSpreading infectious diseases associated with introduced pathogens can have devastating effects on native biota and human livelihoods. We analyse the global distribution of 100 major alien fungal and oomycete pathogens with substantial socio‐economic and environmental impacts and examine their taxonomy, ecological characteristics, temporal accumulation trajectories, regional hot‐ and coldspots of taxon richness and taxon flows between continents.LocationGlobal.TaxonAlien/cryptogenic fungi and fungus‐like oomycetes, pathogenic to plants or animals.MethodsTo identify over/underrepresented classes and phyla, we performed Chi2 tests of independence. To describe spatial patterns, we calculated the region‐wise richness and identified hot‐ and coldspots, defined as residuals after correcting taxon richness for region area and sampling effort via a quasi‐Poisson regression. We examined the relationship with environmental and socio‐economic drivers with a multiple linear regression and evaluated a potential island effect. Regional first records were pooled over 20‐year periods, and for global flows the links between the native range to the alien regions were mapped.ResultsPeronosporomycetes (Oomycota) were overrepresented among taxa and regional taxon richness was positively correlated with area and sampling effort. While no island effect was found, likely due to host limitations, hotspots were correlated with human modification of terrestrial land, per capita gross domestic product, temperate and tropical forest biomes, and orobiomes. Regional first records have increased steeply in recent decades. While Europe and Northern America were major recipients, about half of the taxa originate from Asia.Main ConclusionsWe highlight the putative importance of anthropogenic drivers, such as land use providing a conducive environment, contact opportunities and susceptible hosts, as well as economic wealth likely increasing colonisation pressure. While most taxa were associated with socio‐economic impacts, possibly partly due to a bias in research focus, about a third show substantial impacts to both socio‐economy and the environment, underscoring the importance of maintaining a wholescale perspective across natural and managed systems.

Qin, F., T. Xue, X. Zhang, X. Yang, J. Yu, S. R. Gadagkar, and S. Yu. 2023. Past climate cooling and orogenesis of the Hengduan Mountains have influenced the evolution of Impatiens sect. Impatiens (Balsaminaceae) in the Northern Hemisphere. BMC Plant Biology 23.

Background Impatiens sect. Impatiens is distributed across the Northern Hemisphere and has diversified considerably, particularly within the Hengduan Mountains (HDM) in southwest China. Yet, the infra-sectional phylogenetic relationships are not well resolved, largely due to limited taxon sampling and an insufficient number of molecular markers. The evolutionary history of its diversification is also poorly understood. In this study, plastome data and the most complete sampling to date were used to reconstruct a robust phylogenetic framework for this section. The phylogeny was then used to investigate its biogeographical history and diversification patterns, specifically with the aim of understanding the role played by the HDM and past climatic changes in its diversification. Results A stable phylogeny was reconstructed that strongly supported both the monophyly of the section and its division into seven major clades (Clades I-VII). Molecular dating and ancestral area reconstruction suggest that sect. Impatiens originated in the HDM and Southeast China around 11.76 Ma, after which different lineages dispersed to Northwest China, temperate Eurasia, and North America, mainly during the Pliocene and Pleistocene. An intercontinental dispersal event from East Asia to western North America may have occurred via the Bering Land Bridge or Aleutian Islands. The diversification rate was high during its early history, especially with the HDM, but gradually decreased over time both within and outside the HDM. Multiple linear regression analysis showed that the distribution pattern of species richness was strongly associated with elevation range, elevation, and mean annual temperature. Finally, ancestral niche analysis indicated that sect. Impatiens originated in a relatively cool, middle-elevation area. Conclusions We inferred the evolutionary history of sect. Impatiens based on a solid phylogenetic framework. The HDM was the primary source or pump of its diversity in the Northern Hemisphere. Orogeny and climate change may have also shaped its diversification rates, as a steady decrease in the diversification rate coincided with the uplift of the HDM and climate cooling. These findings provide insights into the distribution pattern of sect. Impatiens and other plants in the Northern Hemisphere.

Zhang, H., W. Guo, and W. Wang. 2023. The dimensionality reductions of environmental variables have a significant effect on the performance of species distribution models. Ecology and Evolution 13.

How to effectively obtain species‐related low‐dimensional data from massive environmental variables has become an urgent problem for species distribution models (SDMs). In this study, we will explore whether dimensionality reduction on environmental variables can improve the predictive performance of SDMs. We first used two linear (i.e., principal component analysis (PCA) and independent components analysis) and two nonlinear (i.e., kernel principal component analysis (KPCA) and uniform manifold approximation and projection) dimensionality reduction techniques (DRTs) to reduce the dimensionality of high‐dimensional environmental data. Then, we established five SDMs based on the environmental variables of dimensionality reduction for 23 real plant species and nine virtual species, and compared the predictive performance of those with the SDMs based on the selected environmental variables through Pearson's correlation coefficient (PCC). In addition, we studied the effects of DRTs, model complexity, and sample size on the predictive performance of SDMs. The predictive performance of SDMs under DRTs other than KPCA is better than using PCC. And the predictive performance of SDMs using linear DRTs is better than using nonlinear DRTs. In addition, using DRTs to deal with environmental variables has no less impact on the predictive performance of SDMs than model complexity and sample size. When the model complexity is at the complex level, PCA can improve the predictive performance of SDMs the most by 2.55% compared with PCC. At the middle level of sample size, the PCA improved the predictive performance of SDMs by 2.68% compared with the PCC. Our study demonstrates that DRTs have a significant effect on the predictive performance of SDMs. Specifically, linear DRTs, especially PCA, are more effective at improving model predictive performance under relatively complex model complexity or large sample sizes.