Header

UZH-Logo

Maintenance Infos

Testing the ability of species distribution models to infer variable importance


Smith, Adam B; Santos, Maria J (2020). Testing the ability of species distribution models to infer variable importance. Ecography, 43(12):1801-1813.

Abstract

Models of species’ distributions and niches are frequently used to infer the importance of range‐ and niche‐defining variables. However, the degree to which these models can reliably identify important variables and quantify their influence remains unknown. Here we use a series of simulations to explore how well models can 1) discriminate between variables with different influence and 2) calibrate the magnitude of influence relative to an ‘omniscient’ model. To quantify variable importance, we trained generalized additive models (GAMs), Maxent and boosted regression trees (BRTs) on simulated data and tested their sensitivity to permutations in each predictor. Importance was inferred by calculating the correlation between permuted and unpermuted predictions, and by comparing predictive accuracy of permuted and unpermuted predictions using AUC and the continuous Boyce index. In scenarios with one influential and one uninfluential variable, models failed to discriminate reliably between variables when training occurrences were < 8–64, prevalence was > 0.5, spatial extent was small, environmental data had coarse resolution and spatial autocorrelation was low, or when pairwise correlation between environmental variables was |r| > 0.7. When two variables influenced the distribution equally, importance was underestimated when species had narrow or intermediate niche breadth. Interactions between variables in how they shaped the niche did not affect inferences about their importance. When variables acted unequally, the effect of the stronger variable was overestimated. GAMs and Maxent discriminated between variables more reliably than BRTs, but no algorithm was consistently well‐calibrated vis‐à‐vis the omniscient model. Algorithm‐specific measures of importance like Maxent's change‐in‐gain metric were less robust than the permutation test. Overall, high predictive accuracy did not connote robust inferential capacity. As a result, requirements for reliably measuring variable importance are likely more stringent than for creating models with high predictive accuracy.

Abstract

Models of species’ distributions and niches are frequently used to infer the importance of range‐ and niche‐defining variables. However, the degree to which these models can reliably identify important variables and quantify their influence remains unknown. Here we use a series of simulations to explore how well models can 1) discriminate between variables with different influence and 2) calibrate the magnitude of influence relative to an ‘omniscient’ model. To quantify variable importance, we trained generalized additive models (GAMs), Maxent and boosted regression trees (BRTs) on simulated data and tested their sensitivity to permutations in each predictor. Importance was inferred by calculating the correlation between permuted and unpermuted predictions, and by comparing predictive accuracy of permuted and unpermuted predictions using AUC and the continuous Boyce index. In scenarios with one influential and one uninfluential variable, models failed to discriminate reliably between variables when training occurrences were < 8–64, prevalence was > 0.5, spatial extent was small, environmental data had coarse resolution and spatial autocorrelation was low, or when pairwise correlation between environmental variables was |r| > 0.7. When two variables influenced the distribution equally, importance was underestimated when species had narrow or intermediate niche breadth. Interactions between variables in how they shaped the niche did not affect inferences about their importance. When variables acted unequally, the effect of the stronger variable was overestimated. GAMs and Maxent discriminated between variables more reliably than BRTs, but no algorithm was consistently well‐calibrated vis‐à‐vis the omniscient model. Algorithm‐specific measures of importance like Maxent's change‐in‐gain metric were less robust than the permutation test. Overall, high predictive accuracy did not connote robust inferential capacity. As a result, requirements for reliably measuring variable importance are likely more stringent than for creating models with high predictive accuracy.

Statistics

Citations

Dimensions.ai Metrics
1 citation in Web of Science®
1 citation in Scopus®
Google Scholar™

Altmetrics

Downloads

8 downloads since deposited on 10 Dec 2020
8 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Journal Article, refereed, original work
Communities & Collections:07 Faculty of Science > Institute of Geography
Dewey Decimal Classification:910 Geography & travel
Scopus Subject Areas:Life Sciences > Ecology, Evolution, Behavior and Systematics
Uncontrolled Keywords:Ecology, Evolution, Behavior and Systematics
Language:English
Date:1 December 2020
Deposited On:10 Dec 2020 15:43
Last Modified:11 Dec 2020 21:01
Publisher:Wiley-Blackwell Publishing, Inc.
ISSN:0906-7590
OA Status:Gold
Free access at:Publisher DOI. An embargo period may apply.
Publisher DOI:https://doi.org/10.1111/ecog.05317

Download

Gold Open Access

Download PDF  'Testing the ability of species distribution models to infer variable importance'.
Preview
Content: Published Version
Language: English
Filetype: PDF
Size: 2MB
View at publisher
Licence: Creative Commons: Attribution 3.0 Unported (CC BY 3.0)