Selecting a distributional assumption for modelling relative densities of benthic macroinvertebrates

Upper Midwest Environmental Sciences Center

Home

Who We Are

Director's Overview

History of Center

Staff Directory

Map to Center

Organization Chart

Center Phone List

Center Video

Cooperators

Employment

State Field Stations

Lake City, Minnesota

La Crosse, Wisconsin

Bellevue, Iowa

Great Rivers, Illinois

Open River, Missouri

Havana, Illinois

Field Station Directory

Science Programs

Amphibian and Reptiles

Aquatic Invasive Species Control

Conservation Ecology

Fisheries Restoration

Long Term Resource Monitoring

Native Mussels

Resource Mapping and Spatial Analysis

River Productivity

Spatial Ecology

Wildlife Toxicology

Maps, Tools, and Databases

Aquatic Features and Navigation Structure

Data Visualization Tools

Decision Support Systems

Fisheries and Macroinvertebrates

Land Cover/Use Data

LTRM Sampling Data

Maps, Quadrangles, and Figures

Photography and Video

Sediments, Contaminants, and Nutrients

Bathymetry and Elevation

Products and Publications

UMESC Publications

UMESC Reports

UMESC USGS Reports

UMESC Fact Sheets

LTRM Reports and Publications

Outreach and Education

News and Events

Education

Contact Us

Staff Directory

UMESC Phone List

Contact the U.S. Geological Survey

Search

UMESC Publications

Selecting a distributional assumption for modelling relative densities of benthic macroinvertebrates

Gray, B. R., 2005, Selecting a distributional assumption for modelling relative densities of benthic macroinvertebrates: Ecological Modelling, v. 185, p. 1-12.

Abstract

The selection of a distributional assumption suitable for modelling macroinvertebrate density data is typically challenging. Macroinvertebrate data often exhibit substantially larger variances than expected under a standard count assumption, that of the Poisson distribution. Such overdispersion may derive from multiple sources, including heterogeneity of habitat (historically and spatially), differing life histories for organisms collected within a single collection in space and time, and autocorrelation. Taken to extreme, heterogeneity of habitat may be argued to explain the frequent large proportions of zero observations in macroinvertebrate data. Sampling locations may consist of habitats defined qualitatively as either suitable or unsuitable. The former category may yield random or stochastic zeroes and the latter structural zeroes. Heterogeneity among counts may be accommodated by treating the count mean itself as a random variable, while extra zeroes may be accommodated using zero-modified count assumptions, including zero-inflated and two-stage (or hurdle) approaches. These and linear assumptions (following log- and square root-transformations) were evaluated using 9 years of mayfly density data from a 52 km, ninthorder reach of the Upper Mississippi River (n = 959). The data exhibited substantial overdispersion relative to that expected under a Poisson assumption (i.e. variance:mean ratio = 23>>1), and 43% of the sampling locations yielded zero mayflies. Based on the Akaike Information Criterion (AIC), count models were improved most by treating the count mean as a random variable (via a Poisson-gamma distributional assumption) and secondarily by zero modification (i.e. improvements in AIC values = 9184 units and 47.48 units, respectively). Zeroes were underestimated by the Poisson, log-transform and square roottransform models, slightly by the standard negative binomial model but not by the zero-modified models (61%, 24%, 32%, 7%, and 0%, respectively). However, the zero-modified Poisson models underestimated small counts (1≤y≤4) and overestimated intermediate counts (7≤y≤23). Counts greater than zero were estimated well by zero-modified negative binomial models, while counts greater than one were also estimated well by the standard negative binomial model. Based on AIC and percent zero estimation criteria, the two-stage and zero-inflated models performed similarly. The above inferences were largely confirmed when the models were used to predict values from a separate, evaluation data set (n = 110). An exception was that, using the evaluation data set, the standard negative binomial model appeared superior to its zero-modified counterparts using the AIC (but not percent zero criteria). This and other evidence suggest that a negative binomial distributional assumption should be routinely considered when modelling benthic macroinvertebrate data from low flow environments. Whether negative binomial modelsshould themselves be routinely examined for extra zeroes requires, from a statistical perspective, more investigation. However, this question may best be answered by ecological arguments that may be specific to the sampled species and locations.

Keywords

Hexagenia; Hurdle models; LTRMP; Mayflies; Negative binomial distribution; Two-stage models; Zero-inflated count models