Upper Mississippi River Restoration Program

Upper Mississippi River Restoration Program

Long Term Resource Monitoring

 

LTRM Statistics
     Estimating means and temporal trends using LTRMP data: details with examples

Estimating population means - Estimating a mean from multiple years

Estimating a mean and standard error from LTRM data over multiple years can provide an estimate of “background” conditions for comparisons among locations or over time. However, annual means will vary among years and so precision estimates of means from multiple years should acknowledge this variation. For example, fish counts will vary not only among sampling locations (within years) but annually at the scale of a population (stratum, pool). Not acknowledging this annual variation typically yields standard errors (SEs) that are too small and confidence intervals that are too narrow.

Another concern associated with estimating means across multiple years is that annual means may be temporally correlated (e.g., means are similar to the means from the previous year). Such correlation should be presumed typical for annual means from the LTRM’s fish, vegetation and possibly macroinvertebrate components, and may be observed with water data (when, for example, annual means exhibit multi-year trends). Failure to address such correlation, when it is present, will yield variance estimates that are too small. We are aware of no easy-to-use, reliable methods for addressing temporal correlation in LTRM means. This is because the LTRM series are short (< 20 years), the correlation is present not in LTRM data per se but in population means (which are not measured), and reflect design factors. For these reasons, we recommend that where visual inspection or analytical methods suggest the presence of temporal correlation among annual means, that variance estimates associated with multi-year means be presumed biased low.  If a temporal trend is present over a given period, then it may not be sensible to calculate a single mean over that period.

Estimating a single mean from multiple years and one stratum

Estimating means across multiple years is simplest when working with a single stratum because stratification and sampling probability effects may then be ignored.

Here we estimate mean chlorophyll a concentration in summer in the main channel of Navigation Pool 8 for the first decade of the LTRM (1993 – 2002). Inspection of sample means from the LTRM water graphical browser suggests no clear evidence of temporal correlation among means. The decadal mean may be estimated using the following code:

proc surveymeans data= WQall alpha = 0.1;
cluster year;
var chlf;
where fs=2 and episode=2 and year lt 2003 and strat=1;
run;

The code above addresses among-year variation among annual means by inclusion of the cluster statement.  The estimated mean (30.97 mg / L, Table 1) is the same regardless of whether annual variation in means is addressed.  However, addressing annual variation increases the SE associated with that mean from 1.38 to 6.65. The larger SE estimate is more realistic, and should be used. Confidence intervals widen when we adjust for clustering not only because the SE increased, but also because the presumed number of independent observations decreased from the number of observations to the number of years (in this case from 239 observations to 10 years).

Table 1. Mean (µ) and standard error (SE) estimates from multi-year (1993 – 2002) datasets from Navigation Pool 8.

Adjusted for clustering within years?                 µ  SE  90% CI
Chlorophyll a, summer, main channel (n = 239)
N 30.97 1.38 28.69, 33.245
Y 30.97 6.65 18.77, 43.16
Bluegill counts, backwater contiguous-shoreline, day electrofishing (n = 157)
N 2.50 0.43 1.79, 3.20
Y 2.50 0.46 1.66, 3.33

The effects of among-year variation on precision estimates were more modest for mean bluegill CPUE in the backwater contiguous stratum of Pool 8. Inspection of sample means from all three sampling periods suggests only modest levels of temporal correlation. Here, adjusting for among-year variation in means increased the SE estimate by 7% and the width of the confidence interval estimate by 18% (Table 1). These bluegill statistics were calculated using the following:

proc surveymeans data=BLGLP8 alpha=0.1;
cluster year;
var catch;
ratio catch / effmin15;
where year lt 2003 and stratum="BWC-S" and period ge 2;
run;

Cluster statements may be added to any of the code for single-stratum analyses provided under Estimating stratum-specific means and Estimating means from portions of one or more strata for a single year.

Estimating a single mean from multiple years and multiple strata

As noted above, estimating a single mean from multiple years requires that variation among annual means be addressed.  Doing so, however, becomes more challenging when means derive from multiple strata.  The problem is that, by definition, clusters (years in this setting) occur within strata while, for the LTRM, strata are sampled within years.  One solution treats LTRM strata as secondary strata, with years (clusters) arising from within a single, primary stratum. This method, which requires specialized software, may be undertaken by adapting the SUDAAN LOGLINK code for fish for use with SUDAAN’s procedures for descriptive statistics from categorical data (CROSSTAB), continuous data (DESCRIPT) and CPUE and other ratio data (RATIO).

A workaround is to treat the possibly minor effects of stratification on precision estimates as ignorable.  In this case, estimating a mean from multiple years and multiple strata becomes substantially easier. In this simpler approach, stratification is ignored but variable sampling probabilities are still addressed. We provide code for this simpler approach below.

Fish data

SAS code for estimating means from multiple years and strata may be adapted from code supplied at Estimating a mean from multiple years. Here we estimate mean bluegill CPUE for the LTRM's first decade (1993 - 2002) in the backwater and impounded shoreline strata of Pool 8. The ratio and weight statements were addressed under Estimating a mean from multiple years and Estimating sampling weights.

proc surveymeans data=BLGLP8wt;
cluster year;
var catch;
ratio catch / effmin15;
weight sweightp_23;
where year lt 2003 and stratum2 in ("BWC-S","IMP-S") and period ge 2;
run;

Macroinvertebrate data

Means of macroinvertebrate data from multiple years and strata are derived by adding a cluster statement to the macroinvertebrate code supplied under Estimating a stratum-specific mean for a single year. The following estimates mean mayfly density for all sampled strata in Pool 26 (all available years). The weight statement was addressed under Estimating sampling weights.

proc surveymeans data= INVERTwt;
cluster year;
var mayflym2;
weight sweight;
where fieldsta=4;
run;

Vegetation data

Means of vegetation data from multiple years and strata are derived by adapting the vegetation code supplied under Estimating a stratum-specific mean for a single year. The following estimates mean site detection of wild celery for all sampled strata (except for isolated backwater stratum) in Pool 13 (for all years in the dataset). The weight statement was addressed under Estimating sampling weights.

proc surveymeans data=VAAM3FS3wt;
cluster year;
var sitedetect;
weight sweight;
where pool="13" and mstratum ne "BWI";
run;

Water quality data

SAS code for estimating water quality means from multiple years and strata may be adapted from code supplied at Estimating a mean from multiple years. Here we estimate mean chlorophyll a for the first decade of water sampling (1993 - 2002) in the backwater and impounded shoreline strata of Pool 8. The weight statement was addressed under Estimating sampling weights.

proc surveymeans data=WQallwt;
cluster year;
var chlf;
weight sweightstd;
where fs=2 and episode=2 and year lt 2003 and strat in (1,2);
run;

PreviousNext

Contact: Questions or comments may be directed to Brian Gray, LTRM statistician, Upper Midwest Environmental Sciences Center, La Crosse, Wisconsin, at brgray@usgs.gov.

Accessibility FOIA Privacy Policies and Notices

Take Pride in America logo USA.gov logo U.S. Department of the Interior | U.S. Geological Survey


Page Last Modified: January 7, 2016 US Army Corps of Engineers USGS Upper Midwest Environmental Sciences Center US Fish and Wildlife Service U.S. Department of Agriculture Natural Resources Conservation Service Minnesota DNR Wisconsin DNR Iowa DNR Illinois Natural History Survey Missouri DC U.S. Environmental Protection Agency