## Upper Mississippi River Restoration Program## Long Term Resource Monitoring |
||

LTRM Statistics

**Calculating sampling weights**

Sampling weights are needed when estimating means or trends from multiple LTRM strata. As example, consider sites derived from two different areas. In the first area, 5% of potential sites are sampled while, in the second area, 10% are sampled. A simple average will overemphasize contributions from the second area. To avoid this bias, samples from the second area need to be downweighted by a factor of two (relative to those from the first area). The sample weights in this example represent the reciprocals of the sampling probabilities and are 20 for area 1 and 10 for area 2. Hence, the sampling weights for the first area are double those for the second. Note that sampling weights may differ even when the numbers of samples from different areas or strata are the same.

Sampling weights for LTRM variables must be calculated on a variable-by-variable basis because variables may differ in levels of missing data and hence in sample size. Calculating sampling weights requires population sizes—the total number of possible samples—on a per-stratum basis. Totals are provided at http://www.umesc.usgs.gov/ltrmp/stats/population_sizes.xls.

__Calculating sampling weights for fisheries data__

The LTRM fisheries sampling design is divided into three sampling periods: June 15 - July 31, August 1 - September 14, and September 15 - October 31. While these periods represent a temporal stratification, the LTRM has historically treated data from multiple periods as arising from a single stratum; this assumption is made in the code supplied below and in subsequent pages. Users may modify the code to treat periods as temporal strata by adding a ‘period’ variable to the capN_h file and by then adding ‘period’ to all ‘by’ statements. Contact the authors at the bottom of this page with questions.

The fisheries component has not consistently sampled within the first period. As we expect means and/or sampling variances to vary by period, comparisons among years in which the first period was not consistently sampled should be made using data from periods 2 and 3 only. This guidance also applies to the estimation of trends across multiple years.

The SAS code for calculating sample weights for bluegill data may be adapted for use with multiple gears, multiple species and length data. The code contains embedded comments.

__Calculating sampling weights for macroinvertebrate data__

This SAS code calculates sample weights for mayfly data for all Navigation Pools, years and strata.

__Calculating sample weights for submersed aquatic vegetation data__

This SAS code calculates sample weights for wild celery in Pool 13 (all years and strata).

__Calculating sampling weights for water data__

This SAS code calculates sample weights for chlorophyll *a* data for all Navigation Pools, seasonal sampling episodes (“episodes”), years and strata. Population totals for the water quality component varied substantially between the winter of 1994-95 and spring 1995 (totals are provided at http://www.umesc.usgs.gov/ltrmp/stats/population_sizes.xls). To ensure that multi-year analyses weight years equally, we generate two sets of weights for the water quality component—those that sum to population totals (to be used only with within-year analyses) and those that yield equal sums for all years (may be used for all calculations).

**Contact:** Questions or comments may be directed
to Brian Gray, LTRM statistician, Upper Midwest Environmental Sciences
Center, La Crosse, Wisconsin, at brgray@usgs.gov.

Page Last Modified: