Upper Mississippi River Restoration ProgramLong Term Resource Monitoring |
||
Calculating sampling weights
Sampling weights are needed when estimating means or trends from multiple LTRM strata. As example, consider sites derived from two different areas. In the first area, 5% of potential sites are sampled while, in the second area, 10% are sampled. A simple average will overemphasize contributions from the second area. To avoid this bias, samples from the second area need to be downweighted by a factor of two (relative to those from the first area). The sample weights in this example represent the reciprocals of the sampling probabilities and are 20 for area 1 and 10 for area 2. Hence, the sampling weights for the first area are double those for the second. Note that sampling weights may differ even when the numbers of samples from different areas or strata are the same.
Sampling weights for LTRM variables must be calculated on a variable-by-variable basis because variables may differ in levels of missing data and hence in sample size. Calculating sampling weights requires population sizes—the total number of possible samples—on a per-stratum basis. Totals are provided at http://www.umesc.usgs.gov/ltrmp/stats/population_sizes.xls.
Calculating sampling weights for fisheries data
The LTRM fisheries sampling design is divided into three sampling periods: June 15 - July 31, August 1 - September 15, and September 16 - October 31. While these periods represent a temporal stratification, the LTRM has historically treated data from multiple periods as arising from a single stratum; this assumption is made in the code supplied below and in subsequent pages. Users may modify the code to treat periods as temporal strata by adding a ‘period’ variable to the capN_h file and by then adding ‘period’ to all ‘by’ statements. Contact the authors at the bottom of this page with questions.
The fisheries component has not consistently sampled within the first period. As we expect means and/or sampling variances to vary by period, comparisons among years in which the first period was not consistently sampled should be made using data from periods 2 and 3 only. This guidance also applies to the estimation of trends across multiple years.
The SAS code for calculating sample weights for bluegill data may be adapted for use with multiple gears, multiple species and length data. The code contains embedded comments.
Calculating sampling weights for macroinvertebrate data
This SAS code calculates sample weights for mayfly data for all Navigation Pools, years and strata.
Calculating sample weights for submersed aquatic vegetation data
This SAS code calculates sample weights for wild celery in Pool 13 (all years and strata).
Calculating sampling weights for water data
This SAS code calculates sample weights for chlorophyll a data for all Navigation Pools, seasonal sampling episodes (“episodes”), years and strata. Population totals for the water quality component varied substantially between the winter of 1994-95 and spring 1995 (totals are provided at http://www.umesc.usgs.gov/ltrmp/stats/population_sizes.xls). To ensure that multi-year analyses weight years equally, we generate two sets of weights for the water quality component—those that sum to population totals (to be used only with within-year analyses) and those that yield equal sums for all years (may be used for all calculations).
Contact: Questions or comments may be directed to Brian Gray, LTRM statistician, Upper Midwest Environmental Sciences Center, La Crosse, Wisconsin, at brgray@usgs.gov.