* SAS code for generating multi-strata means - WQ example (pool-wide)set for total nitrogen with SAS program variable); * Steps: 1) generate sampling weights, 2) add to database, with dataset with weights named "wqallwt"; * created 9/23/09 by BG, modified 4/4/13 by JTR; * revisions: no standardized weights are produced, as those are not needed because N has remained constant across years; * Please note that we adjust the number of sampling elements in strata (capn_h) for weighting purposes; * If applying a finite population correction (which we don't below), then the unadjusted capn_h would be used with "totals="; * Users will want to replace the directory for the library name "wq" to their own directory name; * Users will want to replace the SAS dataset name with the name of the dateset to be used; * Output: &var._data; *************set SAS variables******************************************; %let dataset=wqall; *change name of SAS dataset if not wqall; %let var=tn; run; * set WQ parameter of interest; **************generate sample weights****************************** * create population size dataset (uses data from http://www.umesc.usgs.gov/ltrmp/stats/population_sizes.xls); data capN_h; input fs strat capn_h cellarea; output; datalines; 1 1 310 4.00 1 2 2887 0.25 1 3 9203 0.25 1 4 2441 4.00 2 1 314 4.00 2 2 5506 0.25 2 3 7048 0.25 2 5 869 4.00 3 1 675 4.00 3 2 3219 0.25 3 3 11106 0.25 3 5 890 4.00 4 1 1215 4.00 4 2 5983 0.25 4 3 1658 0.25 4 5 44 4.00 5 1 1298 4.00 5 2 1689 0.25 6 1 623 4.00 6 2 558 0.25 6 3 4197 0.25 run; *Because cell area represented by elements differs among strata, an adjustment to the capn_h is made (capn_h * 4 / .25); *The adjusted capn_h produces a variable (a_capn_h) that represents the true proportion of the strata size; data capN_h; set capN_h; a_capn_h=capn_h; if cellarea = 4 then a_capn_h = capn_h * 16; * sort both datasets to allow for analysis and merging; proc sort data=capN_h; by fs strat; proc sort data=&dataset out=wqall; by fs strat year episode; run; * calculate sampling weights; proc means data=wqall noprint; var &var; output out=n_h n=n_h; by fs strat year episode; run; data weights; merge n_h capN_h; by fs strat; if n_h then &var._weight = a_capn_h / n_h; else weight = .; run; ****************add weights to database************************************* * sort both datasets to allow for merging; proc sort data=weights; by fs year episode strat; proc sort data=wqall; by fs year episode strat; run; * merge with WQ datafile, and retain only the variable of interest; data wqallwt; merge wqall weights; by fs year episode strat; keep fs year episode strat &var year &var._weight; if &var='.' then delete; if &var._weight='.' then delete; run;