and classifying the watersheds to be sampled. People drink water from a particular watershed, they do not drink average regional water. If this is not in the initial defining area, then the data have lost a considerable amount of utility. The use of the GIS tools discussed should allow these logistics to be addressed. Other domains probably are not as useful as being primary domains. The levels of pesticides in drinking water can be estimated for all or most other domains by aggregating the data based on pesticide use and watersheds.
This survey design seems to be about characterizing a distribution of distributions – or at least about characterizing the distribution of mean values of local site concentration distributions. The design seems to focus strongly on the larger distribution, and on a concern for characterizing the upper percentiles of this distribution. Choosing as a data quality criterion the goal to include the 95th percentile of the parent distribution with 95% confidence (and indeed, in each domain) places strong demands on the statistical design, but the choice of this specific criterion is not discussed. What is so magical about the 95th percentile? Why not be satisfied with the 95th percentile with lesser confidence? What should the relationship be between this data quality standard and risk-based standards such as MCLs?
One important thing that is lost in all of this is the issue of the sampling needed to adequately characterize the mean value of the local distribution – the individual data values which make up the overall distribution. Variability determines the number of samples needed to characterize the mean with a given level of confidence. How confident do we need to be about the individual mean values? There are statements in the Agency background document that indicate awareness that sampling needs will be greater for flowing-water systems than for reservoirs and other static-water systems, but no detailed investigation of this issue seems to have been made yet. One SAP member noted it would be good to see more attention paid to the data quality of the individual local mean concentrations and less focus on pinning down the upper tail of the larger distribution.
Some concern was expressed that question 2 seems contrary to the starting point of the survey. Should the amount of money spent be specified first, and the survey design and data quality be determined from that, or should the design and data quality be determined for the needs of the study at all necessary domains, the survey designed accordingly, then the cost established? This will lead to a more expensive program, but will provide the data that is really needed to address the questions. In reality, the design/cost/data quality should be conducted with an iterative approach until the optimum survey is designed. Starting with a predetermined cost is not the scientifically sound way of approaching this survey.
Several of the supporting documents describe stratification as a technique used to form relatively homogeneous groups, to reduce variance and increase confidence in results of a sampling program, for a given number of samples. However, the stratification schemes discussed in the background documents do not appear to fill this role. It is difficult to see how ACPA's suggestion to create five domains, one national and four regional, can identify or benefit from