As shown in column 2, 3, 6 and 7, achieving a confidence interval of + / – 5% would require all states to do a minimum of about 100 inspections unless the observed compliance rate was 90% (as shown in column 4). Even if the confidence level dropped to 85 % (not shown) states would have to complete between 69 (NH) – 106 (NY) inspections assuming an observed compliance rate of 50%.
Decision about sample size for benchmarking performance
The project states initially agreed to a minimum sample size of between 34 – 41 inspections for each state, depending on the state’s universe size. Inspecting this number would allow the project to meet the minimum project goal of benchmarking the performance of each state with a minimum level of precision (+ or – 10%) and with a reasonable level of confidence (90%) and assuming 50% observed compliance rates on each indicator. If performance levels were greater than 50%, the precision of the estimate would increase. As demonstrated in the chart above, states with smaller universes, such as NH would need to complete a smaller number of inspections than states with a larger universe such as NY.
Note: As discussed below, additional inspections are required to identify statistical differences in SQG performance between any two states.
3.1.2 Sample Sizes Needed to Compare Performance Levels Between States
In addition to benchmarking an individual state’s performance, the States Common Measures Project also compared SQG performance results between states.
The issues that affect sample size are different when comparing performance levels between states. They are as follows:
The confidence level (as described above – the likelihood that the observed difference actually exists).
The observed performance rates of the two states (as described above).
The power – this is a new concept, it is the likelihood that the results do not miss a statistically significant difference that is in fact there.
The magnitude of the statistically significant differences that can be detected.
Note: Unlike with benchmarking an individual state’s performance, the number of inspections needed for comparing performance across states does not depend on sample size. See Phase 4 for SQG performance results.
The Massachusetts ERP sample-size calculator was used to calculate sample sizes needed for various assumptions about confidence level, compliance rates, power and the magnitude of the differences that the project states wanted to detect. The results are shown in the chart below.
The States Common Measures Project Final Report