question is likely to be a manifestation of clinical depression or of HIV disease progression. Criterion 3 was adopted to ensure that the data being aggregated were estimates of current mood disor- ders and were not overly influenced by past depression. Inclusion of rates for periods of 12 months or longer would lead to an over- estimation of current mood disorder for both groups. Criterion 4 was adopted to protect against counting individuals more than once and against possible influences of attrition. Criterion 5 was adopted to avoid the obvious overestimation of pathology that would occur if participants had been recruited through consulta- tions for psychiatric evaluation.

Using these criteria, we identified 10 studies for review; the 10 studies provided information on a total of 2,596 participants (Ta- ble 1). As Table 1 shows, every study provided information on ma- jor depressive disorder. However, only half of the studies provided rates of dysthymic disorder. Eight studies used DSM-III-R criteria (4, 8–14), one used DSM-III (7), and one used DSM-IV (6). In addi- tion, six studies exclusively recruited gay men, allowing us to ex- amine this subpopulation separately. Unfortunately, no other subpopulation was specifically represented in multiple studies. As for the remaining studies, three involved mixed groups and only one involved intravenous drug users.

## Statistical Analyses

Three separate meta-analytic techniques were used to reexam- ine the 10 studies. The first technique was selected because of its ease of interpretation, straightforward methods, and statistical simplicity. Frequently called the vote-counting technique, it in- volves the simple aggregation of caseness at the participant level. All of the identified studies provided enough information to de- termine the data of interest. The question at hand involved a two- by-two contingency: the presence or absence of depressive disor- der and the presence or absence of HIV infection. Using the vote- counting technique, we added the numbers of participants in each of the four contingency cells. These data were then reana- lyzed as if we had one very large investigation. Although this type of analysis has great intuitive appeal and is quite easily carried out and interpreted, it is not without limitations. In particular, it is vulnerable to a type of bias known as “Simpson’s paradox,” which can result when the studies that are aggregated differ greatly in the relative number of subjects across study groups and in rates of disorder within groups (see reference 24, pp. 93–98). A technique for the correction of this bias exists (see reference 25, p. 69), but the resulting analysis becomes cumbersome and the straightfor- ward nature of this method is lost.

The second and third meta-analytic methods we used involve the aggregation of study effect sizes and probability levels. First, we will describe the method of calculating the effect size. Effect sizes for each of the identified studies were determined with the approach described by Schafer (26). This effect size is computed by taking the natural log of the odds ratio for co-occurrence of two variables observed in each study. Before computing these val- ues, 0.5 was added to each cell so that undefined values were not possible. A zero value indicates complete independence of the two variables. Negative or positive values indicate the direction of association. These effect sizes are weighted by the number of sub- jects in the respective studies, and the average effect size and standard error are computed. Finally, a confidence interval is used to statistically test the average effect size.

We used another technique as a check of the effect-size method and to estimate the probability that a relationship was observed merely by chance. Many methods have been developed to com- bine the probability levels from multiple studies (see reference 24). Of these, the inverse normal method (25, pp. 39–40) is rou- tinely applicable and has the advantage of being able to incorpo- rate weights based on the number of subjects. This method con- verts the probability levels from each investigation into z scores.

### Am J Psychiatry 158:5, May 2001

JEFFREY A. CIESLA AND JOHN E. ROBERTS

These scores are weighted, summed, and divided by the square root of the number of studies. The probability associated in a nor- mal distribution with the obtained value of z becomes the overall probability level of the observed relationship. In contrast to the technique that uses effect sizes, the inverse normal method pro- vides a precise index of probability, although it does not provide an index of the strength of the observed relationship.

# Results

## HIV and Presence of Depressive Disorders

Our first question was whether there was a relationship between HIV status and the presence of major depressive disorder. Stated another way, are HIV-positive individuals at a higher relative risk for developing major depressive disorder than HIV-negative individuals? Using the vote- counting method, we found a highly significant relation- ship (χ^{2}=14.04, df=1, N=2,596, p<0.001). Whereas 9.4% of HIV-positive participants (N=160 of 1,700) met criteria for current major depressive disorder, only 5.2% of the com- parison participants (N=47 of 896) did. The effect-size method showed that the average weighted effect size was approximately 0.69 (a moderate to large effect size [17]), with a standard error of 0.21. By transforming this statistic into an odds ratio, we found that that HIV-positive individ- uals were 1.99 times more likely to be diagnosed with ma- jor depressive disorder than HIV-negative individuals. Thus, the associated 95% confidence interval was 0.28–1.1 (significant at p<0.05). This translates into a confidence interval of 1.32–3.00 for the odds ratio. The associated 99% confidence interval was 0.15–1.23. Average weighted rates of major depressive disorder were 8.1% for the HIV-posi- tive group and 5.2% for the HIV-negative group. Finally, to estimate the probability of this finding given a true null re- lationship, we employed the inverse normal method. The result was a highly significant relationship (p[z≥3.79] <0.0001, N=10). All three meta-analytic methods con- verged on the conclusion that there is a statistically signif- icant relationship between the risk for major depressive disorder and HIV status.

An important limitation of any literature review is what has become known as the file-drawer problem (27), in which studies with significant results may be more likely to find their way into academic journals, whereas studies with null results may be more likely to remain in the file drawers of the investigators. A procedure has been pro- posed by Orwin (28) to calculate the number of null results that are necessary to reduce the average effect size to a negligible level. Using this method, we found that the fail- safe N for the relationship between major depressive dis- order and HIV status was 17. Thus, 17 studies with null re- sults would be needed to overturn the previously signifi- cant effect (to raise the probability level above 0.05). Although this number is not impressively large, a few things must be remembered. First, these 17 studies would need to have an average weight equal to the existing aver- age weight for the 10 studies included in the analysis. Put

727