An effect size statistic d (Hedges & Olkin, 1985) was calculated for each relevant outcome by subtracting the mean score for comparison participants from the mean score for siblings with a chronic illness and by dividing that sum by a pooled standard deviation. Normative data provided by the primary authors in the published studies were substituted for data from comparison participants when the latter were not provided. If means and standard deviations were not reported, effect sizes were calculated from summary statistics (e. g. , t statistics, p values) by employing the metaanalysis software package D-Stat (Johnson, 1989).
Effect sizes were weighted by the reciprocal of their variance as recommended by Hedges and Olkin (1985). When no data were reported in a primary study but the difference between the sibling and comparison groups was said to be nonsignificant, an effect size of zero was recorded. For all analyses, negative effect sizes reflect less positive functioning for siblings of children with a chronic illness relative to comparison children or normative data. Effect sizes from the same study, chronic illness, dependent measure category, and method of data collection were combined and averaged.
The resulting set of 103 outcome-level effect sizes was evaluated for their statistical significance (95% confidence interval around zero) and their homogeneity (Hedges & Olkin, 1985). The effect sizes from the 51 studies were also examined where appropriate to do so. The overall test for homogeneity (QT) assesses whether a set of effect sizes is internally consistent. For most meta-analyses, homogeneity of the set of effect sizes is not achieved without some combination of outlier analysis and partitioning of effect sizes into smaller clusters on the basis of moderator variables.
The identification and removal of outliers are appropriate if homogeneity can be achieved by deleting no more than 20% of the effect sizes (Hedges & Olkin, 1985). Regardless of the outcome of the overall test of homogeneity, however, tests of moderator variables are justified when based on theoretical considerations (see Hall & Rosenthal, 1991). After the overall test for homogeneity, effect size clusters were created on the basis of moderator variables (e. g. , method of data collection).
The homogeneity of effect sizes within clusters (QW) and differences between mean effect sizes across clusters (QB) were calculated. A significant QB value implies differences in the mean effect sizes associated with the effect size clusters. Interpretation of such an outcome is less clear if there are significant differences in effect sizes within one or more clusters (the QW statistic for each cluster). When moderator variables were continuous (e. g. , sample size), correlations between effect sizes and the moderator variables were calculated. Results
The results are divided into three sections. The first section reports on tests of effect sizes: tests of the magnitude of mean effect sizes, tests for publication bias, and tests of homogeneity of effect sizes. The second section examines the role of methodological moderator variables, specifically, year of publication, method of data collection, and comparison group versus normative data. The third section considers substantive moderator variables, specifically, categories of dependent measures, differences by chronic illness, and effects of gender, birth order, and age of sibling.