The NIMH Multisite HIV Prevention Trial: Reducing HIV Sexual Risk BehaviorThe National Institute of Mental Health (NIMH) Multisite HIV Prevention Trial Group |

Imputation of 12 Month Outcomes

**INTRODUCTION**

Various methods were used to impute values for missing responses for the 12 month outcomes for the NIMH Multisite HIV Prevention Trial. The analysis of the 12 month study outcomes was then tried using the data containing imputed values for those missing in order to compare the intervention effect to that obtained using the raw data. Overall, approximately 10% to 20% of the 12 month values were imputed.**LAST OBSERVATION FORWARD**

First, a last observation forward (LOF) approach was tried. If the 12 month outcome was missing, the most recent non-missing outcome (6 or 3 months) was used as the imputed value for the 12 month outcome. To be included in this analysis, eligible participants must have completed the baseline interview and at least one of the follow-up interviews. Therefore, participants with only the baseline interview were excluded. Models were fit to the data with imputed values for the main study outcomes at 12 months: unprotected acts, proportion condom use, and consistent condom use. All models included effects for study population, ERG, baseline level of the outcome, and intervention assignment. Both overall and by population results were very similar to those from the models based on the data with no imputed values (Table A.1). A statistically significant intervention effect for unprotected acts, proportion condom use, and consistent condom use remained after this imputation.**BASELINE VALUE FORWARD**

Next, the baseline (before intervention) value of the outcome was used as the imputed value when the 12 month outcome was missing. All eligible, randomized participants with a baseline value for the outcome were included in this analysis. Models as above were again fit to the data with imputed values for the main study outcomes at 12 months. Statistically significant intervention effects continued to be seen for unprotected acts, proportion condom use, and consistent condom use both overall and for each population (Table A.2).**GREEN'S METHOD**

Finally, a strategy outlined by Green, et al. was tried in order to impute missing values for some of the endpoints at 12 months. This method is described in the paper "Community Intervention Trial for Smoking Cessation (COMMIT): I. Cohort Results from a Four-Year Community Intervention" by The COMMIT Research Group, American Journal of Public Health, February 1995, Vol.85, No.2, pp.183-192. Eligible participants who completed a baseline interview and participated in the randomized trial were included in this analysis. It was not necessary for these participants to have completed any of the follow-up interviews in order to be included. First, a value was imputed for missing values of consistent condom use by creating strata within the treatment groups (control and intervention) based on consistent condom use or missing data at 3 and 6 months. Specifically, 9 strata were defined among control and 9 among intervention for every possible response pattern to consistent condom use at 3 and 6 (y/y, y/n, y/missing, n/y, n/n, n/missing, missing/y, missing/n, missing/missing). For the first 8 strata, a probability of consistent condom use was calculated by getting the mean of consistent condom use among those with a non-missing 12 month outcome in each strata. This probability was used to impute a value where the outcome was missing. A random number between 0 and 1 was generated from the uniform distribution, and if the number was less than or equal to the probability, the value for consistent condom use was assigned a 1 (yes) and if the number was greater than the probability, the missing value was assigned a 0 (no). For people missing the outcome at 3, 6 and 12 months (part of the missing/missing strata), new strata were defined based on two baseline variables selected as the most important for predicting consistent condom use at 12 months in a logistic model stepwise procedure. The logistic model was fit to consistent condom use at 12 months and included only these baseline variables (no intervention effect was included): female (yes/no), age category (<35 / 35+), HS degree (y/n), employed (y/n), black (y/n), Hispanic (y/n), never married (y/n), CAGE score (0 or 1 / 2+), drug use in the last 90 days (y/n), injected drugs in the last 90 days (y/n), number of partners (0 or 1 / > 1), frequency of risky acts in the last 90 days (1-10 / 11+), and proportion condom use (0-49% / 50-99%). The categorical variables baseline frequency of unprotected acts and baseline proportion condom use were the most significant and were used to define 4 new strata. Consistent condom use probabilities were then calculated for the 4 strata separately for intervention and control as above using the data for everyone with a non-missing 12 month outcome (including those who had some or all of the intermediate data in order to increase sample size above those with the 3 and 6 months missing and get more stable estimates). These probabilities were then used in the manner described above to impute 12 month values for consistent condom use for the people with no 3, 6 or 12 month outcomes.Models were fit to consistent condom use at 12 months using the data with imputed values for those originally missing, both overall and separately for each population. Effects for study population, ERG, and intervention assignment were included. A statistically significant intervention effect continued to be seen for consistent condom use (Table A.3). Next, values for missing proportion condom use at 12 months were imputed in a manner similar to that described above. In order to define imputation strata for this outcome, proportion condom use was categorized into 2 categories: <.50 and .50+. The mean proportion condom use in each strata was used directly as the imputed value. Models were fit to proportion condom use at 12 months using the data with imputed values, and again significant intervention effects were seen both overall and for each population (Table A.3).

**CONCLUSION**

Thus, imputations of missing 12-month outcomes for the trial were computed by three different methods. Intervention effects were then compared using the observed 12-month outcomes only versus the observed plus computed 12-month outcomes. Results indicated that the significant intervention effects seen with the data and reported in the manuscript were still present overall and for the three study populations using the imputed data.

Participants with Baseline and 12 Months | Participants with Baseline and at Least 1 Follow-up (12 Month Outcome Imputed if Missing)^{3} |
||||||
---|---|---|---|---|---|---|---|

N | Adjusted Mean^{1} |
P-value^{2} |
N^{4} |
Adjusted Mean | P-Value | ||

Number of Unprotected Acts | |||||||

Total |
|||||||

Control | 1438 | 17.2 | 0.0001 | 1672 | 17.2 | 0.0001 | |

Intervention | 1453 | 12.3 | 1679 | 12.5 | |||

STD-Male |
|||||||

Control | 550 | 16.0 | 0.0003 | 657 | 16.5 | 0.0009 | |

Intervention | 575 | 11.6 | 684 | 12.5 | |||

STD-Female |
Control | 333 | 20.4 | 0.008 | 401 | 20.1 | 0.0004 |

Intervention | 334 | 13.8 | 388 | 13.0 | |||

Women |
|||||||

Control | 555 | 16.0 | 0.0001 | 614 | 15.8 | 0.0002 | |

Intervention | 544 | 11.5 | 607 | 11.7 | |||

Proportion Condom Use | |||||||

Total |
|||||||

Control | 1438 | 0.47 | 0.0001 | 1672 | 0.48 | 0.0001 | |

Intervention | 1453 | 0.60 | 1679 | 0.60 | |||

STD-Male |
|||||||

Control | 550 | 0.52 | 0.0001 | 657 | 0.53 | 0.0002 | |

Intervention | 575 | 0.62 | 684 | 0.62 | |||

STD-Female |
|||||||

Control | 333 | 0.44 | 0.0001 | 401 | 0.44 | 0.0001 | |

Intervention | 334 | 0.59 | 388 | 0.59 | |||

Women |
|||||||

Control | 555 | 0.44 | 0.0001 | 614 | 0.45 | 0.0001 | |

Intervention | 544 | 0.57 | 607 | 0.58 | |||

Consistent Condom Use | |||||||

Total |
|||||||

Control | 1438 | 33.5 | 0.0001 | 1672 | 33.7 | 0.0001 | |

Intervention | 1453 | 42.6 | 1679 | 42.1 | |||

STD-Male |
|||||||

Control | 550 | 36.1 | 0.004 | 657 | 36.7 | 0.006 | |

Intervention | 575 | 44.5 | 684 | 44.1 | |||

STD-Female |
|||||||

Control | 333 | 32.9 | 0.04 | 401 | 31.0 | 0.007 | |

Intervention | 334 | 40.8 | 388 | 40.2 | |||

Women |
|||||||

Control | 555 | 31.5 | 0.0004 | 614 | 32.5 | 0.0008 | |

Intervention | 544 | 41.8 | 607 | 41.8 |

^{1}Mean adjusted for study populaton (Total only), ERG, and baseline level of the endpoint in a linear or logistic model.

^{2}P-value for a test of intervention versus control based on square-root transformed outcome for Number of Unprotected Acts.

^{3}If the 12 month outcome was missing, the most recent non-missing follow-up value was used (6 or 3 months). Overall 14% of
the 12 month values were imputed.

^{4}Of those participants eligible for the 12 month follow-up, 305 did not have a 3, 6 or 12 month visit and could not be included.
50 participants are not included due to missing baseline values of the outcomes.

Participants with Baseline and 12 Months | Participants with Baseline (12 Month Outcome Imputed if Missing)^{3} |
|||||
---|---|---|---|---|---|---|

N | Adjusted Mean^{1} |
P-value^{2} |
N^{4} |
Adjusted Mean | P-Value | |

Number of Unprotected Acts | ||||||

Total |
||||||

Control | 1,438 | 17.2 | 0.0001 | 1,835 | 19.9 | 0.0001 |

Intervention | 1,453 | 12.3 | 1,821 | 15.4 | ||

STD-Male |
||||||

Control | 550 | 16.0 | 0.0003 | 767 | 20.6 | 0.0007 |

Intervention | 575 | 11.6 | 766 | 17.0 | ||

STD-Female |
||||||

Control | 333 | 20.4 | 0.008 | 427 | 21.2 | 0.01 |

Intervention | 334 | 13.8 | 423 | 16.1 | ||

Women |
||||||

Control | 555 | 16.0 | 0.0001 | 641 | 18.4 | 0.0001 |

Intervention | 544 | 11.5 | 632 | 12.9 | ||

Proportion Condom Use | ||||||

Total |
||||||

Control | 1,438 | 0.47 | 0.0001 | 1,835 | 0.42 | 0.0001 |

Intervention | 1,453 | 0.60 | 1,821 | 0.52 | ||

STD-Male |
||||||

Control | 550 | 0.52 | 0.0001 | 767 | 0.44 | 0.0001 |

Intervention | 575 | 0.62 | 766 | 0.53 | ||

STD-Female |
||||||

Control | 333 | 0.44 | 0.0001 | 427 | 0.39 | 0.0001 |

Intervention | 334 | 0.59 | 423 | 0.51 | ||

Women |
||||||

Control | 555 | 0.44 | 0.0001 | 641 | 0.41 | 0.0001 |

Intervention | 544 | 0.57 | 632 | 0.52 | ||

Consistent Condom Use | ||||||

Total |
||||||

Control | 1,438 | 33.5 | 0.0001 | 1,850 | 26.2 | 0.0001 |

Intervention | 1,453 | 42.6 | 1,841 | 33.9 | ||

STD-Male |
||||||

Control | 550 | 36.1 | 0.004 | 776 | 25.1 | 0.001 |

Intervention | 575 | 44.5 | 777 | 32.5 | ||

STD-Female |
||||||

Control | 333 | 32.9 | 0.04 | 431 | 25.3 | 0.02 |

Intervention | 334 | 40.8 | 429 | 32.3 | ||

Women |
||||||

Control | 555 | 31.5 | 0.0004 | 643 | 27.2 | 0.001 |

Intervention | 544 | 41.8 | 635 | 35.8 |

^{1}Mean adjusted for study population (Total only), ERG, and baseline level of the endpoint (except Consistent Condom Use) in
a linear or logistic model.

^{2}P-value for a test of intervention versus control.

^{3}If the 12 month outcome was missing, it was imputed using the baseline (before intervention) value of the outcome.

^{4}50 eligible participants are not included for number of unprotected acts and proportion condom use due to missing baseline
values of the outcome.

Participants with Baseline and 12 Months | Participants with Baseline (12 Month Outcome Imputed if Missing)^{3} |
|||||
---|---|---|---|---|---|---|

N | Adjusted Mean^{1} |
P-value^{2} |
N^{4} |
Adjusted Mean | P-Value | |

Proportion Condom Use | ||||||

Total |
||||||

Control | 1,438 | 0.47 | 0.0001 | 1,835 | 0.47 | 0.0001 |

Intervention | 1,453 | 0.60 | 1,821 | 0.60 | ||

STD-Male |
||||||

Control | 550 | 0.52 | 0.0001 | 767 | 0.52 | 0.0001 |

Intervention | 575 | 0.62 | 766 | 0.62 | ||

STD-Female |
||||||

Control | 333 | 0.44 | 0.0001 | 427 | 0.44 | 0.0001 |

Intervention | 334 | 0.59 | 423 | 0.60 | ||

Women |
||||||

Control | 555 | 0.44 | 0.0001 | 641 | 0.45 | 0.0001 |

Intervention | 544 | 0.57 | 632 | 0.58 | ||

Consistent Condom Use | ||||||

Total |
||||||

Control | 1,438 | 33.5 | 0.0001 | 1,852 | 33.8 | 0.0001 |

Intervention | 1,453 | 42.6 | 1,847 | 42.6 | ||

STD-Male |
||||||

Control | 550 | 36.1 | 0.004 | 778 | 35.4 | 0.0006 |

Intervention | 575 | 44.5 | 780 | 44.0 | ||

STD-Female |
||||||

Control | 333 | 32.9 | 0.04 | 431 | 33.3 | 0.01 |

Intervention | 334 | 40.8 | 431 | 41.4 | ||

Women |
||||||

Control | 555 | 31.5 | 0.0004 | 643 | 31.9 | 0.0002 |

Intervention | 544 | 41.8 | 636 | 41.8 |

^{1}Mean adjusted for study population (Total only), ERG, and baseline level of the endpoint (except Consistent Condom Use) in
a linear or logistic model.

^{2}P-value for a test of intervention versus control.

^{3}If the 12 month outcome was missing, it was imputed using responses within strata defined by responses at 3 and 6 months.
This method is outlined in "Community Intervention Trial for Smoking Cessation (COMMIT): I. Cohort Results from a Four-Year
Community Intervention," *American Journal of Public Health*, February 1995, Vol. 85, No. 2.

^{4}50 eligible participants are not included for proportion condom use due to missing baseline values of the outcome.