Note to users. If you're seeing this message, it means that your browser cannot find this page's style/presentation instructions -- or possibly that you are using a browser that does not support current Web standards. Find out more about why this message is appearing, and what you can do to make your experience of our site the best it can be.

Site Tools

  • AAAS
  • Subscribe
  • Feedback

Site Search

Search Advanced

Science 24 May 2002:
Vol. 296. no. 5572, pp. 1439 - 1443
DOI: 10.1126/science.1069660


Abstract
Full Text
Evidence of HIV-1 Adaptation to HLA-Restricted Immune Responses at a Population Level
Corey B. Moore, Mina John, Ian R. James, Frank T. Christiansen, Campbell S. Witt, and Simon A. Mallal

Supplementary Material


Polymorphism rate and functional constraint in HIV-1 RT

The relationship between polymorphism rate at single residues in HIV-1 RT and the known functional characteristics of the residues was examined (1). The polymorphism rates at the critical catalytic residues in HIV-1 RT (n=3, 0.53%), stability residues (n=37, 1.06%) and functional residues (n=11, 3.05%) were lower than at external residues (n=10, 5.95%) (P=0.0009, Wilcoxon).

Statistical methods

Epipop was designed with a direct link to the WA HIV Cohort Study electronic database to facilitate analyses based on Fisher's exact tests and logistic regression models. Power calculations, covariate selection procedures and randomisation procedures are described in detail below.


Steps in Epipop analysis at a single amino acid -an example using position 135 of HIV-1 RT

Set the outcome/response variable as any substitution of population sequence consensus amino acid (isoleucine) at position 135 of HIV-1 RT, ie I135x. Set starting covariates/explanatory variables as all HLA-A and-B alleles present in all individuals (n=473): A1, A2, A3, A9, A10, A11, A19, A28, A31, A36, B5, B7, B8, B12, B13, B14, B15, B16, B17, B18, B21, B22, B27, B35, B37, B40, B41, B42, B55, B56, B58, B60, B61. Serologically defined broad alleles were considered, rather than subtypes defined by high resolution DNA sequence based typing, so that data on all individuals in the cohort could be included. Furthermore, for several published CTL epitopes in HIV-1 RT, the HLA restriction of the epitope to the level of high resolution typing is not known.

Step 1-Power calculations

Formal power calculations effectively exclude at the outset any HLA allele/position combinations for which there is insufficient statistical power (because of rarity of polymorphism, rarity of HLA allele or both) to be realistically examined for association. This considerably restricts the number of covariates and therefore the number of comparisons made within models. Power calculations also formally identify which HLA associations cannot be excluded by our analysis and would need examination in a larger dataset. Standard formulae are used for power calculations (2). The numbers of patients with each HLA allele and with I135x are used to calculate the power to detect an association with an odds ratio (OR) of 2 (positive association) or 0.5 (negative association). HLA alleles with less than 30% power are removed. The removed alleles at position 135 are A31, A36, B42, B55, B56, B58 and B61. It is important to note that we had less power to detect negative associations than positive associations. For example, at the mean HLA frequency of 10.9 and mean polymorphism rate of 4.0%, we had 30% power to detect an OR of 2.0 (ie a positive association) but only 5.6% power to detect an equivalent negative OR of 0.5.

Step 2

The numbers of individuals with and without each HLA allele, and with and without I135x are calculated. In order to remove covariates that may lead to an unstable logistic regression model, HLA alleles are eliminated if there are fewer than five individuals in any of the comparison groups. The removed alleles at position 135 are HLA-B37, B41 and B60.

Step 3

Covariates were then assessed separately for association with I135x using Fisher's exact test, and only those with univariate P-values Less Than or Equal to Symbol 0.1 are included in further analyses. The removed alleles are A1, A2, A3, A9, A11, A19, A28, B7, B8, B13, B14, B15, B16, B21, B22, B27 and B35.

Step 4-Forward Selection

If the number of covariates remaining exceeds 10% of the number of individuals, forward selection using logistic regression is used to choose the covariates that are to remain in the analysis. Covariates are selected sequentially based on the smallest P-value for an added covariate until the number equals 10% of the number of patients. At position 135, the number of covariates was less than 10% of the number of patients so no selection was needed.

Step 5-Backwards Elimination

A standard backwards elimination procedure is then carried out. Logistic regression models are fitted for the remaining covariates. If any of the P-values for the covariates is greater then 0.1, after accounting for the other included covariates, then the covariate with the largest P-value is removed and the logistic model refitted. This is repeated until all covariates have a P-value less than 0.1. At position 135, this removes HLA alleles B12, B17 and B40.

Step 6-Exact P-values

To accommodate relatively small samples, "exact" P-values are based on randomisation tests rather than the usual large sample approximations (3). In this procedure, the final covariate sets are randomly permuted amongst individuals and the standard test statistics for association with I135x calculated for each permutation. 1000 random permutations are generated for each model and the P-value is based on the appropriate percentage of test values more extreme than that pertaining to the actual data. The proportion of times that a covariate has a test statistic in the random datasets exceeding that from the actual data is calculated for each covariate. This proportion gives a randomisation (exact) P-value. Covariates with exact P-values greater than 0.05 are removed sequentially and those with P-values less than 0.05 are considered significant. At position 135, this removes the alleles HLA-A10 and -B18, leaving HLA-B5 as the significant association with I135x.

Correction for multiple comparisons

In order to highlight the significant HLA associations whose P-values withstand correction for the number of comparisons made across the whole analysis (ie a very low P-value cut-off for higher specificity but lower sensitivity), correction factors were generated for each HLA allele. Positive and negative associations were considered separately. 1000 randomised datasets were created from the original dataset as described above. The entire selection process including the preliminary model reduction procedures was then carried out for each amino acid residue and the total number of significant associations for each HLA allele across all positions was calculated. For example, for HLA-A2 there were, on average, 1.827 positive HLA-A2 associations across all residues per random dataset. This number was divided by 0.05 to a multiple comparisons correction factor (x) for HLA-A2. This correction factor is the estimated equivalent number of "independent" tests carried out. The correction factor was applied to the P-values calculated in the actual data using Bonferroni adjustment [i.e. p* = 1-(1-p)x, where p is the P-value from the model using the actual data, x is the correction factor and p* is the corrected P-value].

Overall P-value for actual vs randomised data

The overall P-value for all associations at all positions was obtained by considering the extremeness of the sum of the individual tests at each position relative to the values of this sum obtained from the randomisation data sets. The sum of all test statistics for all models for all alleles using the actual data was calculated. The same was done for the randomised datasets. For none of the 1000 random datasets was this number greater than the actual data, giving an overall P-value of <1/1000 or <0.001.

Significance of associations within 'known' CTL epitopes

We conducted analyses to determine the probability of finding by chance at least 15 significant positive associations within 'corresponding' known CTL epitopes (ie restricted to the same HLA allele). If significant HLA associations were occurring randomly across residues, the probability that an HLA association would occur within the known CTL epitope restricted to that allele equates to the relative proportion of all residues falling within the epitope. The total number of significant associations within known epitopes is then a sum of non-identical binomial variables, whose distribution can be evaluated via simulation, for example. Only 4.27 significant positive associations within known epitopes were expected based on the random hypothesis compared with the 15 observed. The approximate P-value for this is <0.001.


Supplemental Figure 1. Full version of map of HIV-1 RT showing all positions between 20 and 227 analysed in this study. Residues at which there was little power to detect any HLA associations are shaded. (A) Published HLA-A and HLA-B restricted CTL epitopes are marked as red lines. (B) The HLA alleles that are significantly positively associated with polymorphism along with the odds ratio (OR) for the association. The HLA-specific polymorphisms within known CTL epitopes restricted to the same broad HLA allele are in red text and the six at flanking residues are in blue text. The boxed associations are those that remain significant after correction for total number of residues examined. HLA-B*5101 is a subtype of HLA-B5, HLA-B44 is a subtype of HLA-B12, HLA-A24 is a subtype of HLA-A9 and HLA-A*3002 is a subtype of HLA-A19. (C) Negative HLA associations are marked with odds ratios of not being different to consensus. These are also in red or blue text if within or flanking known CTL epitopes. (D) The percentages of patients with a different amino acid to that of consensus sequence (%). Red bars indicate non-conservative amino acid changes and blue bars are conservative changes.


Medium version | Full size version


Viral load analysis

Comparisons of viral loads were based on a Cox analysis to accommodate censoring of values above the limit of assay detection and were restricted to situations with at least four individuals representing HLA allele versus non-HLA allele, with polymorphisms and without. The viral load measured closest to first pre-treatment HIV-1 RT sequencing was used. When HLA alleles and polymorphisms were included as interaction terms (i.e. a polymorphism and it's positively associated HLA allele, or consensus amino acid and the negatively associated HLA allele) the overall significance value improved. The former model had a log likelihood of -15.4 with 25 degrees of freedom and the latter model had a log likelihood of -32.1 with 40 degrees of freedom. The change in log-likelihood has a P-value of 0.004.


Supplemental Figure 2. Distribution of I135x in HIV-1 RT sequence in all HLA-B5 individuals The most recent amino acid sequence of HIV-1 RT in all 52 patients in the cohort with serologically defined HLA-B5 (patients 1-52) were compared with population consensus sequence. HIV-1 RT sequences are grouped according to the HLA-B subtype of the patient. In all sequences, a dot ( . ) indicates no difference from consensus. Amino acids different from consensus are shown. Where quasispecies with different amino acids were detected, the most common non-consensus amino acid is shown, except at position 135 where all detected amino acids in a mixed viral population are shown. All but one of the forty patients (98%) with the HLA-B*5101 subtype have a substitution of the consensus amino acid isoleucine (I) at position 135, most commonly with threonine (T). 1The sequence without I135x is that of the single HLA-B*5101 patient who had HAART during acute HIV infection. 2This patient did not have molecular genotyping. 3This patient was an HLA-B*5101/B*5201 heterozygote but was counted only once in the HLA- B*5101 group. The one patient with the HLA-B*5108 subtype, and four of eight patients with the HLA-B*5201 subtype did not have I135x, suggesting that these subtypes may not bind the RT(128-135 IIIB) epitope. Both subtypes differ from HLA-B*5101 by only two amino acids (HLA-B*5108 at positions 152 and 156, HLA-B*5201 at positions 63 and 67, of HLA amino acid sequence) (IMGT/HLA sequence database; http://www.ebi.ac.uk/imgt/hla). The remaining two patients were shown to be HLA-B*5301 by sequencing.

Download PDF of Figure 2


Secondary polymorphisms

Primary CTL escape mutation in an HIV-1 p24 epitope has been shown to induce possible compensatory mutations in the virus (4). We sought to determine whether the secondary or compensatory changes accompanying primary (putative) CTL escape mutation were evident at a population level. We therefore included polymorphism at all 'other' positions in HIV-1 RT, along with HLA alleles, as covariates in all multivariate logistic regression models. All but two of the 64 positive HLA-specific polymorphisms were also associated with one or more polymorphisms at other positions.

Selection of HLA-specific polymorphism over time

To determine whether selection of HLA-specific polymorphisms over time was demonstrable in our study, we compared the amount of HLA-specific variation present in the most recent HIV-1 RT sequence with the first sequence for all individuals. For 61 of 64 HLA-specific polymorphisms, the number of individuals with an amino acid polymorphism increased over time and under observation. In 52 of these cases, the increase was significantly greater in those with the HLA allele associated with the polymorphism, compared with all others without the allele (P=0.0008, sign test, Supplemental Table 1).


Supplemental Table 1.
HLA-specific polymorphismsn=64P-value (sign test)
HLA-specific polymorphisms that increase
from first to last HIV-1 RT sequencesn=61P<0.0001
HLA-specific polymorphisms that increase from first to last HIV-1 RT sequences in those with the corresponding HLA allele compared with all othersn=52P<0.0001


References

1. J. A. Wrobel et al., Proc. Natl. Acad. Sci. U.S.A. 95, 638-645 (1998).

2. J. H. Zar, Biostatistical Analysis, 4th Ed. (Prentice-Hall International, New Jersey, 1999), Chap. 24.12.

3. F. L. Ramsey and D. W. Schafer, in The statistical sleuth. A course in methods of data analysis, (Duxbury Press, 1997),chap. 2.

4. A. D. Kelleher et al., J. Exp. Med. 193, 375-386 (2001).





ADVERTISEMENT
Click Me!

ADVERTISEMENT
Click Me!

To Advertise     Find Products


Science. ISSN 0036-8075 (print), 1095-9203 (online)