|
Home
[Viewing Options]

Extension of HLA allele names

an extract from:
Nomenclature for factors of the HLA system, 2002.

Steven G. E. Marsh, Ekkehard D. Albert, Walter F. Bodmer, Ronald E. Bontrop, Bo Dupont, Henry A. Erlich, Daniel E. Geraghty, John A. Hansen, Bernard Mach, Wolfgang R. Mayr, Peter Parham, Effie W. Petersdorf, Takehiko Sasazuki, Geziena M. Th. Schreuder, Jack L. Strominger, Arne Svejgaard and Paul I. Terasaki

Tissue Antigens (2002) 60 407-464
European Journal of Immunogenetics (2002) 29 463-515
Human Immunology (2002) 63 1213-1268

See also:
Nomenclature of HLA alleles for examples,
The 2002 Nomenclature Report for the complete text.

The convention of using a four digit code to distinguish HLA alleles that differ in the proteins they encode was first implemented in the 1987 Nomenclature Report. In 1990 a fifth digit was added to permit the distinction of sequences differing only by synonymous (non-coding) nucleotide substitutions within the exons. When these conventions were adopted it was anticipated that the nomenclature system would accommodate all the HLA allele likely to be sequenced. Unfortunately that is not proving to be the case, as the number of alleles for certain genes is fast approaching the maximum possible with the current naming convention.

In particular there are three problem areas; firstly the fifth digit, used for synonymous substitutions, can distinguish only nine variants of an allele. Already there are six named variants of the A*0201 allele: A*02011 to A*02016, and eight variants of the G*0101: G*01011 to G*01018. The second problem area concerns the third and fourth digits used to distinguish up to 99 variants within the allele families defined by the first and second digits. The first allele family to exceed 99 named alleles is likely to be the B*15 family for which 73 variants have been named to date, soon followed by the A*02 and DRB1*13 families for which over 50 allele variants have already been named. The most immediate problem concerns the DP genes, for which the decision was taken in 1989 to name all alleles which differ by non-synonymous (coding) substitutions with different combinations of the first two digits, a system that can only accommodate 99 alleles. The most recently assigned names was DPB1*9201, so that once an additional eight coding sequences have been reported there will be no capacity left in this system for naming newly discovered DPB1 alleles.

There was much discussion of this topic. Several different options were considered including the splitting up of the allele names into discreet fields separated by colons or semi-colons. This option while it would have no limit to the number of names available, was in the end considered by the committee to be too radical and disruptive a solution for the problems at hand. It was therefore decided to seek solutions with minimal change to the existing format of the alleles, so as to limit the changes that would have to be made to existing database structure. The following decisions were taken to solve the three major problems.

  1. To introduce an extra digit between the current fourth and fifth digit, to allow for up to 99 synonymous variants of each allele. This expands the full allele name to eight digits, the first digits defining the allele family and where possible corresponding to the serological family, the third and fourth digits describing coding variation, the fifth and sixth digits describing synonymous variation and the seventh and eighth digits describing variation in introns or 5´ or 3´ regions of the gene.
  2. In cases where the total number of coding variants exceeds 99, a second number series will be used to extend the first one. For example for the B*15 family of alleles, the B*95 series will be reserved and used to code for additional B*15 alleles. Consequently the next B*15 allele to be named following B*1599 will be B*9501. Likewise the A*92 series will be reserved as a second series for the A*02 allele family.
  3. For HLA-DPB1 alleles, it was decided to assign new alleles within the existing system, hence once DPB1*9901 has been assigned, the next allele would be DPB1*0102, followed by DPB1*0203, DPB1*0302 etc.

The introduction of the additional digit for synonymous variation will take place immediately and all allele names which are currently five digits or above will be renamed accordingly, as shown in tables 2 - 10. The other changes will only be implemented when necessary, as dictated by submission of novel allele sequences.