Antibodies

Abysis - new database

The Abysis database integrates sequence data from Kabat, IMGT and the PDB with structural data from the PDB. It provides a comprehensive point-and-click interface which allows one to search the sequence data on various criteria and display results in different formats. For data from the PDB, sequence searches can be combined with structural constraints. For example, one can ask for all the antibodies with a 10-residue Kabat CDR-L1 with a serine at H23 and an arginine within 10A of H36.

General information

Visit the page of information and links to accompany my book chapter Protein Sequence and Structure Analysis of Antibody Variable Domains. In: Antibody Engineering Lab Manual (Ed.: Duebel, S. and Kontermann, R., Springer-Verlag, Heidelberg). [Information] [Purchase]



Availability

25.02.14 APOLOGIES! We are still finding problems resulting from our web server faiing over the Christmas break. The SACS database and ImPACT are now fully available again!

18 October 2014

Two new pieces of software to help programmers: JSAV, a JavaScript Sequence Alignment Viewer component for web sites; BiopLib, 25 years worth of C code development for handling (mostly) protein structure.

The Kabat Numbering Scheme

The Kabat numbering scheme is a widely adopted standard for numbering the residues in an antibody in a consistent manner. However the scheme has problems!

First, since the numbering scheme was developed from sequence data (a fairly limited set), the position at which insertions occur in CDR-L1 and CDR-H1 does not match the structural insertion position. Thus topologically equivalent residues in these loops do not get the same number.

Second, the numbering adopts a rigid specification. For example in the potentially very long CDR-H3, insertions are numbered between residue H100 and H101 with letters up to K (i.e. H100, H100A ... H100K, H101). If there are more residues than that, there is no standard way of numbering them. Such situations occur at other positions too.

The numbering throughout the chains is as follows:

Light chain

       0     1     2     3     4     5     6     7     8     9
      10    11    12    13    14    15    16    17    18    19
      20    21    22    23    24    25    26    27
      27A   27B   27C   27D   27E   27F               28    29
      30    31    32    33    34    35    36    37    38    39
      40    41    42    43    44    45    46    47    48    49
      50    51    52    53    54    55    56    57    58    59
      60    61    62    63    64    65    66    67    68    69
      70    71    72    73    74    75    76    77    78    79
      80    81    82    83    84    85    86    87    88    89
      90    91    92    93    94    95
      95A   95B   95C   95D   95E   95F   96    97    98    99
     100   101   102   103   104   105   106
     106A                                      107   108   109

Heavy chain

       0     1     2     3     4     5     6     7     8     9
      10    11    12    13    14    15    16    17    18    19
      20    21    22    23    24    25    26    27    28    29
      30    31    32    33    34    35
      35A  35B                            36    37    38    39
      40    41    42    43    44    45    46    47    48    49
      50    51    52
      52A   52B   52C   53    54    55    56    57    58    59
      60    61    62    63    64    65    66    67    68    69
      70    71    72    73    74    75    76    77    78    79
      80    81    82
      82A   82B   82C   83    84    85    86    87    88    89
      90    91    92    93    94    95    96    97    98    99
     100
     100A  100B  100C  100D  100E  100F  100G  100H  100I  100J
     100K  101   102   103   104   105   106   107   108   109
     110   111   112   113

The Chothia Numbering Scheme

The Chothia numbering scheme is identical to the Kabat scheme, but places the insertions in CDR-L1 and CDR-H1 at the structurally correct positions. This means that topologically equivalent residues in these loops do get the same label (unlike the Kabat scheme).

There are two disadvantages: first, the Kabat scheme is so widely used that some confusion can arise; second, Chothia et al. changed their numbering scheme as of their 1989 Nature paper such that insertions in CDR-L1 are placed after residue L31 rather than L30. Examining the conformations of the loops shows that L30 is the correct position.

Note That in their latest paper (Al-Lazikani et al., (1997) JMB 273,927-948), Chothia's group returns to using residue L30 as the insertion site in CDR-L1!

The pre-1989/post-1997 Chothia numbering (the structurally correct version) throughout the chains follows.

Light chain

       0     1     2     3     4     5     6     7     8     9
      10    11    12    13    14    15    16    17    18    19
      20    21    22    23    24    25    26    27    28    29
      30    
      30A   30B   30C   30D   30E   30F
            31    32    33    34    35    36    37    38    39
      40    41    42    43    44    45    46    47    48    49
      50    51    52    53    54    55    56    57    58    59
      60    61    62    63    64    65    66    67    68    69
      70    71    72    73    74    75    76    77    78    79
      80    81    82    83    84    85    86    87    88    89
      90    91    92    93    94    95
      95A   95B   95C   95D   95E   95F   96    97    98    99
     100   101   102   103   104   105   106
     106A                                      107   108   109

Heavy chain

       0     1     2     3     4     5     6     7     8     9
      10    11    12    13    14    15    16    17    18    19
      20    21    22    23    24    25    26    27    28    29
      30    31    
      31A   31B
                  32    33    34    35    36    37    38    39
      40    41    42    43    44    45    46    47    48    49
      50    51    52
      52A   52B   52C   53    54    55    56    57    58    59
      60    61    62    63    64    65    66    67    68    69
      70    71    72    73    74    75    76    77    78    79
      80    81    82
      82A   82B   82C   83    84    85    86    87    88    89
      90    91    92    93    94    95    96    97    98    99
     100
     100A  100B  100C  100D  100E  100F  100G  100H  100I  100J
     100K  101   102   103   104   105   106   107   108   109
     110   111   112   113

Martin (Enhanced Chothia) Numbering Scheme

The only differences between the Chothia and Kabat numbering schemes are in the sites of indels in CDR-L1 and CDR-H1.

Our 'Martin' (Enhanced Chothia) scheme also considers the structurally correct locations for indels in the framework regions. Thus the numbering scheme is identical to the Chothia in most regards but positions of framework indels have been refined.

The most important of these is the insertion which present in the majority of antibodies at H82a,b,c has been moved to the structurally correct location of H72a,b,c.

We have also introduced an indel site at L52 in CDR-L2. All structures have the standard length of 7 residues and the conformations are relatively conserved. However sequences of varying length are seen and analysis of the structure suggests this is the correct location. It also corresponds with the AHo numbering scheme.

In this scheme, the locations of deleted residues are indicated with ().

A manuscript describing this new scheme is in preparation.

Light chain

       0     1     2     3     4     5     6     7     8     9 
     (10)   11    12    13    14    15    16    17    18    19 
      20    21    22    23    24    25    26    27    28    29
     (30)
      30A   30B   30C   30D   30E   30F
            31    32    33    34    35    36    37    38    39
      40
      40A  (41)   42    43    44    45    46    47    48    49 
      50    51    (52)    
      52A   52B   52C   52D   52E   
                        53    54    55    56    57    58    59
      60    61    62    63    64    65    66    67   (68)
      68A   68B   68C   68D   68E   68F   68G   68H         69
      70    71    72    73    74    75    76    77    78    79
      80    81    82    83    84    85    86    87    88    89
      90    91    92    93    94   (95)
      95A   95B   95C   95D   95E   95F   96    97    98    99
     100   101   102   103   104   105   106   107
     107A                                            108   109
     110

Heavy chain

       0     1     2     3     4     5     6     7    (8)
       8A    8B    8C    8D                                  9
      10    11    12    13    14    15    16    17    18    19
      20    21    22    23    24    25    26    27    28    29
      30   (31)
      31A   31B
                  32    33    34    35    36    37    38    39
      40    41   (42)   43    44    45    46    47    48    49 
      50    51   (52)
      52A   52B   52C   53    54    55    56    57    58    59
      60    61    62    63    64    65    66    67    68    69
      70    71    72
      72A   72B   72C   73    74    75    76    77    78    79
      80    81    82    83    84    85    86    87    88    89
      90    91    92    93    94    95    96    97    98    99
     (100)
     100A  100B  100C  100D  100E  100F  100G  100H  100I  100J
     100K  101   102   103   104   105   106   107   108   109
     110   111   112   113

Table of CDR Definitions

A number of definitions of the CDRs are commonly in use:

LoopKabat AbM Chothia1 Contact2
L1 L24--L34 L24--L34 L24--L34 L30--L36
L2 L50--L56 L50--L56 L50--L56 L46--L55
L3 L89--L97 L89--L97 L89--L97 L89--L96
H1 H31--H35B
(Kabat Numbering)3
H26--H35BH26--H32..34H30--H35B
H1 H31--H35
(Chothia Numbering)
H26--H35 H26--H32 H30--H35
H2 H50--H65 H50--H58 H52--H56 H47--H58
H3 H95--H102 H95--H102H95--H102 H93--H101

Note (1) some of these definitions (particularly for Chothia loops) vary depending on the individual publication examined.

Note (2) any of the numbering schemes can be used for these CDR defintions, except the contact definition uses the Chothia or Martin (Enhanced Chothia) definition.

Note (3) the end of the Chothia CDR-H1 loop when numbered using the Kabat numbering convention varies between H32 and H34 depending on the length of the loop. (This is because the Kabat numbering scheme places the insertions at H35A and H35B.)

H1 Definition

This diagram illustrates the alternative definitions for CDR-H1. The Kabat and Chothia numbering schemes are shown horizontally and the Kabat, Chothia, AbM and Contact definitions of the CDRs are shown with arrows above and below the two numbering schemes.


How to identify the CDRs by looking at a sequence

The following set of rules will allow you to find the CDRs in an antibody sequence. Note that the word 'always' should always be treated with care! There are rare examples where these virtually constant features do not occur (for example the human heavy chain sequence EU does not have Trp-Gly after CDR-H3). The Cys residues are the best conserved feature.

CDR-L1

Start Approx residue 24
Residue before  always a Cys
Residue after always a Trp. Typically Trp-Tyr-Gln, but also, Trp-Leu-Gln, Trp-Phe-Gln, Trp-Tyr-Leu
Length 10 to 17 residues

CDR-L2

Start always 16 residues after the end of L1
Residues before  generally Ile-Tyr, but also, Val-Tyr, Ile-Lys, Ile-Phe
Length always 7 residues (except NEW (7FAB) which has a deletion in this region)

CDR-L3

Start always 33 residues after end of L2 (except NEW (7FAB) which has the deletion at the end of CDR-L2)
Residue before  always Cys
Residues after always Phe-Gly-XXX-Gly
Length 7 to 11 residues

CDR-H1

Start Approx residue 26 (always 4 after a Cys) [Chothia / AbM defintion];
Kabat definition starts 5 residues later
Residues before  always Cys-XXX-XXX-XXX
Residues after always a Trp. Typically Trp-Val, but also, Trp-Ile, Trp-Ala
Length 10 to 12 residues [AbM definition];
Chothia definition excludes the last 4 residues

CDR-H2

Start always 15 residues after the end of Kabat / AbM definition) of CDR-H1
Residues before  typically Leu-Glu-Trp-Ile-Gly, but a number of variations
Residues after Lys/Arg-Leu/Ile/Val/Phe/Thr/Ala-Thr/Ser/Ile/Ala
Length Kabat definition 16 to 19 residues;
AbM (and recent Chothia) definition ends 7 residues earlier

CDR-H3

Start always 33 residues after end of CDR-H2 (always 2 after a Cys)
Residues before  always Cys-XXX-XXX (typically Cys-Ala-Arg)
Residues after always Trp-Gly-XXX-Gly
Length 3 to 25(!) residues

Table of mean contact data

Following an analysis of the contacts between antibody and antigen in the complex structures available in the Protein Databank, we have generated a set of mean contact data. The full method by which these results were obtained is described in the following paper: MacCallum, R. M., Martin, A. C. R. and Thornton, J. T. Antibody-antigen interactions: Contact analysis and binding site topography. J. Mol. Biol. 262, 732-745.

Briefly, we have analysed the number of contacts made at each position, defining contact as burial by > 1 square Angstrom change in solvent accessibility. These data give a simple measure of how likely a residue is to be involved in antigen contact.

Second, we have calculated the mean percentage burial over the accessible residues.

Click here for an image showing a composite combining site containing all CDR conformations coloured by contact propensity.

The table presents the chain name, residue number (N.B. This is pre-1989 Chothia Numbering), the number of contacts and the mean percent burial. The data are available by clicking here.

An alternative simplified view is presented as a list of CDR residues making contact in each antibody with summary data for each CDR.