Mutations Practical

In the practical work we are going to look at a mutation in glucose-6-phosphate dehydrogenase using a few of the most popular / best performing methods.

The G6PD sequence (UniProtKB accession: P11413; name: G6PD_HUMAN, RefSeq: NP_001035810.1) is:

>sp|P11413|G6PD_HUMAN Glucose-6-phosphate 1-dehydrogenase OS=Homo sapiens GN=G6PD PE=1 SV=4
MAEQVALSRTQVCGILREELFQGDAFHQSDTHIFIIMGASGDLAKKKIYPTIWWLFRDGL
LPENTFIVGYARSRLTVADIRKQSEPFFKATPEEKLKLEDFFARNSYVAGQYDDAASYQR
LNSHMNALHLGSQANRLFYLALPPTVYEAVTKNIHESCMSQIGWNRIIVEKPFGRDLQSS
DRLSNHISSLFREDQIYRIDHYLGKEMVQNLMVLRFANRIFGPIWNRDNIACVILTFKEP
FGTEGRGGYFDEFGIIRDVMQNHLLQMLCLVAMEKPASTNSDDVRDEKVKVLKCISEVQA
NNVVLGQYVGNPDGEGEATKGYLDDPTVPRGSTTATFAAVVLYVENERWDGVPFILRCGK
ALNERKAEVRLQFHDVAGDIFHQQCKRNELVIRVQPNEAVYTKMMTKKPGMFFNPEESEL
DLTYGNRYKNVKLPDAYERLILDVFCGSQMHFVRSDELREAWRIFTPLLHQIELEKPKPI
PYIYGSRGPTEADELMKRVGFQYEGTYKWVNPHKL

Download FASTA file

You will analyze the mutation S106->C. This is a known pathogenic mutation causing favism and classified by the WHO as a Class I mutation: Severe deficiency (<10% activity) with chronic (nonspherocytic) hemolytic anemia.

Explore this mutation using the following servers:

FATHMM Link

As the name suggests, FATHMM makes use of Hidden Markow Models.

  • Click the 'Inherited Disease' box
  • Enter the mutation as:
    P11413 S106C
  • Click the 'Submit' button

SIFT Link

SIFT works purely on the basis of sequence and is one of the oldest and most popular methods. Its performance is far lower than modern methods.

  • Under 'Single Protein Tools', click 'SIFT Sequence'
  • Under 'Paste in your protein query sequence', paste the sequence including the FASTA header (the line that starts with a > )
  • Under 'Enter the substutions of interest [format]:', enter 'S106C'
  • Click 'Submit' at the bottom of the page
  • Once the result is returned, click 'Predictions of substitutions entered'

Note that there are dire warnings about having to wait 20 minutes - in practice for this query you are likely to get a result back in a couple of minutes!

PolyPhen2 Link

  • Enter the G6PD sequence into the 'Protein sequence in FASTA format' box.
    Note that you must include the FASTA header (the line that starts with a > )
  • Enter '106' into the 'Position' box
  • Under 'Substitution AA1', click 'S'
  • Under 'Substitution AA2', click 'C'
  • Click 'Submit Query'
  • In the resulting page, click 'Refresh' as necessary until, under 'Jobs', your job is indicated as 'Completed'.
  • Click 'View' under 'Results'

ProSPect/SuSPect Link

  • Under 'Mutations (human only)', enter the sequence as
    P11413 S106C
  • Click 'Submit'

Note that this is using pre-calculated results; we could also have entered the sequence and mutation in the right-hand panel and obtained a prediction from scratch.

SDM Link

  • Click 'Predict' at the top of the page
  • Under 'Single Mutation' and 'Provide a 4-letter PDB code', enter the PDB code for the sequence (1QKI) - you would normally obtain this by examining the UniProtKB/SwissProt entry for P11413 or by using PDBSWS.
  • Under 'Mutation', enter 'S106C'
  • Under 'Mutation chain', enter 'A'
  • Click 'Run SDM' and wait patiently (don't click it again!)

Condel Link

  • Log in to Condel using your Google account or by creating an account
  • Under 'Condel', click 'Query'
  • Under 'Mutations', enter the mutation as
    P11413 S106C
  • Click 'Query'
  • When the results appear, click 'Download'
  • Save the results file (a ZIP file) and view the contents. This is a tab-separated file that can be viewed with Excel.
  • The 'CONDEL_LABEL' (the last item on a line) is 'D' if the mutation is predicted to be Deleterious, or 'N' for Neutral.
  • The score under 'CONDEL' can be compared with the 'condel-cutoff' at the top of the page to obtain a confidence.

SAAPpred Link

  • Under 'UniProt Accession', enter the accession ('P11413')
  • Under 'Native Residue', enter 'ser'
  • Under 'Residue Number', enter '106'
  • Under 'Mutant Residue', enter 'cys'
  • Press 'Submit'
  • The analysis progress will appear on the same page which will then change to the results page
  • Click 'ser106->cys'
  • You will now see a summary of the likely local structural effects of the mutation in different structures of the same protein
  • Click 'Predict Pathogenicity'