bioinf.org.uk - Prof. Andrew C.R. Martin's group at UCL

Missense Prediction Methods and Approaches

Practical

Mutations fall into different types:

Most mutations are 'Loss of Function'
Some are 'Gain of Function' (generally through loss of regulation)
A small number are actually 'change of function' e.g. of specificity (estimated at 5% of cancer mutations [9]).

...as do prediction methods:

Many methods are purely sequence based (e.g. SIFT).
Some methods incorporate protein structure information through rule-based approaches [70] or machine learning.
More sophisticated alignment information has been used exploiting hidden Markov models (e.g. subPSEC and PANTHER).
When a structure is not available, comparative modelling may be exploited (e.g. LS-SNP) and application of ab initio structural models has also been explored [65].
Various servers exploit a combination of sequence, structural and evolutionary features (e.g. SNAP, PMUT and CanPredict).

Sequence and evolutionary conservation-based methods

e.g. SIFT, Align-GVGD, MutationAssessor, PANTHER, MAPP

Empirical rules

e.g. PolyPhen

Protein Structure

e.g. SNP@Domain, BONGO, SNPs3D

Protein sequence and structure-based methods

e.g. PolyPhen, PolyPhen-2, LS-SNP/PDB, SNPeffect, BONGO

Direct methods

These employ some sort of score based on some type of theoretical model of what happens when a mutation occurs (e.g. SIFT, PANTHER, etc)

Machine-learning methods

These use machine-learning (such as neural nets, SVMs, random forests, etc.) and can combine different properties of the native and mutant residue such as size and polarity, together with other information such as structural environment (e.g. accessibility, H-bonding), evolutionary conservation

e.g. PMut, SNAP, PhD-SNP, SNPs&GO, Parepro, CanPredict, nsSNPAnalyzer, MutPred, Hansa, MutationTaster

Missense Prediction Tool Catalogue

www.ngrl.org.uk/Manchester/page/missense-prediction-tool-catalogue.html

A Summary of Prediction Methods and Databases

Ref - reference (see below)
St - use of structural data: Y = required; (Y) = used if available (predicts structural information otherwise)
M - generates models: Y = yes; P = precomputed only; H = highlights where the mutation is but doesn't model it
Pre - Are data pre-calculated: Y = yes (novel mutations cannot be uploaded); NS = No web server

Program	Ref	St	M	Pre	Notes
SNPs3D	1, 70, 72, 73	Y	P	Y	snps3d.org Uses structures, sequence profiles, pathways together with conservation scores from MutDB to train SVMs to make destabilization predictions
StSNP	2	Y	Y	Y	ilyinlab.org/StSNP/ Pre-calculated analysis uses pathways
ModSNP	3	Y		Y	Simply provides models and SIFT results (No longer available?)
MutDB	4	Y	H	Y	www.mutdb.org Precalculated set of mutant models
LS-SNP	5, 57, 58	Y	P	Y	www.salilab.org/LS-SNP Uses an SVM trained with rule-based annotation of structure, sequence and evolution to look for destabilization, proximity to ligands and interfaces and exploits information from OMIM on similar known PDs
TopoSNP	6	Y	P	Y	gila.bioe.uic.edu/snp/toposnp/ Classifies residues based on location (surface pocket or interior void; convex or depressed surface; internal) and combines this with a conservation score from derived from Pfam.
SNPeffect	7	Y	N	N	snpeffect.vib.be/ assesses stability (FoldX), aggregation, amyloidosis, proximity to functional sites and cellular processing
nsSNPAnalyzer	8	Y		N	http://snpanalyzer.uthsc.edu/ Exploits SIFT and structural features to train a random forest
FIS	9			NS	'Functional Impact Score' - exploits evolutionary information from multiple sequence alignments.
MutationTaster	10			N	www.mutationtaster.org Uses conservation, effects on splicing, protein features and mRNA production/stability
SNAP	11, 12	(Y)	N	N	www.rostlab.org/services/SNAP Uses neural networks with data from the sequence and PolyPhen and SIFT predictions. In addition it uses predicted structural features (solvent accessibility, secondary structure and flexibility), but can exploit actual structural data if available.
Condel	13				bg.upf.edu/condel Uses a weighted average score from a number of predictors. The original paper uses LogRE, MAPP, MutationAssessor, PolyPhen-2 and SIFT, but the latest version just MutationAssessor and FATHMM.
FATHMM	14				fathmm.biocompute.org.uk Exploits HMMs to represent a protein family and exploits species-specific weights.
SAAPdb	28	Y	N	Y	www.bioinf.org.uk/saap/db/ [NO LONGER MAINTAINED] A pre-calculated database of the structural effects of mutations. Used a number of rule-based analyses of strctural effects together with a conservation score.
SAAPdap	15	Y	N		www.bioinf.org.uk/saap/dap/ A pipeline for calculating the structural effects of mutations (replaces SAAPdb). Uses a number of rule-based analyses of strctural effects together with a conservation score.
SAAPpred	15	Y	N		www.bioinf.org.uk/saap/dap/ A random-forest predictor based on the structural analyses from SAAPdap
MutPred	16				mutpred.mutdb.org Uses a Random Forest predictor with data based on predicted protein structure and dynamics, predicted functional properties and sequence and evolutionary information.
CADD	17				cadd.gs.washington.edu A meta-predictor that uses support vector machines with results from SIFT, PolyPhen, conservation, predicted effects on regulation, the 'Grantham' score for amino acid differences. Designed to be expandable.
SNPS&GO	18, 36				snps.biofold.org/snps-and-go Uses results from PANTHER together with functional information from GO and sequence information - both from the local environment and from profiles from multiple sequence alignments.
SNPS&GO3D	N/A	Y			snps.biofold.org/snps-and-go As SNPS&GO, but also uses structural data
SIFT	19, 64				sift.bii.a-star.edu.sg/ An evolutionary method which calculates a sophisticated residue conservation score from multiple alignment
PolyPhen/PolyPhen-2	20, 21, 22, 67	(Y)	N		genetics.bwh.harvard.edu/pph/ genetics.bwh.harvard.edu/pph2/ Uses machine learning on a set of eight sequence- and three structure-based features. If no structure is available, the structural features are predicted.
Panther/subPSEC	23, 24, 25				www.pantherdb.org PSEC is a position-specific evolutionary conservation score and subPSEC is a difference in PSEC scores for a substitution. Panther exploits these scores derived from HMMs (PANTHER/lib) together with an ontology of protein function (PANTHER/X - a simplified form of GO) to make predictions.
PhD-SNP	26				gpcr.biocomp.unibo.it/cgi/predictors/PhD-SNP/PhD-SNP.cgi Uses a support vector machine with local sequence environment and a profile derived from a multiple sequence alignment
PMut	27, 45	(Y)			mmb.pcb.ub.es/PMut/ Uses PHD secondary structure and accessibility prediction (or observed if a structure is available), together with statistical potentials from Prosa-II to evaluate stability, mutation matrix scores, changes in amino acid properties, a sequence potential, PSSM, a conservation score and SwissProt annotations to train a neural network.
SDM	29, 71	Y			www-cryst.bioc.cam.ac.uk/~sdm/sdm.php Assess stability using environment-specific substitution tables and local structural environment (secondary structure, solvent accessibility, Hbonds), functional information from the catalytic site atlas and UniProt.
MutationAssessor	30				mutationassessor.org Uses 'combinatorial entropy optimization' (CEO) to look at sets of evolutionarily related proteins and find key functional residues to which it applies a conservation score.
LogRE / CanPredict	31, 39, 56				lpgws.nci.nih.gov/cgi-bin/GeneViewer.cgi [SEEMS NOT TO BE AVAILABLE] LogRE is a score calculated from a Hidden Markov Model for a substitution that is exploited by CanPredict
MAPP / ProPhylER	32, 33				mendel.stanford.edu/sidowlab/downloads/MAPP/ [DOWNLOADABLE SOFTWARE] www.prophyler.org [SEEMS NOT TO BE AVAILABLE] Prophyler uses the MAPP score which takes data from a multiple alignment and converts a position in the alignment to a vector describing the importance of 6 physicochemical properties (hydropathy, polarity, charge, volume and free-energy in alpha helices and beta-strands)
ProSPect	34, 35, 77				www.sbg.bio.ic.ac.uk/servers/suspect/ Concentrates on stability and interfaces and protein network information
SNP@Domain	37		H		http://snpnavigator.net/ [SEEMS NOT TO BE AVAILABLE]
FOLD-X	38	Y			foldxsuite.crg.eu/ FOLD-X is an online force-field for calculating energy - it has been widely used for calculating stability changes on mutation.
PoPMuSiC	40, 41, 42	(Y)			dezyme.com/en/Software
VEP	49				www.ensembl.org/info/docs/tools/vep/ Links ENSEMBL to variant effect predictors (currently SIFT and PolyPhen-2)
BONGO	46				www.bongo.cl.cam.ac.uk/Bongo2/Bongo.htm [NOT AVAILABLE] A protein structure is converted to a graph, based on its amino acid interactions. Those residues of key importance for structural stability are determined by these interactions. The substituted amino acids are modelled and the impact of the change determined based on the changes in the network.
HANSA	47				hansa.cdfd.org.in:8080/ Combines 10 different properties of these substitutions to partition disease and neutral mutations: 6 features related to the specific position of the mutation and probabilities of the amino acids; 2 features of protein structural environment; 2 features based on likelihood of the amino acid substitutions.
Parepro	48				www.mobioinfor.cn/parepro/ Three attributes are characterised from homologues collected using PSI-BLAST: (i) property differences between the ‘new’ amino acid and those in the alignment; (ii) the distribution of amino acids at the position; (iii) the sequence environment (upstream and downstream amino acids)
transFIC	N/A				bg.upf.edu/transfic/home Exploits Functional Impact Scores with SIFT, PolyPhen-2 and MutationAssessor to score cancer mutations
[Westhead]	59			NS	Evaluates two machine learning methods in prediction from sequence
[Cui]	50	Y		NS	compbio.utmem.edu/snp/dataset/ [NO LONGER AVAILABLE] Evaluate two machine learning methods and uses structural information from homologues and sequence profiles from multiple alignment
[Kohane]	51			NS	Uses Bayesian methods using frequency data and hydrophobicity on some specific datasets
CHASM	52			NS	Cancer-specific High-throughput Annotation of Somatic Mutations. Uses a random forest to identify driver mutations in cancer.
B-SIFT	62			NS	a modified version of SIFT which is able to identify both deleterious and a subset of activating mutations given a protein sequence and a query mutation within that sequence
[Baker]	65	Y		NS	Uses classification tree and logistic regression machine learning method with solvent-accessibility, Cβ density and SIFT scores.
SNPdryad	75				snps.ccbr.utoronto.ca:8080/SNPdryad/ Uses only protein orthologs in building a multiple sequence alignment to derive a novel conservation scoring scheme with a Random Forest classifier.
STRUM	78		Y		http://zhanglab.ccmb.med.umich.edu/STRUM/ Predicts stability changes caused by single-point mutations. Starting from wild-type sequences, 3D models are constructed using I-TASSER and physics- and knowledge-based energy functions derived from the I-TASSER models are used for machine learning.
Provean	79,80,81				http://provean.jcvi.org/index.php A fast approach to predict the effects of both amino acid substitutions and indels.

Reviews etc

Ref 2 has a useful comparison of some of the resources in Table 1

Refs 43, 44, 63, 66, 68 are extensive reviews

Refs 55 and 61 are review of methods used for cancer mutations

References

[1] Yue, P., Melamud, E. and Moult, J. (2006) SNPs3D: candidate gene and SNP selection for association studies. BMC Bioinformatics, 7:166-166.

[2] Uzun, A., Leslin, C.M., Abyzov, A. and Ilyin, V. (2007) Structure SNP (StSNP): a web server for mapping and modeling nsSNPs on protein structures with linkage to metabolic pathways. Nucleic Acids Res, 35:W384-W392.

[3] Yip, Y.L., Scheib, H., Diemand, A.V., Gattiker, A., Famiglietti, L.M. Gasteiger, E. and Bairoch, A. (2004) The SwissProt variant page and the ModSNP database: a resource for sequence and structure information on human protein variants. Hum Mutat, 23:464-470.

[4] Dantzer, J., Moad, C., Heiland, R. and Mooney, S. (2005) MutDB services: Interactive structural analysis of mutation data. Nucleic Acids Res, 33:W311-W314.

[5] Karchin, R., Diekhans, M., Kelly, L., Thomas, D.J., Pieper, U., Eswar, N., Haussler, D. and Sali, A. (2005) LS-SNP: large-scale annotation of coding non-synonymous SNPs based on multiple information sources. Bioinformatics, 21:2814-2820.

[6] Stitziel, N.O., Binkowski, T.A., Tseng, Y.Y., Kasif, S. and Liang, J. (2004) topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association. Nucleic Acids Res, 32:D520-D522.

[7] Reumers, J., Schymkowitz, J., Ferkinghoff-Borg, J., Stricher, F., Serrano, L. and Rousseau, F. (2005) SNPeffect: a database mapping molecular phenotypic effects of human non-synonymous coding SNPs. Nucleic Acids Res, 33:D527-D532.

[8] Lei Bao, Mi Zhou, and Yan Cui. (2005) nsSNPAnalyzer: identifying disease-associated nonsynonymous single nucleotide polymorphisms. Nucleic Acids Res, 33:W480-W482.

[9] Boris Reva, Yevgeniy Antipin, and Chris Sander. (2011) Predicting the functional impact of protein mutations: Application to cancer genomics. Nucleic Acids Res, 39:e118-e118.

[10] Jana Marie Schwarz, Christian Rödelsperger, Markus Schuelke, and Dominik Seelow. (2010) MutationTaster evaluates disease-causing potential of sequence alterations. Nature Methods, 7:575-576.

[11] Bromberg,Y. and Rost,B. (2007) SNAP: predict effect of non-synonymous polymorphisms on function. Nucleic Acids Res., 35, 3823-3835.

[12] Yana Bromberg, Guy Yachdav, and Burkhard Rost. (2008) SNAP predicts effect of mutations on protein function. Bioinformatics, 24:2397-2398.

[13] Abel González-Pérez and Nuria López-Bigas. (2011) Improving the assessment of the outcome of nonsynonymous SNVs with a consensus deleteriousness score, Condel. Am J Hum Genet, 88:440-449.

[14] Hashem A. Shihab, Julian Gough, David N. Cooper, Peter D. Stenson, Gary L. A. Barker, Keith J. Edwards, Ian N. M. Day, and Tom R. Gaunt. (2013) Predicting the functional, molecular, and phenotypic consequences of amino acid substitutions using hidden Markov models. Hum Mutat, 34:57-65.

[15] Nouf S Al-Numair and Andrew C R Martin. (2013) The SAAP pipeline and database: tools to analyze the impact and predict the pathogenicity of mutations. BMC Genomics, 14(3):1-11.

[16] Biao Li, Vidhya G. Krishnan, Matthew E. Mort, Fuxiao Xin, Kishore K. Kamati, David N. Cooper, Sean D. Mooney, and Predrag Radivojac. (2009) Automated inference of molecular mechanisms of disease from amino acid substitutions. Bioinformatics, 25:2744-2750.

[17] Martin Kircher, Daniela M. Witten, Preti Jain, Brian J. O’Roak, Gregory M. Cooper, and Jay Shendure. (2014) A general framework for estimating the relative pathogenicity of human genetic variants. Nature Genetics, 46:310-315.

[18] Remo Calabrese, Emidio Capriotti, Piero Fariselli, Pier Luigi Martelli, and Rita Casadio. (2009) Functional annotations improve the predictive score of human disease-related mutations in proteins. Human Mutatation, 30:1237- 1244.

[19] Pauline C. Ng and Steven Henikoff. (2003) SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res, 31:3812-3814.

[20] Ramensky V, Bork P, Sunyaev S (2002) Human non-synonymous SNPs: server and survey. Nucleic Acids Res 30:3894-3900

[21] Ivan A. Adzhubei, Steffen Schmidt, Leonid Peshkin, Vasily E. Ramensky, Anna Gerasimova, Peer Bork, Alexey S. Kondrashov, and Shamil R. Sunyaev. (2010) A method and server for predicting damaging missense mutations. Nature Methods, 7:248-249.

[22] Ivan A. Adzhubei, Daniel M. Jordan, and Shamil R. Sunyaev. (2013) Predicting functional effect of human missense mutations using PolyPhen-2. Curr Protoc Hum Genet, 76:7.20.

[23] Paul D. Thomas, Michael J. Campbell, Anish Kejariwal, Huaiyu Mi, Brian Karlak, Robin Daverman, Karen Diemer, Anushya Muruganujan, Apurva Narechania (2003) PANTHER: A Library of Protein Families and Subfamilies Indexed by Function. Genome Res. 13(9):2129-2141.

[24] Paul D. Thomas and Anish Kejariwal (2004) Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: Evolutionary evidence for differences in molecular effects PNAS 101:15398-15403

[25] Liam R Brunham, Roshni R Singaraja, Terry D Pape, Anish Kejariwal, Paul D Thomas, Michael R Hayden (2005) Accurate Prediction of the Functional Significance of Single Nucleotide Polymorphisms and Mutations in the ABCA1 Gene. PLoS Genet 1(6): e83. doi: 10.1371/journal.pgen.0010083

[26] Capriotti, E., Calabrese, R. & Casadio, R. (2006) Predicting the insurgence of human genetic diseases associated to single point protein mutations with support vector machines and evolutionary information. Bioinformatics, 22:2729-2734.

[27] Ferrer-Costa C, Gelpí JL, Zamakola L, Parraga I, de la Cruz X, Orozco M. (2005) PMUT: a web-based tool for the annotation of pathological mutations on proteins. Bioinformatics. 21:3176-8.

[28] Hurst, J.M., McMillan, L.E.M., Porter, C.T., Allen, J. Fakorede, A. and Martin, A.C.R. (2009) The SAAPdb web resource: a large scale structural analysis of mutant proteins, Human Mutation, 30:616-624.

[29] Worth CL, Preissner R, & Blundell TL. (2011) SDM - a server for predicting effects of mutations on protein stability and malfunction. Nucleic Acids Res. 39:W215-22.

[30] Reva, B., Antipin, Y., and Sander, C. (2007). Determinants of protein function revealed by combinatorial entropy optimization. Genome Biol. 8, R232.

[31] Clifford, R.J., Edmonson, M.N., Nguyen, C., & Buetow, K.H. (2004). Large-scale analysis of non-synonymous coding region single nucleotide polymorphisms. Bioinformatics 20:1006-1014.

[32] Stone, E.A., and Sidow, A. (2005). Physicochemical constraint violation by missense substitutions mediates impairment of protein function and disease severity. Genome Res. 15, 978-986.

[33] Binkley, J., Karra, K., Kirby, A., Hosobuchi, M., Stone, E.A., & Sidow, A. (2010). ProPhylER: A curated online resource for protein function and structure based on evolutionary constraint analyses. Genome Res. 20:142-154.

[34] Yates, C.M. & Sternberg, M.J. (2013) Proteins and Domains Vary in Their Tolerance of Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs) J. Mol. Biol. 425, 1274-1286

[35] Yates, C.M. & Sternberg, M.J. (2013) The Effect of Non-Synonymous Single Nucleotide Polymorphisms (nsSNPs) on Protein-Protein Interactions. J. Mol. Biol. 425:3949-63

[36] Emidio Capriotti, Remo Calabrese, Piero Fariselli, Pier Luigi Martelli, Russ B Altman, Rita Casadio (2013) WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation BMC Genomics 14(Suppl 3):S6

[37] Chasman,D. and Adams,R.M. (2001) Predicting the functional consequences of non-synonymous single nucleotide polymorphisms: structure-based assessment of amino acid variation. J. Mol. Biol., 307:683-706.

[38] Schymkowitz, J., Borg, J., Stricher, F., Nys, R., Rousseau, F., Serrano, L. (2005) The FoldX web server: an online force field. Nucleic Acids Res. 33:W382-8.

[39] Kaminker J.S., Zhang Y., Waugh A., Haverty P.M., Peters B., Sebisanovic D., Stinson J., Forrest W.F., Bazan J.F., Seshagiri S., Zhang Z. (2007). Distinguishing cancer-associated missense mutations from common polymorphisms. Cancer Research 67, 465-73.

[40] Dehouck Y, Grosfils A, Folch B, Gilis D, Bogaerts Ph, Rooman M. (2009) Prediction of protein stability changes upon mutations using statistical potentials and neural networks: PoPMuSiC 2.0. Bioinformatics 25:2537-2543

[41] Dehouck Y, Kwasigroch JM, Gilis D, Rooman M. (2011) PoPMuSiC 2.1: a web server for the estimation of protein stability changes upon mutation and sequence optimality. BMC Bioinformatics 12:151

[42] Gonnelli G, Rooman M, Dehouck Y. (2012) Structure-based mutant stability predictions on proteins of unknown structure. Journal of Biotechnology 161:287-293

[43] Casandra Riera, Sergio Lois, Xavier de la Cruz 2014, 'Prediction of pathological mutations in proteins: the challenge of integrating sequence conservation and structure stability principles', WIREs Computational Molecular Science, 4, 3, 249-268.

[44] Tavtigian, S.V., Greenblatt, M.S., Lesueur, F. and Byrnes, G.B. for the IARC Unclassified Genetic Variants Working Group (2008) In silico analysis of missense substitutions using sequence-alignment based methods. Hum Mutat. 29: 1327-1336

[45] Ferrer-Costa, C., Orozco, M., de la Cruz, X. (2004) Sequence-Based Prediction of Pathological Mutations. PROTEINS: Structure, Function, and Bioinformatics 57:811- 819

[46] Cheng T.M.K., Lu Y-E, Vendruscolo M., Lio P., Blundell T.L. (2008) Prediction by graph theoretic measures of structural effects in proteins arising from non-synonymous single nucleotide polymorphisms. PLoS Comp. Biology. 4 (7) e1000135.

[47] Acharya V. and Nagarajaram H.A. Hansa (2012) An automated method for discriminating disease and neutral human nsSNPs. Human Mutation 2:332-337.

[48] Tian J., Wu N., Guo X., Guo J. Zhang J., Fan Y. (2007) Predicting the phenotypic effects of non-synonymous single nucleotide polymorphisms based on support vector machines. BMC Bioinformatics 8:450-464.

[49] McLaren W, Pritchard B, Rios D, Chen Y, Flicek P and Cunningham F. (2010) Deriving the consequences of genomic variants with the Ensembl API and SNP Effect Predictor. Bioinformatics 26:2069-70

[50] Bao,L. and Cui,Y. (2005) Prediction of the phenotypic effects of non-synonymous single nucleotide polymorphisms using structural and evolutionary information. Bioinformatics, 21, 2185-2190.

[51] Cai,Z., Tsung,E.F., Marinescu,V.D., Ramoni,M.F., Riva,A. and Kohane, I.S. (2004) Bayesian approach to discovering pathogenic SNPs in conserved protein domains. Hum. Mutat., 24, 178-184.

[52] Carter,H., Chen,S., Isik,L., Tyekucheva,S., Velculescu,V.E., Kinzler,K.W., Vogelstein,B. and Karchin,R. (2009) Cancer-specific high-throughput annotation of somatic mutations: computational prediction of driver missense mutations. Cancer Res., 69, 6660-6667.

*[53] Chan,P.A. et al. (2007) Interpreting missense variants: comparing computational methods in human disease genes CDKN2A, MLH1, MSH2, MECP2, and tyrosinase (TYR). Hum. Mutat., 28, 683-693.

*[54] Cooper, G.M. et al. Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 15, 901-913 (2005).

[55] Hon,L.S. et al. (2008) Computational approaches for predicting causal missense mutations in cancer genome projects. Curr. Bioinformatics, 3, 46-55.

[56] Kaminker,J.S. et al. (2007a) CanPredict: a computational tool for predicting cancer-associated missense mutations. Nucleic Acids Res., 35, W595-W598. [WITH 39]

[57] Karchin,R. (2009) Next generation tools for the annotation of human SNPs. Brief Bioinformatics, 10, 35-52.

[58] Karchin,R., Kelly,L. and Sali,A. (2005) Improving functional annotation of non-synonymous SNPs with information theory. In Klein,T.E., Hunter,L., Dunker,A.K., Jung,T. and Altman,R.B. (eds), Proceedings of the Pacific Symposium in Biocomputing 2005 (PBS 2005), January 4-8, Hawaii, USA.

[59] Krishnan,V.G. and Westhead,D.R. (2003) A comparative study of machine-learning methods to predict the effects of single nucleotide polymorphisms on protein function. Bioinformatics, 19, 2199-2209.

*[60] Kulkarni,V. et al. (2008) Exhaustive prediction of disease susceptibility to coding base changes in the human genome. BMC Bioinformatics, 9(Suppl. 9), S3.

[61] Lee,W., Yue,P. and Zhang,Z. (2009) Analytical methods for inferring functional effects of single base pair substitutions in human cancers. Hum.Genet., 126, 481-498.

[62] Lee,W., Zhang,Y., Mukhyala,K., Lazarus,R.A. and Zhang,Z. (2009) Bi-directional SIFT predicts a subset of activating mutations. PLoS One, 4, e8311.

[63] Mooney,S. (2005) Bioinformatics approaches and resources for single nucleotide polymorphism functional analysis. Brief. Bioinform., 6, 44-56.

[64] Ng,P.C. and Henikoff,S. (2001) Predicting deleterious amino acid substitutions. Genome Res., 11, 863-874.

[65] Saunders,C.T. and Baker,D. (2002) Evaluation of structural and evolutionary contributions to deleterious mutation prediction. J. Mol. Biol., 322, 891-901

[66] Steward,R.E. et al. (2003) Molecular basis of inherited diseases: a structural perspective. Trends Genet., 19, 505-513.

[67] Sunyaev,S.R., Ramensky,V., Koch,I., Lathe,W.,3rd, Kondrashov,A.S. and Bork,P. (2001) Prediction of deleterious human alleles. Hum. Mol. Genet., 10, 591-597.

[68] Teng,S., Michonova-Alexova,E. and Alexov,E. (2008) Approaches and resources for prediction of the effects of non-synonymous single nucleotide polymorphism on protein function and interactions. Curr. Pharm. Biotechnol., 9, 123-133.

*[69] Thomas,P.D. and Kejariwal,A. (2004) Coding single-nucleotide polymorphisms associated with complex vs. Mendelian disease: evolutionary evidence for differences in molecular effects. Proc. Natl Acad. Sci. USA, 101, 15398-15403.

[70] Wang,Z. and Moult,J. (2001) SNPs, protein structure, and disease. Hum. Mutat., 17, 263-270.

[71] Worth CL, Bickerton GR, Schreyer A, Forman JR, Cheng TM, Lee S, Gong S, Burke DF, Blundell TL. 2007. A structural bioinformatics approach to the analysis of nonsynonymous single nucleotide polymorphisms (nsSNPs) and their relation to disease. J Bioinform Comput Biol 5:1297-1318.

[72] Yue P, Li Z, Moult J. 2005. Loss of protein structure stability as a major causative factor in monogenic disease. J Mol Biol 353:459-473.

[73] Yue,P. and Moult,J. (2006) Identification and analysis of deleterious human SNPs. J. Mol. Biol., 356, 1263-1274.

[75] Wong, K-C. and Zhang, Z. (2014) 'SNPdryad: Predicting Deleterious Non-synonymous Human SNPs Using Only Orthologous Protein Sequences' Bioinformatics 30: 1112-1119.

*[76] Juritz, E., Fornasari, M.S., Martelli, P.L., Fariselli, P., Casadio, R. and Parisi, G. (2012) 'On the effect of protein conformation diversity in discriminating among neutral and disease related single amino acid substitutions' BMC Genomics 13(Suppl 4):S5

[77] Yates, C.M., Filippis, I., Kelley, L.A. and Sternberg, M.J.E. (2014) 'SuSPect: Enhanced Prediction of Single Amino Acid Variant (SAV) Phenotype Using Network Features' J. Mol. Biol., 426, 2692-2701.

[78] Quan, L., Lv, Q., Zhang, Y. (2016) 'STRUM: structure-based prediction of protein stability changes upon single-point mutation', Bioinformatics 32:2936-2946

[79] Choi Y, Sims GE, Murphy S, Miller JR, Chan AP (2012) 'Predicting the Functional Effect of Amino Acid Substitutions and Indels', PLoS ONE 7(10): e46688.

[80] Choi Y (2012) 'A Fast Computation of Pairwise Sequence Alignment Scores Between a Protein and a Set of Single-Locus Variants of Another Protein', In Proceedings of the ACM Conference on Bioinformatics, Computational Biology and Biomedicine (BCB '12). ACM, New York, NY, USA, 414-417.

[81] Choi Y, Chan AP (2015) 'PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels', Bioinformatics 31(16): 2745-2747.

* Reference does not appear in the table above