The MULTI command allows a multiple set of structures to be
read in for fitting. The filename specified for MULTI is a
`file of files' i.e. it contains a list of filenames which will be
read.
MULTI is used in place of REFERENCE and MOBILE to
read in a set of structure files. The first structure file is used as
a reference set for the first fitting stage, but the coordinates are
averaged after each fitting stage to derive an averaged template used
for subsequent fitting.
i.e. Given
files to fit, file 2 is fitted to file 1 and an
averaged structure,
, is calculated, file 3 is then fitted to
and a new average,
is calculated. This continues until all
structures have been fitted. The whole procedure iterates until
convergence (typically 3 or 4 cycles).
ProFit V3.0 changes the default method of calculating the average template. As each new mobile structure is added, the degree of change in the averaged structure is inversely proportional to the total number of mobile structures. Consequently, outlying structures should have less effect on the averaged reference structure.
Normally, the coordinates of the first structure in the MULTI list
are taken as the starting point for the averaged reference structure. It is
possible however, to select another mobile structure as the initial reference
structure using the SETREF command.
For example, SETREF 3 will use the third mobile structure as the
reference strcture.
If no structure number is
specified, then the SETREF command carries out an all vs. all
comparison and the coordinates
of the mobile structure with the least overall RMSD to all the other
mobile structures are selected as the initial reference structure.
Multiple structures can be fitted with either the FIT or
ORDERFIT command. The ORDERFIT command (new in V3.0) will
perform the multiple structure fit in a similar manner to the FIT
command but fitting the most similar structures first. As the averaged
template is updated with each new structure fitted, the order of fitting
has a (small) influence on the template. The ORDERFIT command (possibly
along with the SETREF command) can provide a standardized fitting scheme.
Progress and RMSds are reported at each iteration unless the QUIET
command is used.
By default, RMSDs, pairwise distances and transformation matrices are given in
relation to the first mobile structure. The MULTREF command will set
ProFit to give results in relation to the averaged reference structure rather
than the first mobile structure (MULTREF OFF restores the
default behaviour).
The resulting fitted files are written with the MWRITE command.
Note that there is no ``reference'' set in the sense used for normal
2-structure fitting; fitted versions of all
files will be written
since the reference set is actually an averaged template used purely as a
guide for fitting.
The averaged template can be written to a file using the WRITE
REF command. As it is a simple numerical average of the cartesian
coordinates however, taking the reference structure generated by ProFit as a
representation of an actual geometry/conformation accessible by the structure
should be done with caution.
When the MWRITE command is used, the output filenames are the
same as the input files, but with the extension replaced by that
specified in the MWRITE command. If no extension is specified,
then `.fit' will be used. If the input structure files contained no
extension, then the extension specified will be appended to the
filenames.
Note that since only the extension is changed when writing back the fitted files, you must have permission to write to the directory from which the original files were read.
Multiple-structure fitting is particularly effective in combination
with the ITERATE command (see Section 8.4)
which refines the fitting zones iteratively. This can lead to
extremely good multiple structures fits.
Note that multiple structure fitting and zone iteration can be very slow as these have been added to the earlier pair-wise fitting engine. An increase in speed needs a complete re-design of the code.
Currently, the ZONE command may only be used with multiple structure fitting when the same zone specification may be applied to every structure. i.e. You cannot specify a zone for each structure separating the zones with a colon (:)
Thus, the following are legal zones:
ZONE 20-30 ZONE C,3while the following are not:
ZONE 24-34:25-35 ZONE CAR:VNS ZONE 24-34:EIR,11
For normal use, it is recommended that the ALIGN, TRIMZONES and
READALIGNMENT commands (possibly in conjunction with the
LIMIT command) are used for specifying zones when fitting
multiple structures.
As of ProFit V3.0, the TRIMZONES command can be used in conjuction with
the ALIGN command. The ALIGN command performs a pairwise
alignment for each of the mobile structures with the reference.
Although fitting each mobile
using an individualized set of zones offers the best fitting for each mobile
to the reference, there may be times when a like vs. like comparison is
required. If the number of residues used for fitting varies, RMS
deviation cannot be directly compared between structures.
To allow for a like vs. like comparison, the TRIMZONES command resets
the fitting zones for each mobile structure to include only fitting residues
that are common to all the mobile structures. Thus, by ensuring that the
fitting zones are the same for each mobile, the TRIMZONES command
allows for a like vs. like comparison.
When using READALIGNMENT with
multiple structures, the first sequence must appear twice in
the alignment file. This is because it is used as both the first
reference and mobile set.
Note that a bug in using the READALIGNMENT with multiple
structure fitting was fixed in V2.3. (The bug caused the program
to crash if a deletion appeared in the same place in two or
more of the sequences.)
As of ProFit V3.0, it is possible to perform an all versus all comparison of
the mobile structures when fitting multiple structures. The ALLVSALL
command requires that the fitting zones set are identical for all mobile
stuctures and automatically resets the fitting zones using the
TRIMZONES command.
Results are presented as tab-delimited text suitable for loading into a
spreadsheet. If the optional filename parameter is given, output is
directed to the specified file. If a filename is not specified, or the file
cannot be opened, output appears on the screen. If the filename begins with
a pipe character (
), the results are piped into the specified program.
This is particularly useful with the more (or less) Unix
command.