The ZONE command is used to specify zones in the two structures
which are considered equivalent. The complete syntax for the command
is:
ZONE CLEAR|((*|(X...[,n][/m])|(j-k))[:(*|(X...[,n][/m])|(j-k))])where
X... is an amino acid sequence, n is a number of
residues, m is the occurrence number, j and k are
residue specifications of the form [chain][.]resnum[insert]. Items
in square brackets are optional and alternatives are marked by a
| and grouped in parentheses.
ZONE commands are cumulative. Thus each zone you specify is
added to those currently active. To clear all zones (i.e. fit all
residues), the ZONE CLEAR or ZONE * command may be
given. To clear a single zone, the DELZONE command can be used.
When a new zone is added, a warning message is displayed if the new zone
overlaps an existing zone. Overlapping zones will be flagged with *
when using the STATUS command.
Although it appears complex, the syntax is actually very simple and consists of two identical sections separated by a colon (:). The left half is applied to the reference structure and the right half to the mobile structure. In its simplest form, the right hand half of the expression is absent and the specification is applied to both reference and mobile structures. For example:
ZONE 24-34will set the zone to include residues 24-34 in both structures. If you wanted to fit 24-34 in the reference structure with 25-35 in the mobile structure, this simply becomes:
ZONE 24-34:25-35
You may also specify chain names and insertion codes. The chain name is placed before the residue number and the insertion code afterwards. For example:
ZONE L25A-L30fits residues 25A-30 in the L chain of both structures. Optionally, the chain name may be separated from the residue number using a full stop. For example:
ZONE L.25A-L.30Using the full stop also makes the statement case-sensitive. In practice, the full stop separator is used with numeric chain names to separate the chain name from the residue number and with lowercase chain names.
ZONE 1.25-1.30 ZONE b.1-b.60:A.1-A.60:
Simple wildcards may also be used. For example
ZONE H*:B*fits the reference H chain with the mobile B chain,
ZONE -10:50-59fits from the first residue to residue 10 in the reference structure with 50-59 in the mobile structure.
ZONE *:1-100fits all residues in the reference structure with 1-100 in the mobile structure.
If the structure file contains negatively numbered residues and you are using residue numbering, you can escape the minus sign in the residue number using a backslash:
ZONE \-4-10:\-1-13
will fit residues Alternatively, you may specify the zones to be fitted by giving a sequence fragment. Together with that fragment, you may specify the number of residues to consider starting at that point. If the fragment occurs more than once in the sequence you may specify which occurrence you wish to consider. For example:
ZONE CAR:VNSfits the first occurrence of CAR in the reference set with first occurrence of VNS in the mobile set;
ZONE CAR,10:VNS,10fits 10 residues starting at the first occurrence of CAR in the reference set with 10 residues from the first occurrence of VNS in the mobile set;
ZONE CAR,5/2fits 5 residues from second occurrence of CAR in both structures;
ZONE 24-34:EIR,llfits 24-34 in the reference set with 11 residues starting at the first occurrence of EIR in the mobile set.
By default, ProFit works in `Residue Number' mode, i.e. the numbers used in zone commands are the numbers seen in the PDB file. The alternative mode is `Sequential' mode where residues are numbered sequentially throughout the structure (including throughout multiple chains). Any chain names appearing in zone specifications will be ignored in Sequential mode. To switch mode, you use the NUMBER SEQUENTIAL or NUMBER RESIDUE commands.
The DELZONE command specifies zones to be deleted from the user-defined
list of fit zones. DELZONE uses the same syntax as the ZONE
command. The command matches the specified zone with a zone in the user-defined
list of fitting zones and deletes the matching zone from the list. As with the
ZONE command, entering either DELZONE CLEAR or DELZONE *
will delete all user-defined zones.
ALIGN
command. The sequence alignment is displayed, any currently active
fitting zones are cleared and replaced by zones derived from the
alignment.
Currently the ALIGN command may only be used if the structures
contain only one chain.
Additional zones may also be specified in the usual way.
Clearly, it will normally be necessary to use the ATOMS command
to specify that only backbone or C
atoms are included in the
fitting.
The GAPPEN command allows you to specify an integer gap penalty
for the sequence alignment performed by the ALIGN command. The
default value is 5.
If you have an alignment performed outside ProFit you may use this to
specify the equivalent zones. Any previously defined fitting zones are
automatically cleared first. As with the ALIGN command, this
can currently only be used with structures having a single chain.
The alignment should be a file in PIR format using - characters to align the sequences. The two sequences are represented by separate entries, i.e. each must have a header of the form:
>P1;xxxxxx title text .......
If the PIR file contains multiple chains, it will be rejected. The first sequence will be assumed to be that of the the reference structure and the second is that of the the mobile structure. Any other sequences in the file are ignored.
The READALIGNMENT command is used to read in the PIR file.
When performing a multiple structure fit, the first sequence must appear twice in the sequence alignment file. This is because it is used as both the first reference and mobile set.
Note that a bug in using the READALIGNMENT with multiple
structure fitting was fixed in V2.3. (The bug caused the program
to crash if a deletion appeared in the same place in two or
more of the sequences.)
When obtaining fit zones from a sequence alignment, either from
ALIGN or from READALIGNMENT, it can be useful to limit
the zones of residues used. Normally all aligned residue pairs will be
used.
For example, if the alignment were:
1 2 3
123456789012345678901234567890123
ASAHSTGEHNM--PLELLGHISLAM---NPRTY
---HSTADHNLRTPLEVLG--SLAMEDRQPRTY
the zones would normally be taken from the following positions
in the alignment: 4-11, 14-19, 22-25, 29-33
By using the command:
LIMIT 20 28
only the zone from 22-25 would be included.
This is particularly useful in conjunction with the ITERATE command (Section 8.4) and when fitting multiple structures (Section 9).
The LIMIT OFF command restores the default behaviour of
deriving the zones from the whole alignment.
The ITERATE command switches on the iterative updating of
fitted zones during subsequent FIT commands. The ITERATE
command ma be followed by an optional parameter to specify the cutoff
used to include or exclude pairs from the zones. (ITERATE OFF
is used to switch it off again.)
Currently the ITERATE command may only be used if the structures
contain only one chain.
Note that this immediately does an ATOMS CA since iteration of
zones is only performed on C
atoms. The program gives an
informational message to this effect. See notes below if you want to
calculate an RMSd over other atoms.
After the initial fit on the specified zones, the zones are updated
such that residue pairs with C
atoms within a specified cutoff
(default 3.0Å) are included and those more distant are excluded. The
optimum set of equivalences is obtained using a dynamic programming
method.
After updating the zones, the structures are refitted and the
procedure iterates to convergence of
Å, (typically 3 or 4
cycles). The RMSd on C
atoms is shown after each cycle unless
the QUIET command is given.
You may specify a minimal initial zone of say 3 amino acids on which
to fit first. The zone iteration will expand the zones until as many
residues as possible can be equivalenced. Alternatively, this option
is particularly useful in conjunction with the ALIGN
command. Using ALIGN followed by ITERATE gives a
particularly convenient method of fitting two arbitrary structures.
As stated above, the ITERATE command implies
ATOMS CA. Having fitted on C
atoms, you can of course
display the RMSd over other atom sets in the usual way using the
RATOMS command (e.g. RATOMS N,CA,C,O will display the
backbone RMSd).
Should you wish to refit on another atom set using the iterated zones,
simply use ITERATE OFF to switch off iteration, select the atom
set required using the ATOMS command and use FIT to
refit the structures in the usual way. For example, to fit on backbone
atoms:
ITERATE OFF ATOMS N,CA,C,O FIT
It is possible to define zones by flagging residues in the temperature
factor column of the PDB file using the BZONE command. Zones are marked
using a positive whole numbers while zeros are ignored. Multiple zones can be
marked using additional numbers.
Assignment of zones is carried out in two ways:
If only the reference structure is marked then the marked section will be added as a fitting zone in both the reference and mobile structure.
If both the reference and the mobile structure are marked then fitting zones are assigned by scanning through and setting zones for corresponding continuous stretches of flagged residues in either the reference or mobile structures.
The default method for fitting is to centre the fit around the centre of
geometry of the fit atoms. Fitting can be performed using a residue,
specified by the SETCENTRE (or SETCENTER) command,
as the center of fitting rather than the centre of geometry of the fit atoms.
SETCENTRE CLEAR|(*|j[:j])where j is a residue specification of the form [chain][.]resnum[insert]. Items in square brackets are optional and alternatives are marked by a
| and grouped in parentheses.
Entering SETCENTRE CLEAR or SETCENTRE * will clear the centre
residue.
The DISTCUTOFF command specifies a distance cutoff for ignoring atom
pairs outside a specified distance when calculating RMSd.
DISTCUTOFF [cutoff|ON|OFF]
The DISTCUTOFF command specifies a distance cutoff for ignoring atom
pairs outside a specified distance when calculating RMSd. Entering
DISTCUTOFF ON or DISTCUTOFF OFF will turn the distance cutoff on
or off. Entering DISTCUTOFF 2.5 will set the value of the distance
cutoff to 2.5 Angstroms and turn the distance cutoff on. A warning is displayed
if the distance cutoff is set to zero and turned on.