measure command

The measure command is used for measuring geometries in molecules. All of the options and preprocessors available from the process command are also available.

$  ml measure --help
usage: mollib measure [-h] -i id/filename [id/filename ...] [-c filename] [-l]
                      [-s] [-m [MODELS [MODELS ...]]]
                      [-d atom atom | -a atom atom atom | -dih atom atom atom atom | -w atom atom]
                      [--stats] [--only-intra] [--exclude-intra]
                      [--only-intra-chain] [--exclude-intra-chain]
                      [--only-delta DELTA] [--only-bonded] [--hydrogenate]
                      [--rama]

arguments:
  -h, --help            show this help message and exit
  -i id/filename [id/filename ...], --in id/filename [id/filename ...]
                        (required) The filename(s) or PDB identifier(s) of the
                        structure(s)
  -c filename, --config filename
                        The configuration filename
  -l                    List details on the molecule(s)
  -s, --save            Save fetched files to the local directory.
  -m [MODELS [MODELS ...]], --models [MODELS [MODELS ...]]
                        The models numbers to analyze.
  --hydrogenate         Strip hydrogens and re-add them before analysis

measurement options:
  -d atom atom, --dist atom atom
                        Measure distances between 2 atom selections. ex: 31.N
                        32.CA
  -a atom atom atom, --angle atom atom atom
                        Measure angles between 3 atom selections. ex: 31.N
                        31.CA 31.C
  -dih atom atom atom atom, --dihedral atom atom atom atom
                        Measure dihedral angles between 4 atom selections. ex:
                        31.N 31.CA 31.C 32.N
  -w atom atom, --within atom atom
                        Measure all distances from atom selection to within
                        the specified distance. ex: 31:33.N 5
  --stats               Report statistics on the reported measurements.
  --rama                Report the Ramachandran angles. Filters and options
                        are ignored.

filters:
  --only-intra          Only report measurements within a residue
  --exclude-intra       Exclude measurements within a residue
  --only-intra-chain    Only report measurements within a chain
  --exclude-intra-chain
                        Exclude measurements within a chain
  --only-delta DELTA    Only report residues separated by DELTA residue
                        numbers
  --only-bonded         Only report measurements from bonded atoms

Arguments

-d / --dist atom atom

Measure the distance (in Angstroms) between two atoms.

Multiple atom pairs can used. ex: -d 31.N 31.CA -d 32.N 33.CA

Atoms must follow the standard naming conventions. See Abbreviated Selectors and Filters.

-a / --angle

Measure the angle (in degrees) between three atoms.

Multiple atom triplets can be used. ex: -a 31.N 31.CA 31.CB -a 32.N 32.CA 32.CB

Atoms must follow the standard naming conventions. See Abbreviated Selectors and Filters.

-dih / --dihedral

Measure the dihedral angle (in degrees) between four atoms.

Multiple atom quartets can be used. ex: -dih 30.C 31.N 31.CA 31.C -dih 31.N 31.CA 31.C 32.N

Atoms must follow the standard naming conventions. See Abbreviated Selectors and Filters.

Note

If simple Ramachandran dihedrals are needed, checkout --rama.

--stats
Report the average and standard deviation of all measured values. This option only applies to the distance, angle and dihedral measurements.
--rama

Measure Ramachandran angles (in degrees) for a protein. Filters and options are ignored. Heteroatom chains are skipped.

The --rama command classifies Ramachandran angles based on backbone-backbone amide hydrogen bonds.

Atom Selectors and Filters

Abbreviated Selectors

The measure methods find atoms using atom locators. Atom locators must follow one of these conventions:

  1. (residue number).(atom name). ex: 31.CB for the CB atom of residue number 31.
  2. (chain id).(residue number).(atom name). ex: A.31.CB for the CB atom of residue number 31 in chain ‘A’.

Additionally, the chain id, residue number or both can be expressed as a range using the ‘:’ character:

  1. (residue range).(atom name). ex: 31:34.CB for the CB atom of residue number 31, 32, 33 and 34.
  2. (chain range).(residue number).(atom name). ex:A:C.34.CB for the CB atom of residue number 34 for chains ‘A’, ‘B’, ‘C’ and ‘D’.

Finally, heteroatom chains have an asterisk appended to them. ex: ‘C*’

Note

Atom selections may encompass hundreds of atoms, which when used in combination, could lead to searches over millions of combinations. To help improve their performance, you can either narrow their scope by reducing the range of chains or residue numbers, combine multiple Filters or use one of the shortcut selectors, like --rama for Ramachandran dihedral angles.

Filters

--only-intra
Exclude atom selections that are not within the same residue.
--exclude-intra
Exclude atom selections that are within the same residue.
--only-intra-chain
Exclude atom selections that are not within the same chain.
--exclude-intra-chain
Exclude atom selections that are within the same chain.
--only-delta DELTA
Exclude atom selections that don’t have at least one set of atoms with residues separated by DELTA number. This filter ignores the chain identifier and may need to be combined with --filter-intra-chain or --exclude-intra-chain.
--only-bonded

Exclude atom selections that are not bonded. The bonded tests linear bonding relationships. For example, a dihedral with four atoms (atom1, atom2, atom3 and atom4) must have bonds between atom1–atom2, atom2–atom3 and atom3–atom4.

Note

Bonded searches have to investigate the topology of each atom selection, which can be slower than the above filters. Combining the --only-bonded filter with other filters, like --only-delta 1, can significantly speed up searches.

Examples

Measuring distances

Measure \(\alpha\)-helical HA-H distances in chain ‘A’ for residues 23-49 of 2MUV, the homotetrametic influenza M2 channel. Include statistics on the measured distances.

$  ml measure -i 2MUV -d 23:49.HA 23:49.H --only-delta 3 --stats
Table: Distances for 2MUV-1

Num  Atom 1    Atom 2   Dist. (A)  
---- --------- -------- -----------
1    A.S23.HA  A.L26.H  7.01       
2    A.D24.HA  A.V27.H  5.08       
3    A.P25.HA  A.V28.H  3.38       
4    A.L26.HA  A.A29.H  3.58       
5    A.V27.HA  A.A30.H  3.64       
6    A.V28.HA  A.N31.H  3.58       
7    A.A29.HA  A.I32.H  3.42       
8    A.A30.HA  A.I33.H  3.59       
9    A.N31.HA  A.G34.H  3.53       
10   A.I32.HA  A.I35.H  3.75       
11   A.I33.HA  A.L36.H  3.42       
12   A.I35.HA  A.L38.H  3.36       
13   A.L36.HA  A.I39.H  3.54       
14   A.H37.HA  A.L40.H  3.49       
15   A.L38.HA  A.W41.H  3.42       
16   A.I39.HA  A.I42.H  3.77       
17   A.L40.HA  A.L43.H  3.63       
18   A.W41.HA  A.D44.H  3.46       
19   A.I42.HA  A.R45.H  3.61       
20   A.L43.HA  A.L46.H  3.36       
21   A.D44.HA  A.F47.H  3.58       
22   A.R45.HA  A.F48.H  3.67       
23   A.L46.HA  A.K49.H  6.94       
                        ---------  
                        4.0 ± 1.0  

Measure CA-CA distances between residue 20-21 for chains ‘A’, ‘B’, ‘C’ and ‘D’ of 2MUV–excluding same residue distances and same chain distances.

$  ml measure -i 2MUV -d A:D.20:21.CA A:D.20:21.CA --exclude-intra --exclude-intra-chain
Table: Distances for 2MUV-1

Num  Atom 1    Atom 2    Dist. (A)  
---- --------- --------- -----------
1    A.N20.CA  B.N20.CA  18.05      
2    A.N20.CA  B.D21.CA  17.50      
3    A.D21.CA  B.N20.CA  14.39      
4    A.D21.CA  B.D21.CA  13.70      
5    A.N20.CA  C.N20.CA  25.10      
6    A.N20.CA  C.D21.CA  21.94      
7    A.D21.CA  C.N20.CA  21.95      
8    A.D21.CA  C.D21.CA  19.02      
9    A.N20.CA  D.N20.CA  17.86      
10   A.N20.CA  D.D21.CA  14.21      
11   A.D21.CA  D.N20.CA  17.32      
12   A.D21.CA  D.D21.CA  13.52      
13   B.N20.CA  C.N20.CA  17.89      
14   B.N20.CA  C.D21.CA  17.44      
15   B.D21.CA  C.N20.CA  14.21      
16   B.D21.CA  C.D21.CA  13.64      
17   B.N20.CA  D.N20.CA  25.32      
18   B.N20.CA  D.D21.CA  22.15      
19   B.D21.CA  D.N20.CA  22.14      
20   B.D21.CA  D.D21.CA  19.21      
21   C.N20.CA  D.N20.CA  17.51      
22   C.N20.CA  D.D21.CA  17.00      
23   C.D21.CA  D.N20.CA  13.86      
24   C.D21.CA  D.D21.CA  13.20      

Compare the distance between the HA of residue 5 and the H of residue 21 for two different structures, 2KXA and 2LWA. The 2KXA structure represents the wildtype hemagglutinin fusion peptide (HAfp) in the closed helical-hairpin structure, placing these two atoms in close promixity. The 2LWA structure represents the conformational ensemble of the HAfp-G8A mutant with a closed structure (chain ‘A’), a semi-closed structure (chain ‘B’) and an open structure (chain ‘C’).

$  ml measure -i 2KXA 2LWA -d A:C.5.HA A:C.21.H --only-intra-chain
Table: Distances for 2KXA-1

Num  Atom 1   Atom 2   Dist. (A)  
---- -------- -------- -----------
1    A.A5.HA  A.W21.H  3.30       

Table: Distances for 2LWA-1

Num  Atom 1   Atom 2   Dist. (A)  
---- -------- -------- -----------
1    A.A5.HA  A.W21.H  3.22       
2    B.A5.HA  B.W21.H  11.77      
3    C.A5.HA  C.W21.H  18.47      

Measuring Angles

Measure the angle of the bonded ‘C-1’–‘N’–‘H’ atoms for residues 20-30 from the ubiquitin structure 2MJB.

$  ml measure -i 2MJB -a 20:30.C 20:30.N 20:30.H --only-bonded
Table: Angles for 2MJB-1

Num  Atom 1   Atom 2   Atom 3   Angle (deg)  
---- -------- -------- -------- -------------
1    A.S20.C  A.D21.N  A.D21.H  120.2        
2    A.D21.C  A.T22.N  A.T22.H  118.4        
3    A.T22.C  A.I23.N  A.I23.H  119.0        
4    A.I23.C  A.E24.N  A.E24.H  118.7        
5    A.E24.C  A.N25.N  A.N25.H  119.2        
6    A.N25.C  A.V26.N  A.V26.H  118.9        
7    A.V26.C  A.K27.N  A.K27.H  119.2        
8    A.K27.C  A.A28.N  A.A28.H  118.5        
9    A.A28.C  A.K29.N  A.K29.H  118.4        
10   A.K29.C  A.I30.N  A.I30.H  118.9        

Measuring Dihedrals

The following example measures the \(\phi\) angle for residues 2-6 of the hemagglutinin fusion peptide domain (2KXA).

$  ml measure -i 2KXA -dih 2:6.C 2:6.N 2:6.CA 2:6.C --only-bonded --stats
Table: Dihedrals for 2KXA-1

Num  Atom 1  Atom 2  Atom 3   Atom 4  Dihedral (deg)  
---- ------- ------- -------- ------- ----------------
1    A.L2.C  A.F3.N  A.F3.CA  A.F3.C  -65.2           
2    A.F3.C  A.G4.N  A.G4.CA  A.G4.C  -57.2           
3    A.G4.C  A.A5.N  A.A5.CA  A.A5.C  -68.4           
4    A.A5.C  A.I6.N  A.I6.CA  A.I6.C  -61.7           
                                      -----------     
                                      -63.0 ± 4.0     

Ramachandran Angles

Measure the Ramachandran \(\phi\) and \(\psi\) angles for the hemagglutinin fusion peptide structure 2KXA.

$  ml measure -i 2KXA --rama
Table: Ramachandran angles for 2KXA-1

Residue  Phi (deg)  Psi (deg)  Classification  Minor   E (kT) / Prob.  
-------- ---------- ---------- --------------- ------- ----------------
A.G1        0.0     -179.9                             0.0 / 100.0%    
A.L2      -61.0      -45.6     alpha-helix     N-term  1.1 / 31.9%     
A.F3      -65.2      -48.3     alpha-helix             1.7 / 18.4%     
A.G4      -57.2      -32.3     alpha-helix             1.9 / 15.4%     
A.A5      -68.4      -45.0     alpha-helix             0.7 / 51.0%     
A.I6      -61.7      -51.1     alpha-helix             2.5 /  8.1%     
A.A7      -66.7      -42.8     alpha-helix             0.7 / 51.0%     
A.G8      -63.4      -34.5     alpha-helix             1.4 / 23.5%     
A.F9      -65.4      -44.3     alpha-helix             0.7 / 51.0%     
A.I10     -65.1      -28.7     alpha-helix             3.4 /  3.4%     
A.E11     -92.2      -47.2     alpha-helix             7.5 /  0.1%     
A.G12    -111.9       28.0     alpha-helix     C-term  5.1 /  0.6%     
A.G13      44.9     -146.5                             7.2 /  0.1%     
A.W14     -49.7      -61.9     alpha-helix     N-term  5.7 /  0.3%     
A.T15     -49.6      -32.2     alpha-helix             8.4 /  0.0%     
A.G16     -69.5      -37.4     alpha-helix             0.9 / 38.9%     
A.M17     -58.3      -46.5     alpha-helix             0.8 / 45.5%     
A.I18     -62.6      -49.1     alpha-helix             0.6 / 54.7%     
A.D19     -55.5      -45.6     alpha-helix             0.8 / 45.5%     
A.G20     -65.7      -34.8     alpha-helix             1.7 / 17.7%     
A.W21     -62.9      -48.0     alpha-helix             0.6 / 54.7%     
A.Y22     -77.6      -33.0     alpha-helix     C-term  2.9 /  5.3%     
A.G23      74.3      -87.9                             6.7 /  0.1%     
A.S24     -56.7        0.0                             0.0 / 100.0%    

Approach to Secondary Structure Assignments

\(\beta\) Turns

Turns are defined by a hydrogen bond between residues ‘i’ and ‘i+4’ as well as the backbone torsion angles for residues ‘i+1’ and ‘i+2’. The turn type is based on the torsion angles of the ‘i+1’ and ‘i+2’ residues.

Type \(\phi_{i+1}\) \(\psi_{i+1}\) \(\phi_{i+2}\) \(\psi_{i+2}\)
I -60º -30º -90º
I’ 60º 30º 90º
II -60º 120º 80º
II’ 60º -120º -80º

Assignments of the turn residues ‘i+1’ and ‘i+2’ are made. However, since the torsion angles of the terminal residues–specifically \(\phi\) of residue ‘i’ and \(\psi\) of residue ‘i+4’–are flexible, these are not included in the assignment.

Helices

Helices consist of stretches of hydrogen bonded residues with helical dihedrals. 310-helices are typically short, with one or more ‘i’-‘i+3’hydrogen bonds, and these can be mischaracterized as turns (type I turns). In this case, mollib checks that all residues in the helix have helical dihedral angles.

Sheets

Sheets are first identified by finding hydrogen bonds between residues with sheet torsion angles. This process identifies most sheet residues. However, for strands on the edges of sheets, every second amino acid may not form an internal hydrogen bond.

To accurately identify sheet strands, mollib will find groups of sheet hydrogen bonds, then it will evaluate whether the residues are in a checkered pattern and whether the previous or subsequent residues have sheet backbone torsion angles. Thereafter, it will assign all residues in the group to a sheet classification, if no other classification has already been made. See assign blocks for details.

Assign Blocks

Secondary structure assignments are made based on hydrogen bonds. In some cases, such as the edge strands of sheets or short 310-helices, residues within a contiguous block are not assigned because they do not form an internal hydrogen bond. Mollib assign contiguous blocks of residues with the same secondary structure by testing the dihedral of residues within that block and filling gaps in assignment.

For example, a checkered sheet assignment (‘E E E E E’) will be assigned as a single contiguous ß-strand (‘EEEEEEEEE’) if all residues in the block have ß-strand backbone dihedral angles. 310-helices are another example in which the ‘i’ and ‘i+3’ residues are hydrogen bonded, yet the ‘i+1’ and ‘i+2’ residues may not be. In this case, the gap will be filled by assigning residues ‘i’ through ‘i+3’ as 310-helix, if all four residues have helical dihedral angles.

Additionally, assigning blocks will label the minor classification of N- and C-terminal residues for certain secondary structure blocks, depending on the settings.

Note

The identification of the ‘N-term’ and ‘C-term’ minor classifications are done separately for residues and hydrogen bonds. These assignments may be different between residues and hydrogen bonds.

Energy Maps

The backbone dihedral probabilities and energies are calculated from potential of mean force plots for each type of secondary structure. It is calculated from the probability of finding a particular set of dihedral angles in a group of high-resolution structures. A high probability indicates that the measured dihedral angles are observed frequently in high-resolution structures. Conversely, a low probability indicates that a particular set of dihedral angles is rarely seen in high-resolution structures. These are typically colored in yellow (relatively rare) or red (very rare).

The energies represent a potential of mean force (PMF) calculated from a Boltzmann inversion.

\[E(\Omega) = -kT ln[P(\Omega)]\]

The energy is zero when a set of dihedrals angles is optimal for a given type of secondary structure classification. The following are the energy plots for each secondary structure classification.

../_images/ramachandran_countour_1_lowres.png ../_images/ramachandran_countour_2_lowres.png ../_images/ramachandran_countour_3_lowres.png ../_images/ramachandran_countour_4_lowres.png