Molecular Graphics: Using Inverse Kinematics to Solve the Loop Closure Problem during Protein Structure Modeling



Background


Protein Architecture

  • Proteins are linear chains made up of amino acids.
  • Each of those amino acids has a specific 3D conformation
  • The result is a compact molecule as the one shown in the next column

A Sample Native Protein Structure: 1crn





Objective

Suppose we want to use molecular graphics to model the native structure of a protein molecule. That molecule's amino acid chain, however contains some gaps. Can we create or use existing segments of other protein molecules to fill such gap?

Yes, we can using a suitable protein segment for the insertion and inverse kinematics for placing the insert into a given gap

In this project, I used the Fully Cyclic Coordinate Descent method[1] to move one end of a native protein segment to a given location

Method

Fully Cyclic Coordinate Descent (FCCD) is an optimization based IK technique. In this project, the optimization focuses on minimizing the interatomic distance of the last three c-alpha atoms along the moving chain with the last three c-alpha atoms of the fixed chain. See below for a summary of this method:


(a)

(b)

(c)

Pictures (b) & (c)taken from [1], Picture (a) taken from Wiki

    Preliminaries:


  • The protein structure is reduced to a c-alpha trace
    i.e. from (a) to (b)

  • Now we only have the bond angles (theta) and dihedral
    angles (tau) to consider. See (b)

  • The bond lengths between two consecutive c-alpha atoms
    remain fixed at 3.8 angstroms


  • FCCD

  • Seeks to move the c-alpha atoms m(N-3), m(N-2) and m(N-1)
    in "moving" so as to align them with the c-alpha atoms of
    f(N-3), f(N-2) and f(N-1) in "fixed"

  • The positions of the first three c-alpha atoms
    "moving" is the same as that of the first three c-alpha
    in "fixed"

  • In order to move the last three atoms of "moving"
    onto the last three atoms of "fixed", FCCD iteratively
    picks a bond (or vector) along "moving" and calculates
    the optimal rotation matrix that will

    --rotate all other vectors downstream of it in
    the desired direction
    --after that reduce the interatomic distance between
    the last three atoms of both "moving" and "fixed"

  • For a detailed mathematical treatment of FCCD please
    consult [1]


Results


In general, FCCD was capable of moving given loop ends to desired positions.
FCCD with randomly generated loops:
Because I didn't incorporate any angle constraints in this method, all the randomly generated loops didn't look native.
i.e. some bonds intersected each others, some bonds were too close to each other e.t.c

FCCD with loops from native protein segments:
This worked better. However, I still saw the same problems as above with the randomly generated loops

Artifact: FCCD in Action I

A loop from 1v77 (c-alpha atoms 102 to 114). The goal was to close the gap between the two ends of this loop using FCCD

RMSD after joining = 0.7


Artifact: FCCD in Action II

Another loop from 1v77 (c-alpha atoms 160 to 172). Again, the goal is to close the gap between the two ends of this loop using FCCD

RMSD after joining = 0.4



Future Work

For a realistic model of a loop connected to a native structure, the following ideas still need to be implemented into this project:
  1. Incorporate different types of constraints
    • During random loop folding process
    • While evaluating a given acceptable conformation after attempting to align the moving chain to the fixed chain
  2. These constraints may include:
    • Acceptable dihedral angles according to the Ramachandran plot (a map that depicts empirically determined acceptable phi & psi angles for the amino acids while in native protein structures)
    • Steric collisions with other atoms in the molecule
    • Satisfying a given energy function

References

  1. Boomsma, W. and Hamelryck, T. (2005). Full Cyclic Coordinate Descent: solving the protein loop closure problem in C-alpha space. BMC Bioinformatics 6:159
  2. Canutescu, A. and Dunbrack, R. (2003). Cyclic Coordinate Descent: a robotic algorithm for protein loop closure. Protein Science 12:963
  3. Jmol