Preparing PDB FilesΒΆ
AESOP requires protein structures that comply with the PDB format. Given a structure from the Protein Data Bank, the user needs to consider that coordinates for some atoms may not be resolved in the deposited structure. Thus, residues may be missing from a protein even though the sequence may be known. To fix such issues, the user must perform homology modelling to model and refine gaps. A number of computational tools exist to add these missing residues including Modeller, PDBFixer, UCSF Chimera, Pymol, and SWISS-MODEL. If you are willing to install OpenMM and all other libraries within the OMNIA channel, then PDBFixer may be installed through anaconda quite easily and offers a simple graphical user interface for PDB preparation.
If the PDB contains all residues but is missing a few atoms in one or more residues, AESOP has a function that will call complete_pdb from Modeller to fill in missing atoms. You may use the function as follows:
from aesop import complete_structure
pdbfile = 'input.pdb'
outfile = 'output.pdb'
complete_structure(pdb=pdbfile, dest=outfile, disu=False)
The above snipped of code reads input.pdb
, fills in missing atoms (not missing residues), and
writes the completed structure to output.pdb
. If input.pdb
has disulfide bridges, simply
set disu=True
to predict and patch disulfides.
While AESOP will handle protein structures where residue numbering overlaps between protein chains, we advise users to make sure only a single model is present in the PDB file to prevent unforseen complications. Additionally, each chain should be represented by a unique identifier that complies with the PDB format.