Murcko fragments with python and pybel
Friday, April 25th, 2008There are actually a few bugs in this version. Shortly I’ll put up a new version here.
Couldn’t find a program to get all ring systems out of an SDF file, along with how often they occur. So I wrote a script in python that does exactly that job. It gets the smallest set of smallest rings (SSSR) from pybel: once a molecule is read in (e.g., mol=pybel.readstring(”smi”, smiles)), then you have the SSSR in mol.sssr, which is a vector of OBRing objects (see the OpenBabel documentation for more info about that).
You can iterate over this vector in a standard pythonic fashion, e.g., for ring in mol.sssr: pass. The ring size is easily accessed by ring.PathSize(), the atoms in the ring are stored in the member variable _path, e.g., ring._path will give you the atoms in the ring.
The script checks for fused ring systems by identifying any shared atoms between any members of the SSSR. This is achieved by intersection of the sets of member atoms of any two ring systems. Two rings are considered to be a ring system if they share at least one atom, i.e., strictly speaking it is not fused but rather a spiro system. This behaviour can be changed by changing if len(intsec) in function GetFusedRingsMatrix(mol) to if len(intsec) > 1.
Should you want to get all individual elements of the SSSR, instead of the fused/linked rings as one ring system, then it should suffice to supply a manually crafted matrix as the one returned by GetFusedRingsMatrix(). That would be something like l=len(mol.sssr) and FusedRingsMatrix = [[0 for x in range(l)] for y in range(l)]. Then all rings are supposed to be unconnected.
The script also includes exocyclic double bonds as part of rings they may be linked to.
All fragments are written out to a file fragments.sdf, and the number how often that fragment was encountered in the structure file supplied is written into the SDF field COUNT. If you watch the output with something like mview fragments.sdf (MarvinView from ChemAxon) it will look similar to the picture displayed. ![]()
The script can be found here: Python script for extraction of Murcko fragments