Feat: Add ligand_prep primitive#75
Conversation
Adds the `ligand_prep` primitive to the `proteome_scan` subpackage which converts SMILES strings to 3D SDF files via RDKit. Wires the primitive into the program map and exposes it as POST /primitive/ligand-prep.
riya-singh28
left a comment
There was a problem hiding this comment.
I have left some comments after the first review pass.
| "Binimetinib": "[H]OC([H])([H])C([H])([H])ON([H])C(=O)C1=C([H])C2=C(N=C([H])N2C([H])([H])[H])C(F)=C1N(C)C1=C(F)C([H])=C(Br)C([H])=C1[H]", | ||
| "Dabrafenib": "[H]C([H])C1=NC([H])=C([H])C(=N1)C1=C(N=C(S1)C(C([H])([H])[H])(C([H])([H])[H])C([H])([H])[H])C1=C(F)C(N([H])S(=O)(=O)C2=C(F)C([H])=C([H])C([H])=C2F)=C([H])C([H])=C1[H]", | ||
| "Vemurafenib": "[H]N(C1=C(F)C(C(=O)C2=C([H])N(C)C3=NC([H])=C(C([H])=C23)C2=C([H])C([H])=C(Cl)C([H])=C2[H])=C(F)C([H])=C1[H])S(=O)(=O)C([H])([H])C([H])([H])C([H])([H])[H]", | ||
| "TAK-632": "", |
There was a problem hiding this comment.
Why is this value an empty string here?
There was a problem hiding this comment.
I copied this file from ProteomeScan repository which was added by Aryan already. We just use some sample of the smiles in this file for tests.
| "Vemurafenib": "[H]N(C1=C(F)C(C(=O)C2=C([H])N(C)C3=NC([H])=C(C([H])=C23)C2=C([H])C([H])=C(Cl)C([H])=C2[H])=C(F)C([H])=C1[H])S(=O)(=O)C([H])([H])C([H])([H])C([H])([H])[H]", | ||
| "TAK-632": "", | ||
| "Tucatinib": "[H]N(C1=NC(C([H])([H])[H])(C([H])([H])[H])C([H])([H])O1)C1=C([H])C2=C(N=C([H])N=C2C([H])=C1[H])N(C)C1=C([H])C([H])=C(OC2=C([H])C3=NC([H])=NN3C([H])=C2[H])C(=C1[H])C([H])([H])[H]", | ||
| "Regorafenib": "", |
| "Navitoclax": "[H]C1=C([H])C([H])=C(SC([H])([H])[C@]([H])(N([H])C2=C(S(=O)(=O)C(F)(F)F)C([H])=C(S(=O)(=O)N([H])C(=O)C3=C([H])C([H])=C(N4C([H])([H])C([H])([H])N(C([H])([H])C5=C(C6=C([H])C([H])=C(Cl)C([H])=C6[H])C([H])([H])C([H])([H])C(C([H])([H])[H])(C([H])([H])[H])C5([H])[H])C([H])([H])C4([H])[H])C([H])=C3[H])C([H])=C2[H])C([H])([H])C([H])([H])N2C([H])([H])C([H])([H])OC([H])([H])C2([H])[H])C([H])=C1[H]", | ||
| "Paclitaxel": "[H]O[C@]12C(C([H])([H])[H])(C([H])([H])[H])C(=C(C([H])([H])[H])[C@@]([H])(OC(=O)[C@]([H])(O[H])[C@]([H])(C3=C([H])C([H])=C([H])C([H])=C3[H])N([H])C(=O)C3=C([H])C([H])=C([H])C([H])=C3[H])C1([H])[H])[C@@]([H])(OC(=O)C([H])([H])[H])C(=O)[C@@]1(C([H])([H])[H])[C@]([H])([C@]2([H])OC(=O)C2=C([H])C([H])=C([H])C([H])=C2[H])[C@]2(OC(=O)C([H])([H])[H])C([H])([H])O[C@]2([H])C([H])([H])[C@]1([H])O[H]" | ||
| }, | ||
| "methylated_ligand_smiles": { |
There was a problem hiding this comment.
What's the difference between ligand_smiles and methylated_ligand_smiles?
There was a problem hiding this comment.
I'm not sure, but I think it has something to do with additional hydrogen atom. Same as above. I've copied this file as just a test asset.
|
|
||
| Examples | ||
| -------- | ||
| >>> addr = ligand_prep("CCO", output="ethanol", ligand_name="EtOH") |
There was a problem hiding this comment.
Import statements are required.
There was a problem hiding this comment.
Will address this in a separate doctest PR
| random_seed: Optional[int] = None, | ||
| ) -> str: | ||
| """ | ||
| Convert a SMILES string to a 3D SDF file using RDKit. |
There was a problem hiding this comment.
@vsaravind01 Could you please include the steps involved in ligand preparation in the documentation?
|
LGTM |
Type of change
Please check the option that is related to your PR.
Checklist
lint.shand ensure that it passes (Use./lint.sh --fixto fix formatting issues automatically)