Drug-Dev-A will be using an open-source toolkit called DeepPurpose for Drug-Target Interaction (DTI) prediction which measures the binding affinity of drug molecules to the protein targets. What our team will be doing is replicating some of the DeepPurpose tutorialson our own google colaboratories for checking the reproducibility of the code samples and learning how this toolkit wraps deep learning (DL) models promising performances for DTI prediction to a comprehensive and easy-to-use DL library. Combining machine learned approaches to drug discovery is an effective way as shown by this web-app. This README.md contains all the necessary information to replicate the tutorials for DeepPurpose.
The work-horse for playing with DeepPurpose will be Google Colaboratory or Colab in short. We will be using Google Colaboratory to install DeepPurpose and it’s dependencies for running/eplicating the tutorials. Each of the steps should be run in different cells in Colab
!pip install -q condacolab
import condacolab
condacolab.install_anaconda()
import condacolab
condacolab.check()
!conda --version
%%bash
git clone https://github.com/kexinhuang12345/DeepPurpose.git ## Cloning the DeepPurpose Code Repository
cd DeepPurpose ## Change Directory to DeepPurpose
conda env create -f environment.yml # Creating New Environment
source activate DeepPurpose
conda install -c conda-forge rdkit
conda install -c conda-forge notebook
from DeepPurpose import oneliner #Importing the oneliner package
from DeepPurpose.dataset import * #Importing the proprietary library of drugs
oneliner.repurpose(*load_SARS_CoV2_Protease_3CL(), *load_antiviral_drugs(no_cid = True))
Run drug.py as python drug.py
and voila you have retrieved a list of repurposing drugs from a proprietary library.
----output----
Drug Repurposing Result for SARS-CoV2 3CL Protease
+------+----------------------+------------------------+---------------+
| Rank | Drug Name | Target Name | Binding Score |
+------+----------------------+------------------------+---------------+
| 1 | Sofosbuvir | SARS-CoV2 3CL Protease | 190.25 |
| 2 | Daclatasvir | SARS-CoV2 3CL Protease | 214.58 |
| 3 | Vicriviroc | SARS-CoV2 3CL Protease | 315.70 |
| 4 | Simeprevir | SARS-CoV2 3CL Protease | 396.53 |
| 5 | Etravirine | SARS-CoV2 3CL Protease | 409.34 |
| 6 | Amantadine | SARS-CoV2 3CL Protease | 419.76 |
| 7 | Letermovir | SARS-CoV2 3CL Protease | 460.28 |
| 8 | Rilpivirine | SARS-CoV2 3CL Protease | 470.79 |
| 9 | Darunavir | SARS-CoV2 3CL Protease | 472.24 |
| 10 | Lopinavir | SARS-CoV2 3CL Protease | 473.01 |
| 11 | Maraviroc | SARS-CoV2 3CL Protease | 474.86 |
| 12 | Fosamprenavir | SARS-CoV2 3CL Protease | 487.45 |
| 13 | Ritonavir | SARS-CoV2 3CL Protease | 492.19 |
....
In addition to the DTI prediction, we also provide repurpose and virtual screening functions to rapidly generation predictions. The code for this replicated tutorial is available in a jupyter notebook interfaced with binder.
** Steps to walkthrough in this tutorial:**
from DeepPurpose import DTI as models
from DeepPurpose.utils import *
from DeepPurpose.dataset import *
# Load Data, an array of SMILES for drug, an array of Amino Acid Sequence for Target and an array of binding values/0-1 label.
# e.g. ['Cc1ccc(CNS(=O)(=O)c2ccc(s2)S(N)(=O)=O)cc1', ...], ['MSHHWGYGKHNGPEHWHKDFPIAKGERQSPVDIDTH...', ...], [0.46, 0.49, ...]
# In this example, BindingDB with Kd binding score is used.
drug_encoding, target_encoding = ‘MPNN’, ‘Transformer’
train, val, test = data_process(X_drug, X_target, y, drug_encoding, target_encoding, split_method=’cold_protein’, frac=[0.7,0.1,0.2])
config = generate_config(drug_encoding, target_encoding, transformer_n_layer_target = 8) net = models.model_initialize(**config)
net.train(train, val, test)
net = models.model_pretrained(MODEL_PATH_DIR or MODEL_NAME)
X_repurpose, drug_name, drug_cid = load_broad_repurposing_hub(SAVE_PATH) target, target_name = load_SARS_CoV_Protease_3CL()
_ = models.repurpose(X_repurpose, target, net, drug_name, target_name)
X_repurpose, drug_name, target, target_name = [‘CCCCCCCOc1cccc(c1)C([O-])=O’, …], [‘16007391’, …], [‘MLARRKPVLPALTINPTIAEGPSPTSEGASEANLVDLQKKLEEL…’, …], [‘P36896’, ‘P00374’]
_ = models.virtual_screening(X_repurpose, target, net, drug_name, target_name)
**Figure: The Target SMILES used for Virtual Screening in the trained model**
![image](https://user-images.githubusercontent.com/29195354/130312060-24e68b4b-681e-417e-bdf6-5646202873a5.png)
**Link to the binder run for 1(a):** [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ssiddhantsharma/deep-purpose-tutorial/HEAD?filepath=tutorial-notebooks%2FHackbio_Case_Study_1_(a)_A_Framework_for_Drug_Target_Interaction_Prediction.ipynb) <br>
**[Video-tutorial of 1(a) by a member of team through Loom-App](https://www.loom.com/share/1564269d811d410c9fcdcfdb2f55967a?sharedAppSource=personal_library)**
### Case Study 1(b): A Framework for Drug Property Prediction, with less than 10 lines of codes.
Many dataset is in the form of high throughput screening data, which have only drug and its activity score. It can be formulated as a drug property prediction task. We also provide a repurpose function to predict over large space of drugs. The code for this replicated tutorial is available in a jupyter notebook interfaced with binder.
**Link to the binder run for 1(b):** [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ssiddhantsharma/deep-purpose-tutorial/HEAD?filepath=tutorial-notebooks%2FHackbio_Case_Study_1_(b)_A_Framework_for_Drug_Property_Prediction.ipynb) <br>
**[Video-tutorial of 1(b) by a member of team through Loom-App](https://www.loom.com/share/b38b55e16e184b45a3ae0fde3e3a9df0)**
### Case Study 2 (a): Antiviral Drugs Repurposing for SARS-CoV2 3CLPro, using One Line.
Given a new target sequence (e.g. SARS-CoV2 3CL Protease), retrieve a list of repurposing drugs from a curated drug library of 81 antiviral drugs. The Binding Score is the Kd values. The code for this replicated tutorial is available in a jupyter notebook interfaced with binder.
—-output—- Drug Repurposing Result for SARS-CoV2 3CL Protease +——+———————-+————————+—————+ | Rank | Drug Name | Target Name | Binding Score | +——+———————-+————————+—————+ | 1 | Sofosbuvir | SARS-CoV2 3CL Protease | 190.25 | | 2 | Daclatasvir | SARS-CoV2 3CL Protease | 214.58 | | 3 | Vicriviroc | SARS-CoV2 3CL Protease | 315.70 | | 4 | Simeprevir | SARS-CoV2 3CL Protease | 396.53 | | 5 | Etravirine | SARS-CoV2 3CL Protease | 409.34 | | 6 | Amantadine | SARS-CoV2 3CL Protease | 419.76 | | 7 | Letermovir | SARS-CoV2 3CL Protease | 460.28 | | 8 | Rilpivirine | SARS-CoV2 3CL Protease | 470.79 | | 9 | Darunavir | SARS-CoV2 3CL Protease | 472.24 | | 10 | Lopinavir | SARS-CoV2 3CL Protease | 473.01 | | 11 | Maraviroc | SARS-CoV2 3CL Protease | 474.86 | | 12 | Fosamprenavir | SARS-CoV2 3CL Protease | 487.45 | | 13 | Ritonavir | SARS-CoV2 3CL Protease | 492.19 | ….
**Link to the binder run for 2(a):** [![Binder](https://mybinder.org/badge_logo.svg)](https://mybinder.org/v2/gh/ssiddhantsharma/deep-purpose-tutorial/HEAD?filepath=tutorial-notebooks%2FHackbio_Case_Study_2_(a)_Antiviral_Drugs_Repurposing_for_SARS_CoV2_3CLPro_using_One_Line.ipynb) <br>
**[Video-tutorial of 2(a) by a member of team through Loom-App](https://www.loom.com/share/7e3eac0a45144b9abb60bbea17383f27)**
### Case Study 2(b): Repurposing using Customized training data, with One Line.
Given a new target sequence (e.g. SARS-CoV 3CL Pro), training on new data (AID1706 Bioassay), and then retrieve a list of repurposing drugs from a proprietary library (e.g. antiviral drugs). The code for this replicated tutorial is available in a jupyter notebook interfaced with binder.
—-output—-
Drug Repurposing Result for SARS-CoV2 3CL Protease
+——+———————-+————————+—————+
| Rank | Drug Name | Target Name | Binding Score |
+——+———————-+————————+—————+
| 1 | Sofosbuvir | SARS-CoV2 3CL Protease | 190.25 |
| 2 | Daclatasvir | SARS-CoV2 3CL Protease | 214.58 |
| 3 | Vicriviroc | SARS-CoV2 3CL Protease | 315.70 |
| 4 | Simeprevir | SARS-CoV2 3CL Protease | 396.53 |
| 5 | Etravirine | SARS-CoV2 3CL Protease | 409.34 |
| 6 | Amantadine | SARS-CoV2 3CL Protease | 419.76 |
| 7 | Letermovir | SARS-CoV2 3CL Protease | 460.28 |
| 8 | Rilpivirine | SARS-CoV2 3CL Protease | 470.79 |
| 9 | Darunavir | SARS-CoV2 3CL Protease | 472.24 |
| 10 | Lopinavir | SARS-CoV2 3CL Protease | 473.01 |
| 11 | Maraviroc | SARS-CoV2 3CL Protease | 474.86 |
| 12 | Fosamprenavir | SARS-CoV2 3CL Protease | 487.45 |
| 13 | Ritonavir | SARS-CoV2 3CL Protease | 492.19 |
….
```
Link to the binder run for 2(b):
Video-tutorial of 2(b) by a member of team through Loom-App
If you found this package interactive, please see their publication DeepPurpose:
One can try to benchmark DeepPurpose with the very new (August 2021) library, TorchDrug, might give something good to compare to. More about installing TorchDrug here. In case of any questions, please send to siddhaantsharma.ss@gmail.com