Predicting Bio-Activity

In order to make a structure based predict on the bio-activity of molecules a list of features is generated with a KNIME workflow. This list is used as input for either a Neural Network or a Random Forest Predictor. In both scripts the input data is splitted into training and test data, 70% of the data is used to train the predictor. Furthermore, the parameters of the predictors are adjusted by GridSearchCV: The predictor is trained multiple times with different combinations of available parameters and the best predictor is then used to predict the bio-activity.

Feature Calculation

The KNIME workflow featureGeneration.knar receives an input file containing SMILES and the predicted bio-activity of the molecule in a comma separated csv file. It generates a list of features for the molecules and outputs a comma separated file containing the activity, the SMILES structure the molecules corresponding features.

Classification

In order to run the program one has to specify

-t Path of the input csv file generated by the KNIME workflow -o Destination path of the resulting prediction csv

Random Forest Classifier

randomForest_GridSearch.py -t trainingData_Features.csv -o rfc_GridSearch_res.csv

Neural Network Classifier

neuronalNetwork_GridSearch.py -t trainingData_Features.csv -o rfc_GridSearch_res.csv

Built With

KNIME - Analytics Platform (3.7)
RDKIT - Software Package to read and analyse SMILE data (3.4.0v)
Python - Python programming language (3.6)
scikit-learn - Software Package for Machine Learning (v0.20.1)
keras - Open Source Deep Learning Library (2.24)
matplotlib - 2D Plotting Library (2.2.2)
pandas - Datastructures and Dataframes (v0.23.4)
numpy - Scientific computing with Python (v1.15.2)

Authors

Jennifer Bödker Tobias Nietsch

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README_RF_NN.md

README_RF_NN.md

Predicting Bio-Activity

Feature Calculation

Classification

Random Forest Classifier

Neural Network Classifier

Built With

Authors

Files

README_RF_NN.md

Latest commit

History

README_RF_NN.md

File metadata and controls

Predicting Bio-Activity

Feature Calculation

Classification

Random Forest Classifier

Neural Network Classifier

Built With

Authors