Building the Model – WEKA

I have obtained the Molecular Properties of both Actives and Inactives using PowerMV.

Having this as the Input , I am trying to build a MODEL using MACHINE LEARNING METHOD known as WEKA.

  • Weka is to apply a learning method to a dataset and analyze its output to learn more about the data.
  • Weka is to use learned models to generate predictions on new instances.
  • Third is to apply several different learners and compare their performance in order to choose one for prediction. The learning methods are called classifiers.

Using Random Forest as the Classifier I have developed many Models. Random Forest algorithm is based on decision trees, where each tree is independently constructed and each node is split using the best among a subset of predictors randomly chosen at the node. It is the most accurate classifier and produces most precise results for all the datasets.

One of the issue that needs to be taken into consideration while using machine learning technique on a highly imbalanced dataset is the cost of misclassification.The use of cost-sensitive classifiers can abrogate this issue and minimize misclassification errors.

Using the statistical measures I am still trying to choose the  best accurate model for my study.

Hope i ll finish it soon …


Bioassay dataset

I have collected the bioassay dataset of Confirmation Assay for Inhibitors of Leishmania Mexicana Pyruvate Kinase (LmPK) , AID 2559.

It consists of 150 compound totally.

In that 58 are Actives

67 are Inactives and

25 are Inconclusive.


“Anyone who stops learning is old, Whether at twenty or eighty. Anyone who keeps learning stays young”

It gives me immense pleasure in taking you forward into My Dream, into My Vision, into My World and more than that into My Life – My Blog….  Welcome !!!