I have obtained the Molecular Properties of both Actives and Inactives using PowerMV.
Having this as the Input , I am trying to build a MODEL using MACHINE LEARNING METHOD known as WEKA.
- Weka is to apply a learning method to a dataset and analyze its output to learn more about the data.
- Weka is to use learned models to generate predictions on new instances.
- Third is to apply several different learners and compare their performance in order to choose one for prediction. The learning methods are called classifiers.
Using Random Forest as the Classifier I have developed many Models. Random Forest algorithm is based on decision trees, where each tree is independently constructed and each node is split using the best among a subset of predictors randomly chosen at the node. It is the most accurate classifier and produces most precise results for all the datasets.
One of the issue that needs to be taken into consideration while using machine learning technique on a highly imbalanced dataset is the cost of misclassification.The use of cost-sensitive classifiers can abrogate this issue and minimize misclassification errors.
Using the statistical measures I am still trying to choose the best accurate model for my study.
Hope i ll finish it soon …