Sequential feature selection for heart disease detection using random forest
DOI:
https://doi.org/10.24996/ijs.2022.63.9.26Keywords:
Predictive analytics, predictive modeling, machine learning, automated diagnosisAbstract
Heart disease identification is one of the most challenging task that requires highly experienced cardiologists. However, in developing nations such as Ethiopia, there are a few cardiologists and heart disease detection is more challenging. As an alternative solution to cardiologist, this study proposed a more effective model for heart disease detection by employing random forest and sequential feature selection (SFS). SFS is an effective approach to improve the performance of random forest model on heart disease detection. SFS removes unrelated features in heart disease dataset that tends to mislead random forest model on heart disease detection. Thus, removing inappropriate and duplicate features from the training set with sequential feature selection approach plays significant role in improving the performance of the proposed model. The proposed feature selection approach is evaluated using real world clinical heart disease dataset collected from University of California Irvine (UCI) data repository. Empirical test on validation set reveals that the proposed model performs well as compared to the existing methods. Overall, the state of-the-art heart disease detection model with classification accuracy of 98.53% is proposed for heart disease detection using SFS and random forest model.