Elderly Healthcare System for Chronic Ailments using Machine Learning Techniques – a Review

World statistics declare that aging has direct correlations with more and more health problems with comorbid conditions. As healthcare communities evolve with a massive amount of data at a faster pace, it is essential to predict, assist, and prevent diseases at the right time, especially for elders. Similarly, many researchers have discussed that elders suffer extensively due to chronic health conditions. This work was performed to review literature studies on prediction systems for various chronic illnesses of elderly people. Most of the reviewed papers proposed machine learning prediction models combined with, or without, other related intelligence techniques for chronic disease detection of elderly patients at an early stage to avoid emergency situations. This method provides a promising approach in the analysis of either structured or unstructured datasets to produce very substantial pattern discoveries. By defining the generic architecture for the prediction model, we reviewed various papers involved in similar fields, based on suggested methodologies and their associated outcomes. The study discussed the pros and cons of different prediction models using traditional and modern machine learning techniques. using traditional and modern ML techniques. The table is designed based on the data retrieved from selective survey papers.


Major chronic ailments among elders
The most important challenging burden of the healthcare system is to detect, monitor and manage the chronic ailments [48] and social behavioral disorders [15] at an older age. As people grow older, it becomes essential to determine the frequency of occurrence and patterns of chronic diseases, as well as the multi-morbidity cases among them [17,19,20,21]. According to the BKPAI report, the major types of acute problems are fever, hypertension, diarrhea, asthma, gastritis, and arthritis. Chronic diseases are being considered as a major contributor to many diseases. They include, for example, hypertension, diabetes, asthma, arthritis, Parkinson disease, dementia and heart diseases. One of the major factors for the increasing need of nursing care in elders is due to their loss of Activities of Daily Living (ADL), as well as chronic diseases [18]. The BKPAI survey [6] indicates that among various chronic morbidities, 29 percent of elderly are suffering from arthritis. The prevalence rate of arthritis is at a higher level among rural elderly women. It may increase with the increase in age and decrease in education years. Disability of elders greatly depends on the context of arthritis. It can be immobilizing, and it comes in many forms. Due to the increase of urbanization and socio-economic differences, hypertension is known as a lifestyle disease. Previous surveys [22,27] showed that almost 21 percent of elderly are reported with hypertension. The prevalence rate of hypertension is at a higher level among elderly women in comparison with elderly men. Research indicates that the hallmark of diabetes is having high blood glucose levels. It is a group of diseases that affects the body's ability to produce or use insulin correctly [24]. Ten percent of the elderly are reported with diabetes in a recent survey, conducted using bivariate and multivariate analysis [23]. Diabetes is defined as one of the lifestyle diseases as it is more prevalent among urban elders and its rate increases with the increase of education and wealth. Studies proved that there is a strong relationship between comorbidity factors (such as hypertension, diabetic, cardiovascular disease etc.) and obesity in elders [32]. Obesity is considered to be an important factor for chronic inflammation due to lack of physical activity, increase in intra-abdominal fat, non-restriction in diet, etc. Surveys showed that around eight percent of elderly are having asthma. The prevalence of asthma is at a higher rate in elderly men living in urban areas as compared to elderly women and rural areas. Many studies showed that stress can also worsen asthma [25,26]. Heart diseases include heart attack, stroke, cardiac arrest, high blood pressure, peripheral artery disease, cardiovascular condition, and heart failure. Among chronic diseases, heart attack (acute myocardial infarction) is one of the dangerous diseases faced by elderly patients [20]. A more recent study found that the risk of cardiovascular disease can be reduced by exposure to blue light, which decreases blood pressure of patients as well [28]. Emotional stress is considered to be one of the major causes of certain types of heart disease. Parkinson"s disease (PD) occurs due to a neurodegenerative disorder which affects nerve cells in the part of the brain that controls muscle movement. Nearly one percent of elderly people are affected by PD and its prevalence rate increases with age [29]. The symptoms have a greater impact on seniors and contribute to higher mortality. Alzheimer's disease is the most common form of dementia, affecting more than 5 million people worldwide. There are other types of dementias, like memory loss and impaired cognitive function that impact seniors badly [30,31]. The prevalence rate of Alzheimer"s and dementia are higher among rural elderly women, as stated by the BKPAI survey reported in 2011 [6].

Generic architecture of ML prediction model
As reported by the survey carried out by National Center for Biotechnology Information (NCBI) bookshelf, old aged people are more likely to suffer from chronic diseases, like CVD and circulatory disease cancers [19,42], rather than acute problems. The cardiovascular morbidities and obesity [32,33,46] are more associated with certain risk factors, like aging, hypertension, smoking, diabetes, cancer etc. Tummala et al. [33] states that it is a challenging factor to diagnose cardiovascular morbidity. According to the existing social structure and culture, depression is another major factor for chronic diseases affecting the elderly population. For chronic diseases, as discussed in section 3.1, hundreds of ML approaches using different types of data and different assumptions are developed in senior healthcare systems, which involve disease prognosis, diagnosis, treatment suggestion and patient care [34,35]. Currently, the healthcare system is undergoing a shift from a traditional ML approach to a modernized patient centered approach using the modern ML techniques.

Priya and Jinny
Iraqi Journal of Science, 2021, Vol. 62, No. 9, pp: 3138-3151 3141 The generic architecture of the ML prediction model is depicted in Figure-1. The generic diagram represents the working flow of the prediction system, consisting of major four components. They are data collection, data preprocessing, data implementation and data evaluation. Data collection is the extraction of different types of patient"s data from various heterogeneous sources. Data used in most of the ML classification algorithms must be clean, consistent and complete. Hence, data preprocessing is employed before the dataset is trained well for better prediction. Once the trained model is developed, the test dataset is evaluated using various evaluation methods. Then the comparison is carried out in order to choose the best ML technique.

Data collection
Many existing ML based prediction models have revealed their importance to detect the key features, even from complex datasets. In the healthcare domain, patient"s data are mainly available either in the form of structured or unstructured format, as defined in Figure-2.

Figure 2-Classification of Patient"s Data
The structured data consists mainly of two parts: patient's laboratory data and their demographic information, present in the form of tabular format. Unstructured data may include the elderly patient"s narration about his/her illness, doctor"s diagnosis and interrogation records, either in the form of images, textual information or clinical notes.

Structured Data
Data collection is an important process in every ML model. Based on the review, most of the prediction models use structured data as their input to the training set. For example, Chen et al. [13] proposed the prediction system to address the unique characteristics of regional chronic diseases using traditional ML mechanisms in case of structured data. It was an effective prediction model that used both structured and unstructured data over real-time hospital data. But the accuracy is restricted on the diversity features provided by clinical data. Ishita et al. [31] defined a predictive system, where data was acquired by interviewing nearly sixty elders with the help of Geriatric Depression Scale (GPS). The model is trained using various socio-demographic parameters such as age, gender, income status, education, employment, family type, etc., along with other comorbid conditions. Here, the researcher performed a comparison study with five different ML algorithms to choose the best model, which was found to be Bayes Net, for the prediction of depression among seniors. But the accuracy of the model was satisfactory and it can be further improved by considering more factors or applying better algorithms with optimization. The study [12] was conducted to identify the various factors, such as monthly income, diagnosis, depression, discomfort and perceived health status, which dictate Health-Related Quality of Life (HRQoL) of the elders with chronic diseases. A prediction model [12,13] was designed using various traditional ML models by considering 716 cases obtained from a Korean structured database system. The advantage of such a system is the identification of related influencing factors that affect HRQoL of seniors. It had a greater effect on building a better prediction system using ML techniques. But it failed to examine the impacts of one variable over the other, and the nature of influencing factors are also unknown. Similarly, there are many intelligent electronic devices which are used to capture data and perform analysis of chronic disease management, as the patients go about with their daily routine. Wearable devices with cloud storage [50,51] and data analytics are also used to indicate fall of any one or two specific medical issues related to chronic disease [44]. For example, wearable devices can indicate heart condition and initiate an emergency call in the case that a heart attack is detected [36,37].

Priya and Jinny
Iraqi Journal of Science, 2021, Vol. 62, No. 9, pp: 3138-3151 3143 Another example is the bluetooth emergency system [16] which is devised for the wireless detection of heart attack using a wrist watch. Further, the telecardiology system [16] is used for heart attack detection and adds benefits to the rurally located patients. Danielsen et al. [36] proposed the fall prevention tool especially for elderly people using the combined technology of wearables and ambient intelligence [43]. Hussain et al. [34] proposed an Internet of Things (IoT) enabled framework and studied its challenges in data acquiring. It uses Wireless Body Area Networks (WBAN) and various health sensing devices and smart phones as the health monitoring devices. The above proposed intelligent frameworks are well designed to handle emergency cases by providing real-time medical services in smart cities. But it is a general model that does not provide proper support to rural and semi-urban areas of the country. In the recent era, IoT became a blooming technology, especially in the healthcare industry, which offers smart intelligent services. The connectivity in IoT enables communication among a variety of objects in a faster mechanism, being linked to the local and remote networks [16]. Thus, smart environments and wearable devices [37] are found to be more appealing in the health industry for the patients of all ages. The recent market studies state that there are numerous wrist watches / bands that are available in the market. Few fit bands allow users to maintain a healthy lifestyle by measuring a person"s dayto-day activities. Certain bands may monitor only specific physical fall incidents or health discomforts of an individual.

Unstructured Data
Many more existing promising approaches used for data extraction of unstructured format are applied in the healthcare field. Diversity of climate and living habits of a certain region may become the prominent causes of regional chronic diseases [38]. For example, Chen et al. [13] proposed a model to predict regional chronic diseases using modern ML techniques, such as Convolution Neural Network based Multimodal Disease Risk Prediction (CNN-MDRP) for unstructured data of images. These models are best suited to handle the incomplete hospital data. They lack the accuracy of the model as they greatly depend on the features of hospital data. A recent study [16] discussed the development of a customized healthcare system for patients to manage chronic diseases in an effective method. Such a system is developed using open-source big data Hadoop with text mining. Such technique helps in the transformation of unstructured data into structured data. But the biggest challenging task here is handling unstructured dataset which includes grammar mistakes, spelling mistakes, short phrases, etc., obtained from patient lab reports and vital signs. Such a system helps in the improvement of health care systems. It is very important to have proper verification and testing techniques to deal with the different critical domains. Pradeebha et al. and Bhuvaneswari et al. [26,39] addressed the feature extraction of unstructured data, in the form of images using various filtering techniques, such as the Gabor Kernel Filter (GKF), Local Binary Pattern (LBP) and Histogram of Oriented Gradients (HOG). Bhuvaneswari et al. [39] described the feature selection, which was carried out by various methods like information gain, Correlation-based Feature Selection (CFS), Principal Component Analysis (PCA), Genetic Algorithm (GA) and fuzzy c-means clustering. Among these methods, CFS has the maximum accuracy. The feature extraction work can be further improved by other variants of LBP in order to detect a greater number of lung diseases. Li et al. [27] discussed the handling of structured, semi-structured and unstructured health datasets for detecting hypertension patients. They used Natural Language Processing (NLP) and text-mining techniques for the conversion of unstructured data into a structured format. Zia Uddin et al. [37] proposed a multimodal human activity disease prediction system using multiple wearable sensors. The obtained sensor data are then encoded to model time-sequential information using Recurrent Neural Network (RNN) algorithm. But RNN is limited to process long-term information, which can be resolved using Long Short-Term Memory (LSTM). Xue et al.
[40] also stated that LSTMs are used to perform many sequence learning tasks. Purushotham et al. [41] demonstrated that deep learning models perform better on different healthcare prediction models. It is best suited to handle large volumes of datasets and high complexity levels. It provides benchmarking evaluation results for deep learning and ML models by using huge records of raw clinical time series data as input features to the prediction system. Latest research [45] proved that deep learning methods provide better performance in clinical intervention prediction tasks using Intensive Care Units (ICU) data sources. The researchers demonstrated the benefits of feature selection using the LSTM method.

Priya and Jinny
Iraqi Journal of Science, 2021, Vol. 62, No. 9, pp: 3138-3151 3144 But the experiment was carried out only in a forward-facing manner to perform hourly prediction of both onset and weaning interventions.

Data preprocessing
Collection of raw data from different sources usually tends to be incomplete, inconsistent, missing values and noisy. Analyzing such data may produce misleading results. Hence, the quality of data and its representation are the foremost criteria before the execution of the analysis process. Data preprocessing is considered to be an important step in data mining and ML models. It may include data cleaning, normalization, transformation, feature selection and extraction. The output of data preprocessing is the input to the training set of ML models. The major steps in data preprocessing are namely data cleaning, data integration, data transformation and data reduction. Data cleaning is used to identify outliers in order to smooth out noisy data, correct inconsistencies present in data, and fill with appropriate data in missing values. There are various techniques to fill out the missing values, either using global constant labels such as "Unknown", "Not Applicable" or using measures such as mean, median and central tendency for an attribute. Noisy data can be handled by different binning techniques, such as smoothing by bin means, bin medians, bin boundaries, regression etc. Most of the data smoothing techniques are used for data discretization and data reduction methods. Normalization of data is of great use, especially for classification algorithms, such as ANN, K-NN, etc. Min-max normalization, z-score normalization, and normalization by decimal scaling are few other techniques used for data normalization. Data reduction or discretization uses certain techniques, such as binning, histogram analysis, clustering, decision tree etc.

Data implementation and evaluation
In order to predict, prevent and manage chronic diseases, an efficient and very minimal error prone health care management system is essential. Several data mining methods, like J48 decision tree, bagging and boosting algorithms were used and their performance was compared for three different age groups (including elderly group) in a chronic disease prediction model. Among these, the decision tree was considered as the most powerful classification technique in chronic disease prediction [23]. It was concluded that the performance of adaboost was better than that of decision tree and bagging. It can be further improved by the integration of ensemble methods with other traditional ML techniques, such as NB, SVM, base learners, and others. Perveen et al. [23] constructed a classification model with better performance to classify diabetic patients. But this model does not necessarily yield better results for other disease datasets, such as heart attack, dementia, depression, etc. Zhang et al. [18] described a predictive model using the C5.0 method of decision tree to achieve higher accuracy using various health variables, like personal habits, social activities, mental state etc. The limitation of such a system is that it is restricted with the use of continuous samples only. It also fails to predict the multiple classes of disability states. Another survey also stated that, instead of data mining methods for chronic disease prediction, it is better to apply hybrid models to enhance the performance and accuracy of the prediction system [22]. An automated predictive modeling was developed to support earlier diagnosis of depression using traditional ML techniques. Here, multiple ML classifiers, such as BayesNet, Multilayer Perceptron network (MLP), LR, decision tree and Sequential Minimal Optimization (SMO) were applied to train the system. The results clearly revealed that SMO provides better accuracy and high precision values, while BayesNet provides better Receiver Operating Characteristic (ROC) area and lower Root Mean Square Error (RMSE), as compared to the other ML classifiers. Thus, a performance comparative study helps in early diagnosis of depression among senior citizens and the ability to choose the best performance classifier [31]. The drawback of this model is the use of inaccurate feature selection for prediction. The influences of other equally important parameters may be considered for further improvement and wider acceptability of the system. The learning model performance was measured by certain evaluation parameters, like accuracy, sensitivity, precision, recall, and F-score, among which Stepwise Logistic Regression (SLR) analysis model provided better performance with an accuracy rate of 93% and F-score of 49% [12,13]. This system failed to analyze the impacts of each prediction variable considered in ML techniques. Also, better measurement tools can be considered in order to address disease specific parameters and target specific population groups, hence achieving more accurate results. The training dataset was learnt with three most widely used ML algorithms, which are NB, K-NN and Classification and Regression Tree (CART). A performance evaluation of the predictive model was carried out by few measurement tools, like accuracy, precision, recall and F-score measures. CNN-MDRP could achieve better accuracy as compared to CNN-based Uni-modal Disease Risk Prediction (CNN-UDRP). The system achieved accuracy of 94.8% with faster convergence speed. It was observed that to attain a better trained model, measurements such as Area under Curve (AUC) and ROC values must be approximately equal to one [38]. The graph analysis demonstrated that when there is an increased rate of recall measure, it leads to lower risk of disease prediction. The paper has discussed deeply the handling of structured data using more efficient traditional ML algorithms and unstructured data using modern ML techniques, like CNN-MDRP, to yield better accuracy. Although the paper deliberates about the prediction of complex diseases, it fails to predict when the disease is at a high-risk rate. In such cases, predicting chronic diseases at an early stage for specific age groups will become a challenging task. Recently, an algorithm was proposed to provide a proper guidance to recognize and diagnose heart failure at an early stage [20]. This plays a vital role in reducing heart failure rate as early diagnosis helps to initiate appropriate treatment processes by healthcare professionals. It was explained clearly about the evaluation algorithms for the diagnosis of patients with chronic heart failure. It did not discuss earlier detection of heart disease for any age group specifically. This can be achieved by applying an appropriate traditional ML algorithm with proper feature selection parameters. The most challenging task in IoT enabled health monitoring systems is to easily handle emergency situations that occur due to chronic problems. It also helps to maximize the other potential benefits of superior services. Hussain et al. [34] proposed an IoT enabled smart environment and discussed the various challenges faced at data processing, management and emergency care. There are certain other examples of prediction models, where big data analytics can be merged well with ML algorithms, rather than any other technique [28]. The limitation discussed in this case is about the accuracy of the model. False negative results are one of the most dangerous predictions, in which the patient stays unnoticed until the disease reaches a critical situation. In such a model, it is better to use either transfer learning or deep learning algorithms, depending on the size of datasets. Fernández-Caballero et al. [14] proposed a smart environment architecture for the detection of emotions in older people. The model uses SVM technique and a traditional ML algorithm for the geometric feature extraction, based on facial positions (58 facial points are considered). Different other ways also help to regulate the emotions, such as music performance and color variations in the environment. The study did not focus on any specific health care problem. Pradeebha et al. [26] employed SVM Radial Basis Function as classifier and obtained an accuracy of 98%. Li et al. [27] demonstrated the classification model using C4.5 and SVM methods. Such methods provide better accuracy due to ease of use and ability to handle various categories of data in the presence of an overfitting issue. The big challenge is to choose appropriate learning algorithms for the management of chronic disease prediction in elderly.

Comparative study of chronic disease prediction for elderly people using traditional and modern ML techniques
Based on the data collected from a few selected survey papers, this study examined various factors of chronic disease prediction models developed using traditional and modern ML techniques. A variety of ML algorithms have been used for chronic disease prediction. The comparative study is to provide researchers with the information needed to improve health statistics of older communities for choosing an appropriate ML algorithm. It discusses the research objective, data collection, feature selection methods, ML methods, evaluation scores along with the advantages and limitations of various studies. The survey found that the majority of elderly population suffer from one or more chronic diseases, with a substantial number of elders having physical and functional disabilities. The most common input datasets considered for the development of chronic disease prediction models are the structured datasets. In certain studies [13] structured as well as unstructured dataset (Electronic Health Record (EHR), medical images and gene data) are considered, whereas few studies [37,44] used sensor data like Electrocardiogram (ECG), magnetometer, accelerometer, gyroscope and so on. It is observed that very few studies use feature selection methods for further processing of prediction models. Certain researchers used modern ML algorithms, such as Random Forest (RF) [12], CNN [13], and LSTM [45] as feature selection methods. Also, word embedding [13] and Synthetic Minority Oversampling Technique (SMOTE) [30] techniques were applied in choosing the best features of the prediction model.

Priya and Jinny
Iraqi Journal of Science, 2021, Vol. 62, No. 9, pp: 3138-3151 3146 Certain studies use statistical and ML methods for the selection of features to predict most of the chronic diseases. In such a proposed system, accuracy and performance levels are only at acceptable criteria. The study clearly indicates that very few researchers trained the model using modern ML algorithms. Chen et al. [13] discussed the approaches of CNN-UDRP and CNN-MDRP. Among these, the CNN-MDRP proved better by performance measurements, compared to the CNN-UDRP. In addition, RNN algorithm [37] was used for the training of the model, and obtained an accuracy of 99.69 %. Another study [30] made use of the CART algorithm for the purpose of training, achieving an accuracy of nearly 92%. The most common traditional ML algorithms used in various studies are decision tree, SVM, NB and K-NN. One of the earlier studies [23] trained the model using ensemble methods, like bagging and adaboost algorithms, and attained an ROC score of 0.98%. These traditional ML techniques were found to be better for structured data. Such methods yield better accuracy due to simplicity, suitability for numerical as well as nominal attributes, even with the presence of an overfitting issue. Finally, the review specifies the advantages and disadvantages of every study of the survey work, as dictated in Table-1. The descriptive analysis determined that most of the modern ML techniques help to improve the performance of the health monitoring system. The great challenge is to use an efficient smart cutting-edge technology with correctly chosen traditional ML or modern ML algorithms for the prediction of one or more chronic diseases based on real time and available health records. Moreover, health care prediction results are very much sensitive; they do not provide either true positive or true negative results. Furthermore, diverse smart environment techniques using any wearable device are likely to incorporate traditional ML and deep learning mechanisms, such as RNN, CNN-MDRP, CNN-UDRP for building more sophisticated prediction and feature extraction models. Also, to facilitate better computational speed, GPU is utilized along with smart edging technologies like IoT, Cloud-of-Things (CoT) etc. According to the National Sample Survey Organization (NSSO), there has been a steady rise in elderly population. In rural areas, the health-related problems faced by the older citizens are more critical compared to those of other age groups. Moreover, very few health care centres exist in the villages, where health professionals may not visit such centres regularly. At the same time, there are still multiple technology related challenges that exist in such areas. Elderly patients will require more time, given the usually complicated nature of their histories, medications and co-morbidities, which also means that they require more testing. However, it is important for healthcare providers to recognize that older patients do not always follow what is in the textbooks; providers must be vigilant to ensure that they do not underestimate the potential burden of caring for the booming population of older individuals. Since there is a severe scarcity of revolutions in the field of geriatrics, strong gradation in technology integration are required. The main objective of any healthcare system is to promote a healthy environment, which enhances a better quality of life among older adults. In this regard, our work reviewed various research papers to achieve a comparison study of different prediction models. It is found that conventional and modern ML techniques play eminent roles in healthcare industry for the prediction of chronic diseases. The different methods of data collection, data preprocessing, ML techniques, evaluation models, and results obtained from the survey list are also discussed. Hence, the healthcare industry is urged to maintain a secured cloud storage with an appropriate learning prediction model, which helps the health professionals to diagnose, predict and control the occurrence of any chronic problems. We also conclude that integration of smart healthcare devices by applying traditional ML and modern ML methods, like deep learning models, could help to build more efficient and adequate prediction systems in the healthcare industry. Compliance with Ethical Standards  Funding: The above study was not funded by any organization.  Conflict of Interest: The authors declare here that they don"t have any conflict of interest.