IAES Nawala: The use of data mining

Greetings, fellow Newsletters! May you always be in good health.

This is the IAES Newsletter from the Institute of Advanced Engineering and Science. Today we will share news about the use of data mining. Data mining is the process of discovering interesting knowledge from large amounts of data. These data mining techniques can be used to improve operations and make better decisions. Panthong and Wongkanthiya (2023) conducted a study that used data mining techniques to analyze the health conditions of the elderly. The results obtained show that there are four groups of elderly people with different characteristics. These findings can be used to develop appropriate health interventions for each group.

Analysis of clustering and association using data mining technique for elderly health condition dataset

Rattanawadee Panthong, Thawin Wongkanthiya

Data survey on the elderly health condition in each year aimed to investigate the performance result on the elderly health care and to evaluate the elderly’s health and health promotion. Thus, in analyzing the data, it mainly relied on the mining data technique for the evaluating health condition. This study presented the data analysis by clustering method. Then, the data was taken from each group to find the association rule. The analysis results showed that the elderly’s health condition data could be classified into four different groups; cluster 1 (25%) were male elderly with high blood pressure and smoking cigarette, cluster 2 (25%) were female elderly with no the congenital disease but the result from the eye sight examination, it was found that they were long-sighted, cluster 3 (24%) were female elderly with no the congenital disease but having the insomnia and osteoarthritis and cluster 4 (26%) were female elderly with high blood pressure and diabetes. It also indicated that each group had the rule showing the correlation between the data in each group having the minimum value of confidence at 0.8 and the minimum value of support not less than 0.5.

On the other hand, according to Shanshool’s research (2023), data mining techniques can be used for early diagnosis and treatment of coronary artery disease (CAD). The data mining methods used are decision tree (DT), logistic regression (LR), random forest (RF), and Naïve Bayes (NB). The results obtained show that the NB method has the best accuracy of 89.47%. This research can help accurate, fast, and efficient CAD diagnosis.

Comparison of various data mining methods for early diagnosis of human cardiology

Abeer Mohammed Shanshool, Enas Mohammed Hussien Saeed, Hasan Hadi Khaleel

Recent healthcare reports indicate clearly an increasing mortality rates worldwide which puts a significant burden on the healthcare sector due to different diseases. Coronary artery diseases (CAD) is one of the main reasons of these uprising death rates since it affects the heart directly. For early diagnosis and treatment of CADs, a swiftly growing technology called data mining has been used to collect and categorize necessary data from patients; age, blood sugar and pressure, a type of thorax pain, cholesterol, and so on. Therefore, this paper adopted four data mining methods; decision tree (DT), logistic regression (LR), random forest (RF), and Naïve Bayes (NB) to achieve the goal. The paper utilized the Cleveland dataset along with Python programming language to compare among the four data mining methods in terms of precision, accuracy, recall, and area under the curve. The results illustrated that NB method has the best accuracy of 89.47% compared with previous studies which will help with accurate, fast and inexpensive diagnosis of CADs.

A further use of data mining is to improve the diagnosis of type 2 diabetes, which can result in faster and more efficient diagnosis. Using the random forest algorithm, a model achieved 90.43% accuracy and was integrated into a web application. This study showed significant improvements in terms of diagnosis time, cost, and difficulty.

Predictive machine learning applying cross industry standard process for data mining for the diagnosis of diabetes mellitus type 2

Victor Garcia-Rios, Marieta Marres-Salhuana, Fernando Sierra-Liñan, Michael Cabanillas-Carbonell

Currently, type 2 diabetes mellitus is one of the world’s most prevalent diseases and has claimed millions of people’s lives. The present research aims to know the impact of the use of machine learning in the diagnostic process of type 2 diabetes mellitus and to offer a tool that facilitates the diagnosis of the dis-ease quickly and easily. Different machine learning models were designed and compared, being random forest was the algorithm that generated the model with the best performance (90.43% accuracy), which was integrated into a web platform, working with the PIMA dataset, which was validated by specialists from the Peruvian League for the Fight against Diabetes organization. The result was a decrease of (A) 88.28% in the information collection time, (B) 99.99% in the diagnosis time, (C) 44.42% in the diagnosis cost, and (D) 100% in the level of difficulty, concluding that the application of machine learning can significantly optimize the diagnostic process of type 2 diabetes mellitus.

Some of the articles above are a small part of the research on the use of data mining. To get more information, readers can visit the IAES International Journal of Artificial Intelligence (IJ-AI) page and read articles for FREE via the following link https://ijai.iaescore.com/.