Clustering Analysis of European Health Status Pre-Covid-19
Chapter from the book:
Akoğul,
S.
&
Tuna,
E.
(eds.)
2024.
Academic Studies with Current Econometric and Statistical Applications.
Synopsis
The influence of genetic factors as well as the individual's living environment and consumption habits are very important for a healthy and long life. The consumption of unhealthy products (alcohol, cigarettes, etc.) and the occurrence of diseases such as obesity and diabetes as a result of an unhealthy diet are inevitable. The aim of this study is to perform a cluster analysis by individual health criteria to observe the level of preparedness of Europe for a pandemic using data from the period just before the Covid-19 pandemic, which was announced worldwide in March 2020. The cluster analysis is conducted using data on the health status of people living in Europe prior to the Covid-19 pandemic contained in the Global Health Report published by the World Health Organization (WHO) in November 2021. In assessing European health status, countries are analyzed according to four categories and clustered using the k-means method: 1) demographic characteristics; 2) alcohol and tobacco prevalence per capita; 3) the likelihood of dying from cardiovascular disease (CVD), cancer, diabetes and chronic respiratory disease (CRD), and 4) the prevalence of diabetes, tuberculosis, systolic blood pressure (SBP) and diastolic blood pressure (DBP), and obesity. Data from 38 countries with no missing observations are analyzed using data from the health report. The countries are divided into 2 clusters using the cluster dendrogram, the Calinski-Harabasz index and the elbow method. There are a total of 23 countries in cluster 1 and 15 countries in cluster 2. The Ward method and Euclidean distance are used for clustering. The k-means method is used to calculate the confusing matrix by bootstrap for 2 clusters. An accurate prediction is achieved with 94.04% success for cluster 1 and 91.7% success for cluster 2.