Artificial intelligence based model for Automatic Real-Time and Non-Invasive Estimation of blood potassium level in pediatric patients
Hamid Mokhtari Torshizi1, Negar Omidi2, Mohammad Rafie Khorgami3,Fattaneh Khalaj4, Mohsen Ahmadi5*
1PhD student of Biomedical Engineering and Physics Department, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran
2Associate Professor of Cardiology, Department of Cardiology, School of Medicine, Tehran Heart Center, Tehran University of Medical Sciences, Tehran, Iran
3Rajaie Heart Center and Department of Pediatric Cardiology, School of Medicine, Iran University of Medical Sciences, Tehran, Iran
4Digestive Disease Research Center, Digestive Disease Research Institute, Tehran University of Medical Sciences, Tehran, Iran
5Associate professor, Department of Biomedical Engineering, School of Medicine, Shahid Beheshti University of Medical Sciences, Tehran, Iran.
Mohsen Ahmadi, MD , Associate professor, Department of Biomedical Engineering, School of Medicine,
Shahid Beheshti University of Medical Sciences, Tehran, Iran.
The cross-correlation matrix was used for dimensionality reduction. For every two signal features with a correlation greater than 0.7, one was removed (Fig 3).
We used z-score standardization according to the following formula:
- is a data point (x1, x2 … xn)
- is the sample mean
- S is the sample standard deviation
A Regression model is one of the most common types of supervised learning in Machine Learning. When evaluating the linear relationship between the ECG characteristics and potassium serum level, we used the Pearson Correlation Coefficient and evaluated the non-linear relationship using Decision Trees and Random Forest algorithms. Python version 3 was used for cardiac signal processing.
SVR, Decision Tree, Random Forest, Linear Regression, and Polynomial Regression algorithms were used to test the linear and non-linear relationship between characteristics and potassium serum level. Based on the results, the Random Forest algorithm has the best performance in this research.
It is a group learning method that involves combining several trees. Each tree is trained, and several random features are sampled at each tree node. The average of the regression results from all the decision trees is assigned to the final decision.
In this study, we used 20% of the population as a test sample. The estimated potassium level was calculated from the obtained ECG data using the corresponding patient-specific potassium prediction model developed during the training phase. To assess the accuracy, we calculated the mean absolute error, which is the mean absolute value of the difference between the estimated and measured potassium for each patient.
Figure 2: Histogram of blood potassium serum level.
Figure 3: Cross-correlation matrix.
Based on the information in Figure 3, we remove one of the two parameters that correlate more than 0.7. PR, Ps, PT, Twidth, QS, QR, QT, RS, RT, ST, and RtoT are variables that we use to teach regression methods. Table 1 shows the efficiency of each regression algorithm based on the MSE. As indicated in the table, the polynomial method has the lowest accuracy, and the random forest method has the highest measurement accuracy.
Table 1: Efficiency of regression methods based on MSE.
Considering that most of our studied patients have a potassium level between 4 and 4.5, we use the scatter diagram to get better feedback than regression methods. Figure 4, shows the scatter diagram for different approaches.
Figure 4: Scatter diagram of different methods.
Another important goal of this study is to calculate the importance of each feature in determining the level of potassium. In Figure 5, the importance of each feature by the random forest algorithm is shown in percentage terms. Figure 6 shows how the decision tree algorithm yields decisions.
Figure 5: Features` importance by the random forest algorithm
Figure 6: How to allocate blood serum potassium level based on the input characteristics in the decision tree algorithm