Prediction of intrapartum fetal hypoxia considering feature selection algorithms and machine learning models

Comert, Zafer; ŞENGÜR, ABDULKADİR; BUDAK, ÜMİT; KOCAMAZ, ADNAN

doi:10.1007/s13755-019-0079-z

Prediction of intrapartum fetal hypoxia considering feature selection algorithms and machine learning models

Comert Z., ŞENGÜR A., BUDAK Ü., KOCAMAZ A. F.

HEALTH INFORMATION SCIENCE AND SYSTEMS, cilt.7, sa.1, 2019 (ESCI)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 7 Sayı: 1
Basım Tarihi: 2019
Doi Numarası: 10.1007/s13755-019-0079-z
Dergi Adı: HEALTH INFORMATION SCIENCE AND SYSTEMS
Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI)
İnönü Üniversitesi Adresli: Evet

Özet

Introduction Cardiotocography (CTG) consists of two biophysical signals that are fetal heart rate (FHR) and uterine contraction (UC). In this research area, the computerized systems are usually utilized to provide more objective and repeatable results. Materials and Methods Feature selection algorithms are of great importance regarding the computerized systems to not only reduce the dimension of feature set but also to reveal the most relevant features without losing too much information. In this paper, three filters and two wrappers feature selection methods and machine learning models, which are artificial neural network (ANN), k-nearest neighbor (kNN), decision tree (DT), and support vector machine (SVM), are evaluated on a high dimensional feature set obtained from an open-access CTU-UHB intrapartum CTG database. The signals are divided into two classes as normal and hypoxic considering umbilical artery pH value (pH < 7.20) measured after delivery. A comprehensive diagnostic feature set forming the features obtained from morphological, linear, nonlinear, time-frequency and image-based time-frequency domains is generated first. Then, combinations of the feature selection algorithms and machine learning models are evaluated to achieve the most effective features as well as high classification performance. Results The experimental results show that it is possible to achieve better classification performance using lower dimensional feature set that comprises of more related features, instead of the high-dimensional feature set. The most informative feature subset was generated by considering the frequency of selection of the features by feature selection algorithms. As a result, the most efficient results were produced by selected only 12 relevant features instead of a full feature set consisting of 30 diagnostic indices and SVM model. Sensitivity and specificity were achieved as 77.40% and 93.86%, respectively. Conclusion Consequently, the evaluation of multiple feature selection algorithms resulted in achieving the best results.