Artificial Intelligence-Based Prediction of Covid-19 Severity on the Results of Protein Profiling

YAŞAR, ŞEYMA; ÇOLAK, CEMİL; YOLOĞLU, SAİM

doi:10.1016/j.cmpb.2021.105996

Artificial Intelligence-Based Prediction of Covid-19 Severity on the Results of Protein Profiling

YAŞAR Ş., ÇOLAK C., YOLOĞLU S.

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE, cilt.202, 2021 (SCI-Expanded, Scopus)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 202
Basım Tarihi: 2021
Doi Numarası: 10.1016/j.cmpb.2021.105996
Dergi Adı: COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE
Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Academic Search Premier, PASCAL, Applied Science & Technology Source, BIOSIS, Biotechnology Research Abstracts, Compendex, Computer & Applied Sciences, EMBASE, INSPEC, MEDLINE
İnönü Üniversitesi Adresli: Evet

Özet

Background: COVID-19 progresses slowly and negatively affects many people. However, mild to moderate symptoms develop in most infected people, who recover without hospitalization. Therefore, the development of early diagnosis and treatment strategies is essential. One of these methods is proteomic technology based on the blood protein profiling technique. This study aims to classify three COVID-19 positive patient groups (mild, severe, and critical) and a control group based on the blood protein profiling using deep learning (DL), random forest (RF), and gradient boosted trees (GBTs). Methods: The dataset consists of 93 samples (60 COVID-19 patients, 33 control), and 370 variables obtained from an open-source website. The current dataset contains age, gender, and 368 protein, used to predict the relationship between disease severity and proteins using DL and machine learning approaches (RF, GBTs). An evolutionary algorithm tunes hyperparameters of the models and the predictions are assessed through accuracy, sensitivity, specificity, precision, F1 score, classification error, and kappa performance metrics. Results: The accuracy of RF (96.21%) was higher as compared to DL (94.73%). However, the ensemble classifier GBTs produced the highest accuracy (96.98%). TGB1BP2 in the cardiovascular II panel and MILR1 in the inflammation panel were the two most important proteins associated with disease severity. Conclusions: The proposed model (GBTs) achieved the best prediction of disease severity based on the proteins compared to the other algorithms. The results point out that changes in blood proteins associated with the severity of COVID-19 may be used in monitoring and early diagnosis/treatment of the disease.