No 3 (2023)

Development and validation of a tool for statistical comparison of ROC-curves using the example of algorithms based on artificial intelligence technologies

T. M. Bobrovskaya, Y. S. Kirpichev, E. F. Savkina, S. F. Chetverikov, K. M. Arzamasov

PDF (Rus)

4-15 15

Abstract

Background: Due to the National Strategy for the Development of Artificial Intelligence, large-scale digitalization of healthcare is taking place in the Russian Federation, which leads to huge number of various practical and scientific tasks emergence of, which in turn require convenient tools to solve them. ROC analysis tool is one of them, which was developed and successfully applied within the framework of the project «Experiment on the use of innovative technologies in the field of computer vision for the analysis of medical images and further application in the healthcare system of the city of Moscow». However, there is an urgent need for the development of a module comparing ROC-curves in order to solve a wider range of problems related to analytics of the operation of technologies based on artificial intelligence.
Aim: to implement the ROC analysis tool module for comparing the area under the curve using statistical methods and calculating the p-value, and to test it on real data.
Materials and methods: the tool is implemented in Python 3.9. The 95% confidence interval for ROC curves was calculated using the bootstrapping and the DeLong method. Areas under the ROC curves comparison was carried out using a permutation test.
The testing of the tool was carried out on the 6 algorithms work results on 2 data sets. Area under the ROC curve pairwise comparison was carried out and the results were compared with the same data results analysis, calculated by the DeLong method (roc.test function, R language 3.6.1).
Results: the p-values obtained using the permutation test were in most cases comparable to the roc.test results, however, in 4 out of 30 cases, the p-values differed significantly, which led to changes in the test interpretation.
Discussion: the differences in the results calculated by two separate methods, in our opinion, are due to the peculiarities of the methods used: DeLong method is more conservative. Also, due to the use of the pseudorandomization method in the permutation test, variability of results is possible, which can lead to uncertainty. In addition, the developed tool compares data of the same length, which is a limitation of its use, but its further development is possible for data of different lengths.
Conclusion: the module for comparing ROC curves was successfully implemented and tested using statistical criteria with the calculation of the p-value.

Development of a machine learning model predicting the incidence of newly diagnosed HIV infection in the subjects of the Russian Federation

M. Yu. Kotlovskiy, E. B. Tsybikova, S. M. Lorsanov, P. A. Fadeev, S. O. Fadeeva, A. V. Gusev

PDF (Rus)

16-29 24

Abstract

Aim: to develop a model predicting the incidence of newly diagnosed HIV infection in the subjects of the Russian Federation using machine learning methods.
Materials and methods: The initial data were obtained from the Federal statistical observation Form No. 61 and Rosstat data on the average annual population of 85 subjects of the Russian Federation (2016-2022). We made a comparison of machine learning methods and their ensembles in the construction of a regression model for predicting the incidence of newly diagnosed patients with HIV infection in the subjects of the Russian Federation.
Results: The model was built using the following methods: linear regression, decision Tree, random forest, gradient boosting on decision trees (GBDT) and bagging. The interactive computing environment «Jupiter Notebook» (6.5.2) and software libraries «Pandas» (1.5.3), «Scikit-learn» (1.0.2), «Statsmodels» (0.13.5) and CatBoost were utilized. Optimal hyperparameters were selected using the Optuna framework. The following quality metrics were used: root of mean square error (RMSE); coefficient of determination (R2); average absolute error (MAE); average absolute percentage error (MAPE); median absolute error (MedAE).
Conclusions: The use of machine learning methods and algorithms gives different results in terms of metrics of model accuracy. The worst values of all quality metrics were demonstrated by the linear regression method (MAPE 67%). The combination (bagging) of the two ensemble methods — Random Forest and GBDT — was the best, since the highest values were found for a larger number of quality metrics. In this regard, it is reasonable to test all available machine learning methods and algorithms and then select the best-quality model from the results obtained.

Algorithm for UMLS metathesaurus concepts specificity estimation using example of analysis of the semantic model describing axial spondyloarthritis differential diagnostics

P. A. Astanin, L. V. Ronzhin, S. E. Rauzina

PDF (Rus)

30-43 11

Abstract

Background. Early axial spondyloarthritis (axSpA) diagnostics is a difficult task requiring clinical decision support (CDS) making. Currently, there is a big unstructured data applicable in CDS systems development. Semantic data analysis is a complex issue to solve, and unified tools for named entity recognition are required. The biggest data source for biomedical text annotation is the Unified Medical Language System (UMLS) Metathesaurus. It includes more than 11 million atomic terms for writing of 4.6 million concepts. The main issue in UMLS using for medical text analysis is a presence of numerous unspecified (generic) terms without any clinical value. Their application leads to significant decrease of searching results. That is why tools for automatic specificity degree estimation are needed to be developed.
Aim. To develop an algorithm for specificity degree estimation for UMLS metathesaurus concepts (using example of axial spondyloarthritis).
Methods. English clinical abstracts have been used as data source for automatic UMLS named entity recognition. They have been extracted using free search engine PubMed followed by integration into single electronic corpus. Then each of 24276 texts in corpus has been labeled (affiliated with one of diagnosis in differential list for axSpA) and used for UMLS concepts mapping. A total of 8260 UMLS concepts have been recognized. Each term received an expert binary label of relative specificity.
Results. Rules for concepts specificity degree estimation have been developed based on comparison of 4 parameters: mean length of hierarchical chain, total count of direct relationships, TF-IDF score and count of hierarchical relationships with child concepts UMLS. These rules have been integrated into the total algorithm for UMLS concepts specificity degree estimation. Its accuracy was 99,1% for test data sample for paired comparisons. But its accuracy for solid comparison of all extracted concepts was 74,2%, which less than desirable for substantiation of this algorithm use for automatically terms big sets cutbacks. That is why some limitations for developed algorithm have been outlined.

Multilevel categorization of continuous variables in the tasks of explaining predictive estimates of machine learning models in clinical medicine

K. I. Shakhgeldyan, B. I. Geltser, N. S. Kuksin, I. G. Domzhalov

PDF (Rus)

44-57 14

Abstract

Aim: Comparative assessment of the quality of predictive models of in-hospital mortality (IHM) in patients with ST-segment elevation myocardial infarction (STEMI) after percutaneous coronary artery intervention (PCI), developed on the basis of predictors in continuous, dichotomous and multilevel categorical forms.
Materials and methods: This was a single-center retrospective study analyzing data from 4677 medical records of patients with STEMI PCI who were treated at the Regional Vascular Center of Vladivostok. Two groups of patients were identified: the first consisted of 318 (6.8%) patients who died in hospital, the second — 4359 (93.2%) patients with a favorable treatment outcome. Predictive models of IHF with continuous variables were developed using multivariate logistic regression, random forest, and stochastic gradient boosting. Dichotomization of predictors was performed using grid search methods for optimal cutoff points, centroid calculation, and Shapley additive explanation (SHAP). It was proposed for multi-level categorization to use a combination of threshold values identified during dichotomization, as well as ranking cut-off thresholds using multivariate logistic regression weighting coefficients.
Results: Based on the results of a multistage analysis of indicators of the clinical and functional status of STEMI patients, new predictors of IHM were identified and validated, their categorization was performed, and prognostic models with continuous, dichotomous and multilevel categorical variables were developed (AUC: 0.885-0.902). Models whose predictors were identified using the multimetric categorization method were not inferior in accuracy to models with continuous variables and had higher quality metrics than algorithms with dichotomous predictors. The advantage of models with multilevel categorization of predictors was the ability to explain and clinically interpret the results of IHM prediction.
Conclusions: Multilevel categorization of predictors is a promising tool for explaining predictive scores in clinical medicine.

Automated complex of multidisciplinary neural network support of medical decisions in the treatment of coronary heart disease

D. M. Zhuravlev, F. Yu. Kopylov, V. K. Chaadaev, S. V. Ardatov, K. V. Chaadaev

PDF (Rus)

58-71 21

Abstract

The article covers methods and procedures for developing a neural network decision support system when choosing the tactics of surgical intervention on coronary heart vessels. The system is designed to advise a wide range of practicing cardiologists and cardiac surgeons when deciding on the tactics of surgical intervention in patients with conditions associated with compromised coronary vessels. Based on a mathematical model taking into account a number of factors and the outcomes of previously performed surgeries, the neural network system offers a choice between aorto-coronary bypass surgery and percutaneous coronary intervention. The decision determined by the system can serve as an additional argument for the final adoption of a collegial decision in complex clinical cases. Right decision affects the patient’s recovery time after surgery, the quality of life after recovery, and the ability to continue working after treatment. The neural network decision support system in the field of cardiac surgery is designed as a standard application for a personal computer with specific technical characteristics that allow processing a large amount of data. Access to the system can be obtained by any cardiologist or cardiac surgeon registered in the system and validated. The developed complex is designed to provide healthcare institutions with a digital product and domestic service based on a new technological structure.

Using cartographic scheme of territorial planning in health care

V. I. Klimko, I. A. Solomatnikov, O. V. Khodakova, O. V. Kungurtcev, F. I. Stradymov, E. V. Chernomyrdina

PDF (Rus)

72-83 15

Abstract

This article discusses the experience of using data visualization tools in the preparation of territorial planning schemes for regional health systems. The article also covers trends in the development of visualization tools and the competencies necessary for decision makers to use these tools effectively in making managerial decisions.

Possible applications of digital technologies in the preventive examination of schoolchildren in Samara

G. Yu. Poretskova, A. A. Tyazheva, S. V. Plakhotnikova, S. S. Chaplygin, S. A. Napalkova, I. V. Barinov, N. I. Panova

PDF (Rus)

84-92 26

Abstract

One of the key tasks of healthcare is to preserve the health of the child population, which is achieved through comprehensive health assessment. Timely identification of negative changes in children health during periodic medical examinations is of great importance. This justifies the development of more accurate methods to extract useful information from the data obtained during a comprehensive examination of children.
Aim: to improve the procedure for preventive examinations of school-age children using digital technologies and remote data transmission.
Methods: 128 children aged 14-17 years living in Samara were examined using digital technologies. The comparison group included 54 adolescents of the similar gender and average age distribution within the group.
Results: the use of information technology allows reducing the work load of medical personnel during a medical examination of schoolchildren. Analytical software and hardware complex identified that 47.8% of children had physical development disorders, 3.8% of children had newly diagnosed bronchial conduction disorders and 4.6% of children had arterial hypertension.
Conclusions: the use of digital technologies allows reducing the number of possible errors when fixing anthropometric and physiometric data and increasing the efficiency of medical decision-making during preventive examinations of children.

Medical Doctor and Information Technologies

ORIGINAL RESEARCH

PRACTICE EXPERIENCE

Username
Password
	Remember me

User

Medical Doctor and Information Technologies

ORIGINAL RESEARCH

PRACTICE EXPERIENCE

Cookies policy