Algorithm for UMLS metathesaurus concepts specificity estimation using example of analysis of the semantic model describing axial spondyloarthritis differential diagnostics
https://doi.org/10.25881/18110193_2023_3_30
Abstract
Background. Early axial spondyloarthritis (axSpA) diagnostics is a difficult task requiring clinical decision support (CDS) making. Currently, there is a big unstructured data applicable in CDS systems development. Semantic data analysis is a complex issue to solve, and unified tools for named entity recognition are required. The biggest data source for biomedical text annotation is the Unified Medical Language System (UMLS) Metathesaurus. It includes more than 11 million atomic terms for writing of 4.6 million concepts. The main issue in UMLS using for medical text analysis is a presence of numerous unspecified (generic) terms without any clinical value. Their application leads to significant decrease of searching results. That is why tools for automatic specificity degree estimation are needed to be developed.
Aim. To develop an algorithm for specificity degree estimation for UMLS metathesaurus concepts (using example of axial spondyloarthritis).
Methods. English clinical abstracts have been used as data source for automatic UMLS named entity recognition. They have been extracted using free search engine PubMed followed by integration into single electronic corpus. Then each of 24276 texts in corpus has been labeled (affiliated with one of diagnosis in differential list for axSpA) and used for UMLS concepts mapping. A total of 8260 UMLS concepts have been recognized. Each term received an expert binary label of relative specificity.
Results. Rules for concepts specificity degree estimation have been developed based on comparison of 4 parameters: mean length of hierarchical chain, total count of direct relationships, TF-IDF score and count of hierarchical relationships with child concepts UMLS. These rules have been integrated into the total algorithm for UMLS concepts specificity degree estimation. Its accuracy was 99,1% for test data sample for paired comparisons. But its accuracy for solid comparison of all extracted concepts was 74,2%, which less than desirable for substantiation of this algorithm use for automatically terms big sets cutbacks. That is why some limitations for developed algorithm have been outlined.
About the Authors
P. A. AstaninRussian Federation
Moscow
L. V. Ronzhin
Russian Federation
Moscow
S. E. Rauzina
Russian Federation
PhD, Associate Professor
Moscow
References
1. Moiseev S, Novikov P, Gulyaev S, et al. Ankylosing spondylitis: diagnostic challenges and efficacy of upadacitinib. Klinicheskaja farmakologija i terapija. 2021: 30(4): 62-70. (In Russ.) doi: 10.32756/0869-5490-2021-4-62-70.
2. Krichevskaya OA, Dubinina TV, Ilinykh EV, et al. Back pain and assessment of ankylosing spondylitis activity during gestation: symptoms reflecting exacerbation of the disease and pregnancy. Modern Rheumatology Journal. 2022: 16(5): 60-65. (In Russ.) doi: 10.14412/1996-7012-2022-5-60-65.
3. Dubinina TV, Demina AB. Radiologic methods as a tool for monitoring axial spondyloarthritis in real clinical practice. Modern Rheumatology Journal. 2022: 16(1): 91-96. (In Russ.) doi: 10.14412/1996-7012-2022-1-91-96.
4. Varavin NA, Vertkin AL. Back pain in therapeutic practice. The Attending Physician. 2022: 25(7-8): 52-56. (In Russ.) doi: 10.51793/OS.2022.25.8.008.
5. Karateev DE, Stepanova EA, Luchikhina EL. Practical guidelines for radiological investigation methods in rheumatoid arthritis and ankylosing spondylitis. Effective Pharmacotherapy. 2022: 18(18): 12-25. (In Russ.) doi: 10.33978/2307-3586-2022-18-18-12-25.
6. Gaiduk AS, Zheleznyak IS, Tyrenko VV, et al. Digital tomosynthesis and other visualization methods in the early detection of axial spondyloarthritis: a review. Diagnostic radiology and radiotherapy. 2022: 2(13): 25-35. (In Russ.) doi: 10.22328/2079-5343-2022-13-2-25-35.
7. Garaeva AR, Lapshina SA, Anisimov VI, et al. Clinical and instrumental diagnostics of early changes in the atlantoaxial area in ankylosing spondylitis. Practical medicine. 2023: 21(3): 68-73. (In Russ.) doi: 10.32000/2072-1757-2023-3-68-73.
8. Ivanova LV, Akulinushkina EYU, Lapshina SA, Abdulganieva DI. Early diagnosis of inflammatory bowel diseases in patients with spondyloarthritis. Practical medicine. 2023: 21(2): 54-57. (In Russ.) doi: 10.32000/2072-1757-2023-2-54-57.
9. Moiseev S, Bulanov N. Autoimmunity, autoinflammation and kidney. Klinicheskaya farmakologiya i terapiya. 2022: 31(4): 7-17. (In Russ.) doi: 10.32756/0869-5490-2022-4-7-17.
10. Ponomareva MN, Karpova DA, Petrov IM. Ankylosing spondilitis: hypotheses of pathogenesis, new biomarkers and features of therapy. Modern problems of science and education. 2021: 6: 188. (In Russ.) doi: 10.17513/spno.31264.
11. Martyushev-Poklad AV, Guliev YI, Kazakov IF, et al. Person-centered instruments in digital transformation of healthcare: ways to improve. Medical doctor and information technologies. 2021: S5: 4-13. (In Russ.) doi: 10.25881/18110193_2021_S5_4.
12. Batudaeva TI, Pavlova AB, Lobysheva EA, Arzumanyan EA. Analysis of hospital care for patients with ankylosing spondylitis. Vestnik buryatskogo gosudarstvennogo universiteta. Meditsina i farmatsiya. 2022: 1: 7-14. (In Russ.) doi: 10.18101/2306-1995-2022-1-7-14.
13. Shostak NA, Pravdyuk NG, Novikova AV. Back pain in young people: algorithm of management in practice of primary physician. Analysis of hospital care for patients with ankylosing spondylitis. The Clinician. 2022: 16(3): 48-57. (In Russ.) doi: 10.17650/1818-8338-2022-16-3-K674.
14. Lila AM, Dubinina TV, Dreval RO, et al. Medical and social significance and calculation of the economic burden of axial spondyloarthritis in the Russian Federation. Modern Rheumatology Journal. 2022: 16(1): 20-25. (In Russ.) doi: 10.14412/1996-7012-2022-1-20-25.
15. Astanin PA, Narkevich AN. Digital technology for estimation of course of diseases with acute pain syndrome on the example of ankylosing spondylitis. Russian Journal of Pain. 2021: 19(2): 38-41. (In Russ.) doi: 10.17116/pain20211902138.
16. Kiselev KV, Noeva EA, Vyborov ON. Development of a reasoning solver algorithm for instrumental diagnostics of angina pectoris in intelligent clinical decision support system. Medical Technologies. Assessment and Choice. 2019: 1(35): 32-42. (In Russ.) doi: 10.31556/2219-0678.2019.35.1.032-042.
17. Kobrinskii BA, Blagosklonov NA, Demikova NS, et al. An artifiсial intelligence computer system for differential diagnosis of lysosomal storage diseases. Bulletin of Siberian Medicine. 2022: 21(2): 67-73. (In Russ.) doi: 10.20538/1682-0363-2022-2-67-73.
18. Orlova NV, Suvorov GN, Gorbunov KS. Ethics and legal regulation of using large databases in medicine. Medical ethics. 2022: 10(3): 4-9. (In Russ.) doi: 10.24075/medet.2022.056.
19. Sharova DE, Mikhailova AA, Gusev AV. An analysis of global experience in regulations on the use of medical data for artificial intelligence systems development based on machine learning. Medical doctor and information technologies. 2022: 4: 28-39. (In Russ.) doi: 10.25881/18110193_2022_4_28.
20. Gusev AV, Zingerman BV, Tyufilin DS, Zinchenko VV. Electronic medical records as a source of real-world clinical data. Real-World Data & Evidence. 2022: 2(2): 8-20. (In Russ.) doi: 10.37489/2782-3784-myrwd-13.
21. Gurdaeva NA. Spetsial’naya leksika sovremennogo russkogo yazyka v svete teorii funktsional’no-semanticheskogo polya. Vestnik Taganrogskogo gosudarstvennogo pedagogicheskogo instituta. 2012: 2: 15-19. (In Russ.)
22. Zarubina TV. Medical informatics. 2nd ed. Moscow: GEOTAR-Media, 2022. (In Russ.) doi: 10.33029/9704-6273-7-TMI-2022-1-464.
23. Osmolovsky IS, Zarubina TV, Shostak NA. Development of knowledge base structure for gout diagnosis. The Siberian Journal of Clinical and Experimental Medicine. 2022: 37(3): 149-158. (In Russ.) doi: 10.29001/2073-8552-2022-37-3-149-158.
24. Budykina AV, Tikhomirova EV, Kiselev KV, et al. Formalization of knowledge about gastrointestinal bleeding of unknown origin for use in intelligent clinical decision support systems. Journal of New Medical Technologies. 2020: 27(4): 98-101. (In Russ.) doi: 10.24411/1609-2163-2020-16741.
25. Kolesnikova OI, Lopatina EV, Sokolova VV. Terminological correspondence in the translation of economic terms from English to Russian. Mezhdunarodnyi nauchno-issledovatel’’skii zhurnal. 2021: 1-3(103): 153-157. (In Russ.) doi: 10.23670/IRJ.2021.103.1.083.
26. Zatsman IM. Forms representing new knowledge discovered in texts. Informatics and Applications. 2021: 15(3): 83-90. (In Russ.) doi: 10.14357/19922264210311.
27. Bodenreider O. The Unified Medical Language System (UMLS): integrating biomedical terminology. Nucleic Acids Res. 2004; 32: 267-270. doi: 10.1093/nar/gkh061.
28. Astanin PA. Primenenie avtomatizirovannogo analiza semanticheskoi seti UMLS dlya resheniya zadachi poiska relevantnykh znanii o revmaticheskikh zabolevaniyakh. Matematicheskoe modelirovanie sistem i protsessov. 2022: 6-12. (In Russ.) doi: 10.37490/978-5-00200-102-6-6-12.
29. Tuchkova PA. Application of natural language processing methods for analysing of text and speech data in medicine. Naukosfera. 2021: 5-1: 174-179. (In Russ.) doi: 10.5281/zenodo.4771893.
30. Sboev AG, Selivanov AA, Rybka RB. Sovremennye metody ekstraktsii svyazannykh imenovannykh sushchnostei na primere biomeditsinskikh tekstovykh dannykh. Vestnik voennogo innovatsionnogo tekhnopolisa «Era» (In Russ.) doi: 10.56304/S2782375X22010193.
31. Gusev AV, Vladzimirskii AV, Golubev NA, Zarubina TV. Informatization of healthcare in the Russian Federation: history and results of development. National Health Care (Russia). 2021: 2(3): 5-17. (In Russ.) doi: 10.47093/2713-069X.2021.2.3.5-17.
32. Nikitina NM, Yupatova MI, Rebrov AP. Problems of osteoporosis in patients with ankylosing spondylitis (literature review). Medical alphabet. 2023: 9: 40-45. (In Russ.) doi: 10.33667/2078-5631-2023-9-40-45.
33. Godzenko AA, Cheremushkina EV, Dimitreva AE, Urumova MM. Combination of ankylosing spondylitis and rheumatoid arthritis: clinical observations and literature review. Modern Rheumatology Journal. 2021: 15(4): 72-80. (In Russ.) doi: 10.14412/1996-7012-2021-4-72-80.
34. Nurpolatova S, Kosymbetova A, Dzhumanazarova G. Back pain, as one of the problems of medicine. Bulletin of science and practice. 2021: 7(6): 200-207. (In Russ.) doi: 10.33619/2414-2948/67/23.
35. Olyunin YuA, Lila AM. Chronic back pain from rheumatologist point of view. Modern Rheumatology Journal. 2022: 16(5): 94-100. (In Russ.) doi: 10.14412/1996-7012-2022-5-94-100.
36. Astanin PA, Rauzina SE, Zarubina TV. Automated system for recognizing clinically relevant UMLS terms in texts of the English-language articles exemplified by axial spondyloarthritis. Social Aspects of Population Health. 2023: 69(3): 1. (In Russ.) doi: 10.21045/2071-5021-2023-69-3-14.
37. Valiev AI, Lysenkova SA. Application of machine learning methods for automation of the process of the text contents analysis. Proceedings in Cybernetics. 2021: 4(44): 12-15. (In Russ.) doi: 10.34822/1999-7604-2021-4-12-15.
38. Zulkarneev R, Yusupova N, Smetanina O. Method and models of extraction of knowledge from medical documents. Informatics and Automation. 2022: 21(6): 1169-1210. (In Russ.) doi: 10.15622/ia.21.6.4.
Review
For citations:
Astanin P.A., Ronzhin L.V., Rauzina S.E. Algorithm for UMLS metathesaurus concepts specificity estimation using example of analysis of the semantic model describing axial spondyloarthritis differential diagnostics. Medical Doctor and Information Technologies. 2023;(3):30-43. (In Russ.) https://doi.org/10.25881/18110193_2023_3_30