Preview

Medical Doctor and Information Technologies

Advanced search

Special aspects of dataset creation for artificial intelligence services in neuroimaging: the case of a dataset creation with ct images of the brain with signs of hemorrhages

https://doi.org/10.25881/18110193_2023_4_42

Abstract

Aim. To demonstrate the special aspects of dataset creation for neuroimaging using the example of preparing a dataset with computed tomographic images of the brain with and without signs of intracranial hemorrhage.

 Methods. The creation of the dataset is based on the methodology developed by the Scientific and Practical Clinical Center for Diagnostics and Telemedicine (regulations for preparing the dataset), which is carried out in 4 stages: planning (selection of the necessary keywords for the initial selection of studies, determination of inclusion and exclusion criteria, source of medical information), selection (initial downloading of the text information - a brief patient history and description protocols from the Unified Radiological Information Service of the city of Moscow for 2020, anonymization of the received data, keywords analysis), labeling and verification (filling out the accompanying table with clinical and technical data, study selection by two radiologists and an expert verification by a neuroradiologist) and publication (publication of the dataset online, state registration).

 Results. In the process of creating a dataset, the special aspects, defined by the neuroradiology background, were noted and formulated, which should be taken into the account when executing the primary training, testing and additional training of artificial intelligence services for diagnosing brain diseases: the use of specific terms, the use of images with the least amount of noise and the highest contrast, as well as the use of ratios of subtypes of the target pathology corresponding to its ratio in the population. A dataset with computed tomography images containing signs of intracranial hemorrhage was prepared. The final version of the dataset included anonymized studies of 209 patients (109 with the pathology, 100 without the pathology): DICOM images, an accompanying text table with clinical features (gender, age, type(s) and number of hemorrhages, presence/absence of concomitant pathology) and technical parameters (slice thickness and reconstruction slice thickness).

 Conclusion. The special aspects of preparing datasets for training and testing neuroradiological artificial intelligence services were demonstrated.

About the Authors

E. I. Kremneva
«Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies Department of Health of the City of Moscow»
Russian Federation

 PhD

 Moscow



A. K. Smorchkova
«Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies Department of Health of the City of Moscow»
Russian Federation

  Moscow



A. N. Khoruzhaya
«Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies Department of Health of the City of Moscow»
Russian Federation

Moscow



D. S. Semenov
«Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies Department of Health of the City of Moscow»
Russian Federation

PhD

 Moscow



A. V. Maltsev
«Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies Department of Health of the City of Moscow»
Russian Federation

 PhD

 Moscow



D. E. Sharova
«Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies Department of Health of the City of Moscow»
Russian Federation

Moscow



V. V. Zinchenko
«Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies Department of Health of the City of Moscow"
Russian Federation

 Moscow



A. V. Vladzymyrskyy
«Scientific and Practical Clinical Center for Diagnostics and Telemedicine Technologies Department of Health of the City of Moscow»
Russian Federation

 DSc

Moscow



References

1. McCarthy J, Minsky ML, Rochester N, Shannon CE. A Proposal for the Dartmouth Summer Research Project on Artificial Intelligence, August 31, 1955. AI Mag. 2006; 27(4): 12. doi: 10.1609/aimag.v27i4.1904.

2. He J, Baxter SL, Xu J, Xu J, Zhou X, Zhang K. The practical implementation of artificial intelligence technologies in medicine. Nat Med. 2019; 25(1): 3-–6. doi: 10.1038/s41591-018-0307-0.

3. Hosny A, Parmar C, Quackenbush J, Schwartz LH, Aerts HJWL. Artificial intelligence in radiology. Nat Rev Cancer. 2018; 18(8): 500-10. doi: 10.1038/s41568-018-0016-5.

4. Sharova DE, Mikhailova AA, Gusev AV, et al. An analysis of global experience in regulations on the use of medical data for artificial intelligence systems development based on machine learning. Vrach Inf Tehnol. 2022; 4: 28-39. (In Russ.)] doi: 10.25881/18110193_2022_4_28.

5. Morozov SP, Vladzymyrskyy AV, Ledikhova NV, et al. Moscow experiment on computer vision in radiology: involvement and participation of radiologists. Vrach Inf Tehnol. 2020; 4: 14-23. (In Russ.) doi: 10.37690/1811-0193-2020-4-14-23.

6. Salmela MB, Mortazavi S, Jagadeesan BD, Broderick DF, Burns J, Deshmukh TK, et al. ACR Appropriateness Criteria ® Cerebrovascular Disease. J Am Coll Radiol. 2017; 14(5): S34-61. doi: 10.1016/j.jacr.2017.01.051.

7. Nabor dannyh KT golovnogo mozga dlya poiska priznakov krovoizliyanij — nabory dannyh v luchevoj diagnostike. Available at: https://mosmed.ai/datasets/nabor-dannyih-kt-golovnogo-mozga-s-tselyu-testirovaniya-ii-servisov-dlya-poiska-priznakov-krovoizliyanij/. Accessed 22.11.2023. (In Russ.)

8. Nabor dannyh KT golovnogo mozga s cel’yu testirovaniya II-servisov dlya poiska priznakov krovoizliyanij — nabory dannyh v luchevoj diagnostike. Available at: https://mosmed.ai/datasets/nabor-dannyih-kt-golovnogo-mozga-s-tselyu-testirovaniya-ii-servisov-dlya-poiska-priznakov-krovoizliyanij5/. Accessed 22.11.2023. (In Russ.)

9. Certificate of the state registration of the database RU 2022620559/ 16.03.2022 Morozov SP, Pavlov NA, Petraikin AV, Kremneva EI, Khoruzhaya AN, Smorchkova AK MosMedData: Nabor diagnosticheskih komp’jutenro-tomograficheskih izobrazhenij golovnogo mozga s nalichiem I otsutstviem priznakov vnutricherepnogo krovoizlijanija. Available at: https://www.elibrary.ru/item.asp?id=48137428. Accessed 22.11.2023. (In Russ.)

10. Wilkinson MD, Dumontier M, Aalbersberg IjJ, Appleton G, Axton M, Baak A, et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data. 2016; 3(1): 160018. doi: 10.1038/sdata.2016.18.

11. Ker J, Singh SP, Bai Y, Rao J, Lim T, Wang L. Image Thresholding Improves 3-Dimensional Convolutional Neural Network Diagnosis of Different Acute Brain Hemorrhages on Computed Tomography Scans. Sensors. 2019; 19(9): 2167. doi: 10.3390/s19092167.

12. Flanders AE, Prevedello LM, Shih G, Halabi SS, Kalpathy-Cramer J, Ball R, et al. Construction of a Machine Learning Dataset through Collaboration: The RSNA 2019 Brain CT Hemorrhage Challenge. Radiol Artif Intell. 2020; 2(4): e209002. doi: 10.1148/ryai.2020190211.

13. Qing L, Linhong W, Xuehai D. A Novel Neural Network-Based Method for Medical Text Classification. Future Internet. 2019; 11(12): 255. doi: 10.3390/fi11120255.

14. Hughes M, Li I, Kotoulas S, Suzumura T. Medical Text Classification Using Convolutional Neural Networks. In: Informatics for Health: Connected Citizen-Led Wellness and Population Health. IOS Press; 2017: 246-50. Available at: https://ebooks.iospress.nl/doi/10.3233/978-1-61499-753-5-246. Accessed 22.11.2023.

15. Kokina D, Meshalkin Y, Gombolevskiy V., Vladzymyrskyy A., Andreychenko A., Morozov S. RPS 605-10 Text mining in Russian radiology reports. ECR 2021 Book of Abstracts. Insights Imaging. 2021; 12(S2): 75. doi: 10.1186/s13244-021-01014–5.

16. Chilamkurthy S, Ghosh R, Tanamala S, Biviji M, Campeau NG, Venugopal VK, et al. Deep learning algorithms for detection of critical findings in head CT scans: a retrospective study. The Lancet. 2018; 392(10162): 2388-96. doi: 10.1016/S0140-6736(18)31645-3.

17. Lee H, Yune S, Mansouri M, Kim M, Tajmir SH, Guerrier CE, et al. An explainable deep-learning algorithm for the detection of acute intracranial haemorrhage from small datasets. Nat Biomed Eng. 2019; 3(3): 173-82. doi: 10.1038/s41551-018-0324-9.

18. Kulberg NS, Reshetnikov RV, Novik VP, et al. Inter-observer variability between readers of CT images: all for one and one for all. Digit Diagn. 2021; 2(2): 105-18. (In Russ.)] doi: 10.17816/DD60622.


Review

For citations:


Kremneva E.I., Smorchkova A.K., Khoruzhaya A.N., Semenov D.S., Maltsev A.V., Sharova D.E., Zinchenko V.V., Vladzymyrskyy A.V. Special aspects of dataset creation for artificial intelligence services in neuroimaging: the case of a dataset creation with ct images of the brain with signs of hemorrhages. Medical Doctor and Information Technologies. 2023;(4):42-53. (In Russ.) https://doi.org/10.25881/18110193_2023_4_42

Views: 22


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1811-0193 (Print)
ISSN 2413-5208 (Online)