Comparison of machine learning methods for predicting cervical cancer risk based on genetic predisposition


DOI: https://dx.doi.org/10.18565/epidem.2024.14.1.77-82

Vinokurov M.A., Mironov K.O., Domonova E.A., Romanyuk T.N., Popova A.A., Akimkin V.G.

Central Research Institute of Epidemiology, Russian Federal Service for Supervision of Consumer Rights Protection and Human Well-Being, Moscow, Russia
The etiological agent of cervical cancer (CC) is human papillomavirus (HPV). However, not all HPV-infected women develop CC, which suggests a genetic predisposition.
Objective. Comparison and selection of the most optimal machine learning method for predicting the development of cervical cancer in HPV-infected women using data on genetic predisposition.
Materials and methods. DNA samples from 127 women with CC and 120 women without intraepithelial lesions were studied. The following single nucleotide polymorphisms were taken for analysis: rs55986091 (HLA-DQB1), rs2516448 (MICA) and rs9271898 (HLA-DQA1). To predict cervical cancer, the following methods were used: logistic regression, random forests, Gradient Boosting Machine (GBM), XGBoost and neural network.
Results. Predictors associated with the risk of developing CC included the presence of HPV types 16, 18 or 45 and 3 polymorphic variants: rs55986091, rs2516448 or rs9271898. When comparing machine learning methods, more accurate prediction results were shown for neural network and XGBoost.
Conclusion. Genetic predisposition and calculation models based on machine learning can be used to calculate individual risk of cervical cancer, identify risk groups and adjust the period between screenings.

Literature


1. Ferlay J.E.M., Lam F., Colombet M., Mery L., Pineros M., Znaor A. et al. Global Cancer Observatory: Cancer Today. International Agency for Research on Cancer, Lyon. https://gco.iarc.fr/today


2. Каприн А.Д., Старинский В.В., Шахзадова А.О. (ред.). Состояние онкологической помощи населению России в 2019 году. М.: МНИОИ им. П.А. Герцена − филиал ФГБУ «НМИЦ радиологии» Минздрава России, 2020. 239 с.


Kaprin A.D., Starinskiy V.V., Shakhzadova A.O. (The state of cancer care for the Russian population in 2019.) Moscow: P.A. Herzen Institute of Medical Sciences − branch of the National Medical Research Center for Radiology of the Ministry of Health of the Russian Federation, 2020. 239 p. (In Russ.).


3. Okunade K.S. Human papillomavirus and cervical cancer. J. Obstet. Gynaecol. 2020; 40(5): 602–8. DOI10.1080/01443615.2019. 1634030


4. Duenas-Gonzalez A., Serrano-Olvera A., Cetina L., Coronel J. New molecular targets against cervical cancer. Int. J. Womens Health. 2014; (6): 1023–31. DOI: 10.2147/IJWH.S49471


5. Баранов В.С. Генетический паспорт – основа индивидуальной и предиктивной медицины. СПб: Изд-во Н-Л, 2009. 528 с.


Baranov V.S. (The genetic passport is the basis of individual and predictive medicine). St. Petersburg: N-L, 2009. 528 p. (In Russ.).


6. Попова А.А., Домонова Э.А., Виноградова Н.А., Шипулина О.Ю. Аногенитальная папилломавирусная инфекция у ВИЧ-инфицированных женщин (по результатам пилотного исследования в Московском регионе). Эпидемиол. инфекц. болезни. Актуал. вопр. 2021; 11(3): 40–5. DOI: 10.18565/epidem.2021.11.3.40-5


Popova A.A., Domonova E.A., Vinogradova N.A., Shipulina O.Y. (Anogenital papillomavirus infection in HIV-infected women (based on the results of a pilot study in the Moscow region)). Epidemiоlоgy and infectious diseases. Сurrent items 2021; 11(3): 40–5. (In Russ.). DOI: 10.18565/epidem.2021.11.3.40-5


7. Каприн А.Д., Новикова Е.Г., Трушина О.И., Грецова О.П. Скрининг рака шейки матки – нерешенные проблемы. Исследования и практика в медицине 2015; 2(1): 36–41. DOI: 10.17709/2409-2231-2015-2-1-36-41


Kaprin A.D., Novikova E.G., Trushina O.I., Gretsova O.P. (Cervical cancer screening – unresolved problems). Research and practice in medicine. – 2015; 2(1): 36-41. (In Russ.). DOI: 10.17709/2409-2231-2015-2-1-36-41


8. Riley R.D., van der Windt D., Croft P., Moons K. G. Prognosis research in healthcare: concepts, methods, and impact. Oxford University Press, 2019. 384 р.


9. Лапач С.Н., Радченко С.Г. Основные проблемы построения регрессионных моделей. Математические машины и системы 2012; 1(4): 125–33.


Lapach S. N., Radchenko S. G. (The main problems of constructing regression models). Mathematical machines and systems 2012; 1(4): 125–33. (In Russ.).


10. Mitchell T.M. Machine learning. New York: McGraw-hill, 2007. 436 p.


11. Motsinger-Reif A.A., Ritchie M.D. Neural networks for genetic epidemiology: past, present, and future. BioData Min. 2008; 1(1): 3. DOI: 10.1186/1756-0381-1-3


12. Винокуров М.А., Миронов К.О., Корчагин В.И., Попова А.А. Генетические полиморфизмы, ассоциированные с раком шейки матки: систематический обзор. Журнал микробиологии, эпидемиологии и иммунобиологии 2022; 99(3): 353–61. DOI:10.36233/0372-9311-251


Vinokurov M.A., Mironov K.O., Korchagin V.I., Popova A.A. (Genetic polymorphisms associated with cervical cancer: a systematic review). Journal of Microbiology, Epidemiology and Immunobiology 2022; 99(3): 353–61. (In Russ.). DOI:10.36233/0372-9311-251


13. Винокуров, М. А., Миронов К.О. Разработка методики определения генетических полиморфизмов, ассоциированных с раком шейки матки. Сборник материалов конгресса с международным участием «Молекулярная диагностика и биобезопасность-2022». М: ЦНИИЭ Роспотребнадзора, 2022: 184–5.


Vinokurov M.A., Mironov K.O. (Development of a methodology for determining genetic polymorphisms associated with cervical cancer). Proceedings of the Congress with international participation «Molecular diagnostics and biosafety–2022». Moscow, 2022: 1845. (In Russ.).


14. Moskowitz C.S., Pepe M.S. Comparing the predictive values of diagnostic tests: sample size and analysis for paired study designs. Clin. Trials. 2006; 3(3): 272–9. DOI:10.1191/1740774506cn147


15. Оганов Р.Г. Значение эпидемиологических исследований и доказательной медицины для клинической практики. Кардиоваскулярная терапия и профилактика 2015; 14(4): 4–7 DOI:10.15829/1728-8800-2015-4-4-7


Oganov R.G. (The importance of epidemiological research and evidence-based medicine for clinical practice). Cardiovascular Therapy and Prevention 2015; 14(4): 4–7. (In Russ.). DOI:10.15829/1728-8800-2015-4-4-7


16. Hua C., Choi Y.J. Companion to BER 642: Advanced regression methods. https://bookdown.org/chua/ber642_ advanced_regression


17. Davey Smith G., Ebrahim S., Lewis S., Hansell A.L., Palmer L.J., Burton P.R. Genetic epidemiology and public health: hope, hype, and future prospects. The Lancet 2005; 366(9495): 1484–98. DOI: 10.1016/S0140-6736(05)67601-5


18. Короленкова Л.И., Завольская Ж.А., Лешкина Г.В. Новые возможности молекулярного тестирования в цервикальном скрининге и ранней диагностике предрака и рака шейки матки (по материалам клинических рекомендаций «Цервикальная интраэпителиальная неоплазия, эрозия и эктропион шейки матки» Минздрава России от 2020 года). Медицинский оппонент 2020; (3): 12–8.


Korolenkova L.I., Zavolskaya Zh.A., Leshkina G.V. (New possibilities of molecular testing in cervical screening and early diagnosis of precancerous and cervical cancer (based on the materials of the clinical recommendations «Cervical intraepithelial neoplasia, erosion and ectropion of the cervix» of the Ministry of Health of the Russian Federation from 2020)). Medical opponent 2020; (3): 12–8. (In Russ.).


19. Адамян Л.В., Аполихина И.А., Артымук Н.В., Ашрафян Л.А., Баранов И.И., Байрамова Г.Р. и др. Цервикальная интраэпителиальная неоплазия, эрозия и эктропион шейки матки. Клинические рекомендации. М., 2020. 59 с.


Adamyan L.V., Apolikhina I.A., Artymuk N.V., Ashrafyan L.A., Baranov I.I., Bayramova G.R. et al. (Cervical intraepithelial neoplasia, erosion and ectropion of the cervix. Clinical recommendations). Moscow, 2020. 59 p. (In Russ.).


20. Perkins R.B., Guido R.S., Castle P.E., Chelmow D., Einstein M.H., Garcia F. et al. 2019 ASCCP Risk-Based Management Consensus Guidelines Committee. 2019 ASCCP Risk-Based Management Consensus Guidelines for Abnormal Cervical Cancer Screening Tests and Cancer Precursors. J. Low. Genit. Tract. Dis. 2020; 24(2): 102–31. DOI: 10.1097/LGT.0000000000000525


About the Autors


Mikhail A. Vinokurov, Junior Researcher, Laboratory of Molecular Methods for Studying Genetic Polymorphisms, Central Research Institute of Epidemiology, Russian Federal Service for Supervision of Consumer Rights Protection and Human Well-Being, Moscow, Russia; vinokurov@cmd.su; https://orcid.org/0000-0002-4101-0702
Konstantin O. Mironov, МD, Head, Laboratory of Molecular Methods for Studying Genetic Polymorphisms, Central Research Institute of Epidemiology, Russian Federal Service for Supervision of Consumer Rights Protection and Human Well-Being, Moscow, Russia; mironov@cmd.su; https://orcid.org/0000-0003-4481-2249
Elvira A. Domonova, Cand. Biol. Sci., Head, Scientific Group for the Development of New Methods for Diagnosis of Opportunistic and Human Papillomavirus Infections, Russian Federal Service for Surveillance of Consumer Rights Protection and Human Well-Being, Moscow, Russia; еlvira.domonova@pcr.ms; http://orcid.org/0000-0001-8262-3938
Tatyana N. Romanyuk, Central Research Institute of Epidemiology, Russian Federal Service for Supervision of Consumer Rights Protection and Human Well-Being, Moscow, Russia; tatiana.romaniuk@pcr.ms; https://orcid.org/0009-0006-1952-907X
Anna A. Popova, Cand. Med. Sci., Senior Researcher, Central Research Institute of Epidemiology, Russian Federal Service for Surveillance of Consumer Rights Protection and Human Well-Being, Moscow, Russia; asya-med@mail.ru; https://orcid.org/0000-0001-9484-5917
Professor Vasily G. Akimkin, Academician of the Russian Academy of Sciences, MD. Director, Central Research Institute of Epidemiology, Russian Federal Service for Supervision of Consumer Rights Protection and Human Well-Being, Moscow, Russia; vgakimkin@yandex.ru; https://orcid. org/0000-0003-4228-9044.


Similar Articles


Бионика Медиа