PROJECT TITLE: Clinicopathologic and Genetic Profiling through Machine Learning and Natural Language Processing for Precision Lung Cancer Management
National Institutes of Health (NIH)
National Cancer Institute (NCI)
Saeed Hassanpour, PhD
OTHER PROJECT STAFF
Laura Tafe, MD, Associate Professor of Pathology, Department of Pathology and Laboratory Medicine at Dartmouth-Hitchcock; Gregory Tsongalis, PhD, Professor of Pathology and Director of Molecular Pathology and Clinical Genomics and Advanced Technologies at Dartmouth-Hitchcock, and Co-Director of the Translational Research Program and Pathology Shared Resource at Dartmouth’s Norris Cotton Cancer Center; Konstantin Dragnev, MD, Professor of Medicine and Associate Director for Clinical Research at Dartmouth’s Norris Cotton Cancer Center; and external collaborators from the University of Vermont Medical Center and the Baylor College of Medicine
Lung cancer is the second-most common type of cancer and the leading cause of cancer death in men and women. Among the different types of lung cancer, non-small cell lung cancer (NSCLC) is the most common type and it constitutes 85% to 90% of all lung cancer cases. Current cancer research has shown that multiple somatic mutations affect the sensitivity of patients to various drugs used for NSCLC treatment. These mutations are essential factors for determining the most effective, “personalized” treatment for each NSCLC patient; however, most NSCLC patients develop resistance to these targeted therapies in their first year of treatment. Many mechanisms of this resistance are still unknown. Designing and prescribing better targeted therapies for NSCLC patients requires further understanding, particularly with respect to the relationship between NSCLC tumors’ pathological and clinical findings, genetic profiles, and targeted therapy responses/resistance. Currently, there is no computational method to connect observations and findings from pathology reports, medical records, somatic mutations, and the targeted therapy resistance. This project provides a plan to build a novel computational method to identify statistically significant associations between the pathological findings of NSCLC tumors and the presence of clinically-actionable somatic mutations. Furthermore, these associations, in combination with an innovative set of feature analysis from pathology reports and electronic medical records, will be leveraged to build and validate a machine-learning model to identify NSCLC patients with clinically-actionable somatic mutations. Finally, the associated clinical, pathological, and genetic findings for NSCLC patients will be used in a new machine-learning framework to predict patients’ time-to-resistance to targeted therapies. The required data to build and validate the proposed models in this project will be obtained through a collaboration with the Department of Pathology’s Laboratory for Clinical Genomics and Advanced Technologies at Dartmouth-Hitchcock Medical Center. In addition to internal validation, the investigators in this proposal established a collaboration with the Department of Pathology at the University of Vermont Medical Center to apply and validate the developed models on an external data source. Upon successful implementation of this bioinformatics approach, the developed models will be able to reveal statistically significant links between clinical and pathological findings, clinically-actionable somatic mutations, and targeted-therapy responses for a better understanding of NSCLC tumor development and treatment. The proposed approach will provide an accurate, fast, and inexpensive pre- selection method for screening NSCLC patients with clinically-actionable mutations for translational research and precision medicine. Furthermore, the proposed machine-learning method to identify NSCLC patients’ resistance to targeted therapies will help healthcare providers to select the best treatment strategies for these patients, improve their health outcomes, and establish this precision medicine paradigm for other types of cancer.
PUBLIC HEALTH RELEVANCE
Resistance to targeted therapies severely limits the ability to treat non-small cell lung cancer (NSCLC) patients. This project proposes a novel computational approach to find statistically-significant links between pathological and clinical findings, clinically-actionable mutations, and targeted-therapy responses for NSCLC patients. The outcomes of this proposal can assist healthcare providers to identify the most effective strategy for NSCLC treatment, improve public health, and promote precision medicine.