15 Month Programs
The QBS Masters of Science degree in Health Data Science academic curriculum has been designed to provide students with the core skills of data science including Big Data wrangling, database programming, high performance computing, data visualization, exploratory statistics, statistical modeling, and machine learning. Students will also gain valuable training communicating results in verbal, visual and written reports.
The QBS Masters of Science degree in Health Data Science
Satisfactory completion of the required core courses is as follows: an Applied Machine Learning course (QBS 108), two terms of biostatistics (QBS 120 & 121), one term of epidemiology (QBS 130), a course on Algorithms for Data Science (QBS 177), a Data Visulization and Data Wrangling course (QBS 180 & 181). All students are also required to enroll in a Capstone course (QBS 185) and have the option to pursue a summer intership after their first year.
QBS 108: Applied Machine Learning
Course Directors: Saeed Hassanpour
This course provides a comprehensive introduction to machine learning methods and techniques. Various machine learning concepts and topics, including natural language processing and deep learning, will be described and discussed. The emphasis of this course will be providing the required background and working knowledge of the machine learning methodology to apply these techniques to new or existing data science problems. Through multiple projects/assignments, this course will provide students with the experience on the application of machine learning techniques to solve complex real-world problems, such as those in the biomedical domain.
QBS 120: Foundations of Biostatistics I: Statistical Theory for the Quantitative Biomedical Science
Course Directors: Robert Frost
This is a graduate level course in statistics designed to teach the fundamental knowledge required to read and, with further study, contribute to the statistical methodology literature. An in depth overview of statistical estimation and hypothesis testing will be provided, including the method of least squares, maximum likelihood methods, asymptotic methods, and correction for multiple comparisons. The basic elements of statistical design and sample size calculations will be introduced. Resampling strategies will be discussed in the context of the bootstrap and cross validation, as well as simulation as a tool for statistical research. The emphasis will be on theory used in modern applications in biomedical sciences, including genomics, epidemiology, and clinical and health services research. The statistical package R will be leveraged for computational examples, problem sets and exams. The course will meet for two 1.5 hour sessions per week.
QBS 121: Foundations of Biostatistics II: Regression
Course Directors: Tor Tosteson and Todd MacKenzie
This course covers generalized regression theory and applications as practiced in biostatistics and the quantitative biomedical sciences. The basics of linear model theory are presented, and extended to generalized linear models for binary, counted, and categorical data; regression models for censored survival data; and multivariate regression and mixed fixed and random effects regression models for longitudinal and repeated measures data.. Special topics include measurement error in regression, instrumental variables, causal inference, propensity scores and inverse propensity weighted estimation, methods for missing data. Current statistical methodologies for model selection and classification are introduced in the context of applications in genomics and the biomedical sciences. The course features computational examples using the statistical package R, with references as necessary to other statistical packages.. The course meets 3 hours per week. Most course meetings will consist of presentations and demonstrations of analytic methods using datasets from QBS projects and R or other statistical software. The final meeting will feature presentation of class projects consisting of the explanation and application of a novel regression methodology in a QBS case study.
QBS 130: Foundations of Epidemiology I: Theory and Methods
Course Director: Diane Gilbert-Diamond
This is the first of a two course sequence of graduate level epidemiology (Foundations of Epidemiology I and II). The two courses are designed to teach the underlying theory of epidemiologic study designs and analysis and prepare students for conduct of epidemiology research. Design of investigations seeking to understand the cause of human disease, disease progression, treatment and screening methods include clinical trials, cohort studies, case-cohort, case-case, nested case-control and case-control designs. Concepts of incidence rates, attributable rate and relative rate, induction and latent periods of disease occurrence, confounding, effect modification, misclassification, and causal inference will be covered in depth.
QBS 149: Mathematics and Probability for Statistics and Data Mining
Course Director: TBD
Optional if student tests out.
This course will cover the fundamental concepts and methods in mathematics and probability necessary to study statistical theory. Topics will include univariate and multivariate probability distributions with emphasis on the normal distribution, conditional distributions, mathematical expectation, convergence in probability and distribution, and the central limit theorem. Relevant concepts and methods from univariate and multivariate calculus will be introduced as necessary, along with related topics in linear and matrix algebra. Computational methods for statistics, including nonlinear optimization and Monte Carlo simulation will be introduced. Special attention will be given to students' active learning by programming in a statistical software package. The course will meet for 3 hours per week.
QBS 177: Algorithms for Data Science
Course Director: Jiang Gui & Eugene Demidenko
This course provides an introduction to algorithms used in data science with applications to biomedical and health data science. The goal of this course is to present an overview of many of the approaches used for big data focusing on analytical methods and algorithms. The course assumes that students have some knowledge of R. Students will be provided with 2 large data sets. Lectures on data reduction, classification, and optimization will request students complete homework for these datasets.
QBS 180: Data Visualization and Statistical Graphics
Course Directors: Ramesh Yapalparvi & Eugene Demidenko
This course will teach best practices for visualizing data, including exploratory statistics and effectivecommunication of statistical analysis. Students will become competent in engaging diverse audiences in the process of analytic thinking and decision making. Topics include principles of graphic design, perceptual psychology, dashboards, dimensionality reduction, statistical smoothing and 3D graphics. Students will become competent users of Tableau, R graphics and R-Shiny.
QBS 181: Data Wrangling
Course Directors: Ramesh Yapalparvi
This course is a survey of methods for extracting and processing data. It will cover data architectures (ontologies, metadata, pipeline and open source resources), database theory, data warehouses, the electronic medical record, various file formats including audio, and video, data security and cloud resources. Students will gain skills working with Big Data using software such as SQL, APACHE Hadoop and Python.
QBS 185: Health Data Science Capstone Experience
Course Director: Todd Mackenzie
Only open to Masters students that have declared Health Data Science as their concentration.
The capstone consists of projects completed by the students in which they bring together all aspects of data science: 1) conception of problem to solve; 2) extraction, merging and construction of analyzable data set (data wrangling) using big data; 3) exploratory statistics using principles of data visualization; 4) statistical analysis and/or machine learning of big data; 5) communication of the results, written, verbal and visually.
Course deliverables are a final paper and faculty presentation.
Students will have the following choices for the capstone experience:
- Traditional Group Project Format: Residence Required at Dartmouth
- Individual Project with Dartmouth PI: Residence Required at Dartmouth
- External Experience w/ coursework: Residence NOT required at Dartmouth. Internships with a company or another academic institution is allowed.**
- Requires weekly deliverables and project updates with Capstone Course Director
- Final project paper and faculty presentation apply
Students will work on their capstone full-time (3 units) during the summer term, whether in residence at Dartmouth or at an external company/institution. Capstone only offered during Summer term. Tuition & Fees apply.
** Students can be paid or unpaid for external experiences. QBS Tuition & Fees still apply.
The QBS Masters of Science degree in Health Data Science Satisfactory completion of up to 9 approved graduate level elective courses is required of all students. Below is the list of QBS electives and those from other departments if space allows.
QBS 100: Molecular Basis of Human Health and Disease
Course Director: Kristine Giffin and Michael Whitfield
This course is designed to solidify key cellular, molecular, and genetic concepts in the biology of human health and disease. Students in this course will develop a fundamental understanding of the molecular pathogenesis and genetic predisposition to disease, be familiar with the modern tools and technologies to study molecular processes and disease in model systems and human populations. Topics include the basics of cell structure and function, DNA structure and function, normal and pathologic cellular processes, genetic and epigenetic mechanisms, and examples of major disease outcomes such as cancer.
QBS122/PH 271: Biostatistics III: Modeling Complex Data
Course Directors: Todd MacKenzie & James O'Malley
The first component of the course introduces Bayesian statistical methods, which is featured due to its affinity for solving challenging problems and its ubiquity across modern statistical and artificial intelligence applications. In an extension of QBS 120, Bayesian methodology is carefully developed and compared to the classical (frequentist) approach. A variety of applications in which the Bayesian approach is naturally suited are considered (e.g., non-inferiority testing, missing outcome imputation, two-part models and selective topics in structural equation modeling). Bayesian computation via Markov-chain Monte-Carlo (MCMC) is also developed and illustrated. The remainder of the course follows QBS 121 by extending regression and other methods for analyzing data when standard statistical assumptions fail. There are two main areas of focus: analysis of statistically dependent data and analysis of social network data. The dependent data section encompasses clustered, multi-level, longitudinal and other forms of structured data and will focus on hierarchical (mixed-effect) modeling approaches under both a frequentist and a Bayesian perspective. The network analysis section includes representation, visualization, and summarization of networks; models of networks; and models of peer effects and social influence processes. Graph partitioning methods will be included if time permits.
QBS 123: Biostatistics Consulting Lab
Course Directors: Tor Tosteson and Todd MacKenzie
The goal of this course is to have students gain experience contributing to the statistical aspects of health sciences research. Students will be mentored by Biostatistics faculty members while interacting with investigators from the Geisel School of Medicine and Dartmouth-Hitchcock Medical Center who seek support from the Synergy Biostatistics Consulting Core (BCC). Course requirements will include participation in the bi-weekly BCC walk in consulting clinics, shadowing BCC staff and faculty in other statistical collaborative meetings, preparing statistical analyses, sample size calculations, reports and analytic tables and figures. Student performance will be evaluated review of student summaries of their consulting activities and by feedback surveys from BCC collaborators, faculty, and staff.
QBS 131: Foundations of Epidemiology II: Theory and Methods
Course Director: Megan Romano
Epidemiology is the science of studying and understanding the patterns of disease occurrence in human populations with the ultimate goal of preventing human disease. This graduate-level course is the second in a two-part sequence. Building off of concepts covered in the Foundations of Epidemiology I, it aims to develop an in-depth understanding of population characteristics and disease frequencies, epidemiological study designs, measures of excess risk associated with specific exposures, and inferring causality in exposure-disease relationships.
QBS 132: Molecular Biologic Markers in Human Health Studies
Course Director: Angeline Andrew
his course covers the use of human tissue samples in the context of translational research, including observational epidemiology studies and clinical trials. Lectures focus on study design, bio-specimen collection, biomarker types, kinetics and validation. Discussion will focus on examples of biomarker utilization including identifying susceptible populations, exposure assessment, molecular-genetic characterization of disease phenotype, evaluating drug compliance, monitoring dose response, testing molecularly targeted therapy. The computer-laboratory based component of this course accompanies provides students with “hands on” experience with modern analytic approaches to data generated from state-of-the-art molecular studies of human tissues including many of the “omics” technologies (e.g. DNA methylation array data), and integrated analysis. Students will apply techniques for identifying and evaluating clusters and interactions. Includes application of study design principles, statistical modeling, and bioinformatical approaches.
QBS 136: Applied Epidemiological Methods I
Course Director: Anne Hoen
Computer laboratory-based course designed to provide hands-on experience performing epidemiological data analyses relevant to the theoretical/conceptual material presented in Foundations of Epidemiology I. Students will complete laboratory exercises using epidemiological study data sets that guide them through descriptive data analyses, hypothesis testing within the context of a range of epidemiological study designs, causal inference methods, addressing confounding and effect modification, and power and sample size calculations. Analyses will be performed in the open-access programming language R. Course will meet once per week for 90 minutes. Note that this is a half-credit course designed to be taken at the same time as Foundations of Epidemiology I.
QBS 137: Applied Epidemiological Methods II
Course Director: Anne Hoen
Computer laboratory-based course designed to provide hands-on experience performing epidemiological data analyses relevant to the theoretical/conceptual material presented in Foundations of Epidemiology II. Students will complete laboratory exercises using epidemiological study data sets that guide them through descriptive data analyses, hypothesis testing within the context of a range of epidemiological study designs, causal inference methods, addressing confounding and effect modification, and power and sample size calculations. Analyses will be performed in the open-access programming language R. Course will meet once per week for 90 minutes. Note that this is a half-credit course designed to be taken at the same time as Foundations of Epidemiology II.
QBS 146: Foundations of Bioinformatics I
Course Director: Chao Cheng & Michael Whitfield
The sequencing of the complete genomes of many organisms is transforming biology into an information science. This means the modern biologist must possess both molecular and computational skills to adequately mine this data for gaining biological insights and creating new hypotheses. Taught mainly from the primary literature, topics will include genome sequencing and annotation, genome variation, gene mapping, genetic association studies, gene expression, functional genomics, proteomics, single-cell genomics, and systems biology. The course will meet for 3 hours per week.
QBS 147: Genomics: From Data to Analysis
Course Director: Olga Xhaxybayeva
Massive amounts of genomic data pervade 21st century life science. Physicians now assess the risk and susceptibility of their patients to disease by sequencing the patient's genome. Scientists design possible vaccines and treatments based on the genomic sequences of viruses and bacterial pathogens. Better-yielding crop plants are assessed by sequencing their transcriptomes. Moreover, we can more fully explore the roots of humanity by comparing our genomes to those of our close ancestors (e.g., Neanderthals, Denisovans). In this course, students will address real-world problems using the tools of modern genomic analyses. Each week students will address a problem using different types of genomic data, and use the latest analytical technologies to develop answers. Topics will include pairwise genome comparisons, evolutionary patterns, gene expression profiles, genome-wide associations for disease discovery, non-coding RNAs, natural selection at the molecular level, and metagenomic analyses.
QBS 175: Foundations of Bioinformatics II
Course Director: TBD
Computation is vital for modern molecular biology, helping scientists to model, predict the behaviors of, and control the molecular machinery of the cell. This course will study algorithmic challenges in analyzing biomolecular sequences (what genes encode an organism, and how are genes related across organisms?), structures (what do the proteins constructed for these genes look like, and what does that tell us about their mechanisms?), and functions (what do these things do, and how do they interact with each other in doing it?). The course is application-driven, but focused on the underlying algorithms and information processing techniques, employing approaches from search, optimization, pattern recognition, and so forth. The course will meet for 3 hours per week.
QBS 176: Methods in Statistical Genetics and Genomics
Course Director: Ivan Gorlov & Jinyoung Byun
This course will provide an introduction to statistical methods for the study of both simple and complex genetic traits. The emphasis of this course is on training in methods of statistical genetics, especially genetic epidemiology designed to identify genetic factors associated with human diseases. This course covers the key statistical and epidemiologic concepts and methods necessary for understanding genetic architecture of common human diseases.
QBS 194/270 (Winter): Biostatistics Journal Club
Course Directors: Jiang Gui
This is a journal club course that discusses new findings and applications in biostatistics and data science. The goal of the course is to develop critical thinking in biostatistical methodology. Starting the second week of the term, students will present two related paper with an emphasis on biostatical method and the rest of the class will submit a short written summary (1-2 pages) that covers the paper motivation, approach, results, strengths and weaknesses. During class, student will give 50-minute presentation on their papers with 40 min class discussion. In addition to reading and summarizing their selected paper for the week, all students are expected review the two presented papers prior to class in order to participant
in the discussion.
QBS 195: Independent Study
Course Directors: Arranged
Independent study in QBS is structured to allow students to explore subject matter and enhance their knowledge in QBS related fields. This independent study for QBS students will count as an elective credit and is offered during each academic term. The arrangement and a course outline is to be developed between the student and a QBS faculty member prior to the start of the term as well as approved by QBS administration. The student and faculty will work together to structure the study program and set goals that are to be met by the end of the term. The course of study may include, but is not limited to, literature review, seminar attendance, online course material, small projects, and presentations related to the specific field being studied. This can also substitute for a journal club credit after the first year.
QBS 270 (Fall): Epidemiology Journal Club
Course Directors: Jennifer Emond
In this applied course, students will learn how to critically evaluate epidemiological research within public health and the biomedical sciences. Each week we will review a series of peer-reviewed journal articles (approximately 4-6 articles each week) related to one theme. Themes in previous years have included evaluating the health consequences of combustible cigarettes and electronic (“e”) cigarettes, the health benefits and risks of hormone replacement therapy, and the health consequences of sugar-sweetened beverage intake. Articles central to each week’s theme will be selected by the instructor and supplemented with student selected articles. Students are expected to read and critically review each set of articles before class, prepare thought questions based on the readings, and participate in class discussions as we evaluate the body of evidence across studies. For each weekly theme, one set of students will present a summary of the week’s readings to the class. Students are also required to submit a brief summary of the week’s theme after the class discussion.
- To critically evaluate epidemiological research studies within public health and biomedical research.
- To effectively summarize the findings from such studies orally and in writing.
- To critically compare different epidemiological research designs that address similar research questions.
- To identify classical epidemiological research studies within public health and biomedical research.
Students will be evaluated on class attendance, completion of pre-class assignments, participation in class discussions, quality and comprehension of presentations, and completion and quality of weekly summaries.
QBS 270 (Spring): Bioinformatics Journal Club
Course Directors: TBD
The critical analysis and communication of experimental research in an oral format is an essential element of scientific training. Students in the QBS journal club will take turns selecting and presenting recently published journal papers related to their research interests. The presentation should include a brief discussion of the significance of the paper as well as a description of the methods used. While the presenter should be prepared to lead the discussion, members of the journal club are expected to come with questions about the paper. These questions can focus on methods, discussion, and interpretation of the results and their implications. This course will meet for a 1.5-hour discussion every week
QBS 271: Epidemiology Graduate Seminar II: Current topics in Epidemiology
Course Director: TBD
Student-led graduate level seminar. Students will identify and present two influential epidemiological or biomedical research studies that used different epidemiologic study designs to address a research question. Students will be encouraged to discuss and critically analyze the motivation for the studies, the research design, key findings, study limitations and study implications, and present aims for a future study which will address gaps in the research or be a clear extension of the research to date.
CS 169: Applications of Data Science
Course Director: Andrew Campbell
In this seminar (it's not a course) you will hear from leading researchers at Dartmouth extracting new insights from data to advance their respective fields. You will also read and present research papers on key areas in data science. This seminar will also include a number of programming assignments that seek to reinforce concepts and computational methods widely used in data science. The programming assignments will use the pydata stack: the Python open data science stack. The seminar will also include a group project.
CS 174: Machine Learning and Statistical Data Analysis
Course Director: Lorenzo Torresani
This course provides an introduction to statistical modeling and machine learning. Topics include learning theory, supervised and unsupervised machine learning, statistical inference and prediction. A wide variety of algorithms will be presented, including K-nearest neighbors, naive Bayes, decision trees, support vector machines, logistic regression, K-means, mixtures of Gaussians, principal components analysis, Expectation Maximization. The course will also discuss modern applications of machine learning such as image segmentation and categorization, speech recognition, and text processing.
CS 189: Health Informatics
Course Director: Inas Khayal
Our health is everywhere. It is affected by how, where and who we live, work & play with (i.e.biological, behavioral, social and environmental factors). The explosion of digitization of data, capturedboth outside 'in the wild' and within the healthcare delivery system, allows us to understand and address the many factors affecting the complexity of our health. Today, health & healthcare data is continuously being generated. Deriving information and knowledge to improve and maintain health requires health informatics. Computer science plays an active role as a profession and within its research efforts in informing and developing all aspects of health informatics: data capture, data storage and data analytics. The goal of this course is two fold: first, to learn about the latest topics in health informatics and second, to design & develop a health informatics project.
PH 147: Advanced Methods in Health Services Research
Course Director: Tracy Onega
This course will develop student analytic competencies to the level necessary to conceptualize, plan, carry out, and effectively communicate small research projects in patient care, epidemiology, or health services. Lectures, demonstrations, and labs will be used to integrate and extend methods introduced in other TDI courses. The course will also cover new methods in epidemiology and health services. The students will use research datasets from the Medical Care Epidemiology Unit at TDI, including Medicare data, in classroom lab exercises and course assignments. Course topics focus on key aspects observational research including risk adjustment, multilevel analyses, instrumental variables, and small area analysis. Practical skill areas will include programming in STATA, studying datasets for completeness and quality, designing tables, and figures, and data management techniques. Emphasis is on becoming independent in analytic workflow. The instructors will tutor students as they develop their own analytic projects.
PH 151: Environmental Health Science and Policy
Course Directors: Carolyn Murray & Robert McLellan
This course engages students in the exploration of major environmental and occupational health issues through application of the basic tools of environmental science including epidemiologic methods, toxicology and risk assessment. Participants will examine the relationship between environmental and occupational exposures and human disease with emphasis on the interface of science and policy, the role of regulatory agencies and environmental risk communication. Topics include air and water quality, hazardous waste, radiation, heavy metals, food safety, environmental pathogens, and clinical occupational medicine. Faculty use a variety of teaching tools including lectures, audiovisual media, case studies, guest experts, and assigned readings/exercises. As a culminating project, students will author an environmental policy white paper based on a synthesis of scientific evidence.
PH 154: Social and Behavioral Determinants of Health
Course Director: Samir Soneji
This course describes the evolution of the predominant illness patterns that dominate contemporary populations. It delves into explanations for individual and population health that focus primarily social and behavioral determinants for heath promotion and disease prevention. Finally, it examines local and global responses to burgeoning factors that will significantly impact population health in the coming decades.
PEMM 132: Clinical Management of Cancer
Course Directors: Todd Miller & Manabu Kurokawa
This course will expose non-clinical researchers to the clinical realities of managing cancer through classroom lectures, tumor board case review sessions, and observation of everyday oncology clinic experiences. Students will gain insight into the issues associated with the clinical management of diverse cancer subtypes; an understanding of the complexities involved in treating people/patients, not just the cancer, including consideration and management of the side effects of therapy; and exposure to translational and clinical approaches to cancer research. The format will be a one-hour lecture each week by a practicing clinician, attendance at 5 one-hour tumor board sessions, and 5 half-days of observation in oncology clinics.
ENGM 182: Data Analytics
Course Director: Geoffrey Parker
This course provides a hands-on introduction to the concepts, methods and processes of business analytics. Students learn how to obtain and draw business inferences from data by asking the right questions and using the appropriate tools. Topics include data preparation, statistical tools, data mining, visualization, and the overall process of using analytics to solve business problems. Students work with real-world business data and analytics software. Where possible, cases are used to motivate the topic being covered. Students acquire a working knowledge of the “R” language and environment for statistical computing and graphics. Prior experience with “R” is not necessary, but students should have a basic familiarity with statistics, probability, and be comfortable with basic data manipulation in Excel spreadsheets.
MATH 116: Topics in Applied Mathematics: Fundamentals in Numerical Analysis
Course Director: Anne Gelb
Many mathematical models arising in various applications cannot be solved analytically. This course teaches fundamentals of numerical analysis,including a brief overview of numerical linear algebra, root finding methods, interpolation and approximation, and methods for solving ordinary differential equations. The course will focus on how numerical algorithms are constructed and analyzed in terms of their accuracy, efficiency, and stability. Students will use MATLAB to demonstrate the validity and/or failure of various approaches in different situations.
MATH 126: Current problems in Applied Mathematics
Course Director: Feng Fu
Partial differential equations (PDEs) are essential for the modelling of physical phenomena appearing in a variety of fields from geophysics and fluid dynamics to geometry. In this course, we will study three major topics one should understand when modelling with PDEs. The topics are: (i) the theory (e.g. existence and uniqueness of solutions)> (ii) when and how can solutions be found analytically> (iii) classic numerical techniques (e.g. finite difference and finite element methods) and how to determine if the method is stable and convergent. In addition, we will discuss the limitations of existing solution techniques in the context of open research questions.
Requirements for a the Masters of Science degree in Quantitative Biomedical Sciences with a Concentration in Health Data Science
Health Data Science students have access to interdisciplinary courses positioning individuals to have competitive advantages for careers in Big Data, healthcare and biomedicine that translate to academia and industry. Students complete 9 required courses, including a capstone that brings together data wrangling, exploratory data analysis, programming, statistical learning, epidemiology, data visualization and communication. In addition, up to 9 elective courses are required during the 5 quarters in residence. Students are encouraged to pursue an internship during the summer, which extends their last session from the fall quarter to the winter quarter.
- Satisfactory completion of the following courses:Mathematics & Probability for Statistics & Data Mining (may test out)
Foundations of Biostatistics I
Foundations of Biostatistics II
Foundations of Epidemiology I
Algorithms for Data Science
- Satisfactory completion of up to 9 approved graduate level elective course
- Completion of mandatory first year ethics course required of all first year graduate students