DISCOVERY Supercomputer Enables Big Data Analytics at Dartmouth

Hanover, NH— Dartmouth’s Institute for Quantitative Biomedical Sciences (iQBS), a program established to advance and support interdisciplinary education and research at Dartmouth College, has reached a major milestone in infrastructure development. A departmental resource, the DISCOVERY (Dartmouth Initiative for SuperCOmputing Ventures in Education and Research) supercomputer cluster, has exceeded 2200 cores, enabling researchers across disciplines to perform large-scale analyses of complex data sets.

DISCOVERY is a Linux cluster comprised of quad-core Intel and Opteron nodes and octa- and dodeca-core AMD nodes. DISCOVERY exceeds 8 terabytes (TB) of memory, 200 TB of storage space, and contains third party applications. Theoretical performance measurements indicate that DISCOVERY can perform 19 billion floating-point operations per second (or 19 gigaFLOPS), a measure of computing power. DISCOVERY is available to researchers across the Dartmouth College and Dartmouth Hitchcock Medical Center campuses.

Jason H. Moore, Ph.D., Third Century Professor, professor of genetics, and director of the iQBS, says, “DISCOVERY provides inexpensive, low-barrier access to high-performance computing resources to the entire Dartmouth community, including students, staff, and faculty.” Moore, who conceived of the DISCOVERY supercomputer cluster, says, “DISCOVERY is an important part of the big data analytics ecosystem provided by iQBS.” Moore funded DISCOVERY with financial support from iQBS, the Neukom Institute for Computational Sciences at Dartmouth College, individual faculty contributions, and federal funding including a Center of Biomedical Research Excellence (COBRE) grant (GM103534) and an IDeA Network of Biomedical Research Excellence (INBRE) grant (GM103506) both from the National Institute of General Medical Sciences (NIGMS) at the National Institutes of Health (NIH).

DISCOVERY has enabled many important studies in the Biomedical Sciences. Richard Cowper-Sal.Lari a graduate student in the Molecular and Cellular Biology (MCB) Program and student of Dr. Moore, recently uncovered genetic associations with breast cancer. A notable discovery published in the November 2012 issue of Nature Genetics, Cowper performed genome wide association studies that identified single nucleotide polymorphisms (SNPs), DNA sequence variations, associated with breast cancer. This study demonstrated that SNPs associated with increased risk for breast cancer were found throughout the genome in DNA binding sites for proteins called FOXA1 and ESR11. These results start to explain how SNPs located in non-coding DNA regions can affect disease outcomes.

In another influential paper, published in the May 2012 issue of Science, Cowper used similar methods and the DISCOVERY supercomputer to identify thousands of non-coding DNA elements, termed variant enhancer loci (VELs), that increase risk for colon cancer. The VEL signature that Cowper and colleagues discovered was predictive of colon cancer-associated gene expression in cell lines2. Similar to the Nature Genetics study, this work highlighted the significant roles of non-coding DNA in determining disease risk.

DISCOVERY has been utilized across the Dartmouth campus. James Wright Professor Michael Casey, jointly appointed in the Department of Music and the Department of Computer Science, is developing new methods to decode neural patterns3. His project, “Mind-to-Music,” uses machine learning and large-scale data analysis techniques to make music directly from functional magnetic resonance imaging (fMRI) scans of subjects imagining sound. “Our goal is to create an original personalized audio-visual experience, neural cinema, that is based on memories and the creative imagination at play,” says Casey. He has also used DISCOVERY to analyze catalogs of music to generate search engines for recommending music based on selected tracks. Results from Casey’s Mind to Music project will be published as scientific manuscripts and works of art.

DISCOVERY is one component of ongoing big data infrastructure development by Dartmouth and the iQBS. Moore established iQBS with the goal of enhancing quantitative research and its integration and synergy with more traditional experimental and observational approaches. To support these goals, Moore, through the Northeast Cyber-Infrastructure Consortium, helped establish a high-speed fiber network, the New England High-Performance Computing Grid (NE-Grid), across areas of Delaware, Maine, New Hampshire, Rhode Island, and Vermont that previously lacked sufficient high-speed Internet connectivity. This resource links researchers and institutions across Northern New England to high-performance computing resources such as the New England High-Performance Computing Grid (NE-Grid) established by Dr. Moore with funding from the COBRE and INBRE grants.

The Geisel School of Medicine has been home to many firsts in medical education, research, and practice and is poised to continue innovating through the establishment of the iQBS, led by Jason Moore, the iQBS Center for Integrative Biomedical Sciences, with Founding Director Scott Williams, the Center for Genomic Medicine, with Founding Director Christopher Amos, and the Collaboratory for Healthcare and Biomedical Informatics, with Founding Director Amar Das. These initiatives and the infrastructure they support establish Dartmouth as a leader in academic supercomputing and big data analytics.

1. Richard Cowper-Sal·lari, Xiaoyang Zhang, Jason B. Wright, Swneke D. Bailey, Michael D. Cole, Jerome Eeckhoute, Jason H. Moore, and Mathieu Lupien. Breast cancer risk-associated SNPs modulate the affinity of chromatin for FOXA1 and alter gene expression. Nat Genet. 2012 November; 44(11): 1191–1198.
2. Batool Akhtar-Zaidi, Richard Cowper-Sal·lari, Olivia Corradin, Alina Saiakhova, Cynthia F. Bartels, Dheepa Balasubramanian, Lois Myeroff, James Lutterbaugh, Awad Jarrar, Matthew F. Kalady, Joseph Willis, Jason H. Moore, Paul J. Tesar, Thomas Laframboise, Sanford Markowitz, Mathieu Lupien, and Peter C. Scacheri. Epigenomic enhancer profiling defines a signature of colon cancer. Science. 2012 May 11;336(6082):736-9.
3. Thompson, J. and Casey, M., Music information retrieval from neurological signals: Towards neural population codes for music, Society for Music Perception and Cognition, Toronto, CA, 2013.

# # #

The Audrey and Theodor Geisel School of Medicine at Dartmouth is founded on the mission to improve the lives of the communities it serves by promoting innovation in education, research, and healthcare.

The iQBS, established in July of 2010 to promote interdisciplinary education, research, and infrastructure in the quantitative biomedical sciences, is supported by funds from Dartmouth College, the Geisel School of Medicine, and a Center of Biomedical Research Excellence (COBRE) grant from the Institutional Development Award (IDeA) program of the National Institute for General Medical Sciences (grant P20 GM103534). The iQBS prepares students for successful careers in biomedical research and teaching by offering an interdisciplinary Ph.D. degree.