Pulmonomics: Genomics and Proteomics of Lung Disease

Basic Science Research

Mission Statement:

Directed by Dr. Avrum Spira, MD, MSc, the Bioinformatics Program at the Boston University Pulmonary Center consists of Pulmonologists and Bioinformaticians involved in translational research into lung biology and disease. The long terms goals of this program are to apply and stimulate the development of post-genomic technologies and computational tools for translational research into human disease and to train physician-scientists and graduate students who can apply these tools in a clinical setting.


With the complete sequence of the human and other genomes recently elucidated, we have witnessed an explosion of information and high-throughput tools that are profoundly altering biomedical research and the culture of science. The reference genomes combined with advances in the biotechnology sector have produced an exponential growth in the amount and types of data available for research. These developments are altering the paradigm of biological research, from traditional studies of single genes or pathways to large-scale studies that combine data mining of high-throughput datasets (i.e. microarray and next generation sequencing experiments) for hypothesis generation followed by experimental work for validation. From this the discipline of Bioinformatics has emerged whose goal is to apply the techniques from computer science, such as data manipulation and pattern discovery techniques, to solve problems in molecular biology and ultimately give rise to translational research into human disease.

The Bioinformatics Program strives to apply and develop computational tools that can be used to mine data from high-throughput translational research studies ongoing within the Pulmonary Center (see below for specific project details). This program combines expertise in designing and running genome-wide studies of gene and miRNA expression on clinical specimens with high-throughput data storage and analysis capabilities (Figure 1). The scientists affiliated with this Program combine talents of molecular biologists with mathematicians, statisticians, epidemiologists, and computer scientists. In addition, our Program serves to train Bioinformatics graduate students and clinician- scientists in the application of computational tools to clinical studies. In conjunction with the Dr. Charles DeLisi at the College of Engineering at BU, we have developed a Masters Program in Clinical Bioinformatics (MD track) whose goal is to train physicians who will be leaders in applying and stimulating the development of post-genomic technologies to clinical research and the practice of medicine. In addition, our program is closely affiliated with the Translational Bioinformatics Core of the Clinical and Translational Science Institute (CTSI), a bioinformatics resource to the Boston University research community as well as the Section of Computational Biomedicine.

Figure 1: High throughput platforms currently employed for the analysis of clinical specimens.
Figure 1: High throughput platforms currently employed for the analysis of clinical specimens. Microarrays have been a remarkable resource for performing genome-wide studies looking at mRNA and miRNA expression as well as modifications such as SNPs and methylation. More recently next generation sequencing has emerged as a valuable tool used for discovery of novel large and small RNAs.

Selected Ongoing Research Projects:

The Airway Transcriptome—Developing Biomarkers for Lung Cancer and COPD:

As one of the central projects within our program, the goal of this study is to profile gene expression changes occurring in intrathoracic (bronchial) and extrathoracic (buccal mucosa and nasal) airway epithelial cells in the setting of tobacco exposure and develop molecular biomarkers that can predict those smokers at risk for having or developing lung cancer and COPD. This study has defined the genome-wide impact of smoking and smoking cessation on bronchial airway and nasal epithelium (Spira et al. PNAS, 2004; Beane et al. Genome Biology 2007, Zhang et al. Physiol Genomics, 2010). Additionally, we have identified an airway signature that can serve as a highly sensitive and specific diagnostic in smokers with clinical suspicion of lung cancer (Spira et al. Nature Medicine. 2007; Beane et al. Cancer Prevention Research 2008). A large multicenter clinical trial is underway to validate this early diagnostic biomarker for lung cancer. Recently, we have applied next generation sequencing technology to sequence the airway transcriptome (RNA-seq), identifying novel transcript alterations associated with smoking and lung cancer not identified in our previous work using microarray technologies (Beane et al. Cancer Prev Res, 2011). In collaboration with Drs. Andrea Bild (University of Utah) and Stephen Lam (University of British Colombia), this project has also explored how the molecular “field of injury” in airway epithelium reflects information about the perturbation of oncogenic pathways within an individual and how these pathway alterations can be reversed with chemoprevention (Gustafson et al. Science Translational Medicine, 2010). Such studies open the door to potentially allowing personalized genomic approaches to chemoprophylaxis and therapy. Additionally, we are extending this “field of injury” to the airway of smokers with COPD in order to develop novel approaches to molecular classification of this heterogeneous disease and identify intermediate biomarkers of therapeutic efficacy.

Mechanisms of regulation of airway epithelial gene expression:

Our program has explored genetic, epigenetic and post-transcriptional regulators of gene expression changes within airway epithelial cells. MicroRNAs are key post-transcriptional regulators of gene expression and in this study we identified a subset of miRNAs that play a role in regulating the gene-expression response of the airway epithelium to tobacco smoke exposure (Schembri et al. PNAS, 2009). In collaboration with Dr. Douglas Bell at the NIEHS, we are linking gene expression profiles to SNPs in promoter regions of these genes via computational modeling and a novel high-throughput SNP platform. We have also begun to link methylation of promoter regions to alterations in gene expression via whole-genome methylation arrays.

Developing Non-Invasive Biomarkers of Tobacco Exposure for the NIH Genes and Environment Initiative:

The program was awarded a U01 grant from the NIEHS to develop non-invasive biomarkers (using nasal and buccal epithelium) of host response to cigarette exposure. These biomarkers will ultimately serve as noninvasive measures of the biological response to tobacco exposure that can be applied to large-scale population studies as part of the NIH/NIEHS Genes and Environment Initiative. In collaboration with Dr. Steven Chillrud (Columbia University), we are beginning to extend this study to the development of non-invasive biomarkers measuring second hand smoke exposure in the nasal epithelium of children. In collaboration with the Dr. Nat Rothman and Qing Lin (National Cancer Institute), we have expanded our study to include other inhaled toxins (i.e. biomass combustion, formaldehyde) that may be contributing factors to lung disease.

Developing a Molecular Data Repository for the Lung Disease Research Community:

This project, funded through The National Heart, Lung and Blood Institute (NHLBI), will establish the Lung Genomics Research Consortium (LGRC), a collaboration between Drs. David Schwartz (National Jewish Health), Naftali Kaminski (University of Pittsburgh), John Quackenbush (Dana-Farber cancer Institute), Marc Geraci (University of Colorado), Frank Sciurba (University of Pittsburgh), Ivana Yang (National Jewish Health) and Avrum Spira (Boston University). The goals of this project are to combine genetic, genomic and clinical information on greater than 400 human lung tissue samples from patients with various lung diseases (i.e. Interstitial Lung Disease (ILD), Chronic Obstructive Pulmonary Disease (COPD), Emphesyma).

Immunopathology of the Nasal Mucosa in Sarcoidosis:

Together with Dr. Jeffrey Berman, we are studying gene expression profiles of nasal epithelium in the setting of sarcoidosis and their relationship to clinical outcome.

Gene–expression profiles in mediastinal lymph nodes as predictors of lung cancer prognosis:

Together with Dr. Hasmeena Kathuria and Dr. Ben Daly, we are exploring whole-genome expression profiles in paraffin-embedded lymph nodes from stage-1 lung cancer subjects and linking these profiles to survival post-resection.


  • DNA microarrays
  • MicroRNA arrays
  • SNP arrays
  • Next Generation Sequencing: Illumina GAIIx and HiSeq
  • Construction of relational databases
  • Reverse Engineering of Biological Networks

Principal Investigators:

  • Avrum Spira, MD, MSc; Computational Biomedicine
  • Jerome Brody, MD; The Pulmonary Center
  • Marc Lenburg, PhD; Computational Biomedicine
  • Frank Schembri, MD; The Pulmonary Center
  • Gang Liu, PhD; Computational Biomedicine
  • Katrina Steiling, MD, MSc; Computational Biomedicine
  • Jennifer Beane, PhD; Computational Biomedicine
  • Paola Sebastiani, PhD; School of Public Health, Boston University
  • Dan Brooks, ScD; School of Public Health, Boston University
  • Marisa Ramirez, PhD; The Pulmonary Center
  • Hasmeena Kathuria, MD, The Pulmonary Center
  • Marty Joyce Brady, MD; The Pulmonary Center
  • George O’Connor, MD; The Pulmonary Center
  • Karen Schlauch, PhD; Department of Genetics and Genomics
  • Chrisitna Anderlind, MD; Computational Biomedicine

Pulmonary Fellows:

  • Yannbor Lin, MD; The Pulmonary Center
  • Charles Dumont MD; The Pulmonary Center

Graduate Students:

  • Catalina Perdomo (PhD candidate, Genetics and Genomics)
  • Adam Gower (PhD candidate, Bioinformatics)
  • Julie Zeskind (PhD candidate, Bioinformatics)
  • John Brothers (PhD candidate, Bioinformatics)
  • Josh Campbell (PhD Candidate, Bioinformatics
  • Joe Gerrein (PhD Candidate, Bioinformatics)
  • Kahkeshan Hijazi (PhD Candidate, Bioinformatics)
  • Rebecca Kusko (Graduate Student, Genetics and Genomics)
  • Carly Garrison (Graduate Student, Genetics and Genomics)

Study Coordinator:

  • Martine Dumas, MPH, RN

Research Technicians:

  • Sherry Zhang
  • Lingqi Luo

Research Assistants:

  • Ji Xiao
  • Emma Chu

Collaborators (outside of BU medical center):

  • James Collins, PhD; College of Engineering, BU
  • Douglas Bell, PhD; National Institute of Environmental Health Sciences, NC
  • Joseph Keane, MD; Trinity College, Ireland
  • Steve Chillirud, PhD; Columbia University
  • David Schwartz, MD; National Jewish Health
  • Ivana Yang, PhD; National Jewish Health
  • Naftali Kaminski, MD; University of Pittsburgh
  • Frank Sciurba, MD; University of Pittsburgh
  • John Quackenbush, PhD; Dana-Farber cancer Institute
  • Marc Geraci, MD; University of Colorado
  • James Hogg, MD; University of British Colombia
  • Andrea Bild, PhD; University of Utah
  • Stephen Lam, MD; University of British Colombia
  • Pierre Massion, MD; Vanderbilt University
  • Joseph Keane, MD; Trinity College, Ireland
  • Rafael Guerrero, PhD; John Hopkins School of Medicine
  • David Sidransky, MD; John Hopkins School of Medicine
  • Shyam Biswal, PhD; John Hopkins School of Medicine
  • Steven Dubinett, MD; University of California, Los Angles
  • David Elashoff, PhD; University of California, Los Angles
  • Steven Belinsky, PhD; Lovelace Respiratory Research Institute

Selected Publications:

  • Beane J, Vick J, Schembri F, Anderlind C, Gower A, Campbell J, Luo L, Zhang XH, Xiao J, Alekseyev YO, Wang S, Levy S, Massion PP, Lenburg M, Spira A. Characterizing the impact of smoking and lung cancer on the airway transcriptome using RNA-Seq. Cancer Prev Res (Phila). 6: 803-17, 2011.
  • Gustafson AM, Soldi R, Anderlind C, Scholand MB, Qian J, Zhang X, Cooper K, Walker D, McWilliams A, Liu G, Szabo E, Brody J, Massion PP, Lenburg ME, Lam S, Bild AH, Spira A. Airway PI3K pathway activation is an early and reversible event in lung cancer development. Sci Transl Med. 2(26): 26ra25, 2010.
  • Zhang X, Sebastiani P, Liu G, Schembri F, Zhang X, Dumas YM, Langer EM, Alekseyev Y, O’Connor GT, Brooks DR, Lenburg ME, Spira A. Similarities and differences between smoking-related gene expression in nasal and bronchial epithelium. Physiol Genomics. 41(1): 1-8, 2010.
  • Schembri F, Sridhar S, Perdomo C, Gustafson AM, Zhang X, Ergun A, Lu J, Liu G, Zhang X, Bowers J, Vaziri C, Ott K, Sensinger K, Collins JJ, Brody JS, Getts R, Lenburg ME, Spira A. MicroRNAs as modulators of smoking-induced gene expression changes in human airway epithelium. Proc Natl Acad Sci 106(7): 2319-24. 2009.
  • Beane J, Sebastiani P, Whitfield T, Steiling K, Lenburg M, Spira A. A Prediction Model for Lung Cancer Diagnosis that Integrates Clinical and Genomic Features. Cancer Prevention Research. 1: 56-64, 2008.
  • Sridhar S, Schembri F, Zeskind J, Shah V, Gustafson A, Steiling K, Liu G, Dumas Y, Zhang S, Brody J, Lenburg M, Spira A. Smoking-induced gene expression changes in the bronchial airway are reflected in nasal and buccal epithelium. BMC Genomics. 9:259, 2008.
  • Beane J, Sebastiani P, Liu G, Brody J, Lenburg M, Spira A. Reversible and Permanent Effects of Tobacco Smoke Exposure on Airway Epithelial Gene Expression. Genome Biology. 8: R201, 2007.
  • Millien G, Beane J, Lenburg M, Lu J, Spira A, Ramirez M. Characterization of the mid-foregut transcriptome identifies genes regulated during lung bud induction. Mechanisms of Development. 8:124-39, 2008.
  • Spira A, Beane J, Shah V, Steiling K, Liu G, Schembri F, Gilman S, Dumas Y, Calner P, Sebastiani P, Sridhar S, Beamis J, Lamb C, Keane J, Lenburg M, Brody J. Airway Epithelial Gene Expression in the Diagnostic Evaluation of Smokers with Suspect Lung Cancer. Nature Medicine. 13:361-6. 2007.
  • Zhang X, Liu G, Lenburg M, Spira A. Comparison of smoking-induced gene expression on Affymetrix exon and 3’ based expression arrays. Genome Informatics. 18:247-257, 2007.
  • Brody JS, Spira A. Inflammation, Lung Cancer and COPD. Proc Am Thorac Soc. 3:535-7. 2006.
  • Demeo D, Mariani T, Lange C, Srisuma S, Litonjua A, Celedón C, Lake S, Reilly J, Chapman H, Mecham B, Haley K, Sylvia J, Sparrow D, Spira A, Beane J, Pinto-Plata V, Speizer F, Shapiro S, Weiss S, Silverman E. The SERPINE2 gene is associated with Chronic Obstructive Pulmonary Disease. Am J Human Genetics. Am J Human Genetics. 78: 253-64, 2006.
  • Millien G, Spira A, Hinds A, Wang J, Williams M and Ramirez M. Alterations in gene expression in T1alpha null lung: a model of deficient alveolar sac development. BMC Developmental Biology 6:35. 2006.
  • Shah V., Sridhar S., Beane J., Brody J., Spira A. SIEGE: Smoking Induced Epithelial Gene Expression Database. Nucleic Acids Res. 33: D573-9. 2005.
  • Spira A*, Beane J*, Pinto-Plata V*, Kadar A, Liu G, Shah V, Celli B, Brody, J.S. Gene Expression Profiling of Human Lung Tissue from Smokers with Severe Emphysema. Am J Respir Cell Mol Biol. 31: 601-610. 2004. *contributed equally and should be considered co-first authors.
  • Spira A, Beane J, Shah V, Liu G, Schembri F, Yang X, Palma J, Brody J. Effects of Cigarette Smoke on the Human Airway Epithelial Cell Transcriptome. Proc Natl Acad Sci USA. 101:10143-8, 2004.
  • Spira, A, Beane J, Schembri F, Liu G, Yang X, Ding C, Gilman S, Cantor C, and Brody J. Noninvasive method for obtaining RNA from buccal mucosa epithelial cells for gene expression profiling. Biotechniques. 36:484-87, 2004.