Johnson Lab

Lab Objectives:

To develop high quality cutting edge computational algorithms for applications in precision genomic medicine, and use these methods to impact and improve the way patients are treated in the clinic.

Approach to Science:

  • Develop clinically relevant computational methods and software for high-throughput data
  • Collaborate closely with biologists, clinicians, and other statistical and computational scientists
  • Apply methods in in highly ‘translatable’ ways to impact the way patients are treated in the clinic


The development of personalized treatment regimes is an active area of current research in genomics. The focus of our research is to investigate core biological components that contribute to disease prognosis, development, and early detection and to develop latent variable models to accurately determine optimal therapeutic regimens for individual patients. Because biological processes do not act in isolation but as parts of complex interactive systems, we are computationally evaluating interactions between these systems at multiple levels. At the sequence and cellular level, we have developed latent variable models for probabilistically determining gene expression profiles that are linked to individual response to treatment. In addition, we are experimentally perturbing subcomponents of larger biological systems or pathways and linking pathway activation status to genetic disease risk and drug sensitivity.

Our lab’s research consists of the development of methods for analyzing a variety of genome-wide data types, currently focusing on the analysis of data from next-generation sequencing (NGS) experiments. We are developing a comprehensive and coordinated set of statistical methods for NGS data analysis and data integration that directly address many important problems in epigenetics and translational medicine. There is a great need for biologically motivated and mathematically justified methods able to efficiently handle the massive data sets generated by high-throughput experiments. Our goal is to conduct cutting-edge research, making an impact in genomics and translational science while developing straightforward computational tools to facilitate others conducting similar work. In order to ensure the methods being developed are appropriate and useful, we are aggressively working to establish and maintain strong collaborations with applied scientists in genomics and translational research. We are firmly of the opinion that the research conducted in our lab is timely, of high importance, and relevant to the current needs in these fields.

Post Doctorate

Arthur VanValkenburg

Graduate Students

  • Howard Fan, Bioinformatics Program
  • Ethel Nankya, Bioinformatics Program
  • Kiloni Quiles-Franco, Molecular and Translational Medicine
  • Lucas Schiffer, Bioinformatics Program
  • Xutao Wang, Biostatistics Program
  • Kristina Yamkovoy, Biostatistics Program 

Past Trainees

Recent Publications

  1. Shen Y, Rahman M, Piccolo SR, Gusenleitner D, El-Chaar NN, et al. ASSIGN: context-specific genomic profiling of multiple heterogeneous biological pathways. Bioinformatics. 2015 Jan 22;PubMed PMID: 25617415.
  2. Castro-Nallar E, Hasan NA, Cebula TA, Colwell RR, Robison RA, et al. Concordance and discordance of sequence survey methods for molecular epidemiology. PeerJ. 2015;3:e761. PubMed PMID: 25737810; PubMed Central PMCID: PMC4338773.
  3. Byrd AL, Perez-Rogers JF, Manimaran S, Castro-Nallar E, Toma I, McCaffrey T, Siegel M, Benson G, Crandall KA, Johnson WE. Clinical PathoScope: rapid alignment and filtration for accurate pathogen identification in clinical samples using unassembled sequencing data. BMC Bioinformatics. 2014 Aug 4;15:262. PubMed PMID: 25091138; PubMed Central PMCID: PMC4131054.
  4. Hong C, Manimaran S, Shen Y, Perez-Rogers JF, Byrd AL, Castro-Nallar E, Crandall KA, Johnson WE. PathoScope 2.0: a complete computational framework for strain identification in environmental or clinical sequencing samples. Microbiome. 2014;2:33. PubMed PMID: 25225611; PubMed Central PMCID: PMC4164323.
  5. Bild AH, Chang JT, Johnson WE, Piccolo SR. A field guide to genomics research. PLoS Biol. 2014 Jan;12(1):e1001744. PubMed PMID: 24409093; PubMed Central PMCID: PMC3883637.
  6. Hong C, Clement NL, Clement S, Hammoud SS, Carrell DT, Cairns BR, Snell Q, Clement MJ, Johnson WE. Probabilistic alignment leads to improved accuracy and read coverage for bisulfite sequencing data. BMC Bioinformatics. 2013 Nov 21;14:337. PubMed PMID: 24261665; PubMed Central PMCID: PMC3924334.
  7. Piccolo SR, Withers MR, Francis OE, Bild AH, Johnson WE. Multiplatform single-sample estimates of transcriptional activation. Proc Natl Acad Sci U S A. 2013 Oct 29;110(44):17778-83. PubMed PMID: 24128763; PubMed Central PMCID: PMC3816418.
  8. Francis OE, Bendall M, Manimaran S, Hong C, Clement NL, Castro-Nallar E, Snell Q, Schaalje GB, Clement MJ, Crandall KA, Johnson WE. Pathoscope: species identification and strain attribution with unassembled sequencing data. Genome Res. 2013 Oct;23(10):1721-9. PubMed PMID: 23843222; PubMed Central PMCID: PMC3787268.
  9. Piccolo SR, Sun Y, Campbell JD, Lenburg ME, Bild AH, Johnson WE. A single-sample microarray normalization method to facilitate personalized-medicine workflows. Genomics. 2012 Dec;100(6):337-44. PubMed PMID: 22959562; PubMed Central PMCID: PMC3508193.
  10. Clement NL, Snell Q, Clement MJ, Hollenhorst PC, Purwar J, Graves BJ, Cairns BR, Johnson WE. The GNUMAP algorithm: unbiased probabilistic mapping of oligonucleotides from next-generation sequencing. Bioinformatics. 2010 Jan 1;26(1):38-45. PubMed PMID: 19861355