Researchers Develop New Method to Help with Analysis of Single Cell Data
CITE-seq (cellular indexing of transcriptomes and epitopes) is an RNA sequencing-based method that simultaneously quantifies cell surface protein and transcriptomic data within a single cell readout. The ability to study cells concurrently offers unprecedented insights into new cell types, disease states or other conditions.
While CITE-seq solves the problem of detecting a limited number of proteins while using single-cell sequencing in an unbiased way, one of its limitations is the high levels of background noise that can hinder analysis.
To rectify this problem, researchers from Boston University Chobanian & Avedisian School of Medicine and College of Arts & Sciences have developed a novel tool which can identify and remove unwanted background noise that comes from various sources.
“We created DecontPro, a statistical model that decontaminates two sources of contamination that were observed empirically in CITE-seq data,” explains corresponding author Joshua Campbell, PhD, associate professor of medicine at the School. “It can be used as an important quality assessment tool that will aid in the downstream analysis and help researchers to better understand the molecular cause of disease,” he said.
The researchers examined several publicly available datasets that profiled different types of tissue with CITE-seq and found a novel type of artifact, which they called a “spongelet.” The spongelets contributed a large amount of background noise in several datasets. The researchers found that DecontPro can estimate and remove different sources of background noise, including contamination from spongelets, from ambient material that may be present in the cell suspension, or from non-specific binding of antibodies.
Masanao Yajima, PhD, professor of the practice in the department of mathematics and statistics states, “DecontPro is a Bayesian hierarchical model. We carefully constructed it so that it can tease apart the signals from noise in single-cell datasets without being overly aggressive.”
These findings appear online in the journal Nucleic Acids Research.