Understanding the differences between CERES and Chronos DepMap data

In 21Q1 we introduced Chronos, a new method for inferring gene fitness effects from CRISPR knockout screens. The accompanying preprint described the algorithm in detail and compared it to a number of competitors across a range of global performance metrics. However many DepMap users will be particularly interested in understanding how Chronos changes the Project Achilles dataset. This blog post explores the differences in a little more detail.

Distributional Properties

Genome-wide CRISPR screens in cell lines show a characteristic pattern for gene viability effects: a narrow peak near 0 and a long left tail of dependencies. Chronos shows this same pattern, but a bit more pronounced. The peak at 0 is higher and narrower and the dependencies more heavy tailed. This is in agreement with our understanding of the underlying biology. There are many genes for which we think the true fitness effect is identically 0 (such as all unexpressed genes), but a dependency can have a wide range of effects from mild growth inhibition to rapid cell death.


The distributions look similar within individual cell lines.

Many of the most important uses of DepMap data involve looking at the pattern of a gene's effects across cell lines, which we'll call the gene's profile. Distributions of gene effects for a single gene across cell lines of course look very different for different categories of genes. One consistent finding is that Chronos has a more pronounced negative skew for many gene profiules, indicating a greater degree of apparent selectivity. Here's the plot for each gene profile's skewness in Chronos vs CERES. with common essential genes highlighted.


Although the algorithms agree well on which genes have a negative skew, they show less agreement on which genes have a positive skew. About 70 genes are positively skewed in CERES and not Chronos; for the reverse, it's 225. The difference is explained by strong common essentials, many of which have a Gaussian distribution in CERES but are right-skewed in Chronos. An example helps illustrate why this is the case. Ras-related nuclear protein (RAN) is the most essential (across cell lines) of the genes positively skewed in Chronos but not CERES:


The left tail of Chronos gene effect scores for RAN is truncated near -3, while CERES has a long left tail extending to -4.5. This difference is at least partly due to the different cost functions for each algorithm. CERES models log fold change data and assumes each observed log fold change value is normally distributed about the true value. Chronos models the underlying readcount data directly and assumes a negative binomial distribution, much like MAGeCK. Thus, CERES sees a difference between, for example, -8 and -12 log fold change a highly significant. Chronos accounts for the fact that this difference could arise from observing 1 vs 0 reads for the guide in question, which it can attribute to noise.

Many of the opposite cases – CERES positively skewed, Chronos not – are driven by a single cell line where the gene has a very positive score. In fact, a single guide in a single replicate for that gene and cell line frequently accounts for the positive skew in CERES. For example, consider the profile of :


The outlying line in CERES, T3M4, has one replicate that passed QC in Achilles. When we examine the log fold change scores for FXYD3 guides, we find a striking discrepancy:


This appears to be a case of clonal outgrowth most likely unrelated to FXYD3's function. Chronos uses a prefiltering step to help remove suspected cases of outgrowth.

Differences between Gene Effect Profiles

Unsurprisingly, since they're modeling the same underlying data, most gene profiles correlate pretty strongly between Chronos and CERES:

Most of the exceptions, where gene profiles do not agree between the two methods, are common essential genes. Here's the relationship between gene profile correlation between Chronos and CERES and the gene profile's mean effect in CERES. We can see a trend toward lower correlation for more essential genes.


We noted above that Chronos and CERES treat strong essentials differently at a statistical level, but there is another salient difference. In the underlying log fold change data, while most genes are negatively correlated with their own copy number (meaning that increasing copies increases guide depletion), log fold change for guides targeting common essential genes are positively correlated with copy number. One likely contributor to this effect is incomplete knockout, in which increasing copies of an essential gene reduces the chance of the CRISPR system achieving a full knockout.

CERES removes the negative correlation of nonessential gene profiles with copy number, but at the cost of enhancing the positive correlation of common essential gene profiles. Chronos removes both trends. Below is a plot illustrating the difference. Each gene profile in CERES and Chronos has been correlated with its own copy number, then binned according to its average gene effect score. In both CERES and Chronos, genes centered at 0 have been decorrelated from their own copy number. However, genes with negative average scores–common essentials–are positively correlated with their own copy number in CERES.


To highlight the influence of positive copy number correlation on agreement, we've reproduced the plot of Chronos-CERES gene profile correlation, marking in red the genes with greater than 0.2 positive correlation between their copy number and CERES profiles.


In the remaining analysis, we'll filter out common essentials (those with mean CERES gene effect < -0.7). Most of the remaining genes with low correlation between CERES and Chronos have low variance across cell lines in CERES, and conversely all the genes with high variance have high correlation. Thus, the lack of strong correlation in many cases can be explained by a lack of differential signal rather than any meaningful disagreement between the two algorithms. However, the genes with lowest correlation tend to have above median variance:


In cases of low correlation, which algorithm should we believe? We recently came up with a method for assessing confidence in gene effect profiles. Confidence is measured by combining various types of information, such as agreement with the Project Score CRISPR screens, agreement with RNAi, predictability, and guide consistency, yielding a final score between 0 and 1. Cases where we have high confidence in the Chronos profile but not the CERES profile probably indicate that Chronos is right and vice versa.


Overall, Chronos had slightly higher confidence scores (median 0.69 vs 0.66 for CERES), consistent with the higher overall data quality we found. For genes (excluding common essentials) with low (< .5) correlation between their CERES and Chronos profiles, we looked for cases where we had significantly higher confidence in one dataset than the other. We found only 17 such genes where much higher confidence in the CERES profile than in Chronos (CERES confidence > 0.75, Chronos confidence < 0.5), and 22 where we were much more confident in Chronos than CERES. None of the genes in either list is strongly selective, and most don't have cell lines with strong (less than -1) gene effect.

A typical example, pyruvate dehydrogenase kinase 1 (PDK1), has correlation of only 0.42 between the Chronos and CERES gene profiles. None of the DepMap datasets (including Sangers Project Score and the RNAi datasets) show any appreciable effect from suppression or knockout of this gene in any breast lines. CERES shows the strongest maximum depletion for PDK1 at -0.93 in KPNYN (a neuroblastoma), but we only have confidence 0.4 in this profile. In Chronos, we have excellent (0.83) confidence in PDK1's profile, but it never scores below -0.3. An examination of the guide data below shows the source of the discrepancy. A single reagent, ATACAAGGAGAGCTTTGGGG, shows strong depletion in some lines while no other reagent does. CERES bases its gene effect estimates on this single guide. Chronos bases its estimate on the other three guides and dismisses the depleting reagent as an outlier.


In the most common case, we don't have much confidence in genes with low Chronos-CERES correlation in either dataset. Of the 50 genes with the lowest correlation between Chronos and CERES, 40 are in the bottom third of genes by confidence in both datasets. This suggests that the major driver of disagreement (after the common essential gene differences we described above) is murky screen results, probably stemming from low guide quality for the genes in question.


As outlined in our preprint, we believe that Chronos provides significant improvements over existing methods, including CERES, for analyzing large CRISPR screening datasets. Of course, every method needs to make some certain assumptions about the data, and it’s unlikely that one method will perform best in every case. For the most part, as shown here, the differences between DepMap data processed using Chronos vs CERES should have little impact on interpreting the data. Genes for with more substantial changes in dependency profiles should be handled with care, as they are likely ones in which there are substantial guide-to-guide variability and potential off-target effects.

Correction: the first version of this post incorrectly described PDK1 as a putative breast cancer target based on literature. The breast cancer target is phosphoinositide-dependent protein kinase 1, with the official symbol PDPK1.

comments powered by Disqus