Abstract
The epigenome modulates the activity of genes and supports the stability of the genome. The epigenome can also contain phenotypically relevant, heritable marks that may vary at the organismic and population level. Such non-genetic standing variation may be relevant to ecological and evolutionary processes. To identify loci susceptible to selection, it is common to profile large populations at the genome scale, yet methods to perform such scans for epigenetic diversity are barely tapped. Here, we develop a scalable, information-theoretic approach to assess epigenome diversity based on Jensen-Shannon divergence (JSD) and demonstrate its practicality by measuring cell type-specific methylation diversity in the model plant <jats:italic>Arabidopsis thaliana</jats:italic>. DNA methylation diversity tends to be increased in the CG as compared to the non-CG (CHG and CHH) sequence context but the tissue or cell type has an impact on diversity at non-CG sites. Less accessible, more heterochromatic states of chromatin exhibit increased diversity. Genes tend to carry more single-methylation polymorphisms when they harbor gene body-like chromatin signatures and flank transposable elements (TEs). In conclusion, the analysis of DNA methylation with JSD in <jats:italic>Arabidopsis</jats:italic> demonstrates that the genomic location of a gene dominates its methylation diversity, in particular the proximity to TEs which are increasingly viewed as drivers of evolution. Importantly, the JSD-based approach we implemented here is applicable to any population-level epigenomic data set to analyze variation in epigenetic marks among individuals, tissues, or cells of any organism, including the epigenetic heterogeneity of cells in healthy or diseased organs such as tumors.