Dissecting the roles of MBD2 isoforms in regulating NuRD complex function during cellular differentiation

The Nucleosome Remodelling and Deacetylation (NuRD) complex is a crucial regulator of cellular differentiation. Two members of the Methyl-CpG-binding domain (MBD) protein family, MBD2 and MBD3, are known to be integral, but mutually exclusive subunits of the NuRD complex. Several MBD2 and MBD3 isoforms are present in mammalian cells, resulting in distinct MBD-NuRD complexes. If these different complexes serve distinct biochemical and/or functional activities during differentiation is not completely understood. Based on the essential role of MBD3 in lineage commitment, we systematically investigated a diverse set of MBD3 and MBD2 variants for their potential to rescue the differentiation block observed in mouse embryonic stem cells (ESCs) lacking MBD3. Our study reveals that while MBD3 is indeed crucial for ESC differentiation to neuronal cells, this function is independent of its MBD domain or binding to methylated DNA. While MBD3 isoforms are highly redundant, we identify that MBD2 isoforms vary in their potential to fully rescue the absence of MBD3 during lineage commitment. Full-length MBD2a only partially rescues the differentiation block; MBD2b, which lacks the N-terminal GR-rich repeat, fully rescues the differentiation block in MBD3 KO ES cells, and cells expressing the testis-specific isoform MBD2t that lacks the coiled-coil domain required for NuRD interactions are not able to generate any differentiated cells. In case of MBD2a, we further show that removing the m-CpG DNA binding capacity or the GR-rich repeat renders the protein fully redundant to MBD3, highlighting the requirements for these domains in diversifying NuRD complex function. In sum, our results highlight a partial redundancy of MBD2 and MBD3 during cellular differentiation and point to specific functions of distinct MBD2 isoforms and specific domains within the NuRD complex.

The Nucleosome Remodelling and Deacetylation (NuRD) complex is an abundant and highlyconserved complex regulating cell fate transitions and differentiation in many different organisms and developmental contexts (2)(3)(4). The multi-protein complex combines two enzymatic activities: lysine deacetylation mediated by Histone Deacetylase (HDAC) 1 and 2 proteins, and ATPase-dependent nucleosome remodeling by Chromodomain Helicase DNA binding protein (CHD) 3 or 4 (5)(6)(7). Additional complex partners are the histone chaperone proteins RBBP4 and 7, the zinc-finger proteins GATAD2a or GATAD2b, two MTA proteins (MTA1, MTA2, and/or MTA3) and CDK2AP1 (8). Additionally, the methyl-CpG binding protein family members MBD2 or MBD3 are essential but mutually exclusive NuRD complex members, therefore assembling distinct MBD2-NuRD or MBD3-NuRD complexes (4,9). Recent structural and biochemical data support the notion that the MBD2 and MBD3 proteins function as a link between the MTA:HDAC:RBBP core and the peripheral GATAD2:CHD:CDK2AP remodeling module (10)(11)(12). Absence of MBD2 or MBD3 therefore disrupts NuRD complex functionality. In addition, replacement of MBD2 or MBD3 with PWWP2A results in a distinct complex lacking the remodeling module, also called NuDe complex (13)(14)(15). In vivo, MBD2 seems dispensable for normal mouse development as MBD2 KO mice display only minor phenotypes but are viable and fertile (4). In contrast, MBD3 is required to exit pluripotency and essential for early mammalian development reflected by lethality of MBD3 KO mouse embryos (4,(16)(17)(18).
MBD2 and MBD3 are closely related proteins that share almost 80% homology outside the MBD domain and arose by gene duplication from an ancestral MBD2/3 gene that is present in some metazoans (4,9,19). MBD2 and MBD3 contain an MBD and a coiled-coil domain (CC) separated by a disordered protein region, with the latter two being important for protein-protein interaction with the NuRD complex (20)(21)(22). Whereas the MBD domain of MBD2 shows high affinity for methylated DNA, the MBD3-MBD domain lacks four conserved amino acids required for the recognition of methyl-CpG. In addition, MBD2 contains a N-terminal glycine-arginine (GR) rich stretch that has been implicated in increasing DNA methylation affinity and interactions with the NuRD complex (9,23). Differential inclusion of these domains result in various MBD2 and MBD3 isoforms, some with cell type or tissue-specific expression (16,(24)(25)(26). Three MBD3 isoforms are present in mouse ESCs: The full-length MBD3a isoform, MBD3b with a truncated MBD domain and MBD3c lacking the MBD domain (16). MBD2 also contains three isoforms: the full-length MBD2a, MBD2b lacking the N-terminal GR repeat and MBD2t lacking the C-terminal CC domain. Based on the presence of either MBD2 or MBD3 in the NuRD complex, MBD2-NuRD and MBD3-NuRD are thought to have distinct functional roles during early development. It is speculated that this is mainly due to their differential binding affinity to methylated DNA by the MBD proteins and recruitment of the NuRD complex to distinct genomic sites. The tissue-specific presence of MBD2-or MBD3-isoforms are expected to further increase the complexity of NuRD complex function. Still, little is known about the direct requirement of the individual MBD2 and MBD3 domains for NuRD complex activity during cellular differentiation. Furthermore, differential and overlapping expression levels of MBD2 and MBD3 isoforms in different cellular contexts convolutes our current understanding about the roles of these different NuRD complexes, requiring further investigation.
Here, we took a systematic approach to dissect the functionality of different NuRD complex compositions during neuronal commitment and terminal differentiation through controlling the expression of MBD2-or MBD3-isoforms. Towards this, we combined neuronal differentiation of engineered murine ES cells with FACS-based measurements of cell identity and transcriptional profiling. In our approach, successful lineage commitment is a direct measurement of a functional NuRD and the role of specific isoforms. While MBD3 is a critical NuRD complex member allowing neuronal differentiation, we show that it functions independent of its MBD domain.
Additionally, full-length MBD2 is able to partially compensate MBD3 function. In absence of the GR-stretch or DNA methylation binding affinity, this ability is further elevated to fully compensate absence of MBD3, indicating that these properties prevent a complete redundancy to MBD3. In sum, our results combining functional assays with gene expression analysis of a diverse set of MBD constructs, highlight a partial redundancy of MBD2 and MBD3 during cellular differentiation and point to a more structural than instructive function of the specific MBD family members.

Establishment of a functional readout for systematic interrogation of MBD2/MBD3-NuRD complexes
To investigate the distinct roles of MBD2 and MBD3 during lineage commitment, we employed a well-established in vitro differentiation system of ESCs towards homogenous populations of neural progenitor cells (NPC) and terminal neurons (TN) (27) ( Figure 1A). First, we explored published microarray expression data (28) of several surface proteins at consecutive differentiation stages (ESC, cell aggregate formation (CA) day4, NPC day 8 and TN day2 and day4, respectively) and identified two neuronal surface proteins, CD24a (CD24) and CD56 (also known as NCAM1), as significantly upregulated at the NPC and TN stage, indicating successful neuronal lineage commitment ( Figure 1B). We further established a FACS-based readout on NPCs to quantify the expression of those neuronal surface markers as a measure to score the differentiation potential of ESC. In addition, we assessed successful lineage commitment by the total amount of live cells at progenitor and terminal stages of the differentiation protocol.
We first tested the suitability of this setup on individually-derived MBD2 and MBD3 knock-out (KO) ESC cell lines that were generated in the same genetic background using CRISPR-Cas9.
Targeting of MBD2 and MBD3 resulted in a complete loss of MBD2-NuRD or MBD3-NuRD as both MBD2 isoforms (MBD2a and b) and all three MBD3 isoforms (MBD3a, b and c) were not detectable in the respective KO cell lines (Supp. Figure 1A-B). We next differentiated the KO cell lines together with wild type cells and measured CD24 and CD56 levels. Whereas uncommitted ESCs do not express CD24 and CD56 in all three tested lines (WT, MBD2 KO and MBD3 KO), NPCs derived from WT cells expressed either CD24 alone or in combination with CD56 (CD24 + CD56 + double-positive), indicating successful neuronal commitment ( Figure 1C).
As expected and in line with previous reports (16), MBD3 KO ESCs failed to differentiate towards NPCs, indicated by a more than 10-fold reduction of total number of live cells and CD24 + CD56 + double-positive cells ( Figure 1D). Additionally, we detected a significant decrease in the frequency of CD24 + cells (89% in WT vs. 61% in MBD3 KO) and a significant increase of uncommitted CD24 -CD56cells in MBD3 KO cells, when compared to WT (12% in WT vs. 38% in MBD3 KO) ( Figure 1E). Furthermore, MBD3 KO ESCs were not able to form terminal neurons ( Figure 1F). Unlike MBD3 KO, MBD2 KO ESCs did not show any noticeable differentiation defects, and similar to the WT ESCs, successfully differentiated towards both NPCs and terminal neurons ( Figure 1C-F).

The MBD domain of MBD3 is dispensable for neuronal differentiation
To systematically test the functional role of different MBD3 isoforms and mutant MBD3 proteins in regulating neuronal lineage commitment, we expressed MBD3 protein variants in MBD3 KO ESCs from a heterologous site and assessed their capacity to rescue the neuronal differentiation phenotype. The MBD3 proteins were expressed from a constitutive promoter, integrated to the same site in the mouse genome via recombinase-mediated cassette exchange (RMCE)  (16,24). Previous reports suggest that all three isoforms are equally capable of promoting lineage commitment (16,29).  Figure 1C). The lower levels of MBD3a were not sufficient to fully rescue the differentiation block. The MBD3a low cell line resulted in a significant reduction of live NPCs, CD24 + and CD24 + CD56 + double-positive cell numbers (4-fold reduction) and a significant reduced percentage of CD24 + cells, when compared to WT cultures (90% in WT vs. 78% in MBD3a low ) ( Figure 2B-C). Additionally, MBD3a low cells showed reduced capacity to form terminal neurons ( Figure 2D). In sum the results indicated that MBD3 protein abundance rather than its MBD domain composition is crucial for ESC differentiation towards neuronal lineage.

Full-length MBD2 partially rescues the differentiation block in MBD3 KO ESCs
To understand the role of MBD2-NuRD during lineage commitment we used the same expres- We next wanted to test if other MBD2 isoforms and variants besides MBD2a are able to partially rescue the differentiation block of MBD3 KO ESCs ( Figure 3A). First, we introduced the shorter isoform MBD2b that lacks the N-terminal stretch of MBD2a, including the repetitive G/R rich region (24). In contrast to MBD2a, this isoform led to a full rescue of the neuronal differenti-  Figure 2A-B). Strikingly, the repetitive G/R stretch and the mCpG binding preference of MBD2 seem to prevent a complete redundancy to MBD3.

Neuronal gene expression signatures can be restored upon MBD2 or MBD3 reintroduction in MBD3 KO neuronal progenitor cells
Having showed that MBD2 variants are capable of rescuing loss of MBD3 function during neuronal differentiation at different capacities, we next wanted to obtain a better insight into the   Figure 4F).

Discussion
Here we provide a systematic dissection of the different MBD2 and MBD3 isoforms and their protein domains during ESC differentiation. In contrast to other tissues where specific MBD isoforms are present, ESCs express all six MBD2/3 variants (MBD2a,b,t and MBD3a,b,c), which can be mutually exclusively incorporated into the NuRD complex, ultimately forming distinct assemblies with different functionalities (16,25). NuRD plays an essential role during lineage commitment regulating the exit from pluripotency and enabling proper lineage differentiation (4,16,34,35). Successful ESC lineage commitment therefore serves as a direct measurement of NuRD complex functionality.

By using a well-defined ESC differentiation model towards NPCs and terminal neurons, we
showed that while MBD3 is critical for ESC lineage commitment, as previously described (16), this function is independent of its MBD domain and binding to DNA, irrespective of the CpG methylation state. Surprisingly, we found that MBD2a can partially compensate for the loss of MBD3, leading to the generation of fully-differentiated neurons, although at a low frequency. and PRMT5 and influence mCpG-affinity and incorporation of MBD2 to the NuRD complex (9,23).
Taken together, the differences observed for the MBD2 isoforms point to a specialized role of these variants in regulating MBD2-NuRD function. Chromatin remodeling complexes often show protein subunit diversity that conveys a specialized function of particular sub-complexes (1,36). Several studies highlight that NuRD cellular function indeed depends on alternate usage of Mbd2/3, Chd3/4/5 and Mta1/2/3, as MBD2-NuRD but not MBD3-NuRD regulate fetalhemoglobin switch in adult erythroid cells (37) and different CHDs subunits regulate neuronal differentiation and migration with a limited protein redundancy (38). The competition between the MBD2 isoforms with MBD3 proteins for other NuRD components results in different assemblies with -depending on the MBD variant levels present in the analyzed tissue -different functional properties. This can for example lead to the presence of incomplete NuRD complexes lacking the GATAD2:CHD:CDK2AP1 chromatin remodeling module -as in the case of MBD2t or the newly identified component PWWP2A that replaces MBD2/MBD3 from the NuRD complex -also called NuDe complex (13)(14)(15)20). This can also result in differential localization of the NuRD complex to genomic sites based on DNA methylation readout by MBD2. While DNA methylation-dependent localization of MBD2 and localization of MBD3 to unmethylated, active regulatory sites have been reported by multiple groups, other NuRD complex members were predominantly found to localize to the latter, with little overlap to DNA-methylated sites (21,34,35,(39)(40)(41). It remains to be investigated if different MBD2-isoforms lead to the assembly of alternative NuRD (sub-)complexes with distinct genomic localization or display NuRDindependent functions. Taken together, our data highlight a more complex role of MBD2 isoforms and domains in ESC lineage commitment than previously anticipated.

Materials and Methods
Cell culture, cell line generation and neuronal differentiation of embryonic stem cells: Mouse embryonic stem cells (HA36CB1, 129×C57BL/6) were cultured as previously described (21). MBD protein expression constructs in pL1-CAGGS-bio-MCS-polyA-1L or pL1-CMV-bio-MCS-polyA-1L were generated in (21). Subcloning of MBD variants without specific domains were achieved by subcloning from initial plasmids using Gibson-Assembly. MBD protein variant expressing cell lines in MBD3 KO ES cells were obtained by RMCE as previously described (21). Briefly, RMCE constructs were co-transfected with a Cre recombinase expression plasmid Fisher Scientific). Neuronal differentiation of embryonic stem cells was performed as previously described (27). Microscopy images were taken at 20x magnification using an Olympus CKX31 microscope and a Canon Eos 550D Camera. Image contrast was increased for better visualization.
Flow cytometry: For CD24 and CD56 measurements in neuronal progenitors, single-cell suspensions were obtained from neuronal progenitors after 8 days of differentiation, as previously described (27). Immunoblotting: Crude nuclear extracts cells were obtained as described in (43). Membranes were blocked with 5% milk or 5% BSA for detection with antibodies or Streptavidin-HRP, respectively. Primary Poly-A RNA-sequencing and differential gene expression analysis: Total RNA was isolated from NPCs using the RNeasy Plus mini kit (Qiagen  (45) and differential gene expression was performed using the edgeR package with significance set to p-value < 0.05 and log fold change > I1I (46). MA and MDS plots were generated with the plotMD() and plotMDS() functions in edgeR. Heatmap representing gene expression changes for selected genes or all genes differentially expressed between WT and MBD3 KO cells were generated using the gplots::heatmap.2() function using log2transformed, normalized CPM counts (prior.count = 1).