Galactic ionising photon budget during the Epoch of Reionisation in the Cosmic Dawn II simulation

Cosmic Dawn ("CoDa") II yields the first statistically-meaningful determination of the relative contribution to reionization by galaxies of different halo mass, from a fully-coupled radiation-hydrodynamics simulation of the epoch of reionization large enough ($\sim$ 100 Mpc) to model global reionization while resolving the formation of all galactic halos above $\sim 10^8 M_\odot$. Cell transmission inside high-mass haloes is bi-modal -- ionized cells are transparent, while neutral cells absorb the photons their stars produce - and the halo escape fraction $f_{esc}$ reflects the balance of star formation rate ("SFR") between these modes. The latter is increasingly prevalent at higher halo mass, driving down $f_{esc}$ (we provide analytical fits to our results), whereas halo escape luminosity, proportional to $f_{esc} \times$SFR, increases with mass. Haloes with dark matter masses within $6.10^{8} M_\odot<M_h<3.10^{10} M_\odot$ produce $\sim 80$% of the escaping photons at z=7, when the Universe is 50% ionized, making them the main drivers of cosmic reionization. Less massive haloes, though more numerous, have low SFRs and contribute less than 10% of the photon budget then, despite their high $f_{esc}$. High mass haloes are too few and too opaque, contributing $<10$% despite their high SFRs. The dominant mass range is lower (higher) at higher (lower) redshift, as mass function and reionization advance together (e.g. at z$=8.5$, x$_{\rm HI}=0.9$, $M_h<5.10^9 M_\odot$ haloes contributed $\sim$80%). Galaxies with UV magnitudes $M_{AB1600}$ between $-12$ and $-19$ dominated reionization between z$=6$ and 8.


INTRODUCTION
Current observations are consistent with the hypothesis that the Universe was reionized when UV starlight from massive stars escaped from the early galaxies in which they formed, creating intergalactic H II regions that grew in size and number until they overlapped to fully-ionize the intergalactic medium (hereafter IGM) by z = 6. [For reviews and references, see, for instance, Dayal & Ferrara (2018) and Barkana & Loeb (2007)]. During the epoch of reionization ("EOR"), the globally-averaged ionized fraction was equiva-lent to the volume filling factor of these H II regions, which increased monotonically in an evolving patchwork of fullyionized and fully-neutral zones. How fast this volume filling factor grew was determined primarily by the average balance between the rate of release of ionizing photons by galaxies and the recombination rate of H atoms in their surrounding IGM. The release rate in a given patch depended upon the galaxy formation rate there, the star formation rates (hereafter "SFR") inside each galaxy, the spectra and luminosities of those stars, and the galactic escape fractions f esc of their ionizing photons. The recombination rate in the surrounding IGM depended upon its evolving inhomogeneous density field. All these ingredients varied in space and time in a complex way, not only because structure formation was inhomogeneous but also because reionization and the energy release associated with the star formation that drives it exerted hydrodynamical feedback, so the ingredients were not independent of each other.
To predict their coupled evolution in the context of ΛCDM cosmology, to test the latter against observations, we must model the gravitational and gas dynamics of dark and baryonic matter and the radiative transfer of ionizing radiation in and between galaxies as they form, in a representative volume large enough (∼ 100 Mpc) to capture the large-scale structure of inhomogeneous reionization, and with enough resolving power to form all the galaxies in that volume which contribute to reionization. As the dominant contributors are thought to be the "atomic-cooling haloes" (hereafter "ACHs") -those of virial temperatures above 10 4 K and masses above 10 8 M -this means we must be able to resolve the formation of all the millions of haloes above 10 8 M in that large volume 1 . In principle, to capture the full details of star formation within each galaxy, we would also have to resolve the interstellar medium of each galaxy down to the sub-parsec scale on which molecular clouds fragment into protostars which then collapse into stars. The latter is currently out-of-reach computationally, however, even in the highest-resolution simulations to-date of a single galaxy, so star formation and its local energy release must generally be treated as a "subgrid" process.
We have developed the Cosmic Dawn (hereafter "CoDa") Project, to simulate reionization and galaxy formation together, self-consistently, with fully-coupled, radiationhydrodynamics, on a large-enough scale and with sufficient mass resolution to satisfy these requirements (Ocvirk et al. 2016;Aubert et al. 2018;Dawoodbhoy et al. 2018;Ocvirk et al. 2018). CoDa I (91 Mpc box), described in Ocvirk et al. (2016) and Dawoodbhoy et al. (2018), and CoDa II (94.5 Mpc box), described in Ocvirk et al. (2018), both used the massively-parallel, hybrid CPU-GPU code RAMSES-CUDATON on a uniform grid of 4096 3 cells for the baryons and the radiation field, with 4096 3 N-body particles for the dark matter. CoDa I-AMR (91 Mpc), on the other hand, used another massively-parallel, hybrid CPU-GPU code EMMA (Aubert et al. 2015), with Adaptive Mesh Refinement ("AMR"), with 2048 3 particles on a grid of 2048 3 coarse cells from which AMR increased the resolution locally, by up to a factor of 8, to follow the increasing local overdensity, leading to 18 billion cells after refinement. All three CoDa simulations were in volumes large enough to model the inhomogeneity and globally-averaged time-history of reionization, while also serving to model reionization and galaxy 1 Although the first stars form in minihaloes (hereafter "MHs") (i.e. those with halo mass M< 10 8 M and virial temperatures T< 10 4 K) that can cool gas by H 2 molecular cooling (and some stars may even form in metal-cooling MHs, as shown in Wise et al. (2014)), the rising UV background during the EOR limits their contribution to the earliest stages of reionization. As a result, their relative contribution when compared to the more massive ACHs appears small ( ∼ 10 -20%), (Kimm et al. 2017;Ahn et al. 2012).
formation in the Local Universe, by adopting "constrained realizations" of the Gaussian random noise initial conditions which were derived from galaxy survey data so as to reproduce the familiar structures of the Local Universe, such as the MW, M31, and the Virgo cluster, when evolved forward to z= 0. We refer the reader to the papers cited above to describe our CoDa simulations and their relative differences in more detail. Our purpose here is to use the most recent of them, CoDa II, to find the ionizing luminosities of all the galaxies that formed in it during the EOR, to make the first statistically-meaningful determination of the relative contribution to reionization by galaxies of different halo mass, over the full range of masses that contribute significantly, in a fully coupled radiation hydro-dynamical numerical simulation.
The escape fraction f esc of galaxies is difficult to observe directly, by observing individual galaxies that are bright enough to detect, comparing their fluxes and spectral information at different wavelengths, above and below the H Lyman-limit, and interpreting this in terms of a model in which stars are assumed to have some initial mass function (hereafter "IMF") and a SFR, which determines their spectral energy distribution (hereafter "SED") over time. The radiation that emerges from the galaxy at wavelengths longward of the Lyman limit is then assumed to be a combination of this SED and the nebular emission which results from reprocessing the absorbed fraction of ionizing starlight by the interstellar gas, and may be partially attenuated by internal dust. Starlight emitted blueward of the H Lyman limit is attenuated by photoionizing H atoms in the ISM of the galaxy and possibly attenuated further by dust, reflecting the net absorbed fraction (1 − f esc ), but may also be attenuated by the bound-free opacity of foreground Lyman limit absorbers along the line of sight. Observations of galaxies at different redshifts face different challenges, as they involve different spectral regions depending on z, and the foreground opacity is also a strong function of increasing redshift. As a consequence, observational determinations of f esc are few and still uncertain. A review of this subject is well beyond the scope of this paper; the reader is referred to, e.g., Izotov et al. (2016), and the review by Dayal & Ferrara (2018) and references therein for a summary.
On the theory side, results are also rather limited. Some attempt to derive an empirical f esc , one-size-fits-all, by adjusting its value to make an assumed form and amplitude of the UV luminosity function of galaxies above redshift 6 release enough ionizing photons to finish reionization in time to satisfy various observational constraints, which we refer to as "one-zone" models. Values like f esc = 10 or 20% are sometimes reported, but this depends strongly on the underlying assumptions that led to it. Other attempts determine a global value for f esc in semi-analytical or semi-numerical models of reionization, that are similarly adjusted to match observational constraints, but use the model's own statistical determinations of the rate of formation of galactic halos from cosmological initial conditions and some assumption about the SFR per halo. A variety of galaxy formation simulations also exist which attempt to predict the f esc and SFR from their simulated galaxies, mostly without radiative transfer or only post-processed with radiative transfer, but some which even include fully-coupled radiation-hydrodynamics and, of these, still fewer which attempt to compute the reionization history of some volume which surrounds their galaxies, in the same simulation. Of the latter, those that are able to resolve the internal structure of individual galaxies and their ISM very well are not able to do this for all the galaxies at once, in the very large volumes required to model global reionization properly. A review of this subject, too, is well beyond the scope of this paper; the reader is again referred to Dayal & Ferrara (2018) for a summary and references, but we will describe some of these results in what follows, as we compare with our own.
The relative contribution of different mass halos to the total ionizing photon budget released into the IGM during the EOR depends, not only upon the values of f esc for each galaxy, but on their SFRs and the number density of galaxies of different masses at different redshifts, as well, which combine to produce the ionizing luminosity function of galaxies. In this work, we shall investigate this galactic ionizing photon budget. Since star formation efficiency typically rises with halo mass within the range of masses we represent [not only in CoDa II, but as generally expected, e.g. see Moster et al. (2013) or Legrand et al. (2019)], while the abundance of haloes decreases with halo mass [e.g. see the halo mass functions of Sheth et al. (2001) or Watson et al. (2013)], the mass range of contributing haloes may, in principle, be broad, with a maximum contribution from halos that that, at different redshifts, may be anywhere within the broad range 10 8 M < M halo < 10 12 M .
Previous work on the role of simulated galaxies in ionizing the IGM has sometimes been difficult to reconcile. However, it seems that with the recent advent of higher resolution and of fully-coupled simulations, a few elements of consensus have begun to emerge. Studies such as Anderson et al. (2017) and Yajima et al. (2011) find that reionization is driven by the more numerous, low-mass galaxies (M halo < 10 9.5 M ), which broadly agrees with the conclusions of Kimm & Cen (2014), who find that the photon budget is dominated by masses 10 8 M < M halo < 10 9 M before z=8, after which more massive haloes take over. Similarly, a more recent effort by Katz et al. (2018) seems to favour haloes within the range 10 9 M < M halo < 10 10 M during the EOR, and those of higher mass at z = 6.
Most of the previous simulations are in volumes which are not large enough (most are boxes smaller than 25 Mpc across) to fully represent the halo mass function above 10 10 M . They may therefore be missing some of the largest haloes and galaxies, the ones, in fact, that form the most stars. The contributions of these highest-mass haloes (> 10 10 M ) to the photon budget in these studies is, therefore, partially absent. This could have a further, profound effect, on the rate LyC photons are released from the lowestmass galaxies, the ones that are reionized and suppressed by external sources, as well as dramatically alter the geometry of ionized regions throughout the EOR. Moreover, when reionization is simulated in too small a box, the duration of reionization is too small compared with that found in a volume large enough to capture the globally-averaged history, which can affect the relative importance of halos of different mass as their relative abundances evolve with redshift (Iliev et al. 2006(Iliev et al. , 2014. To overcome these limitations, we will address the photon budget of galaxies during the EOR using the CoDa II simulation (Ocvirk et al. 2018), produced with the RAMSES-CUDATON code (Ocvirk et al. 2016), which couples RAMSES (Teyssier 2002), the code for baryonic hydrodynamics and dark matter N-body dynamics, to ATON (Aubert & Teyssier 2008), the code for radiative transfer of ionizing radiation and non-equilibrium ionization rate equations. The CoDa II simulation ran from z=150 to z=5.8 in a comoving cubic box 94.533 Mpc on a side, with a highenough mass resolution to form every galaxy in that volume with halo mass above 10 8 M . This is sufficient to satisfy the requirement for large enough volume to simulate the EOR and its inhomogeneity, with a statistically-meaningful halo mass function over the full mass range that may contribute significantly to reionization, with the fully-coupled radiation-hydrodynamics (including radiation transport at the full speed of light) necessary to study the release of ionizing starlight into the IGM by galaxies during the EOR and its transport between galaxies involving highly-supersonic ionization fronts. This combination of very large volume with complete sampling of the galactic sources within it represents a necessary compromise. The focus on large scales comes at a cost; we do not attempt to achieve the higher resolution inside galaxies that some other recent simulations do (e.g. Trebitsch et al. 2017;Rosdahl et al. 2018;Kimm et al. 2019), which may affect some of the internal halo physics that are important for our problem, such as the escape of ionizing photons. One of the goals of this study, however, is to demonstrate that, despite our relatively coarser spatial resolution internal to individual galaxies in CoDa II , the "global" halo quantities relevant for describing the radiative properties of high-redshift galaxies, such as their escape fraction and total escape luminosity in ionizing photons, are meaningful and well-captured, thereby validating the CoDa II -like approach, and paving the way towards even larger numerical simulations of the EOR with it in the future.
In this paper, we first, in Sect. 2, outline our numerical approach and computations. Then, in Sect. 3, we lay out our results in terms of escape fractions and photon budget. Finally, in Sect. 5, we summarize our findings and propose some directions in which to take our subsequent efforts.

Cosmic Dawn Simulations
CoDa I and CoDa II are the largest coupled radiation hydrodynamics cosmological grid-based simulations aimed at studying the EoR. In the simulation code RAMSES-CUDATON, the RAMSES hydrodynamics+N-body code (Teyssier 2002) and the ATON radiative transfer code (Levermore 1984;Aubert & Teyssier 2008) are coupled, forming a hybrid code, where hydrodynamics and gravitation are managed by the CPU, while the more computationally intensive radiative transfer, Hydrogen photo-chemistry and cooling are managed by the graphics processing unit (GPU). The resulting acceleration allows us to perform simulations using the full speed of light, thereby circumventing possible artefacts arising from the use of a reduced speed of light (see Gnedin (2016) Wu et al. (2019) for details on the impact of reduced or variable speed of light approximations). The large box size (94.53 cMpc box) and relatively high resolution for a simulation of cosmic reionization (4096 3 dark matter particles and 4096 3 cells) mean that CoDa II can represent the large-scale spatial variance in the reionisation process whilst self-consistently dealing with the physics of the haloes that interest us (ie : those within 10 8 M M halo 10 12 M ), and providing us a huge sample of galactic haloes: there are around 13 million dark matter halos at the end of the EoR in CoDa II.
CoDa II is compatible with a number of observational constraints related to the EoR, most notably the reionization history of the Universe, the Thomson optical depth measured from the cosmic microwave background, and the UV luminosity function of galaxies, as shown in Ocvirk et al. (2018) For further information relating to the code's design, setup, and runs, we refer the reader to Ocvirk et al. (2018).

Halo detection and boundaries
Dark matter haloes (haloes throughout the text) are detected using a Friends-of-Friends algorithm described in Roy et al. (2014), which produces a catalog of haloes with their positions and masses M halo . We can define a halo's virial radius, based on it's mass M halo , as r 200 , as in Ocvirk et al. (2016Ocvirk et al. ( , 2018: whereρ DM is the average cosmic dark matter density.

Halo escape fraction: ray-tracing, sub-grid and net
Using CoDa II's gas density and ionisation fields, we can compute the optical depths encountered by photons emitted in the halo along paths, from their injection to r 200 . For a given halo, and for a given halo cell, we use the heal pix python module heal py to sample the r 200 sphere with 768 evenly distributed end points (We pick this number so as to adequately resolve our largest haloes. Increasing the number of rays by a factor of two only yields a difference of the order of 10 −4 in f ray esc for the most error susceptible computation). We then compute the optical depth along the path from the source cell center to each end point as : where σ E = 2.493 × 10 −22 m 2 is the effective Hydrogen ionization cross-section of the photon group considered here in CoDa II, ρ HI is the neutral Hydrogen physical density of the cell, and dl is an element of length. Finally, the halo escape fraction f ray esc is obtained as the SFR-weighted average of the transmissions of all the emitting cells of the halo, i.e. for a halo containing N cells indexed by the integer i: The star formation rate SFR of a cell is obtained as the stellar mass formed in the last 10 Myr, M stars (< 10Myr) divided by a 10 Myr duration: In the rest of the paper, "star-forming" (halo or cell) always means that some stellar mass has been formed in the last 10 Myr, i.e. M stars (< 10Myr) > 0. The "ray" superscript is used to clarify at all times that f ray esc is obtained via ray-tracing. We will refrain from discussing the ISM (Inter-Stellar Medium) or CGM nature of the absorbing material. Since these are difficult to separate in CoDa II, we prefer the more general term "halo escape fraction" for f ray esc . Furthermore, since CoDa II does not resolve the ISM of our galaxies, CoDa II also uses a sub-grid escape fraction f sub esc = 0.42 (chosen in order to obtain an EoR ending around z=6), which accounts for the photons lost to the star's birth cloud and the ISM, i.e. only 0.42 of the ionising photons produced by a stellar particle is deposited in the cell containing it. We can now also define the net halo escape fraction f net esc as: This is the fraction of a halo's stellar population photon production which manages to reach r 200 . Since f sub esc is constant by construction in CoDa II, we focus mostly in the rest of the paper on the determination and behaviour of f ray esc .

Escape luminosity
To obtain the instantaneous amount of produced ionising photons within a given halo η , we sum the ionising photon production of all emitting star particles (i.e. younger than 10 Myr) within r 200 , i.e.
where the factor 4.32×10 46 is the stellar ionizing emissivity in ph/s/M , taken from Tab. 1 of Ocvirk et al. (2018). We then define the escape luminosity (L esc ) of a halo as the product of the intrinsic luminosity and the net halo escape fraction. This gives the number of ionising photons that reach r 200 per second.
Note that, with these definitions, L esc is, as expected, proportional to SFR and f ray esc . If we interpret r 200 as the limit between the halo and the IGM, L esc is effectively an estimate of the halo luminosity exiting the halo and entering the IGM. Similarly, we can define the cell escape luminosity, by considering η as the photon production within that cell and the average transmission Tr 200 of the paths from that cell to the r 200 sphere of its host halo. For masses 10 9.5 M , the average f ray esc decreases, reaching values of < 10% for M halo > 10 11 M . These smaller values are observed despite massive haloes appearing to heat their r 200 sphere surroundings with SN activity, bringing their neutral fraction down to 10 − 6 and below. Indeed, massive haloes tend to feature dense, neutral, and opaque cores which trap a large fraction of their ionising photon production as seen in Fig. 2. A more detailed investigation of the internal properties of these haloes is performed in Sec. 3.3.
Halo escape fractions are generally higher at low masses, saturating at 1 for all redshifts. The elongated vertical feature of the distribution of haloes at M halo ≈ 9 × 10 8 M , with f esc values lower than 10 −1 , and at z=10, 8, 7 (top left, top right, bottom left panels of Fig. 1) is populated by haloes in which a star has formed recently, and in which the haloes' expanding HII regions have not yet fully reached the halo boundary at distance r 200 . Fig. 3 shows the neutral fraction in a plane containing such a halo, illustrating this case.
Comparing the panels of Fig. 1 shows that the extended vertical distribution attributed to haloes in which stars have recently formed progressively disappears between z=10 and z=6, owing to the ionisation of the material of the lower mass haloes by the combination of local UV production, supernovae energy injection, and outside radiation affecting these haloes, thereby rendering them more and more UV transparent within r 200 .
There is an intrinsically high scatter in the escape fractions. It is due to the wide range spanned by the properties of the halo population such as the maximum central density, the gas density profile, and the individual accretion history of the halo. These different properties are well sampled thanks to CoDa II very large size and therefore abundant halo population.
Close examination of the density maps reveals discontinuities around 5.10 8 M and 10 9 M in the highly populated red/orange areas. These are due to resolution effects. In the case of our less massive haloes, the number of cells that Figure 2. X HI (colour) centred in a plane containing an example high mass halo (≈ 10 10 M ) at z = 6. Pink symbols denote active (younger than 10 Myr) stellar particles. Notice the central, neutral region (Several cells are close to fully neutral) that exhibits high star formation. The surrounding ionised region is super heated (T >> 10 5 K) and is generated by supernova explosions and accretion shocks. The circle represents r 200 . The gas in the halo center appears to be connected to several filament-like structures (with 10 −4.5 X HI 10 −3 ).
Figure 3. X HI map (coded by colour) centred on a plane containing an example halo that has just formed a stellar particle at z = 8. Ionising radiation hasn't yet reached the surfaces where its flux is sampled. Notice the X HI gradient generated by an incoming UV front in the lower left. The symbols are as in Fig .2. represent them can be small, therefore increases in r 200 can affect the resulting f ray esc . Fig. 4, shows the average escape fractions as a function of mass for five epochs, z=6, 7, 8, 10, 14.9. Again, we see high f ray esc values for low mass haloes, and a negative mass trend from ≈ 10 9 M onwards. There is a clear evolution with redshift for haloes M halo 10 10 M : the f ray esc average increases with decreasing redshift, reaching f ray esc ≈ 1 at z=6. Indeed, for a fixed halo mass, higher redshift haloes are denser than their lower redshift counterparts, leading to lower escape fractions. The average behavior of f ray esc for haloes 10 11 M is unclear as the number of such objects is small.

Comparison with the literature
The most prominent feature of Fig. 1 is a decrease of escape fraction with mass, in agreement with the literature investigating the escape fraction of ionizing radiation in high redshift galaxies from numerical simulations (Razoumov & Sommer-Larsen 2010;Yajima et al. 2011;Wise et al. 2014;Paardekooper et al. 2015;Katz et al. 2018;Kimm & Cen 2014), although the slope and extent of the decrease may vary, as can be expected given the range in resolution and modelling of these studies. For dark matter haloes of 10 11 M , the simulations of ? yield an escape fraction of 7%, also in rather good agreement with our results.
The other striking feature of our results is the evolution of halo escape fraction with time, in particular at masses below 10 10 M , showing that in CoDa II, such star forming haloes tend to be more opaque at higher redshifts. For a given halo mass, haloes tend to be denser at higher redshift, and are therefore more likely to yield higher optical depths. A similar evolution, with escape fractions decreasing at higher redshifts, is seen in Kimm & Cen (2014), although Razoumov & Sommer-Larsen (2010) report an opposite evolution, perhaps due to their modelling of dust, which is not accounted for in CoDa II (their dust UV optical depth scales linearly with cell density and metallicity, the latter is expected to increase on average over the course of their simulation).
3.3 What drives the decrease of escape fractions with mass?
The main trend in all escape fraction plots is a decrease with increasing mass. Since halo escape fraction is determined by density and neutral hydrogen fraction, we want to investigate these particular properties. We show the distributions of the gas properties of star forming cells in Fig. 7, and their contribution to the SFR and escape luminosity of their halo mass bins.
The top left panel of Fig. 5 shows the distribution of cell transmissions weighted by their SFR for three representative mass bins, normalised by the total SFR in each mass bin (so that the integral of each histogram is 1). The star forming cells of the lowest mass bin are concentrated around high transmission values, as observed in individual haloes and yielding the low mass plateaus of f ray esc values close to 1. There is a stark difference with the intermediate mass bin, where star-forming cells present a much wider spread of transmission values. The distribution is dominated by cells with intermediate opacity, and peaks at a transmission of ∼ 0.3. For the most massive mass bin, however, the dispersion in transmissions is even higher. Between a few times 0.1 and 10 −2 the distribution is almost flat, and it extends to extremely opaque cells with transmissions as low as 10 −6 and below. There are very few star forming cells with a transmission of 1. This implies that most of the absorption of UV photons within our most massive haloes occurs within the cells containing the sources. We remind the reader that this is the distribution of the transmission of halo cells, i.e. 1 value per cell, and not the distribution of the halo-averaged (which would yield 1 value per halo). This is why the distributions extend much lower than in Fig. 1, which show only halo-averaged values.
We recompute this histogram, weighing this time by the cells' escape luminosity L esc . The result is shown in the top right panel of Fig. 5. This allows us to quantify the contribution of cells to the total escape luminosity of haloes. Unsurprisingly, low mass haloes' escape luminosities originate from high transmission cells. For intermediate and high mass haloes, the situation is slightly more contrasted: while most of the escaping photons originate from high transmission cells, a small fraction (10-15%) are actually produced by moderately opaque regions with transmission below 0.1.
In order to gain deeper insight into the properties of UVbright and UV-dark cells, we further examine their physical properties.
The middle left panel of Fig. 5 shows the distribution of cells' gas density for our three representative mass bins, weighted by their SFR, and normalised by the total SFR in each mass bin. The mode of the distribution shifts to higher densities with increasing halo mass, and the density of star-forming cells in the high mass bin extends up to 10 H/cm 3 . This is in stark contrast with the right panel of Fig.  5, which shows the same distribution, but this time weighted by the cells' escape luminosity L esc . The distributions for the 2 most massive mass bins are sharply truncated at ∼ 0.3 H/cm 3 , therefore showing that in CoDa II, all stars forming at a density larger than this do not contribute to the escape luminosity of their host halo because their cell transmission is too low.
The bottom left panel of Fig. 5 shows the distribution of cells neutral hydrogen fractions for our three representative mass bins, weighted by their SFR, and normalised by the total SFR in each mass bin. For the lowest mass bin, the neutral fraction distribution has one strong peak at just over 10 −4 , which explains their high transmission. The value of ≈ 10 −4 reflects the ionisation equilibrium for a cell of CoDa II resolution at z=6, with an over-density of δ ≈ 50 that contains a single emitting stellar particle.
In the case of the 10 10−11 M mass bin, haloes' cells are split more or less evenly between ionised and neutral, with two peaks, one centred around 10 −3 and the other one around a few times 10 −1 in neutral fractions.
A possible origin for the binary nature of the distribution may be the rapidity with which dense star forming cells ionise and recombine. In this context, cells would jump very quickly from one peak to the other, depending on their star formation activity or lack thereof, and spend very little time between the peaks.
The star forming cells of the highest mass bin are neutral in majority. Though the distribution presents a small ionised peak as well, the high neutral fraction peak is hugely dominant (about 90% of the SFR takes place there) when compared to the ionized peak.
The bottom right panel of Fig. 5 shows the same distribution, but this time weighted by the cells' escape luminosity L esc . It shows the contribution of the cells in terms of escaping photons. For low mass haloes, the uni-modal distribution is almost unchanged with respect to the left panel. However, for the two higher mass bins, this different weighing has a strong impact on the relative modes of the distribution: indeed, the high neutral fraction mode of the distribution has completely vanished, so that only the high ionisation mode remains (X HI = 10 −3 and below), showing that for these high halo masses, escaping ionising photons originate predominantly from strongly ionised regions.
Another striking aspect of the bottom right panel Fig.  5 is the existence of a tail of the distribution, extending to cells with very high ionisation (neutral fractions as as low as 10 −4 − 10 −7 ) and therefore completely transparent. This tail is completely absent for the low mass bin. By examining maps of the physical properties of haloes, such as Fig. 2, we found that such high ionisation within high mass haloes is typical of shocked regions. These shocks can either be accretion shocks as seen in Ocvirk et al. (2008Ocvirk et al. ( , 2016, or shocks due to supernovae explosions, or a combination of both. Indeed, several studies have shown supernova feedback to play an important role in the escape of ionising photons (Trebitsch et al. 2017;Kimm & Cen 2014), and we interpret this very high ionisation tail in our distribution as another manifestation of this effect.
Examining Fig. 2 in the light of these results also allows us to locate the regions contributing to the escape luminosity of the halo. The cells of the low ionisation peak, at X HI ≥ 0.1, belong to the opaque neutral central core of the halo, from which no ionising photons escape. The high ionisation mode at X HI ∼ 10 −3 and below, consists of cells located along the accreting gas filaments, up to the virial radius. Finally, the star forming cells in the very strong ionisation tail of the  distribution, at X HI < 10 −5 − 10 −6 are located in regions heated by supernova feedback.

Escape luminosities
Using our determinations of halo escape fractions we can now compute the halo escape luminosities as in Eq. 7. We show the resulting L esc as a function of halo mass and their evolution with redshift in Fig. 6. The escape luminosity increases with halo mass, despite the decrease of f esc . Indeed, at z=6 for instance, f esc decreases roughly as M −0.34 halo ∼ M −1/3 halo , as shown in Appendix B, whereas halo SFR increases as M 5/3 halo , and therefore the product L esc ∝ M 4/3 halo increases with halo mass. At the low mass end, L esc flattens as we reach the quantization limit of star-formation in the simulation: between 10 8 − 10 9 M , star-forming haloes host only one emitting star particle of the same elementary mass M = 11732M . The evolution of L esc with redshift reflects that of the escape fraction f esc . The impact of star formation suppression by radiation feedback is not readily seen in the solid lines of the figure because they represent the average L esc of star-forming haloes only. However, the dot-dashed lines shows the average L esc for all haloes, where the suppression-driven decrease of SFR with redshift leads to lower L esc at low redshifts below 10 9 M .

RESULTS: PHOTON BUDGET
We now turn to investigating the contribution of haloes of different masses and different M AB1600 to cosmic reionization in CoDa II.

Photon budget versus mass
We sort the CoDa II dark matter halo population into 40 logarithmic mass bins between 10 8 M and 10 12 M . We define the total escape luminosity of a given mass bin as the sum of the escape luminosity of all haloes within that mass bin. This quantity depends on the total number of haloes in the CoDa II volume, and we wish to conduct our study using a quantity independent of simulation box size, to ease comparison with future semi-analytical models and simulations. Hence, we further define the escape emissivity, as the total escape luminosity of a given mass bin divided by the simulation volume.
The left panel of Fig. 7 shows this escape emissivity as a function of halo mass, at 5 epochs (full lines). This represents the cosmic ionizing photon budget for the CoDa II simulation, i.e. the distribution of the contributions of each mass bin to the total rate of photons reaching the IGM and driving reionization, for a 1 cMpc 3 volume. The contribution to the cosmic escape emissivity culminates around ∼ 10 9.5 M between z=6 and z=8 and decreases at lower and higher masses. This leads to a photon budget that is dominated by haloes of a few times 10 8 up to a few times 10 10 M between these redshifts.
In order to quantify the impact of the smaller escape fraction of massive haloes on the photon budget, we also show in the left panel of Fig. 7 the intrinsic photon production budget, i.e. proportional to the SFR of haloes (dotted line), as compared to the escaping photon budget. At z=6, for instance, intrinsic photon production peaks at 10 10.5 M . This is the result of two opposite trends: SFR increases with increasing halo mass, whereas the number abundance of haloes decreases, as dictated by the halo mass function. This competition yields 10 10.5 M haloes as the foremost contributors to the total cosmic star formation at z=6. However, their contribution to the escaping photon budget is strongly affected by their low escape fractions, to the point that, even though they dominate all other haloes in terms of SFR, they are out-shined in total escape emissivity by the 10 9 −10 10 M halo population.
To further compare the contribution of various mass bins, the right panel of Fig. 7 shows the cumulative version of the photon budget. It allows us to directly read from the plot that at z=6, for instance, haloes within 2.10 9 M M halo 10 10 M produce around 50% of all ionising photons. The shape of the cumulative photon budget is rather similar at all redshifts. However, the cumulative distributions shift towards lower masses as redshift increases, by about half a decade between z=6 and z=10. This is also seen as a shift of the peak of the distribution (left panel) between z=6 and z=8. This shift reflects the buildup and evolution of the halo mass function towards more abundant and more massive halo populations. At all redshifts but the highest, the cumulative distribution has its 10% and 90% levels separated by about 1.5 decades in mass, meaning this range is responsible for 80% of the ionizing photon budget. In CoDa II, the ionized volume fraction goes from 20% to 100% between z=6 and 8, and it is ∼ 50% at z=7. We therefore retain z=7 as the epoch most representative of "ongoing" reionization and detail the photon budget for this redshift: we read from the cumulative distribution that the photon budget at this epoch is dominated by galaxies with dark matter masses Figure 7. Left : escape emissivity as a function of halo mass for several CoDa II epochs. The displayed emissivity is the sum of all haloes' escape luminosities in the mass bin considered at that redshift, divided by the simulation volume. There are 10 mass bins per mass decade. Full lines correspond to the escape emissivity whereas dashed lines correspond to the intrinsic (unabsorbed) emissivities (i.e. derived directly from the star formation rate). Right : Cumulative version of left panel.
of 6.10 8 − 3.10 10 M , which produce 80% of the ionizing photons reaching the IGM. They are therefore the main drivers of cosmic reionization.

Photon budget by M AB1600
The luminosity function of CoDa II haloes was shown to be in good agreement with observations in Ocvirk et al. (2018). Here we use the halo magnitudes to recast our photon budget analysis into a M AB1600 scale instead of the halo mass scale. Fig. 8 shows the total ionising photon contribution as a function of M AB1600 (left), as well as the equivalent cumulative distribution (right). The shape of the distribution and its temporal evolution are similar to that obtained as a function of mass (Fig. 7). This is a direct consequence of the SFR -halo mass relation shown in Ocvirk et al. (2018), which drives a M AB1600 -halo mass relation.
Combined, these figures illustrate that the main contributors to reionisation lie within a magnitude range of around -12 to about -19. Which is in broad agreement with the semi-numerical results of Liu et al. (2016) for z>7. More specifically, the 80% escape luminosity range (reading the 10%-90% levels of the cumulative escape luminosity distribution function) is M AB1600 = [-13,-19] at z=6, and [-12,-17] at z=8. This shift with redshift is due to the buildup of the galaxy mass function.
This suggests that very deep surveys in cluster fields, such as Bouwens et al. (2017) and Atek et al. (2018), if reliable down to M AB1600 = -13, are indeed starting to see the bulk (>80%) of the population driving cosmic reionization at z=6. However, at z=6, cosmic reionization is already finished in CoDa II, and in the middle of reionization, i.e. at z=7, the galaxies seen by Bouwens et al. (2017) are brighter than M AB1600 = -17, and Fig. 8 shows that these can only account for ∼ 20% of the ionizing luminosity. In order to see the bulk of galaxies driving reionization when it is in full spin, surveys would need to achieve a similar M AB1600 =-12 depth at z=7.
Finally, early reionization at z=15 is driven by galaxies of a very narrow range of halo masses, 10 8 − 10 9 M , corresponding to magnitudes fainter than M AB1600 =-14, out of reach of current and future planned observatories at these redshifts. Katz et al. (2018) uses tracers and RAMSES-RT RHD simulations to study the contribution of haloes to the cosmic ionising luminosity. Although their technique differs in many respects from ours (we do not use such tracers, they use variable speed of light and adaptive mesh refinement), their results are in rather good agreement with ours. At z=8, they find that 70% of the ionising luminosity is produced by haloes of 10 9 M M halo 10 10 M (from their Fig. 6). We can read directly from Fig. 7 that at the same redshift, this mass range is responsible for 60% of the ionising photons in CoDa II. However, at z=6, Katz et al. (2018) find that high mass haloes of 10 10 M produce the majority of ionizing photons (60%), while we find that they account for only 40 % of the ionising luminosity, i.e. the largest contribution originates from haloes less massive than 10 10 M in CoDa II. Nevertheless, if we consider the large temporal fluctuations in their results, on the 10-20% level, their results are not severely inconsistent with ours. At higher redshifts, though (z>12), the Katz et al. (2018) values fluctuate too much for a meaningful comparison, due to the smaller size of their simulation domain, yielding a smaller sample of galaxies. Also, because of their smaller box size, their sample is devoid of haloes more massive than 10 11 M even at z=6, unlike in CoDa II, where such massive haloes are present already at z=10. However, their low number density and escape fractions prevent them from contributing significantly to cosmic reionization: their total escape luminosity is less than 10% at all redshifts, which is why our results are in fair agreement with Katz et al. (2018), even though they do not include these high mass haloes. Yajima et al. (2011) find that haloes below M halo 10 10 M contribute about ≈ 75 % of the ionizing luminosity at z=6 (summing the 2 lowest mass bins of the z=6 panel of their Fig. 12). Again, this is in reasonable agreement with our findings, although at this redshift the largest contribution (45%) comes from haloes in the mass range 10 9 − 10 9.5 M , while the contribution of this mass range is about 2 times smaller in CoDa II. This discrepancy could be due to the lack of radiative feedback on the SFR of low mass haloes in (Yajima et al. 2011): indeed, their study performs RT as post-processing whereas radiation and hydrodynamics are fully coupled in CoDa II, which mitigates star formation in low mass haloes, as shown in Ocvirk et al. (2018), Dawoodbhoy et al. (2018) and similarly in Wu et al. (2019); Ma et al. (2018), and therefore intrinsically reduces their contribution to cosmic reionization. Finally, we address the possible dependence of our results on spatial resolution. We show in appendix A that indeed, increasing spatial resolution in RAMSES-CUDATON by a factor of 4 may yield lower halo escape fractions by a factor of 2, while retaining a similar slope. Such a global rescaling over all masses, without changing the slope of the f esc -halo mass relation, should not dramatically affect our results on the photon budget and the predominant halo mass scale driving reionization, because it does not alter the relative contributions between halo mass bins.

CONCLUSION
We use the CoDa II fully coupled RHD simulation of the EoR to study the photon budget of galaxies during the EoR. To do so, we start out by investigating the escape fractions of CoDa II galaxies. We find that the halo escape fraction (i.e. the fraction of ionizing photons produced by the halo's stars reaching the virial radius) is a decreasing function of halo mass.
To gain insight into the evolution of the f ray esc with halo mass, we examine the properties of the halo cells as a function of their star formation rates and escape luminosity. We find that for intermediate and high mass haloes, the neutral fractions of star forming cells exhibit a strongly bi-modal distribution, with a neutral mode and an ionised mode at X HI ∼ 10 −3 . The neutral mode is completely opaque, meaning that escaping ionising photons originate from the star forming, ionised regions of the haloes. The halo escape fractions we obtained closely reflect the distribution of the star forming cells between the neutral opaque mode and the ionised, transparent mode. For instance, CoDa II high mass haloes (10 11 M ) have an average halo escape fraction of ∼ 10% because 90% of their young stars reside in central, dense, fully opaque regions, while the remaining 10% of their young stars reside in transparent regions allowing their photons to escape.
Moreover, we find a slow evolution of the halo escape fraction with redshift: haloes of a given mass are more opaque at higher redshift. This is due to the fact that for a fixed mass, haloes at higher redshifts tend to be more concentrated than their low redshift counterparts.
In Appendix B, we provide a functional form fit to our average halo escape fraction results, so as to allow its use in semi-analytical models of the EoR such as 21cmFAST (Mesinger et al. 2011) or Fialkov et al. (2013. We then use the halo escape fractions of our haloes to investigate the contributions of galaxies of various masses to the total ionising emissivity during the EoR. We show that CoDa II galaxies within 3 × 10 10 M M halo 6 × 10 8 M produce about 80% of all the ionising photons reaching the IGM at z=7, which is the middle of reionization in CoDa II (x HI = 50%). They can therefore be considered as the main drivers of cosmic reionization, although, at z=6 (8), the mass range accounting for 80% of the photon budget is slightly more (less) massive, by 0.25 dex.
The foremost mass range reionizing the Universe emerges as the result of a competition between the different processes exposed throughout this paper, and can be summarised as follows : the numerous low mass haloes are too inefficient at forming stars to contribute significantly despite their high halo escape fractions, whereas the high mass haloes are too few and have escape fractions that are too low to contribute significantly, despite their high star formation rate. As a consequence, the low mass end (below 5.10 9 M ) and the high mass end (above 5×10 10 M ) contribute respectively only less than 10% each to the total ionising photon budget between z=8 and z=6.
Our results are in reasonable agreement with the (not exhaustive) literature reviewed, despite a number of differences in numerical treatment, and assumptions on stellar populations and their feedback, which explain the deviations from our results.
Ideally, we would like to follow up on our study by pushing future CoDa II-like simulations to higher spatial resolutions, possibly using AMR to provide a better description of the ISM, and its processes and if possible rely less on a sub-grid escape fraction, as well as improved physics such as chemical enrichment, AGN mechanical and radiative feedbacks, and their possible contribution to reionization. The continued growth of supercomputers, thanks to hybrid nodes mixing many-core CPUs and GPUs, may allow us to get there in the near future, provided we overcome a number of technical hurdles related to code architecture and optimization.

ACKNOWLEDGEMENTS
This study was performed in the context of several French ANR (Agence Nationale de la Recherche) projects. PO acknowledges support from the French ANR funded project ORAGE (ANR-14-CE33-0016). ND and DA acknowledge funding from the French ANR for project ANR-12-JS05-0001 (EMMA). ITI was supported by the Science and Technology Facilities Council (grant numbers ST/F002858/1 and ST/I000976/1) and the Southeast Physics Network (SEPNet). JS acknowledges support from l'Oréal-UNESCO "Pour les femmes et la Science" and the "CNES (Centre National d'études spatiales)" postdoctoral fellowship programs. KA was supported by NRF (Grant No. NRF-2016R1D1A1B04935414). GY acknowledges financial support by the MINECO/FEDER under project grant AYA2015-63810-P and MICIU/FEDER under project grant PGC2018-094975-C21. PRS was supported in part by U.S. NSF grant AST-1009799, NASA grant NNX11AE09G, NASA/JPL grant RSA Nos. 1492788 and 1515294, and supercomputer resources from NSF XSEDE grant TG-AST090005 and the Texas Advanced Computing Center(TACC) at the University of Texas at Austin. The CoDa II simulation was performed at Oak Ridge National Laboratory/Oak Ridge Leadership Computing Facility on the Titan supercomputer (INCITE2016 award AST031). Processing was performed on the Eos and Rhea clusters. Auxiliary simulations were performed at pôle HPC de l'Université de Strasbourg (méso-centre). The simulations used for the resolution study were performed on CSCS/Piz Daint (Swiss National Supercomputing Centre), as part of the "SALT: Shining a light through the dark ages" PRACE allocation obtained via the 16th call for PRACE Project Access (project id pr37). A series of test simulations for the initial conditions of CoDa II were performed at LRZ Munich within the project pr74no. This work made use of v2.1 of the Binary Population and Spectral Synthesis (BPASS) models as last described in Eldridge et al. (2017).

APPENDIX A: RESOLUTION STUDY OF f ray esc
We have shown that the main features of our f ray esc values are that they decrease with increasing halo mass above a certain mass scale, with haloes of 10 9 M having high f ray esc values, and increase slightly with the progress of the EoR. Since this first trend is driven by the presence of dense, neutral and UV opaque cells in haloes, one may expect that increasing the number of resolution elements (and thus within the same volume and same total mass, increasing the maximum possible density) could affect the exact relation between f ray esc and halo mass, as well as the mass scale at which the break between trends occurs. In order to investigate this, and to test of the robustness of our previously presented results, we proceed to study the resolution convergence of f ray esc . We ran a series of high resolution simulations using RAMSES-CUDATON : we used a 4 cMpc.h −1 sided box, with 1024 3 resolution elements (ie : 4 times the spatial resolution of CoDa II, i.e. 64 times better mass resolution). We will call it high-res from now on. We performed our previous analysis on the high-res simulation in order to test the convergence of f ray esc in CoDa II. Although these boxes were run with the same code, the physical setup was different to that of CoDa II in the following ways: the initial conditions are necessarily different because it uses a smaller box. Moreover, because of the small box size of the high-res sim, it contains much fewer star forming haloes of a given mass than CoDa II, yielding noisy measurement, making comparison challenging.
The left panel of Fig. A1 shows f ray esc as a function of halo mass for the high-res simulation. This high resolution case also presents high f ray esc values for low mass haloes, as well as decreasing f esc with halo mass. Moreover, f ray esc increases on average with time for low mass haloes, as in CoDa II. The right panel presents a direct comparison of these averages with the fitting formula for CoDa II at the same redshifts where appropriate (10 z 6, the domain of validity for our fitting formula given in B1).
However, as anticipated, there are differences. In the high-res box, the average f ray esc is lower for all masses than in CoDa II, and the slope with mass is slightly more pronounced. However, there is a large scatter around the average of the high-res dataset. Indeed, there are only a few hundred star forming haloes of all masses in the high-res box at z=5.7. Reassuringly, the difference between the two boxes is akin to a global re-normalisation of the average f ray esc . Therefore, while numerical resolution may change the absolute L esc of a given halo mass bin, it is not likely not change the relative balance between mass bins in the photon budget, which is our main result.

APPENDIX B: ESCAPE FRACTION FITS
In this section, we propose simple functional forms for the both the average of f ray esc and the SFR-weighted average of f ray esc as a function of halo mass and of redshift. These could be useful as a f ray esc model for projects in which the determination of f ray esc is impossible or difficult : either when working with simulated data that wasn't produced with fully coupled radiation-hydrodynamics simulation codes, performing radiative transfer in post-processing, or with simulations with lower spatial resolution than CoDa II, and also for semianalytical models of the EoR (Mesinger et al. 2011;Fialkov et al. 2013).
In all cases, we caution the reader that the functions presented here represent the halo escape fraction, i.e. in order to obtain the net halo escape fraction of haloes, f net esc , one needs to multiply f ray esc by f sub esc (0.42), as show in Eq. 5.

B1 Average f ray esc
Based on the trends presented by the average values of f ray esc as a function of mass and of redshift between z=6 and z=10 shown in Fig. 4, we opted for a power law of halo mass, which becomes steeper with decreasing redshift, saturating at 1, and using z=7 and 10 11 M as pivot in redshift and mass. We propose the following functional form (Eq. B1): where β = −0.34, and z 7 = z/7. Where f ray esc,11 and M 11 are picked as the f ray esc , and M halo values that represent the point were average curves from z=6 through to z=10 seem to merge. z 7 is chosen to be 7.
The slope variation parametered by β was determined by adjusting the value by hand, since we want to represent the average curves and not the whole dataset (in which case we would perform a fit).
Due to noisy appearance of the data for M halo > 10 11 M , the time behavior of f ray esc within this mass range is unclear, which is why we chose to simply model the evolution of f ray esc with mass as the extension of the f ray esc (M halo , z = 8) fit, in order to produce indicative values despite the high amount of noise in the average curves for these masses.
The average curve corresponding to z=15 behaves somewhat differently. It's mass slope seems to be very close to that of the z=10 average curve, but it is offset in escape fraction by ∼ 0.25 dex. Hence, we cannot use the same simple fitting routine and formula for this redshift. Instead, we tentatively provide the following fit (Eq. B2) for f ray esc for 15≥z>10.
Where gamma was adjusted by hand to γ = 1.64 in to order to achieve plausible values.
thick lines), where they are also compared to the mass bin averages of f ray esc (full, thin lines).
B2 SFR weighted average f ray esc We also provide fits for the SFR-weighted average values of f ray esc as a function of mass and of redshift between z=6 and z=15. As can be seen in Fig. B2, the SFR-weighted average of f ray esc has a slightly different behaviour. To fit it, we opted for a double power law of halo mass, with a knee separating both laws, that shifts with redshift. We propose the following functional form (Eq. B3): f esc = f knee esc (z)f knee (M halo , z); if M halo ≥ M knee (z) , f max esc (z)f max (M halo , z); if M halo < M knee (z) , = min f 0,max esc z β 6 , 1.0 , f knee esc (z) = min f 0,knee esc z γ 6 , 1.0 , ∆(z) = log 10 (f knee esc (z))−log 10 (f max esc (z)) log 10 (M knee (z))−log 10 (10 8 ) , M knee (z) = 3 × 10 9 z f max esc (z) is the maximum value below M knee , it is f 0,max esc at z=6, and evolves as z β . f knee esc (z) is the where both power laws join at M knee (z), it is f 0,knee esc at z=6, and evolves as z γ . ∆(z) is the slope of the power law below M knee (z), it is defined so as to reach f max esc (z) when M=10 8 M . M knee (z) gives the mass where the power laws join, it is 3 × 10 9 M at z=6, and evolves with z as z ζ . Figure B2. SFR-weighted average escape fractions (thin lines) of star-forming haloes and corresponding fits (thick lines) at z=6, 7, 8, 10, 15. f max (M halo , z) accounts for the mass slope for M halo < M knee (z) given by ∆(z). f knee (M halo , z) accounts for the mass slope for M halo ≥ M knee (z) given by δ.
Since we want to reproduce the behavior with mass and with redshift of the SFR weighted averages of f ray esc , and not model the full distribution of points, we again adopt the simple approach of adjusting the fit by hand (as opposed to computing the fit of the model to the data sample).
The adjusted values are summarized in Eq. B5 below.
The aforementioned fits are presented in Fig. B2 (full, thick lines), where they are also compared to the SFRweighted average f ray esc measure in CoDa II (full, thin lines).
APPENDIX C: COMPUTING f ray esc Fig. C1 shows a simplified, explanatory drawing of the computation process for f ray esc for an individual star forming halo cell. Figure C1. Explanatory drawing of the computation of f ray esc . For each cell containing an emitting particle, we trace 768 rays from it's centre to a sphere of radius r 200 centred on the halo centre given by the fof halo finder. By averaging this result for every emitting cell and weighting by SFR, we obtain the halo value f ray esc at r 200 .