References

The variability and trend of Arctic sea ice since the mid 1970s is well documented and linked to rising temperatures. However, much less is known for the first half of the 20th century, when the Arctic also underwent a period of strong warming. For studying this period in atmospheric models, gridded sea ice data are needed as boundary condi5 tions. Current data sets (e.g., HadISST) provide a historical climatology, but may not be suitable when interannual-to-decadal variability is important, as they are interpolated and relaxed towards a (historical) climatology to fill in gaps, particularly in winter. Regional historical sea ice information exhibits considerable variability on interannnualto-decadal scales, but is only available for summer and not in gridded form. Combining 10 the advantages of both types of information could be used to constrain model simulations in a more realistic way. Here we discuss the feasibility of reconstructing yearround gridded Arctic sea ice from 1900 to 1953 from historical information and a coupled climate model. We decompose sea ice variability into centennial (due to climate forcings), decadal (coupled processes in the ocean-sea ice system) and interannual 15 time scales (atmospheric circulation). The three time scales are represented by a historical climatology from HadISST (centennial), a closest analogue approach using the coupled control run of the CCSM-3.0 model (decadal), and a statistical reconstruction based on high-pass filtered data (interannual variability), respectively. Results show that di erences in the model climatology, the length of the control run, and inconsistent 20 historical data strongly limit the quality of the product. However, with more realistic and longer simulations becoming available in the future as well as with improved historical data, useful reconstructions are possible. We suggest that hybrid approaches, using both statistical reconstruction methods and numerical models, may find wider applications in the future. 25

tions. Current data sets (e.g., HadISST) provide a historical climatology, but may not be suitable when interannual-to-decadal variability is important, as they are interpolated and relaxed towards a (historical) climatology to fill in gaps, particularly in winter. Regional historical sea ice information exhibits considerable variability on interannnualto-decadal scales, but is only available for summer and not in gridded form. Combining 10 the advantages of both types of information could be used to constrain model simulations in a more realistic way. Here we discuss the feasibility of reconstructing yearround gridded Arctic sea ice from 1900 to 1953 from historical information and a coupled climate model. We decompose sea ice variability into centennial (due to climate forcings), decadal (coupled processes in the ocean-sea ice system) and interannual 15 time scales (atmospheric circulation). The three time scales are represented by a historical climatology from HadISST (centennial), a closest analogue approach using the coupled control run of the CCSM-3.0 model (decadal), and a statistical reconstruction based on high-pass filtered data (interannual variability), respectively. Results show that differences in the model climatology, the length of the control run, and inconsistent 20 historical data strongly limit the quality of the product. However, with more realistic and longer simulations becoming available in the future as well as with improved historical data, useful reconstructions are possible. We suggest that hybrid approaches, using both statistical reconstruction methods and numerical models, may find wider applications in the future.

Introduction
Arctic sea ice is a key variable in the climate system. It responds to changes in both atmospheric and oceanic circulation and affects the amount of coupling between them through the surface energy balance. As a consequence, Arctic sea ice is involved in feedback processes, some of which act to amplify global warming trends in the Artctic. 5 According to climate models, future Arctic warming is about 3 to 4 times the global average (Holland and Bitz, 2003) and Arctic sea ice might change abruptly (Holland et al., 2006).
Since about the 1970s, Arctic sea ice is decreasing rapidly, in line with a strong warming of Arctic surface air temperatures (Serreze et al., 2007;Stroeve et al., 2008). 10 In order to better understand the ongoing and future warming, it is helpful to look at the past (see also Goosse et al., 2007), in particular the early 20th century Arctic warming. Measurements suggest that annual mean surface air temperature increased by about 1.8 • C north of the polar circle between 1920 and 1950 (Polyakov et al., 2003a). The underlying mechanisms are not fully understood (Overland and Wang, 2005), but re-15 gional feedbacks involving sea ice could have played a role . Climate models could help to address the triggers and feedbacks operating during this warming period, but one limiting factor is the availability of sea ice information for forcing atmospheric models (GCMs).
The standard sea ice boundary condition used in GCM simulations is the Hadley 20 Centre HadISST data set (Rayner et al., 2003). In its early part (for the Arctic), this data set is mainly based on the compilations by Walsh (1978) and Walsh and Chapman (2001) denoted W&C hereafter. Because of the sparseness of observations prior to 1953, HadISST sea ice is strongly interpolated and relies on climatologies in many processing steps (see Rayner et al., 2003) (Polyakov et al., 2003b;Johannessen et al., 2004), denoted J&P hereafter. These series suggest more variability prior to around 1950, including a marked decrease during the early 20th century in some areas. Although the quality of some of the historical information remains to be confirmed, it would be beneficial for model studies to combine the advantages of both types of data, i.e., to produce a historical 5 data set that is spatially complete, covers the whole year, and has a realistic amount of variability. Kauker et al. (2008) model Arctic sea ice during the 20th century by constraining their model with reconstructed atmospheric forcing fields. This approach yields promising results, but is not suitable for applications where independence of atmospheric data is desired. 10 The aim of the present study is to explore an approach for reconstructing Arctic sea ice cover from 1900 to 1953 based on statistical reconstruction in combination with a coupled climate model. The goal is to stimulate discussions on the methodology of such a hybrid approach, which undoubtedly must be further improved in the future.

15
Historical sea ice information is taken from J&P and W&C. J&P give the sea ice extent for different Siberian marginal seas for each year in August. The estimations are based on ship observations in the early decades. From 1929 on, regular air-craft observations were carried out. However, regular (ship and aircraft) observations of the Kara and Chuchki seas started only in 1932. Gaps that occurred during World War II were 20 interpolated using regression models that were calibrated with atmospheric data. Note that the J&P data might have large errors that can not easily be quantified. In fact, the correlation between similar data from various sources often is quite low (see also Alekseev et al., 2008). Clearly, more work remains to be done on the quality assurance of the historical data. W&C provide mid-month values (April to August) of sea ice concen-25 tration in gridded form compiled from various sources. We kept only grid cells that are based on observations (interpolated data or grid cells that were filled with climatology are not used). As our reference sea ice data set we used HadISST. This means that our 1900-1953 reconstructions should be consistent with post-1953 HadISST sea ice as well as with historical sea-surface temperatures (SSTs) from HadISST. In particular, we used the historical  sea ice climatology from HadISST.
The reconstruction approach is partly based on climate model data. For this purpose 5 we used data from the CCSM-3.0 coupled control run b30.009 in T85 resolution, which is known to have a relatively good representation of sea ice (Parkinson et al., 2006;Wang et al., 2007). For this run constant 1990 forcings were used (a pre-industrial control simulation would also be an option, although 1900-1953 is not exactly preindustrial, but the 1990 run is certainly better validated). In order to account for possible 10 spin-up effects in the ocean and sea ice components, only the last 350 years of the simulation were used. Both observational and model data were interpolated to a common grid (a T85 Gaussian grid). Because the land mask was slightly different in the different data sets, only the smallest common sea area is used for the ice reconstructions.

The general approach
When statistically reconstructing atmospheric fields from atmospheric measurements, it is often assumed that the relation between the predictor variables and the field is the same over a range of variability time scales (e.g., Brönnimann and Luter-20 bacher, 2004, termed BL hereafter). For sea ice, such an assumption can not be made because not only atmospheric processes play a role, but also oceanic processes and sea ice dynamics (operating on different time scales). Therefore we distinguish between three time-scales of variability in an additive framework. Climate model data are used in several of the steps and hence our approach is a hybrid

Centennial variability
The response of the coupled ocean-atmosphere-sea ice system to varying external forcings on a centennial scale is accounted for by using a historical climatology from 5 HadISST sea ice data, 1900-1953.

Decadal variability
Oceanic processes, coupling between ocean and sea-ice as well as land surface processes might lead to systematic variability on decadal or longer time scales. In fact, some of the J&P series show substantial decadal variability. Statistical models could 10 possibly be used to extract this information similarly as for the interannual scale (discussed below). However, based on our experience, 60-100 degrees of freedom (or around 1000 years of data) would be required. The observational record of gridded sea ice data is far too short and even climate model simulations are too short and hence another approach has to be used. 15 We use a simple analogue approach, i.e., we select the 54-year period in the control run that fits best, based on low-pass filtered data, with all available historical information (highest average of correlation coefficients, giving equal weight to W&C and J&P).
The separation between decadal and interannual-to-multiannnual time scales is obtained with a Gaussian filter (σ=3 years, corresponding to a cut-off frequency of 16 20 years). Note that the filtering was performed for each calendar month separately. The filter width is somewhat arbitrarily chosen and is based on our previous work on oceanatmosphere coupled variability related to El Niño/Southern Oscillation (Brönnimann et al., 2007). It should be noted, however, that the time-scales of oceanic and atmospheric variability overlap and hence no filter will exactly discriminate between the two.

25
The "closest analogue" approach has the advantage that it requires only the defini-tion of a measure of similarity, but provides information to any desired level of detail. In this case, it provides year-round gridded data even if the historical data are summeronly, single time series. The assumption is that the model reproduces the most important processes which affect both summer and winter sea ice on decadal scales. There is an even more fundamental assumption behind the analogue approach, namely that 5 the climatologies from the model and from HadISST match. As this is clearly not the case (see also Wang et al., 2007) a regridding procedure is performed at a later step (see Sect. 4.1).

Interannual variability
Atmospheric processes play an important role on interannnual-to-multiannual time 10 scales, which is represented in the residual of the filtering procedure (hereafter termed high-pass filtered data). This variability is reconstructed using the same approach (principal component regression) as described in BL and references therein. In addition to the historical sea ice series, SST data north of 45 • N are also used as predictors as these data contribute additional information without producing additional dependence 15 (SSTs must be prescribed anyway) when forcing an atmospheric model. This would be a problem when using land surface air temperature, sea-level pressure or wind stress, and therefore such information is not used. In the following we give an overview of the procedure, but focus particularly on those points that differ from BL. The full procedure is shown in Fig. S1 (http://www. 20 clim-past-discuss.net/4/955/2008/cpd-4-955-2008-supplement.pdf). The most important difference to BL is that the reconstructions are calibrated in a climate model (the high-pass filtered control run) and not in an observational gridded data set, which would be too short. Hence, all observed data (W&C, J&P, SSTs) are also sampled in the model. The statistical transfer functions calibrated in the climate model data are then 25 applied to the historical data, which carries the assumption that the relation between predictors and predictand is the same in model and reality. Here we encounter the same problem as for the decadal scale, namely that the climatologies of model and 961 reality do not match (see Sect. 4.1 for a preliminary solution). Further tests are necessary to assess the validity of this assumption with respect to SST predictors given the a-posteriori regridding of the sea ice. Data from the same calendar month were used for calibration (not a three month moving window as in BL). Similar as in BL, the procedure was optimised based on 5 validation experiments. Of the 350 remaining years in the control run, 8 years were omitted on either side due to possible filtering artefacts. Two periods (the first and last 67 years) were retained only for validation, while the remainder (200 years) was used for calibration. The final model was based on this central portion only and not on the full calibration period as in BL. 10 The gridded data were not standardised, but the aggregated series (J&P) were standardised. Because historical sea-ice data are only available in summer, lagged predictor variables were used to reconstruct winter sea ice. For SSTs, we considered lags of 0, 1, 2, 3, and 4 months, with a linearly decreasing weight (i.e., 5/15, 4/15, 3/15, 2/15, 1/15). Hence, we assume that SSTs in May, for instance, have a large effect on sea ice 15 in May, but also affect, to a lesser extent, sea ice in June and so on. For J&P, which are only available for August, we consider a 23 months moving window with equal weight such that each month is predicted based on the neighbouring two Augusts (August is predicted only by August data of the same year). W&C data, which are given from April to August, are lagged such that each month's sea ice is predicted by the W&C sea 20 ice information for the same month and the previous month, except for sea ice from September through April, which is predicted by W&C sea ice information for August and April. Note that the exact choice of the weights is not crucial as the subsequent principal component (PC) extracts the relevant information.
In the case of the SSTs, after including the lags, the number of variables was extremely high and was reduced by an area weighted PC analysis, retaining 90% of the variance. The PC analysis was performed in the model calibration period and then expanded into the two validation periods (in the model data) and the historical data. After this step, equal weight was attributed to W&C, J&P, and SSTs. This was achieved by dividing each of the three data sets (including all lags; note that W&C data were also area weighted) by the standard deviation over the corresponding data set in the calibration period and by the square root of the number of variables of each data set. The weights were then applied to the historical data and to the data in the validation periods. 5 The actual reconstruction then is again a PC regression, whereby the amount of variance retained on both the predictor and the predictand side is optimised. Again, the PC analysis was performed in the model calibration period and then expanded into the two validation periods (in the model data) and the historical data.
The optimisation was based on the reduction of error statistics (RE , see Cook et 10 al., 1994) obtained from validation experiments. We used a modified version, termed RE * , which was defined as the variance-weighted average of RE over the individual PC time series. RE is between −∞ and 1, however, negative values do not carry much information and therefore were projected onto the interval (−1,0) using a logit transformation. These steps were necessary to condense the information into one 15 number that is comparable across different models. The optimised statistical model determined was then applied to the historical data in order to obtain the values of the PC time series of the predictands. The sea ice field is then obtained as a linear combination of the reconstructed values of the PC time series and the PC scores derived in the calibration period. 20 The historical data have gaps, which results in a different set of available variables for each month (which must be replicated in the model data). Hence, the entire procedure described above was performed for each time step individually, starting with subsampling the model data (all years) to match the available historical information (Fig. S1, http://www.clim-past-discuss.net/4/955/2008/cpd-4-955-2008-supplement.pdf). Each 25 time step was thus optimised individually.

Merging and validating
In a final step the three components were merged (though the results indicated that a regridding was necessary, Sect. 4.1). Values below 0% or above 100% can occur, which must be set to 0 and 100%, respectively. Also, missing values are possible because of the different land-sea masks. These are set to HadISST climatology.  Figure 2 shows the climatologies from HadISST (1900HadISST ( -1953 and from the CCSM-3.0 T85 control run (with 1990 forcings) for the months of March and September (corresponding to maximum and minimum sea ice coverage, approximately). There are clear 10 differences in the North Atlantic sector, particularly in spring. In CCSM-3.0 the ice free areas extend too far northeast. Also, the climatological sea ice edge in CCSM-3.0 is sharper than in HadISST. (Note that the CCSM-3.0 sea ice data were regridded from polar coordinates to the common T85 grid, which could have introduced artefacts. However, this can not explain the too sharp ice edge.) As a consequence, the recon-15 structed anomalies do not match the HadISST ice edge but produce floating ice in the ocean or holes in the ice when added to the climatology.

Climatologies
Solving this problem is key to any such reconstruction procedure and hence we briefly discuss several possibilities. The options include, but are not limited to: 1. Reconstruct latitudes of isolines of sea ice concentrations rather than a gridded 20 ice field. A deviation in latitude could then be added to a climatological latitude.
In practice, such an approach is very difficult because of multiple ice edges (folds), because of islands (producing non-linear "jumps" in the ice edge positions) and because the direction of ice edge movement is difficult to define.
2. Reconstruct Arctic sea ice hierarchically on different spatial scales, starting with a very coarse grid (or even just the total ice area or extent). Regional reconstructions (referenced to the climatological ice edge of each data set) could then be nested into the coarse reconstructions in order to "redistribute" the ice correctly. This approach is promising for the interannual component but a different solution 5 must be found for the decadal component.
3. Define a re-gridding procedure that maps the isolines of CCSM-3.0 climatology onto those of the HadISST climatology, apply the re-gridding to the merged interannual and decadal components.
We chose (3) and defined re-gridding functions for each calendar month based on the two climatologies. For each longitude, the latitude of several sea-ice coverage isolines was analysed. The northernmost ice edge was chosen in cases with multiple ice edges. Based on these isolines, hypothetical 0-and 100-percent isolines were extrapolated (mostly based only on the 10-and 20-percent and on the 80-and 90percent isolines, respectively). These isolines were corrected manually for artefacts  (year round). The longitude stretching in both cases has a latitudinally dependent amplitude, peaking near 80 • N in the former case and near 60 • N in the latter case. Note that the longitude stretching was applied first. Figure 1 shows the re-gridding applied to the climatology fields. Although they are now closer to HadISST than the raw climatologies, there are still differences in the European sector. More sophisticated techniques such as contour matching might help to 965 obtain better re-gridding functions. However, the underlying problem (a physical misinterpretation of the processes in this area) can not be solved. Using such "a posteriori" corrections therefore must be limited to correcting small displacements of sharp gradients in a system that is in principle physically well captured in the model. This needs further improvements on the model side.

Decadal component
The selected closest analogue period was model years 273-326. Although the choice was relatively robust when attributing different weights to different data series, the selected period does not reproduce the historical data well. This is shown in Fig. 2 for the Polykov et al. (2003) series. Overall, the correlations are slightly below 0.6, which is not satisfactory. Two problems might contribute: First, the control run is still relatively short compared to the length of the period so that the closest analogue is not necessarily close to the observations. Second, the differences in the climatology addressed above also affect the closest analogue. For instance, Fig. 2 shows that the Kara Sea is mostly ice free in August in CCSM-3.0 and therefore ice extent varies little, while 15 the observations show a large variability (the other Arctic marginal seas show a much better agreement in their variance). The latter problem could be solved by re-gridding CCSM-3.0 for the entire procedure (before searching for the closest analogue). But this would also imply a re-gridding of SST fields. In our approach, since J&P series are standardised, the assumption is that despite the difference in variance, the decadal 20 signal is captured correctly. In summary, the reconstruction of the decadal component could -and should -be improved and constitutes the weakest part of the approach.

Interannual variability
The interannual-to-multiannual variability of sea ice coverage can, at least theoretically, be well reconstructed from the information available in the historical period. This 25 is shown in terms of the RE * value for the optimal model for the period 1930-1950 ( Fig. 3). Two periods with different sets of predictors can be clearly distinguished. In fact, the 20 years shown are representative for the entire reconstruction period as no other major change in the availability of the predictor data occurs. Note that we performed sensitivity experiments using subsets of the predictors, which are indicated with different colours in the figure. 5 The results clearly show that the skill of the reconstruction has a strong seasonal cycle. The cycle is particularly large if only ice information is used, even though lags were incorporated. This indicates that the memory in the sea-ice system is relatively short. The dependence of winter sea-ice cover on last summer's ice is too small to yield skilful reconstructions. Incorporating SSTs, however, strongly increases the skill 10 in winter. RE * values are mostly above 0.5. Because W&C is not available during the War period, the RE * values for this period represent the skill of only using J&P or J&P plus SSTs. These sensitivity experiments in general show that W&C is clearly needed and is the single most important predictor data set. In summer, the skill is excellent (which is not surprising as all sea ice data are sum-15 mer data). Note that the skill very slightly decreases when adding new (partly redundant) data sets: W&C alone is better than W&C plus J&P, which again is better than all three data sets. However, the loss of skill in summer (caused by the difficulty of the PC optimisation in extracting only the signal out of too redundant data) is very minor relative to the gain of skill in winter, and hence we used the full model.

20
One factor that is still neglected in the current set-up is the contribution of errors in the historical data. These could be modelled by adding appropriate noise to the data (see e.g., Ewen et al., 2007). However, Polyakov et al. (2003b) state that the error is very difficult to quantify. Hence, even though interannual variability can theoretically be well reconstructed, there are strong limitations in its practical application.

Merged fields
Examples for reconstructed fields are shown for March (Fig. 4) and September (Fig. 5) for selected years (1912, 1928, 1937, and 1953, representing warm and cold conditions 967 in different part of the reconstruction period), together with the HadISST fields and the sum of interannual and decadal variability (before regridding). In March (Fig. 4), the reconstructed signal appears in the form of narrow bands in the North Atlantic and Barents Sea region as well as in the Bering Sea. Here, it is particularly important to have a regridding procedure that matches the largest anomaly with the climatological 5 ice edge position. In fact, upon closer inspection the difference between HadISST and the reconstructions as well as the interannual differences mirror the reconstructed decadal and interannual components relatively well. This shows that the approach, validated in the model, could in principle also work in the historical period and produces a larger regional variability than that found in HadISST. It does not mean, however, that the fields are realistic year by year. 1937 is known to have been a warm year in the Arctic (Polyakov et al., 2003a). However, HadISST shows no sign of sea ice decrease in this year. Interestingly, reconstructed interannual and low-frequency anomalies of sea ice cover also are positive in the North Atlantic sector both in March and September. Further validations are neces-15 sary to decide whether this result is correct or whether problems in both the historical data or in the low-frequency variability lead to this result. Data from Vinje (2001) suggest that sea ice was generally reduced in the Nordic Seas in 1937, though not in all areas and seasons.
In September (Fig. 5), the patterns of variability are quite different from those in 20 March. The largest signals are found in the Greenland and East Siberian Seas, and often the spatial pattern of the anomalies shows a dipole between the two or between the East Siberian Sea and the Beaufort Sea. Again, the differences from year to year as well as between HadISST and CCSM-3.0 mirror the reconstructed anomalies well (i.e., the re-gridding puts the anomalies approximately in the right place), and the re-25 constructions exhibit more spatial variability than HadISST. Figure 6 shows time series of seasonally averaged, Arctic sea ice concentrations (area averaged concentration poleward of 50 • N) from HadISST and from the reconstructions. HadISST shows an unrealistically low amount of variability from 1940 to 1953 in all seasons and low variability in winter at all times prior to 1953. The reconstructions show more variability prior to 1953 on decadal and interannual time scales, which in our opinion is more realistic and is in agreement with the regional historical data (note that the larger spatial variability found in the reconstructions averages out in this plot). However, this alone does not necessarily mean that the reconstructions 5 are better than HadISST, as becomes apparent from their interannual correlation. In view of the fact that HadISST does incorporate most of the available data (but might smooth out some of the variability) correlation coefficients of only 0.1-0.4 in the period 1900-1940 are sobering. Therefore, our reconstructions offer no alternative to the HadISST data except possibly in the 1940-1953 period (which might be a particularly 10 interesting period because of the strong El Niño signal, see . While we make our reconstructions available for comparability purposes, we currently do not recommend using the data for analysis.

Conclusions
The purpose of this paper is to discuss a hybrid approach for obtaining gridded sea 15 ice data for the first half of the 20th century that can be used to force atmospheric models. The approach distinguishes different variability time scales, which are reconstructed separately, and uses both statistical techniques and a climate model simulation. While the approach failed in the sense that the final data product is not superior to the available HadISST data, it has brought to light some fundamental difficulties in 20 reconstructing sea ice, summarized in Table 1, but also points to possible solutions.
Our approach shows the following advantages: year-round reconstructions (e.g., selecting analogue period based on summeronly data).
3. Reconstructing interannual variability yields promising skill scores. Using SSTs in addition to sea ice and using leads and lags gives reasonable reconstruction skill even in winter. 4. The final product is consistent with HadISST sea ice and SST by construction.
5. The final product has realistic regional and temporal variability.
In summary, although the described approach fails in the current situation, it could be successful if better historical data, physically more realistic models, and longer control simulations become available. Partitioning time scales creates two problems: the separation of scales (filter length) and the assumption of additivity (no interactions between scales).
Filter width and type need to be validated in future work.
Decadal variability can not be statistically calibrated and at the same time our chosen analogue approach is not satisfactory.
Both statistical approaches and analogue approaches could work, but 1000-year model integrations would be necessary in both cases.
Combining statistical reconstructions with a climate model simulation is problematic if the model climatology does not accurately represent the observed climatology (which is the case for almost all global models).
Regridding approaches or contour matching techniques can be used to match climatologies, but only if the differences are small. Regional models could be better suitable.
Calibrating statistical transfer functions in a model and apply them to observations is problematic. In addition, stationarity is a fundamental assumption in all reconstruction including the one described here.
The applicability of transfer functions must be tested by applying the technique to a number of simulations from many different models.
Current historical sea ice data sets are still of limited value for reconstructions as they are too sparse, summer only, and sometimes contradictory.
More efforts need to be put into data rescue and quality assurance of the historical sea ice data. Uncertainties and errors of the historical data need to be accounted for in the reconstruction procedure by adding noise.