Abstract
In this paper, we demonstrate how a large corpus, consisting of about 10 000 articles describing Swiss alpine landscapes and activities and dating back to 1864, can be used to explore the use of language in space. In a first step, we link landscape descriptions to geospatial footprints, which requires new methods to disambiguating toponyms referring to natural features. Secondly, we identify natural features used to describe landscapes, which are compared and discussed in the light of previous work based on controlled participant experiments in laboratory settings and more exploratory ethnographic studies. Finally, we use natural features in combination with geospatial footprints to investigate variations in landscape descriptions across space. Our contributions are threefold. Firstly, we show how a corpus composed of detailed descriptions of natural landscapes can be georeferenced and mapped using density surfaces and an adaptive grid linking footprints to articles. Secondly, 95 natural features are identified in the corpus, forming a vocabulary of terms reflecting known basic levels and their relationships to other more specific landscape features. Thirdly, we can explore the use of natural features in broader spatial and temporal contexts than is possible in typical ethnographic work, by exploring when and where particular terms are used within Switzerland with respect to our corpus. On the one hand, this enables us to characterize individual regions and, on the other hand, to measure similarity between regions, on the basis of associated natural features. Our methods could be adapted to different types of corpus, for instance, referring to fine granularity entities in urban landscapes. Our results are potential building blocks for attaching place-related descriptions to automatically generated sensor data such as photographs or satellite images.