Abstract
This article introduces a quantitative, data-driven method to identify clusters of groups of data points in longitudinal data. We illustrate this method with examples from first-language acquisition research. First, we discuss a variety of shortcomings of current practices in the identification and handling of stages in studies of language acquisition. Second, we explain and exemplify our method, which we refer to as variability-based neighbour clustering, on the basis of mean length of utterance (MLU) values and lexical growth in two different corpora. Third, we discuss the method's advantages and briefly point to further applications both in language acquisition and in diachronic linguistics.