In recent decades considerable progress has been made in climate model development. Following the massive increase in computational power, models became more sophisticated. At the same time also simple conceptual models have advanced. In this study we validate and compare three hydrological models of different complexity to investigate whether their performance varies accordingly. For this purpose we use runoff and also soil moisture measurements, which allow a truly independent validation, from several sites across Switzerland. The models are calibrated in similar ways with the same runoff data. Our results show that the more complex models HBV and PREVAH outperform the simple water balance model (SWBM) in case of runoff but not for soil moisture. Furthermore the most sophisticated PREVAH model shows an added value compared to the HBV model only in case of soil moisture. Focusing on extreme events we find generally improved performance of the SWBM during drought conditions and degraded agreement with observations during wet extremes. For the more complex models we find the opposite behavior, probably because they were primarily developed for prediction of runoff extremes. As expected given their complexity, HBV and PREVAH have more problems with over-fitting. All models show a tendency towards better performance in lower altitudes as opposed to (pre-) alpine sites. The results vary considerably across the investigated sites. In contrast, the different metrics we consider to estimate the agreement between models and observations lead to similar conclusions, indicating that the performance of the considered models is similar at different time scales as well as for anomalies and long-term means. We conclude that added complexity does not necessarily lead to improved performance of hydrological models, and that performance can vary greatly depending on the considered hydrological variable (e.g. runoff vs. soil moisture) or hydrological conditions (floods vs. droughts).