Abstract
The article provides a toolkit for pivotal necessary statistical steps in analysing data from studies conducted to investigate the influence of teaching quality on student outcomes. It further helps researchers make informed decisions about their choices for statistical checks and the analytical model. Issues we elaborate on are measures of reliability at different levels of the model and for different teaching quality assessments, and how decisions about the statistical model influence the estimated effects. We use data from a teaching quality video study (N = 958 students in 41 classes) to address these statistical issues and demonstrate the necessary steps when evaluating and analysing data from such studies. Results show that inferences from analyses can differ depending on the applied statistical model. These findings imply that practitioners should be cautious in interpreting the results from a single modelling approach and should consider running multiple models to compare the consistency of the results.