This study provides the first comparison of 2 methods proposed to in- crease the structure of selection interviews: frame-of-reference (FOR) rater training for interviewers and providing interviewers with descrip- tively anchored rating scales. In contrast to descriptively anchored rat- ing scales, evidence for the efficacy of FOR training for interviewers is still missing even though its effects have been established in other domains. To evaluate the effectiveness of the 2 methods, we used a 2 × 2 design in which both methods were manipulated independently. Participants observed and rated different interviewees’ performance in a set of videotaped interviews. We found that both methods led to sub- stantial, and comparable, improvements in both rating accuracy and interrater reliability in comparison to a control condition in which nei- ther method was used. Furthermore, even though both methods have the same aim (i.e., enhancing the evaluation process by providing a common evaluative standard for raters), combining both methods led to further improvements in rating accuracy beyond the effects of the individual methods. Practical implications for selection interviews are discussed.