Abstract
The pragmatically annotated corpus of spoken Irish English, SPICE Ireland, offers the possibility to explore, analyze, or, as our contribution does, train systems to automatically detect directives in English. In this study, we evaluate the automatic classification and compare directives Irish English with directives in British English by using lexical signals in the data sets. To do so, we apply and evaluate two approaches from machine learning, document classification with logistic regression, and deep learning with fastText. Both approaches reach a similar, satisfactory, performance on the task of classifying previously unseen sentences as directive or non-directive: up to 90.5% accuracy, and up to 74.2% Kappa. The reported features deliver a large inventory of indicators for speech acts, such as please indicating imperative or what for interrogatives, but also less obvious indicators, such as wait and you know. The results suggest that Irish English contains significantly more directives than British English, except in formal contexts, but may be affected by the strong bias of our automatic classification. Our error analysis shows that implicit directives are missed more often, indicating that contextual, social, situational or prosodic knowledge is vital for a minority of the instances. Our evaluations indicate that classification performance is similar on Irish and British data.