Abstract
Coding in social sciences is a process that involves the categorisation of qualitative or quantitative data in order to facilitate further analysis. Coding is usually a manual process that involves a lot of effort and time to produce codes with high validity and interrater reliability. Although automated methods for quantitative data analysis are largely used in social sciences, there are only a few attempts at automatically or semi-automatically coding the data collected in qualitative studies. To address this problem, in this work we propose an approach for automated coding of social behaviours and environments based on verbatim transcriptions of everyday conversations. To evaluate the approach, we analysed the transcripts from three datasets containing recordings of everyday conversations from: (1) young healthy adults (German transcriptions), (2) elderly healthy adults (German transcriptions), and (3) young healthy adults (English transcriptions). The results show that it is possible to automatically code the social behaviours and environments based on verbatim transcripts of the recorded conversations. This could reduce the time and effort researchers need to assign accurate codes to transcribed conversations.