Abstract
Correct identification of different human epithelial materials such as from skin, saliva and vaginal origin is relevant in forensic casework as it provides crucial information for crime reconstruction. However, the overlap in human cell type composition between these three epithelial materials provides challenges for their differentiation and identification when using previously proposed human cell biomarkers, while their microbiota composition largely differs. By using validated 16S rRNA gene massively parallel sequencing data from the Human Microbiome Project of 1636 skin, oral and vaginal samples, 50 taxonomy-independent deep learning networks were trained to classify these three tissues. Validation testing was performed in de-novo generated highthroughput 16S rRNA gene sequencing data using the Ion Torrent™ Personal Genome Machine from 110 test samples: 56 hand skin, 31 saliva and 23 vaginal secretion specimens. Body-site classification accuracy of these test samples was very high as indicated by AUC values of 0.99 for skin, 0.99 for oral, and 1 for vaginal secretion. Misclassifications were limited to 3 (5%) skin samples. Additional forensic validation testing was performed in mock casework samples by de-novo high-throughput sequencing of 19 freshly-prepared samples and 22 samples aged for 1 up to 7.6 years. All of the 19 fresh and 20 (91%) of the 22 aged mock casework samples were correctly tissue-type classified. Moreover, comparing the microbiome results with outcomes from previous human mRNAbased tissue identification testing in the same 16 aged mock casework samples reveals that our microbiome approach performs better in 12 (75%), similarly in 2 (12.5%), and less good in 2 (12.5%) of the samples. Our results demonstrate that this new microbiome approach allows for accurate tissue-type classification of three human epithelial materials of skin, oral and vaginal origin, which is highly relevant for future forensic investigations.