Header

UZH-Logo

Maintenance Infos

Automatic Sleep Classification with Machine Learning


Malafeev, Alexander. Automatic Sleep Classification with Machine Learning. 2018, University of Zurich, Faculty of Science.

Abstract

Sleep is ubiquitous in nature. Humans spend a third of their lives sleeping. And yet, despite all the recent advances in the field, we still don’t know the purpose of sleep. However, sleep disorders are detrimental for health and quality of life and insomnia is one of the most common sleep disorders. Excessive daytime sleepiness or insufficient sleep decreases cognitive performance and may cause accidents. These facts suggest that understanding sleep and its regulation is very important. In the first chapter I summarized the most recent hypotheses on the purpose of sleep; then I described the gold standard of assessing sleep – polysomnography (PSG), sleep stages and particularly the electroencephalogram (EEG). Furthermore, an overview of machine learning tools is provided as they will be used to classify sleep stages and detect microsleep episodes. In the second chapter we implemented and tested 14 simple artifact detection methods and conducted a thorough analysis of their performance on two datasets, one comprised sleep recordings of healthy young subjects, the other one data recorded in patients with hypersomnia and narcolepsy. We found that clean average EEG power density spectra can be obtained using very simple methods. We got the best performance of artifact detection with thresholding slope, power in high frequency (25-90 Hz or 45-90 Hz) and the residual errors of an autoregressive model fitted to the EEG. It is not surprising that the power in high frequency range was a good predictor of an artifact as muscle artifacts are characterized by the power in high frequency range. Most methods showed good sensitivity. However, since we had chosen fixed false positive rate (FPR) of 10%, we excluded on average 16.3% of the epochs whereas experts excluded on average only 7% of the epochs. Our approach seemed reasonable as it leaves enough data for subsequent analyses. The main chapter (third chapter) describes the developed automatic sleep scoring algorithms. Scoring rules are complex and to some degree subjective. Despite the fact that human brain has superb image recognition abilities, sleep scoring is a difficult task. Thus, it is not possible to just program an algorithm which implements the scoring rules for sleep. Such a problem can be addressed however, with modern machine learning methods. Such algorithms learn from the examples which have already been analyzed by an expert. With these techniques we don’t even need to know how to score sleep stages ourselves, we just need examples of experts. We developed several algorithms ranging from basic machine learning tools to deep artificial neural networks. First, we engineered 20 features derived from EEG, EOG and EMG data. This process reduces the dimensionality of our data and makes classification of the data easier. We employed a random forest (RF) classifier in conjunction with a Hidden Markov Model (HMM) or a moving median filter (MF) to smooth the data. Alternatively, we applied artificial neuronal networks (ANN), Long-Short Term Memory (LSTM) networks, designed to handle time series to classify the data. We used our engineered features as input for these networks. Finally, we employed deep convolutional neural networks (CNNs) in combination with LSTM networks. Such algorithms (CNN-LSTM networks) work with raw data and do not require engineered features. We used the F1 score, a performance measure of multi-class data which takes both specificity and sensitivity into account, to evaluate the quality of the automatic scoring. We achieved a sleep stage classification quality close to the human expert in recordings of healthy subjects, with F1 scores above 0.8 for all stages except for stage 1. Stage 1 is difficult to score for a human scorer as well. F1 scores of stage 1 were slightly above 0.4 for most our methods, like the interscorer performance. Our methods trained on healthy participants performed slightly worse on the patient data than on the data of healthy subjects when they were trained only on the data of healthy subjects. However, the performance of the ANNs was better than RF in this case. Performance on the patient data improved when patient data were included into the training. We demonstrated that the methods which incorporate the temporal structure generally perform better. Further, the methods relying on the raw data performed slightly better than the feature-based methods. We think that we could not use the whole potential of ANNs due to the scarcity of the training data. Using these algorithms, we may score sleep fully automatically and analyze big amounts of data very quickly. Our CNN-LSTM network produced good results using just a single EEG channel. This was an unexpected result as we assumed that reliable detection of REM sleep would require EOG and EMG data. Such networks would allow on-line scoring of data recorded with portable devices. The fourth chapter is dedicated to the automatic detection of microsleep episodes (MSE). MSE are very short sleep fragments lasting 3 to 15 s. They often occur in sleep deprived people, in individuals who had insufficient sleep or under boring or monotonous conditions, and in patients with excessive daytime sleepiness. We engineered features and applied basic machine learning methods (support vector machine, random forest) to detect MSE. In a preliminary step we demonstrated that the methods work and reached very good specificity (0.99) and good sensitivity (0.74). Future improvement of MSE detection algorithms should include the temporal structure of the data, for example using LSTM neural networks. In summary, our preliminary analysis provides proof of concept that automatic detection of MSE based on sleep EEG data is feasible. All together, we could demonstrate that machine learning approaches perform well in detecting sleep stages and MSE. The final chapter provides an outlook on further improvements and future steps to be taken.

Abstract

Sleep is ubiquitous in nature. Humans spend a third of their lives sleeping. And yet, despite all the recent advances in the field, we still don’t know the purpose of sleep. However, sleep disorders are detrimental for health and quality of life and insomnia is one of the most common sleep disorders. Excessive daytime sleepiness or insufficient sleep decreases cognitive performance and may cause accidents. These facts suggest that understanding sleep and its regulation is very important. In the first chapter I summarized the most recent hypotheses on the purpose of sleep; then I described the gold standard of assessing sleep – polysomnography (PSG), sleep stages and particularly the electroencephalogram (EEG). Furthermore, an overview of machine learning tools is provided as they will be used to classify sleep stages and detect microsleep episodes. In the second chapter we implemented and tested 14 simple artifact detection methods and conducted a thorough analysis of their performance on two datasets, one comprised sleep recordings of healthy young subjects, the other one data recorded in patients with hypersomnia and narcolepsy. We found that clean average EEG power density spectra can be obtained using very simple methods. We got the best performance of artifact detection with thresholding slope, power in high frequency (25-90 Hz or 45-90 Hz) and the residual errors of an autoregressive model fitted to the EEG. It is not surprising that the power in high frequency range was a good predictor of an artifact as muscle artifacts are characterized by the power in high frequency range. Most methods showed good sensitivity. However, since we had chosen fixed false positive rate (FPR) of 10%, we excluded on average 16.3% of the epochs whereas experts excluded on average only 7% of the epochs. Our approach seemed reasonable as it leaves enough data for subsequent analyses. The main chapter (third chapter) describes the developed automatic sleep scoring algorithms. Scoring rules are complex and to some degree subjective. Despite the fact that human brain has superb image recognition abilities, sleep scoring is a difficult task. Thus, it is not possible to just program an algorithm which implements the scoring rules for sleep. Such a problem can be addressed however, with modern machine learning methods. Such algorithms learn from the examples which have already been analyzed by an expert. With these techniques we don’t even need to know how to score sleep stages ourselves, we just need examples of experts. We developed several algorithms ranging from basic machine learning tools to deep artificial neural networks. First, we engineered 20 features derived from EEG, EOG and EMG data. This process reduces the dimensionality of our data and makes classification of the data easier. We employed a random forest (RF) classifier in conjunction with a Hidden Markov Model (HMM) or a moving median filter (MF) to smooth the data. Alternatively, we applied artificial neuronal networks (ANN), Long-Short Term Memory (LSTM) networks, designed to handle time series to classify the data. We used our engineered features as input for these networks. Finally, we employed deep convolutional neural networks (CNNs) in combination with LSTM networks. Such algorithms (CNN-LSTM networks) work with raw data and do not require engineered features. We used the F1 score, a performance measure of multi-class data which takes both specificity and sensitivity into account, to evaluate the quality of the automatic scoring. We achieved a sleep stage classification quality close to the human expert in recordings of healthy subjects, with F1 scores above 0.8 for all stages except for stage 1. Stage 1 is difficult to score for a human scorer as well. F1 scores of stage 1 were slightly above 0.4 for most our methods, like the interscorer performance. Our methods trained on healthy participants performed slightly worse on the patient data than on the data of healthy subjects when they were trained only on the data of healthy subjects. However, the performance of the ANNs was better than RF in this case. Performance on the patient data improved when patient data were included into the training. We demonstrated that the methods which incorporate the temporal structure generally perform better. Further, the methods relying on the raw data performed slightly better than the feature-based methods. We think that we could not use the whole potential of ANNs due to the scarcity of the training data. Using these algorithms, we may score sleep fully automatically and analyze big amounts of data very quickly. Our CNN-LSTM network produced good results using just a single EEG channel. This was an unexpected result as we assumed that reliable detection of REM sleep would require EOG and EMG data. Such networks would allow on-line scoring of data recorded with portable devices. The fourth chapter is dedicated to the automatic detection of microsleep episodes (MSE). MSE are very short sleep fragments lasting 3 to 15 s. They often occur in sleep deprived people, in individuals who had insufficient sleep or under boring or monotonous conditions, and in patients with excessive daytime sleepiness. We engineered features and applied basic machine learning methods (support vector machine, random forest) to detect MSE. In a preliminary step we demonstrated that the methods work and reached very good specificity (0.99) and good sensitivity (0.74). Future improvement of MSE detection algorithms should include the temporal structure of the data, for example using LSTM neural networks. In summary, our preliminary analysis provides proof of concept that automatic detection of MSE based on sleep EEG data is feasible. All together, we could demonstrate that machine learning approaches perform well in detecting sleep stages and MSE. The final chapter provides an outlook on further improvements and future steps to be taken.

Statistics

Downloads

728 downloads since deposited on 27 Mar 2019
674 downloads since 12 months
Detailed statistics

Additional indexing

Item Type:Dissertation (monographical)
Referees:Achermann Peter, Martin Kevan A C, König Thomas
Communities & Collections:04 Faculty of Medicine > Institute of Pharmacology and Toxicology
07 Faculty of Science > Institute of Pharmacology and Toxicology

UZH Dissertations
Dewey Decimal Classification:570 Life sciences; biology
610 Medicine & health
Language:English
Date:2018
Deposited On:27 Mar 2019 14:12
Last Modified:25 Aug 2020 14:42
Number of Pages:188
OA Status:Green
Related URLs:https://www.recherche-portal.ch/primo-explore/fulldisplay?docid=ebi01_prod011365474&context=L&vid=ZAD&search_scope=default_scope&tab=default_tab&lang=de_DE (Library Catalogue)

Download

Green Open Access

Download PDF  'Automatic Sleep Classification with Machine Learning'.
Preview
Content: Published Version
Filetype: PDF
Size: 20MB