TY - JOUR
T1 - Optimizing automated sleep stage scoring of 5-second mini-epochs
T2 - a transfer learning study
AU - Follin, Louise Frøstrup
AU - Christensen, Julie Anja Engelhard
AU - Vevelstad, Janita
AU - Juvodden, Hilde T
AU - Viste, Rannveig
AU - Hansen, Berit Hjelde
AU - Perslev, Mathias
AU - Kaufmann, Tobias
AU - Zahid, Alexander Neergaard
AU - Knudsen-Heier, Stine
N1 - © The Author(s) 2025. Published by Oxford University Press on behalf of Sleep Research Society.
PY - 2025/12/12
Y1 - 2025/12/12
N2 - STUDY OBJECTIVES: Conventional sleep staging relies on 30-second epochs, potentially concealing transient sleep stage intrusion and reducing precision. Building on our previous study of mini-epochs, we investigated whether U-Sleep, an existing automatic deep learning-based sleep staging model with high performance in epochs, could be optimized to similar performance level in 5-second mini-epoch scoring, thereby enabling more detailed sleep characterization.METHODS: We created a dataset of 48,000 human-scored 5-second mini-epochs from 100 PSGs. We compared mini-epochs to human-scored epochs before U-Sleep was optimized using transfer learning and evaluated on a test set. Model performance was assessed using F1-scores, confusion matrices, stage distributions and transition rates comparing scorings of the original U-Sleep before, and the optimized U-Sleep after transfer learning to human-scored mini-epochs.RESULTS: Compared to human-scored epochs, human-scored mini-epochs captured significantly more transitions (1.70/minute vs. 0.21/minute, p<0.001), and significantly more wake (8.4% versus 5.4%), N1 (7.2% versus 5.4%), and N2 (51.8% versus 40.9%), less N3 (15.4% versus 25.2%) and REM sleep (16.7% versus 23.0%) (all p<0.001). Optimizing U-Sleep improved its performance significantly from F1=0.74 to F1=0.81 (p<0.05) and gave increased transition rates in the test set (original U-Sleep: 1.06/minute, optimized U-Sleep: 1.34/minute, human-scored mini-epochs: 1.70/minute). Stage distributions did not differ between optimized U-Sleep's scorings and human-scored mini-epochs.CONCLUSION: After optimization, U-Sleep performance in mini-epochs matched the high performance levels previously reported in both human and automated 30-second epoch scoring. This demonstrates the feasibility of precise, automated high resolution sleep staging. Future work should include external validation and application to full-night recordings.
AB - STUDY OBJECTIVES: Conventional sleep staging relies on 30-second epochs, potentially concealing transient sleep stage intrusion and reducing precision. Building on our previous study of mini-epochs, we investigated whether U-Sleep, an existing automatic deep learning-based sleep staging model with high performance in epochs, could be optimized to similar performance level in 5-second mini-epoch scoring, thereby enabling more detailed sleep characterization.METHODS: We created a dataset of 48,000 human-scored 5-second mini-epochs from 100 PSGs. We compared mini-epochs to human-scored epochs before U-Sleep was optimized using transfer learning and evaluated on a test set. Model performance was assessed using F1-scores, confusion matrices, stage distributions and transition rates comparing scorings of the original U-Sleep before, and the optimized U-Sleep after transfer learning to human-scored mini-epochs.RESULTS: Compared to human-scored epochs, human-scored mini-epochs captured significantly more transitions (1.70/minute vs. 0.21/minute, p<0.001), and significantly more wake (8.4% versus 5.4%), N1 (7.2% versus 5.4%), and N2 (51.8% versus 40.9%), less N3 (15.4% versus 25.2%) and REM sleep (16.7% versus 23.0%) (all p<0.001). Optimizing U-Sleep improved its performance significantly from F1=0.74 to F1=0.81 (p<0.05) and gave increased transition rates in the test set (original U-Sleep: 1.06/minute, optimized U-Sleep: 1.34/minute, human-scored mini-epochs: 1.70/minute). Stage distributions did not differ between optimized U-Sleep's scorings and human-scored mini-epochs.CONCLUSION: After optimization, U-Sleep performance in mini-epochs matched the high performance levels previously reported in both human and automated 30-second epoch scoring. This demonstrates the feasibility of precise, automated high resolution sleep staging. Future work should include external validation and application to full-night recordings.
U2 - 10.1093/sleep/zsaf393
DO - 10.1093/sleep/zsaf393
M3 - Journal article
C2 - 41384756
SN - 1550-9109
JO - Sleep
JF - Sleep
ER -