Artificial intelligence in sleep medicine: an American Academy of Sleep Medicine position statement
Sleep medicine is well positioned to benefit from advances that use big data to create artificially intelligent computer programs. One obvious initial application in the sleep disorders center is the assisted (or enhanced) scoring of sleep and associated events during polysomnography (PSG). This position statement outlines the potential opportunities and limitations of integrating artificial intelligence (AI) into the practice of sleep medicine. Additionally, although the most apparent and immediate application of AI in our field is the assisted scoring of PSG, we propose potential clinical use cases that transcend the sleep laboratory and are expected to deepen our understanding of sleep disorders, improve patient-centered sleep care, augment day-to-day clinical operations, and increase our knowledge of the role of sleep in health at a population level.
Goldstein CA, Berry RB, Kent DT, et al. Artificial intelligence in sleep medicine: an American Academy of Sleep Medicine position statement. J Clin Sleep Med. 2020;16(4):605–607.
The American Academy of Sleep Medicine (AASM) is the leading professional society dedicated to the promotion of sleep health. Because its mission is “advancing sleep care and enhancing sleep health to improve lives,” the AASM endeavors to advance sleep health policy that improves the health and well-being of patients and the public.
An unprecedented volume of medical data on the backdrop of computational advances offers the potential to more efficiently care for patients with greater precision. Artificial intelligence (AI) refers to the ability of computer systems to perform tasks historically executed only by humans. Machine learning (ML) algorithms use experience and data to adjust parameters to improve performance on different tasks, such as classification, without direct programming. Because ML programs have come to dominate AI, the terms are often used interchangeably.
The use of polysomnography (PSG), scored by a sleep technologist, as the cornerstone of objective testing in sleep medicine has resulted in the collection and storage of a massive amount of labeled physiological data. AI methods offer the opportunity to automate and enhance the scoring of sleep and associated events while extracting additional insights from PSG data. If appropriately used, AI may streamline clinical operations, introduce greater precision into the evaluation and treatment of sleep disorders, and therefore improve outcomes for patients with sleep disorders.
It is the position of the AASM that the electrophysiological data acquired during PSG is well-suited for analysis using AI. While AI applications that score sleep and associated events are expected to improve sleep laboratory efficiency and yield greater clinical insights, the goal of AI integration should be to augment, not replace, expert evaluation of sleep data. Guidance addressing logistical, security, ethical, and legal considerations will need to be provided to sleep facilities to enable clinical implementation of AI.
Until recently, assisted sleep staging and scoring of associated events relied on hand-designed, rule-based programs, which are vulnerable to human error and bias. Testing on small samples, and limitations in computational power, presented barriers to generalizability and implementation in routine clinical practice. The current capabilities of AI have made augmented and assisted PSG scoring a reality. ML algorithms have been trained on datasets containing thousands of subjects and demonstrate sleep staging accuracy similar to interrater reliability among scorers with reported κ values up to 0.80.1–3
AI analysis of PSG could reduce the time devoted to PSG hand scoring by technical staff, leading to more rapid diagnosis and treatment of patients with sleep disorders and, additionally, reveal greater value of PSG; however, integration of ML into clinical practice should be approached cautiously. Incorporation of AI into the sleep laboratory will undoubtedly present logistical challenges such as the need for increased computer support as well as training mechanisms to assist providers and health systems with software integration into current care pathways. Data storage methods must adhere to stringent security measures that comply with the Health Insurance Portability and Accountability Act (HIPAA). Well-defined methods will be needed to govern access to data within these repositories to protect patient privacy.
Ethical and legal dilemmas may also arise. Because ML programs learn from the data provided, health care inequities that already exist, such as sex disparities in the evaluation of sleep-disordered breathing, could be amplified. Ultimately, the clinician, not the computer software, is responsible for patient diagnosis; therefore, AI tools should augment, but not replace, the clinical judgement of the sleep medicine provider.
The rapid advances of AI in medicine in general pose important challenges to regulation and implementation, and the best practices for AI integration into health care are expected to rapidly evolve alongside the technology. However, for sleep disorders centers that choose to explore the use of AI-based tools, the following are some initial considerations:
Transparency and disclosure: Manufacturers are expected to clearly delineate the intended population and the goal of any AI program used in the evaluation of patients. The following elements of the dataset used to develop the AI program should be clearly disclosed:
Demographic information and disease characteristics of the population within the dataset, and how that may limit generalizability in some patient groups
Specification of any techniques to identify and address artifacts
Sampling rates and filters used for digital acquisition and analysis
A data dictionary that includes all annotations used and generated, as well as terms used to describe the population
Testing on novel data: AI programs intended for clinical use should be tested for performance on adequately sized, independent, standardized test sets that were not used for algorithm development. Testing datasets should be representative of the patient population and conditions for which the software program is intended [eg, an AI-based tool intended for the detection of obstructive sleep apnea (OSA) in children should not be tested on an adult dataset]; however, the testing dataset also must be diverse enough to generalize to a heterogeneous clinical population. AI programs must demonstrate performance comparable to the agreement among expert scorers. Performance metrics derived from testing must be made publicly available.
Laboratory integration: Consistent with procedures that surround the adoption of any new technology in the sleep laboratory, manufacturers should assist sleep disorders centers in evaluating AI-based software performance to determine real-life, laboratory-specific utility. Once confidence in the reliability of the application exists, clinical use without sleep technologist review may be considered. In accordance with current clinical practice, full physician review is expected after algorithm-based scoring and, if indicated, manual sleep technologist rescoring of part or the entirety of the PSG recording may be necessary to provide accurate diagnosis and optimal patient care.
The future applications of AI in clinical sleep medicine are expected to transcend PSG scoring. Currently, the insight a PSG provides into an individual’s health remains relatively limited, as hours of physiological signals are reduced to summary metrics and indices that at times fail to correlate with meaningful clinical outcomes for patients with sleep disorders.4 Already, ML techniques have been developed that appear to allow accurate assisted identification of narcolepsy.5 Combining more in-depth, comprehensive analysis of the PSG with clinical, genomic (and other -omics), behavioral, and other forms of data, may help to improve the diagnosis and classification of disease, identify sub- and endophenotypes, and ultimately, develop precision interventions to prevent complications and improve the treatment of patients who have sleep disorders.
In addition to the possible diagnosis of narcolepsy from PSG, clinical use cases for this growing body of research include improved prediction of cardiovascular risk based on identification of specific OSA phenotypes,4 delineation of pathophysiological mechanisms underlying OSA to guide patient-specific treatment selection,6 and use of electroencephalogram characteristics to predict conversion from idiopathic REM sleep behavior disorder to neurodegenerative disease.7
Additionally, the sleep field is uniquely positioned to use AI to distill patient-generated data and prompt interventions in real-time. For example, treatment of OSA results in the passive acquisition and storage of vast amounts of data from positive airway pressure (PAP) devices. AI, in conjunction with ubiquitous mobile devices, can identify PAP adherence and mask fit difficulties, triggering self-management interventions that may empower patients to optimize treatment adherence. Additionally, because PAP users often have comorbid chronic conditions such as heart disease, signatures in the currently underutilized, continuously collected respiratory signal may reveal changes that would indicate impending heart failure or other disease state exacerbations, triggering a provider-initiated intervention to prevent clinical decompensation.8 In the same way, AI evaluation of sleep patterns estimated by appropriately validated accelerometer-based wearable or other consumer sleep technology devices could identify opportunities for real-time behavioral modifications or identify an imminent health event. However, before patient-generated data are used in this way, device and algorithm output must be verified for accuracy and reliability.
AI is well-suited for the analysis of the massive quantity of physiological signals acquired during PSG, and the initial application to algorithm-based scoring of sleep and associated events is expected to enhance sleep laboratory efficiency and improve patient care. Additionally, AI will allow clinicians to extract greater insights from the PSG than the routinely used summary metrics, ultimately improving disease subtyping and phenotyping, informing tailored selection of treatments based on individual patient characteristics. Nevertheless, pitfalls also exist, and logistical, security, ethical, and legal dilemmas should be considered in the implementation of AI in the clinical sleep laboratory to support high quality, patient-centered care.
This position statement was developed for the board of directors by the AASM Artificial Intelligence in Sleep Medicine Committee. It is published as an advisory that is to be used for educational and informational purposes only.
4. . Sleep apnea heterogeneity, phenotypes, and cardiovascular risk: implications for trial design and precision sleep medicine. Am J Respir Crit Care Med. 2019;200(4):412–413. https://doi.org/10.1164/rccm.201903-0545ED
7. . Algorithmic complexity of EEG for prognosis of neurodegeneration in idiopathic rapid eye movement behavior disorder (RBD). Ann Biomed Eng. 2019;47(1):282–296. https://doi.org/10.1007/s10439-018-02112-0
8. . The respiratory signature: a novel concept to leverage continuous positive airway pressure therapy as an early warning system for exacerbations of common diseases such as heart failure. J Clin Sleep Med. 2019;15(6):923–927. https://doi.org/10.5664/jcsm.7852