Skip to main content
Free AccessSpecial Articles

Evaluating consumer and clinical sleep technologies: an American Academy of Sleep Medicine update

Published Online: by:11


The previous American Academy of Sleep Medicine (AASM) Consumer and Clinical Technology Committee and the subsequent Technology Innovation Committee have assessed a wide range of consumer and clinical sleep technologies. These assessments are available for AASM members on the AASM website (#SleepTechnology, found in the clinical resources section).1 Sleep device/application (app) assessments include product claimed capabilities, narrative summaries, Food and Drug Administration (FDA) status, sensors, mechanisms, data outputs, raw data if available, application programming interfaces (APIs) if accessible, similar devices/apps, and if there are peer-reviewed validation studies or clinical trials. Assessments are intended to assist members in general product understanding and do not represent product endorsement. An updated list of commonly used sleep (device/app) technology terms, in order from simple to more complex terms, is provided in Table 1.

Table 1 Commonly used sleep device/app technology terms.

WearableDevices that are worn to provide physical data or feedback
NearableNearby contactless devices that provide physiologic or environmental data or feedback
SensorA device that measures a physical input and converts it into understandable data
Photoplethysmography (PPG)PPG sensors use a light source and a photodetector to measure blood flow changes, which provide signals that may use AI/ML/DL algorithms to deliver data outputs such as sleep stages
Sleep score or qualityOften a product specific computation of “sleep quality” derived from questionnaires and/or sensor data
Sleep stagesDevice/app reporting of sleep stages such as “light sleep” or “deep sleep” that may vary in type of data acquisition, derivations, and definitions between devices/apps; staging may be derived from proprietary AI/ML/DL algorithms such as using PPG heart rate variability (HRV) rather than standard polysomnographic EEG scoring rules
Patient generated health data (PGHD)Health care related data that are generated by patients and collected for the purpose to address a health concern or issue
Mhealth (mobile health)The use of mobile phones or other wireless technologies to monitor and exchange health information
Software as a medical device (SaMD)Software intended for medical uses that does so without being part of a hardware medical device
Mobile medical app (MMA)A mobile app whose functionality meets the definition of a medical device
Clinical decision support (CDS) softwareSaMD software risk categorization established by the International Medical Device Regulators Forum to determine if a software treats, diagnoses, or drives or informs clinical management
Artificial intelligence/machine learning (AI/ML) as an SaMDSaMD that may have “locked” AI/ML algorithms or “adaptive learning” algorithms that may be assessed using an FDA Precert total Lifecycle product approach
Remote data monitoringMonitoring of data remotely
Remote patient monitoringA subset of remote data monitoring that is used clinically
Application programming interface (API)A software interface that allows two or more applications to exchange information such as with an electronic health record
AlgorithmA sequence of statistical processing steps to solve a problem or compete a task
Artificial intelligence (AI)The broad use of computer algorithms to simulate human tasks and thinking
Machine learning (ML)A subset of AI that uses data training sets to make predictions and decisions without explicit programming
Deep learning (DL)A subset of ML that enhances a deeper dive into smaller patterns of artificial neural networks

Over the past few years, we have seen the evolution of some consumer sleep technologies used for self-tracking and/or self-help, to devices/apps with potentially meaningful clinical diagnostic, therapeutic, and/or long-term data tracking uses. See Figure 1 and the following descriptions of types of sleep device/app technologies.

Figure 1: Sleep device/app types.

Consumer grade technologies generally promote sleep self-awareness and/or may provide suggestions for improving sleep. Definitions of metrics like “sleep quality” or “sleep scores” may vary between devices/apps. Consumer devices/apps do not require a prescription. They may or may not be registered for FDA wellness and sports use.2 Popular sleep/wake tracking smartwatches are an example of consumer long-term sleep-wake self-tracking.

Clinical grade technologies require a prescription, are often FDA cleared or approved, and typically have some peer-reviewed validation studies. Continuous positive airway pressure (CPAP) data tracking by providers is an example of long-term remote patient monitoring (RPM), which offers enhanced care by providing clinical data monitoring between visits. By providing interim care, some RPM may be reimbursable.35 Clinical grade technologies often utilize product-specific, proprietary artificial intelligence (AI), machine learning (ML), or deep learning (DL) algorithms.

Hybrid and/or transitional technologies represent an array of technologies. A hybrid technology may represent a technology for which one of its sensors is FDA cleared or approved for a specific metric, but whose overall claimed data output has not been validated or FDA cleared or approved. A transitional technology may be one that is in the process of applying for FDA clearance or approval using preliminary (often internal, unpublished) studies. Hybrid and transitional technologies frequently utilize product-specific, proprietary AI/ML/DL algorithms.

Previous statements developed by the AASM Consumer and Clinical Technology Committee recognized potential benefits, limitations, and risks of sleep-related technology disruptions.6,7 In the context of the fast pace of sleep technology development, sleep medicine professionals look forward to technological advancements while considering these developments thoughtfully and scientifically before embracing them. This noted, the previous rapid pace of sleep technology development has been propelled even faster by the unexpected 2020 expansion of consumer and provider interest in self-monitoring physiological data, telemedicine, remote testing, remote data monitoring, and novel device sensor and AI/ML/DL integrations catalyzed by the COVID-19 pandemic.811 Sleep technology sensors, such as pulse oximeters or EEG sensors, have been added to previous consumer grade rings, watches, eye or head bands, and other wearables or nearables. Proprietary AI/ML/DL algorithms, such as those using heart rate variability (HRV), have assisted in the progression from consumer to transitional and clinical grade sleep technologies.

Thus, the need for clarification in how to evaluate rapidly evolving and diverse consumer and clinical sleep technology types and uses has become even more timely to sleep providers.12 Specifically, confirming validation of product marketing claimed capabilities can be a challenge for busy clinicians who may be seeking peer-reviewed, randomized controlled studies for each device/app. However, such traditional validation studies may require too long of a time frame to complete and publish for real-time assessments of device/app performance or accuracy. For rapidly emerging technologies, novel validation processes may help lead to faster integration into clinical care applications. Examples may include outcomes-based or AI/ML/DL assisted validation. With these challenges in mind, we propose the following items for clinicians to consider when evaluating the wide range of consumer, hybrid or transitional, and clinical sleep-related technologies:

  • Awareness of FDA terms

  • Defining sleep term definitions across devices/apps

  • Defining populations

  • Data integrity

  • Applications of new sensors, new sensor applications, or other novel technologies

  • Awareness of proprietary AI/ML/DL algorithms

  • Defining validation methods for claimed capabilities


Evaluation of sleep devices/apps is enhanced by an understanding of FDA terms. Unless the product has a specific exemption, FDA classification is generally based on the device/app safety risk, the intended use, and the indication(s) for use.13,14Premarket notification, or 510(k) FDA clearance for marketing, allows the FDA to determine if the product is equivalent to a “predicate” device/app already placed in Class I (low risk), Class II (moderate or higher risk than Class I), or Class III (high risk) type category.15 510(k) clearance is often required for Class II devices/apps and does not require clinical trials. The device/app requires: (1) Having the same intended use AND technological characteristics as the predicate device OR (2) Having the same intended use AND different safe and effective technology. The FDA 510(k) gives marketing clearance (but not “approval”) to such Class I or II devices. Premarket approval (PMA), typically used for Class III devices and more in-depth than 510(k), requires that the device is safe and effective. The PMA process generally requires human clinical trials supported with lab testing.16FDA approval requires successful PMA or a specific exemption before a device can be marketed. FDA Granted is a new term by which the device/app uses the De Novo pathway before it can be marketed. The De Novo pathway is offered for Class I or II devices with low to moderate safety risk when there is no similar predicate device.17

General Wellness devices/apps do not require FDA 510(k) clearance or PMA approval, and they may or may not be FDA registered or require enforcement discretion.18 These Wellness and Sports devices/apps: (1) Have low safety risks to the user or others, (2) Are intended for wellness purposes, (3) Do not directly purport to diagnose or treat a disease or medical condition, and (4) Do not require FDA clearance or approval. Of special note, some popular health/sleep watches may be wellness type or may have FDA 510(k) clearance as a Class I or II medical device. The FDA also has provided guidance on mobile medical apps (MMA).19 The FDA does not have a policy for the storage or platforms for these apps. However, some mobile apps may fall under the FDA software as a medical device (SaMD). The FDA provides guidance for SaMD which are defined as software used for medical purposes that do so without being part of a hardware medical device.20 Further, through the FDA Digital Health Center of Excellence, the FDA is providing guidance and advanced digital pathways for other SaMD type apps such as the International Medical Device Regulators Forum risk categories for clinical decision support (CDS) software, the FDA patient-generated health data throughout the total product life cycle (TPLC) device/app approach, and the AI/ML as SaMD approach.21 Like the other FDA designations, these device/app pathways are guided by safety risk, intended use, and indications for use. Devices@FDA provides one place where you can find official information about FDA 510(k) cleared and PMA approved medical devices/apps. This includes summaries of currently marketed medical devices/apps.


The use of multiple or nonspecific definitions for sleep-related terms may be encountered when reviewing sleep devices/apps. For example, sleep quality, sleep scores, light and deep sleep, and other terms may have variable definitions across devices/apps. Ideally, terminology should be specifically defined and reproducible across comparable consumer or clinical output metrics. Some commonly used sleep device/app and technology terms are listed in Table 1.

In addition, some platforms claim to integrate multiple consumer and/or clinical data outputs.22,23 While integrated platforms may allow easier data and report access, terms across devices/apps may carry different meanings, which may cause confusion particularly if using an integrated platform with multiple data sources and sleep term definitions. For example, definitions, derivations, and outputs of “light sleep” may vary across devices/apps.

Additionally, integration of consumer or clinical device/app data directly into the patient’s electronic health record (EHR), a legal medical document, also raises questions about using consistent sleep terms, device/app data validation, who reviews and owns these vast newly available datasets, and accepted standards for uses of such data within the EHR. Health care disciplines have come to expect standardized definitions when products are used for research or clinical purposes.24,25 More relevant to the sleep field, the Consumer Technology Association (CTA) and the National Sleep Foundation (NSF) have produced three white papers specific to defining device/app terminologies for researchers, clinicians, industry, and consumers. Developed jointly by CTA and the NSF, Definitions and Characteristics for Wearable Sleep Monitors (2016) and Methodology of Measurements for Features in Sleep Tracking Consumer Technology Devices and Applications (2017) provide a foundation for defining sleep terms for use across different sleep apps and devices.26,27 These groups discuss standardizing sleep metrics in terms of events, processes, and patterns in Performance Criteria and Testing Protocols for Features in Sleep Tracking Consumer Technology Devices and Applications (2019).28 This working group is an example of how clinicians, researchers, developers, and industry can benefit from collaboration.29


Validated and/or FDA cleared/approved devices and apps may not necessarily be validated for all age ranges, or for populations with sleep disorders and/or medical comorbidities, or for patients taking certain medications or having implantable devices. Sleep “best practices,” accepted standards of care, quality measures, and clinical guidelines include indications and contraindications for testing of specified populations.3039 For example, home sleep apnea testing (HSAT) is generally indicated for patients who have been prescreened for uncomplicated obstructive sleep apnea (OSA) testing (such as patients without significant cardiopulmonary comorbidities). A sleep device/app should indicate if its claimed use is for a particular consumer or patient population, as well as cite any exclusionary populations. For technologies that will be used to diagnose sleep disorders, direct treatment decisions, or drive personal or population health, clear definitions of the claimed uses for specified populations are indicated. Population demographics of the accessible datasets should accompany these devices/platforms to inform generalizability. These minimum, general requirements are much like the FDA’s digital health action and innovation plans.40,41


Considerations for the integrity of patient generated health data (PGHD) include: who may access device/app data, privacy, ownership, storage, security, raw data review, API accessibility or appropriate platform integration (such as with an EHR or with other entities). Raw data review and interpretation may be meaningful features to some sleep providers, as in the case of reviewing the raw data of apnea testing devices. Who may view, own, or share the data, may be other important features to users. Approximately 45% of smartphone devices have health or fitness apps,42 and the security of PGHD and personal health information has received recent attention.43 Health care entities appear to be a favored target for hackers.44 In a 2020 security report on global mHealth apps, a leading app security firm reported that 71% of health care and medical apps had vulnerability, and 91% had weak encryption.45,46 APIs are reported to be highly vulnerable to hacking, including EHR access.47 Of current relevance, the 21st Century Cures Act has directed that patients are able to access their EHR data as of April 2021, often using APIs.48 Also relevant, the Office of the National Coordinator for Health Information Technology (ONC) supports using standardized APIs such as the Fast Healthcare Interoperability Resources (FHIR) and endorses eradicating information blocking.49,50 The American Medical Association has provided guidelines about protecting PGHD.51 Further, the bidirectional information flow from and to a medical device is a consideration. While there are no current reports of hacking sleep medical devices, the FDA has warned about the ability to hack medical devices such as pacemakers, insulin pumps, and other devices.5254


Wearable sleep sensors may collect data using finger probes, nasal/oral/mask sources, rings, watches/wrist bands, torso bands, skin patches, eyewear, forehead or headbands or caps, smart garments, shoe inserts, leg bands, or other worn devices.55,56Nearables often are located on a bedside stand or under the mattress. Types of nearable sleep tracking include ballistocardiography vibration (for respiratory and heart rate, stroke volume), sound, light, temperature, humidity, and movement sensors.5760 Like wearables, nearables often report user snoring, sleep times, staging, and “quality.”61

Sleep clinicians and researchers are familiar with triaxial accelerometers for actigraphy or polysomnographic (PSG) sensors such as EEG, EOG, ECG, EMG, nasal/oral airflow sensors, torso belts, pulse oximeter, microphone, and camera. Newer consumer and clinical sleep technologies may utilize combinations of these sensors, other sensors, or novel applications of sensor data that may increase the accuracy or performance of the output data such as sleep staging.62,63 Examples of other sensors include skin temperature, radar/radiofrequency, sound, environmental sensors, ingestibles, and others.6365

Further, using AI/ML/DL, some sensor data has been transformed and extended for other uses and applications. For example, like a pulse oximeter, a photoplethysmography (PPG) device uses a light source (through vascular tissue with pulsatile blood-volume flow) and an oppositely positioned photodetector to measure changes in light intensity. The light source wavelength(s) and its application type and location are often product specific. The PPG waveform consists of an AC current (pulsatile wave) superimposed on a DC current (steady, slow changes with respiration). Sensitive to motion and other artifact, raw PPG waveforms are amplified, filtered, and derived to provide outputs such as heart rate variability (HRV) or peripheral arterial tone (PAT).66,67 Using datasets and these signals, AI/ML/DL algorithms are used to provide users with familiar data such as apnea-hypopnea index (AHI) or sleep stages.6876 Note that pulse oximeters may be affected by position on the hemoglobin-oxygen desaturation curve, poor pulsatile flow, hypoxia, motion, skin tone, and other settings.7779 Similarly, products utilizing PPG may have variable accuracy in different conditions and populations. Additionally, a product algorithm is specific for a sensor at a specified location, type of light source, and its type of application and data acquisition. For example, data derived from one fingertip sensor brand cannot be generalized to another device’s fingertip or wrist application.

Many newer clinical, hybrid, and evolving transitional sleep technologies incorporate triaxial accelerometer and PPG sensors and utilize proprietary AI/ML/DL algorithms to report sleep-related data.62 However, each device generally utilizes different sensor designs, data collection and analysis methods, and AI/ML/DL data interpretation. Thus, sensor collection, location, analysis, and outputs such as PPG may differ across devices. As such, assessing validation or performance of claimed uses may be proprietary, product specific, and not generalizable across other devices. Proprietary sensors and associated algorithmic outputs are device/app specific.


Sensors allow the collection of vast amounts of physiologic and environmental data, and AI-based analytics are well suited to present digestible data outputs and summaries to patients, providers, and researchers. To display user-friendly data output, sleep device/app software may use AI/ML/DL to incorporate consumer or patient-entered data and data from one or more sensors.8082 Based on ongoing user entered and/or physiological data collection, some AI/ML/DL software may advise or make patient-centered, focused recommendations to a consumer or to a patient. For example, CPAP or insomnia software may provide coaching based on collected data. Using learning datasets, AI/ML/DL algorithms learn from data to improve performance on specific tasks (for example, automated sleep staging). Such computerized algorithms are often proprietary, not easily summarized, and may be referred to as a “black box.”83 However, even while remaining proprietary, disclosure of certain aspects of algorithm development and testing can improve transparency and, therefore, increase confidence in the clinical use of such tools.84,85

When AI/ML/DL analysis is applied to a new technology to track sleep, information regarding both the training and testing dataset should be disclosed including the size, demographics, and clinical features of participants from which the data is derived. Examples of shared sleep datasets are available.86

As the first clinical application of AI/ML/DL in sleep medicine is likely to be the automated scoring of sleep and associated events, a certification program could provide a framework to guide sleep laboratories in the use of ML-based scoring software. In addition to disclosure of minimum characteristics of training and testing sets and reporting of performance of statistics on a novel, independent testing set, such a program will also require manufacturers to assist sleep laboratories with demonstration of local performance in their own facility.


Providers and researchers commonly expect clinical (diagnostic or treatment) validation studies to be peer-reviewed, gold standard comparisons, randomized clinical trials, and/or outcomes-based or population health-based strategies. Clinicians generally use the requirement of a prescription or some type of FDA verbiage to guide comfort in using a product. This noted, users may encounter difficulties when searching for support or validation of product marketing statements about claimed uses.87 Additionally, sleep stage validation for consumer or clinical sleep devices against gold standards may be an indirect comparison of different sensors and interpretive derivations. For example, the former may use proprietary AI/ML/DL processed heart rate signals compared against polysomnographic sensor data using EEG, EOG, EMG.25

Determining the accuracy or performance of sleep device/app claimed uses can be challenging for a variety of reasons:

  • A descriptive name for a sensor may vary in accuracy across specific devices. For example, the accuracy of pulse oximeters may vary across products.

  • Sensor or device position and/or environment may affect data collection.

  • Validation for the claimed uses may not specify inclusion/exclusion for the intended population use (such as for certain age ranges, healthy vs users with sleep or medical comorbidities, users with pacemakers or taking certain medications, other conditions).

  • Cited articles may be exclusively performed and/or funded by the product company and/or inventor.

  • Cited articles may refer to general sleep principles but lack specific validation for that particular product claim. For example, the referencing of general light wavelength and circadian studies does not verify that a particular light-related device/app has authenticated the claimed use for that specific light-related device/app.

  • Cited validation articles may be for one sensor, but not for the integration with other sensors or the associated proprietary AI/ML/DL algorithms for that product claim. Validation or FDA clearance of one sensor does not necessarily ensure fidelity of the entire technology, which requires sound methods of data collection for all incorporated sensors as well as data transmission and analysis methods. Some devices/apps utilizing proprietary “black box” algorithms also may incorporate patient-reported data and demographics with data from multiple sensors (such as photoplethysmography, accelerometer, electrocardiogram, pulse oximetry).

  • Software and/or proprietary AI/ML/DL algorithms may be adjusted and/or updated, with or without specific notice to users.25 Additionally, even when validated studies may be available, the testing may have been performed on an outdated software version or dataset.

For developers who design products intended for medical and research purposes, traditional “gold standard” comparison validation studies may not be applicable or may be difficult to achieve in a brisk marketplace. This noted, transparent validation efforts and guidelines for choosing sleep technologies clinically or for research are needed.83 For example, validation may not have to be limited to traditional “gold standard” comparisons, but rather could include clinical certification based on comparative outcome studies, AI/ML/DL validation using shared datasets, and/or other novel validation approaches. Menghini et al recently described a standardized framework for testing the performance of sleep trackers.88


Like all fields of medicine, sleep medicine is in an era of rapid technologic disruption that has been further accelerated by recent care delivery changes, telehealth expansion, and remote data monitoring accompanying the COVID-19 pandemic. Also catalyzed by the pandemic, consumers have sought devices/apps that allow sleep self-help measures and data tracking. The influx of new sensors and mobile applications, new modes of remote signal collection, and new methods of data management are generating massive information caches. Analytics with AI/ML/DL could realize the promise of personalized medicine, precision medicine, and population health.

There are a broad range of consumer, hybrid and transitional, and clinical sleep technologies, and providers look to deliver a consistent message about this range of sleep technologies to patients. Consumer sleep technologies encourage patients to think about their sleep and its impact on their health. Such discussions open conversations about the patients’ sleep concerns, questions, or symptoms.

While PSG is considered the gold standard for defining sleep stage metrics and diagnosing many sleep disorders, PSG is limited by its data collection outside of the home and for it providing only a snapshot in time. Home sleep apnea tests also provide snapshot data, do not typically include arousals, and may have variable accuracy of sleep metrics across devices. Actigraphy too has limits of its data acquisition being expensive and time-consuming as well as data collection generally for only 2 weeks. Reliable consumer sleep technology could provide popular, inexpensive, 24/7 sleep/wake data collection over long periods of time.89 New pathways for diagnostic testing, clinical treatments, and/or chronic management could emerge from such long-term data collection and analysis.

Sleep providers are familiar with the improvement in CPAP compliance with remote data monitoring and patient engagement software.9092 Providers look forward to adding enhanced remote testing, treatment, and both consumer and clinical data monitoring sleep technology tools to provide real-time and improved between-visit care, personalized care, interactive data alerts, and novel care guidelines based on personalized or population health big data analytics. AI/ML/DL analytics of ongoing patient reported data and physiologic data over time offer new opportunities to monitor individual and group patient symptoms and physiology in the home continuously, indefinitely, and in real time. Changes from baseline data trends could prove invaluable in predicting individual and group disease onset and/or exacerbations.9,10,93,94

However, before using sleep devices/apps clinically, providers seek to gain comfort in understanding how to assess the accuracy, performance, and intended uses of the product marketing claims. As more devices/apps utilize proprietary AI/ML/DL algorithms, user confidence is further challenged. Wu et al studied the 130 medical AI devices cleared by the FDA between 2015 and 2020 and found the FDA AI process less vigorous than the FDA pharmaceutical process.95 As an example, the authors used a chest X-ray detection of pneumothorax algorithm and found that it worked well for the original site cohort but was 10% less accurate for two different sites and less accurate for Black patients. They cite that many datasets are retrospective, from only one or a few sites, or do not include all representative populations in clinical settings. Interested clinicians can search the FDA web database for devices/apps that have obtained 510(k) clearance or premarket approval. Yet, seekers cannot easily do similar FDA web AI/ML/DL searches, and Benjamens et al have created an open access database of strictly AI/ML-based medical technologies that have been approved by the FDA.96

As described in this paper (and Table 2), a practical checklist for clinicians when evaluating sleep product claimed uses includes: awareness of FDA terms, familiarity with product sleep term definitions, use with particular populations, data integrity considerations, recognizing sensor types and applications, awareness of proprietary AI/ML/DL algorithms, and clarification of validation methods used for the product claimed uses. At present, evaluating product claimed uses requires time-consuming verification and/or familiarity for each unique device/app. Until there are consistent performance standards for these elements across devices/apps, providers may continue to feel unsure about clinical use of the many and diverse sleep technologies. Recent guides have been proposed for developers to use to document the performance of product claimed uses, and consumers, clinicians, and researchers look forward to this consistency and transparency. To provide user confidence, increased communications and collaboration between industry, government, insurers, clinicians, researchers, and consumers could help to create a standard framework for reporting of product performance.

Table 2 A guide to evaluating sleep technologies.

  • • Awareness of FDA terms

  • • Defining sleep term definitions across devices/apps

  • • Defining populations

  • • Data integrity

  • • Applications of new sensors, new sensor applications, or other novel technologies

  • • Awareness of proprietary AI/ML/DL algorithms

  • • Defining validation methods for claimed capabilities


The authors of this paper compose the 2019–2020 AASM Clinical and Consumer Sleep Technology Committee and the 2020–2021 AASM Technology Innovation Committee. Dr. Kirsch is a past president of the AASM and served on the board of directors during the writing of this paper. Dr. Ramar is the 2020–2021 President of the AASM. Dr. Deak is employed by eviCore Healthcare. eviCore Healthcare played no role in the development of this work. Dr. Goldstein is 5% inventor of a circadian estimation mobile application that is licensed to Arcascope, LLC; sleep estimation capabilities of the application are associated with open source code. Dr. Chiang received research grants from Belun Technology Company (Hong Kong) for conducting validation studies at University Hospitals Cleveland Medical Center. Dr. O’Hearn reports serving as an advisor/consultant to 4D Medical, Inc and Inogen, Inc. These relationships do not represent a conflict of interest with this work. The other authors report no conflicts of interest.