Abstract
Increasing evidence supports reduced accuracy of noninvasive assessment tools, such as pulse oximetry, temperature probes, and AI skin diagnosis benchmarks, in patients with darker skin tones. The FDA is exploring potential strategies for device regulation to improve performance across diverse skin tones by including skin tone criteria. However, there is no consensus about how prospective studies should perform skin tone assessment in order to take this bias into account. There are several tools available to conduct skin tone assessments including administered visual scales (e.g., Fitzpatrick Skin Type, Pantone, Monk Skin Tone) and color measurement tools (e.g., reflectance colorimeters, reflectance spectrophotometers, cameras), although none are consistently used or validated across multiple medical domains. Accurate and consistent skin tone measurement depends on many factors including standardized environments, lighting, body parts assessed, patient conditions, and choice of skin tone assessment tool(s). As race and ethnicity are inadequate proxies for skin tone, these considerations can be helpful in standardizing the effect of skin tone on studies such as AI dermatology diagnoses, pulse oximetry, and temporal thermometers. Skin tone bias in medical devices is likely due to systemic factors that lead to inadequate validation across diverse skin tones. There is an opportunity for researchers to use skin tone assessment methods with standardized considerations in prospective studies of noninvasive tools that may be affected by skin tone. We propose considerations that researchers must take in order to improve device robustness to skin tone bias.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Introduction
Technological advances have made non-invasive medical devices (e.g., pulse oximetry, heart rate monitors, artificial intelligence-based diagnostics) irreplaceable aspects of modern patient care. However, evidence shows many of these devices are susceptible to skin tone bias, which can worsen disparities in outcomes1. Modalities relying on transcutaneous measurements may produce bias due to skin tone variation and device validation on non-diverse populations, limiting device performance and overall generalizability1,2.
Recent evidence shows current racial bias in pulse oximetry. A retrospective analysis revealed a 3x increased frequency of undetected hypoxemia in Black patients (17%; 95% CI 12.2–23.3) compared to white patients (6.2%; 95% CI 5.4–7.1)3. Although there are mixed results4,5, several studies show overestimation in arterial oxygen saturation by 0.17–10% in darkly pigmented subjects, especially at lower SpO2 values6,7,8. Furthermore, darker skin tone is associated with a larger bias of undetected hypoxemia9.
Artificial Intelligence (AI) image classification is an emerging non-invasive tool that aims to improve diagnostic accuracy in medicine10, including classification of skin lesions10,11,12. However, increasing literature recognizes bias in their performance, with models performing worse on individuals with darker skin tones13. Daneshjou et al. showed that three state of the art algorithms had performance decreases on darker skin types (FST V, VI) compared to lighter skin types (FST I, II) (ModelDerm: 0.55 vs 0.64; DeepDerm: 0.50 vs 0.61; HAM10000: 0.57 vs 0.72)14. Groh et al. further exemplified that models are most accurate for skin types they were trained on, although some studies report no differences in model performance by skin tone15. The effect of skin tone on model performance is largely underreported—only 10% (7/70) of deep learning algorithms include information about skin tone16 and few report performance by skin tone categories17. Further, there is no gold standard for skin tone labeling, and commonly used practices like estimated Fitzpatrick Skin Type are limited by uncertainty18 and lack of inclusiveness. (ref. 11; ref. 12; ref. 10; ref. 19)
To improve generalizability of assessments and algorithm fairness, it is critical that patient skin tone variation be taken into account in validation studies of noninvasive technologies. There are initiatives to modify device monitoring regulation criteria, such as those released by the FDA in November 202320. In this review, we will (1) briefly present a review of background and methods for skin tone measurement within health care then (2) provide detailed study considerations for measuring skin tone in prospective trials.
Results
Part I. Review of skin tone assessment
Defining skin tone
Color is the perception of light based characterizations, such as hue, lightness, and saturation. Physiologically, the inherent color of the skin, defined as “skin tone”, is the result of light absorbing compounds called chromophores. The most abundant chromophores in humans are melanin (pheomelanin and eumelanin), carotene, oxygenated hemoglobin, and reduced hemoglobin21. In general, the two major contributors to skin tone are melanin, which produces a brown tint, and hemoglobin, which creates red and purple-blue hues22. Frequent methods for discriminating skin tone for the purpose of validating noninvasive assessment tools are administered visual scales and color measurement tools23 (Table 1). Skin tone can also be extracted from camera images through a variety of techniques24. For the purposes of this paper, automated skin color extraction, modeling, and labeling will not be discussed23,25,26.
Administered visual scales
Administered visual scales, such as Von Luschan, Monk, and Pantone, utilize numbered colored tiles that are matched to a person’s skin tone (Table 1). Fitzpatrick Skin Type (FST) was originally developed to assess tanning and burning propensity, however, many use FST as a proxy for skin tone27 despite evidence showing that FST is poorly correlated with objective measurements of skin color evaluation27,28. Although widely available and inexpensive to administer, visual scales can be limited by subjectivity. Furthermore, visual scales can be affected by complex human perception of color which is influenced by light, anatomic site, the context of the object, or a person’s unique experiences with similar objects29,30,31,32.
Color measurement tools
Color measurements are objective measurements achieved through reflectance spectrophotometry (Konica Minolta CM700D, Variable Spectro) and colorimetry (Delfin SkinColorCatch) (Table 1). In current works, color measurement tools are being utilized as a by-product of the limitations of visual scales by to providing objective measurements to increase precision in color quantification33,34,35. Color measurements offer greater color precision, but the tools are expensive and devices are sensitive to environmental influences25,26.
Cameras and color spaces
Several differing color models provide a framework to systematically or mathematically describe color output. One of the most common, the RGB model (red, green, blue), was developed to mimic the primary colors perceived by the eye32. It encodes color in an additive fashion where a combination of all three colors results in white. Other color models include HSL (hue, saturation, lightness), CIELAB (lightness, green-red gradient, blue-yellow gradient), and CIECAM02 (brightness, lightness, chroma, saturation, hue, and “colorfulness”)32,36. Most reproduction of color on printed work is exported in a CMYK (cyan, magenta, yellow, black) color space. Color spaces are a specific organization of colors that are mapped to values in color models in a standardized way. The standardized RGB color space (sRGB) is the most commonly used space for representing digital images on displays26,37.
Part II. Considerations for study design
Body part assessed
A summary of recommendations for skin tone measurement can be found in Table 2. Unaltered skin tone represents a combination of genetic factors and environmental influence based on constitutive (baseline skin color) and facultative (skin color altered by sun exposure) grouping. Constitutive skin color is best characterized in sun-protected areas more likely to represent unaltered baseline pigmentation38. However, one’s perceived skin tone may also be influenced by exogenous factors including artificial tanner, makeup, or tattoo pigmentation. Depending on the technology being validated, the inclusion of at least one constitutive skin site may be important given its decreased variability across seasons and increased correlation with skin phototype39. The upper volar arm has been proposed as a reliable measurement of constitutive skin given its low seasonal variability and ease in access40. Otherwise, body part selection may be predetermined based on the application (e.g., using the finger/earlobe in pulse oximetry).
Underlying conditions that can affect skin tone
Several conditions can affect the relative concentration and distribution of chromophores and alter skin tone assessment. Therefore, study designs incorporating skin tone measurement should consider pigmentary disorders (e.g., vitiligo or melasma) and medical conditions (e.g., anemia and jaundice) that influence skin tone. Perfusion-related changes in skin (e.g., flushing, blanching) can also affect skin tone assessment34. To minimize these effects, it is recommended that skin measurements occur in a pressure-free state and at rest, and to collect as much information about factors that influence skin tone at the time of measurement as feasible.
Ambient lighting
The impact of lighting on the perception of color is critical and commonly overlooked in study design. Ambient lighting can come in various forms, such as brightness and temperature. Ambient lighting can influence color perception based on time of day and location and may skew skin tone perception, making it appear lighter or darker41. Ambient lighting should be both sufficient and standardized to increase the accuracy and precision of skin tone assessment. To prevent variability in daylight conditions, artificial lighting with similar temperature to natural light (5000–6500 K) could be helpful42. A controlled illumination source, combined with an ambient light-blocking feature, can significantly enhance light isolation and improve the signal-to-noise ratio43.
Location
Considerations for skin tone assessment depend in part on the location of the patient population under study.
For example, an outpatient clinic may be a single location where ambient lighting and temperature may be more easily standardized. Patients are often mobile, making it more feasible to incorporate skin tone measurements on less accessible sun-protected body parts (e.g., lower back). These may be difficult when collecting remote photos from a patient’s home where lighting may not be standardized and number of body parts for measurement may be limited. Additionally, longitudinal study design may need to account for patients’ variable sun exposure. Previous studies have included non-sun exposed body parts, advised participants to avoid sun exposure and/or require the use of sunscreen on a daily basis to attempt to address this44.
In contrast, an inpatient population presents complications when attempting to achieve a more fixed environment for data collection and synchronization of measurements. A more fixed environment for data collection can potentially be improved by understanding the unique workflow of standard care and adjusting each patient’s room to mimic a standardized environment. Although dependent on study design, study procedures may need to take into consideration patient health status, iatrogenic complications, patient mobility, and other external factors that could potentially hinder temporal aspects that are essential for the completion of measurements. For instance, in the context of pulse oximetry, short timepoints for data collection may be needed to minimize potential discrepancies between arterial blood gas (ABG)-pulse oximetry measurements and skin tone readings45.
Dataset balance—skin tone and race
Clinical research of all kinds must incorporate racial and ethnic diversity to ensure results are generalizable, especially when evaluating devices for clinical use46,47. Underrepresentation of minority groups may lead to a higher risk of adverse reactions or reduced efficacy. In fact, the NIH has an issued policy and guideline requiring all phase III clinical trials to ensure analysis by sex/gender, race, and/or ethnicity48. However, multiple studies have demonstrated large variations in skin tone within racial and ethnic subgroups49, and skin tone may directly influence bias of technologies beyond race. This highlights the potential need to balance datasets specifically by skin tone in addition to race. Ensuring dataset balance by skin tone may pose several challenges. Since skin tone varies within a person and across time, one may consider balancing only by constitutive body site or averaging skin tone from multiple body sites. Both skin tone and racial/ethnic dataset variation will enhance the generalizability of research results. However, investigators may consider balancing by either skin tone or race with a minimum threshold of other parameters based on their research question. Initial explorations by the Food and Drug Administration are soliciting input on methods to integrate skin tone measures into clinical device studies, but a standard has not been established50.
Considerations for administered visual scales
Visual scales are low-cost tools that can distinguish skin tones with relatively high reliability, require minimal training, are widely available, and can be utilized in various forms of analyses (retrospective, prospective, and post-hoc). Limitations of these scales are that they are influenced by user perception (e.g., color blindness) and environmental conditions (e.g., ambient lighting). There are also several visual scales available which can make comparison of skin tone data across studies difficult.
Few studies have assessed the relative utility of visual scales. FST was designed as a questionnaire to determine tanning and burning propensity, but when used as a proxy for skin tone is only weakly associated with a visual color scale (p < 0.0001)51. There have been attempts to create RGB-defined FST visual scales, although these have not been widely adopted52. The Von Luschan scale has been shown to be highly correlated with narrow band spectrophotometry with one study showing correlation of VLS and Melanin + erythema index to be 0.90 (p < 0.001)53. The scales with more levels (Pantone, Taylor Hyperpigmentation) offer greater granularity and shade range for skin tone assessment, but may be more challenging to reliably administer. For the Pantone scale, one study investigating vascularized allotransplantation matching found inter-rater skin tone assessment to be fair (k = 0.454) and intra-rater reliability to be substantial (k = 0.725)54. A newer, 10-point Monk scale shows high reliability for crowdsourced annotators (ICC 0.86–0.94), but has not been tested in a medical context55. Price may also be a consideration when choosing a scale. While the Monk scale is freely available and free to use and the FST questionnaire is easily accessible online, the Pantone scale is only available for purchase.
Considerations for color measurements tools
Colorimetric and spectrophotometric devices are used in a wide range of study designs to assess skin tone by quantifying melanin, erythema, and overall skin pigmentation. Common reflectance colorimetric and spectrophotometric devices are composed of an illuminator, standard observer, and a tristimulus measurement system. The illuminator of the instrument applies a fixed light source to a desired surface and specific wavelengths are then isolated to obtain color details without influence of outside lighting conditions when pressed gently to the skin. Colorimetric and spectrophotometric devices are easily-operated, non-invasive methods of measurement to achieve objective skin tone measurement. In varying clinical settings, handheld colorimetric and spectrophotometric devices (e.g., Delfin SkinColorCatch, Konica Minolta, and Variable Spectro) may be easier to use to assess body parts while also maintaining patient comfort. Although the devices demonstrate moderate to high interobserver reliability, devices are potentially high-cost, and most devices have not had large-scale published validation. A few studies have attempted to compare the utility of objective color measurement tools, but were limited in scope33,56,57. Consequently, a particular type of colorimetric/spectrophotometric device has not been proven to be superior33.
Considerations for cameras
Cameras are widely used, and available in the pocket of almost everyone a patient interacts with across the medical practice. There are also large datasets of images that have been acquired to train AI algorithms and other applications11,58. However, it can be challenging to extract skin tone information from a photograph alone. Camera type, export compression level, and lighting will be critical to consider59. One of the most important modifiable factors is the white balance, which affects the relationship between red, green, and blue pixel values60. The use of cross polarizing filters can help reduce specular reflection61 and improve skin tone evaluation, especially in darker toned individuals43. Reference color charts or color calibration cards (eg. X-Rite, Douglas, DSC Labs, QPcard, Macbeth) can be used within the frame of the image or before/after in identical conditions to improve reproducibility across devices, but will not be available for retrospectively captured images59,60. After acquisition, proper image processing is necessary to maintain color accuracy across mediums. Although popular image storage mediums like JPEG increase computational efficiency, image compression can lead to artifact and lost image parameter information. Therefore, RAW image format may be helpful for maintaining color consistency, although large file size may limit its utility, and many photography devices (eg. phones) cannot acquire in RAW format32.
Scale/device reliability
The process of evaluating device bias against skin tone measurement is nascent. The portability and low cost benefits of visual scales will need to be balanced against the potential increased accuracy of color measurement technologies that also include continuous measurement compared to categorical bins. Having at least two raters use visual scales and conducting triplicate readings for color measurement tools will increase color precision. When possible, comparison between multiple color measurement instruments will be valuable to the specific study and field.
Discussion
Consideration of skin tone in device validation studies across medicine is important to reduce bias against patients with darker skin tones that exists in pulse oximetry, AI diagnosis, and many other areas of medicine. These biases will worsen existing healthcare disparities unless they are addressed and measured directly. In many cases, race and ethnicity play specific roles in equity focused-medicine and biased outcomes arise due to these socio-demographic factors62. A considerable amount of research is race and ethnicity focused, but for technologies that rely on light for measurement, their bias may be specifically related to skin tone. This highlights the need for increased awareness of limitations of current medical devices associated with systematic error and pronounced inaccuracies among patients with a darker skin tone.
In this review, we highlight common tools used for skin tone measurement and discuss pertinent study design considerations for accurate skin tone assessment. There is no current gold standard tool as each possesses relative pros and cons, and validation is largely absent. Visual scales are more readily available for prospective or post-hoc analysis, but may be influenced by user perception, while color measurement tools offer objective, sensitive measurements but can be expensive with variable reliability. In addition to tool choice, investigators should consider how patient level factors may affect skin tone validity, including selection of a body site, consideration for medical conditions affecting skin tone, and minimization of perfusion-related color changes. Furthermore, creation of a standardized environment with consistent lighting and camera settings will promote improved color consistency. This paper is a narrative review and therefore results are limited by the non-systematic approach. Further, the study is not powered to directly compare the utility of skin tone assessment modalities or quantify the potential effect of study design parameters on skin tone accuracy.
When prospectively evaluating devices that may be influenced by skin tone, incorporation of skin tone measurement will play an important role in considering these potential biases. The current review offers researchers a tool to aid in development of skin tone assessment protocols. We encourage researchers to continue to focus on validating devices against a diverse and representative dataset, and when possible, to make public the skin tone measurements for future use and calibration.
In conclusion, increasing evidence shows bias and increased error in noninvasive tools across medicine in patients with darker skin tones. We provide guidance and consideration when conducting skin tone assessments using administered scales (eg. Fitzpatrick, Pantone, Monk) and color measurement tools (colorimeters, spectrophotometers), encouraging device validation to include at least one color measurement tool. As our awareness as investigators consider skin tone as a variable in future work, we will be able to reduce skin tone biases in medical devices.
Data availability
All analyzed data is included in the published article.
References
Charpignon, M.-L. et al. Critical bias in critical care devices. Crit. Care Clin. 39, 795–813 (2023).
Kadambi, A. Achieving fairness in medical devices. Science 372, 30–31 (2021).
Sjoding, M. W., Dickson, R. P., Iwashyna, T. J., Gay, S. E. & Valley, T. S. Racial bias in pulse oximetry measurement. N. Engl. J. Med. 383, 2477–2478 (2020).
Adler, J. N., Hughes, L. A., Vivilecchia, R. & Camargo, C. A. Jr Effect of skin pigmentation on pulse oximetry accuracy in the emergency department. Acad. Emerg. Med. 5, 965–970 (1998).
Bothma, P. A. et al. Accuracy of pulse oximetry in pigmented patients. S. Afr. Med. J. 86, 594–596 (1996).
Feiner, J. R., Severinghaus, J. W. & Bickler, P. E. Dark skin decreases the accuracy of pulse oximeters at low oxygen saturation: the effects of oximeter probe type and gender. Anesth. Analg. 105, S18–S23 (2007).
Bickler, P. E., Feiner, J. R. & Severinghaus, J. W. Effects of skin pigmentation on pulse oximeter accuracy at low saturation. Anesthesiology 102, 715–719 (2005).
Martin, D. et al. Effect of skin tone on the accuracy of the estimation of arterial oxygen saturation by pulse oximetry: a systematic review. Br. J. Anaesth. https://doi.org/10.1016/j.bja.2024.01.023 (2024).
Ebmeier, S. J. et al. A two centre observational study of simultaneous pulse oximetry and arterial oxygen saturation recordings in intensive care unit patients. Anaesth. Intensive Care 46, 297–303 (2018).
Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44–56 (2019).
Combalia, M. et al. Validation of artificial intelligence prediction models for skin cancer diagnosis using dermoscopy images: the 2019 International Skin Imaging Collaboration Grand Challenge. Lancet Digit. Health 4, e330–e339 (2022).
Marchetti, M. A. et al. Prospective validation of dermoscopy-based open-source artificial intelligence for melanoma diagnosis (PROVE-AI study). NPJ Digit. Med. 6, 127 (2023).
Krishnapriya, K. S., Albiero, V., Vangara, K., King, M. C. & Bowyer, K. W. Issues related to face recognition accuracy varying based on race and skin tone. IEEE Trans. Technol. Soc. 1, 8–20 (2020).
Daneshjou, R. et al. Disparities in dermatology AI performance on a diverse, curated clinical image set. Sci. Adv. 8, eabq6147 (2022).
Kinyanjui, N. M. et al. Fairness of classifiers across skin tones in dermatology. In International Conference on Medical Image Computing and Computer-Assisted Intervention 320–329 (Springer International Publishing, 2020).
Daneshjou, R., Smith, M. P., Sun, M. D., Rotemberg, V. & Zou, J. Lack of transparency and potential bias in artificial intelligence data sets and algorithms: a scoping review. JAMA Dermatol 157, 1362–1369 (2021).
Steele, L. et al. Determining the clinical applicability of machine learning models through assessment of reporting across skin phototypes and rarer skin cancer types: a systematic review. J. Eur. Acad. Dermatol. Venereol. 37, 657–665 (2023).
Groh, M., Harris, C., Daneshjou, R., Badri, O. & Koochek, A. Towards transparency in dermatology image datasets with skin tone annotations by experts, crowds, and an algorithm. Proc. ACM Hum. Comput. Interact. 6, 1–26 (2022).
Heldreth, C. M. et al. Which skin tone measures are the most inclusive? An investigation of skin tone measures for artificial intelligence. ACM J. Responsib. Comput. https://doi.org/10.1145/3632120 (2023).
Discussion paper: Approach for improving the performance evaluation of pulse oximeter devices taking into consideration skin pigmentation, race, and ethnicity. https://www.fda.gov/media/173905.
Tseng, S.-H., Bargo, P., Durkin, A. & Kollias, N. Chromophore concentrations, absorption and scattering properties of human skin in-vivo. Opt. Express 17, 14599–14617 (2009).
Everett, J. S., Budescu, M. & Sommers, M. S. Making Sense of Skin Color in Clinical Care. Clin. Nurs. Res. https://doi.org/10.1177/1054773812446510 (2012).
Taylor, S., Westerhof, W., Im, S. & Lim, J. Noninvasive techniques for the evaluation of skin color. J. Am. Acad. Dermatol. 54, S282–S290 (2006).
Nanni, L., Loreggia, A., Lumini, A. & Dorizza, A. A standardized approach for skin detection: analysis of the literature and case studies. J. Imaging Sci. Technol. 9, 35 (2023).
Krishnapriya, K. S., Pangelinan, G., King, M. C. & Bowyer, K. W. Analysis of manual and automated skin tone assignments. In Proc. IEEE/CVF Winter Conference on Applications of Computer Vision Workshops (WACVW) 429–438 (IEEE, 2022). https://doi.org/10.1109/wacvw54805.2022.00049.
Kakumanu, P., Makrogiannis, S. & Bourbakis, N. A survey of skin-color modeling and detection methods. Pattern Recognit. 40, 1106–1122 (2007).
Ware, O. R., Dawson, J. E., Shinohara, M. M. & Taylor, S. C. Racial limitations of fitzpatrick skin type. Cutis 105, 77–80 (2020).
Leenutaphong, V. Relationship between skin color and cutaneous response to ultraviolet radiation in Thai. Photodermatol. Photoimmunol. Photomed. 11, 198–203 (1995).
Gornitsky, J., Saleh, E., Bouhadana, G. & Borsuk, D. E. Validating a novel device to improve skin color matching for face transplants. Plast. Reconstr. Surg. Glob. Open 10, e4649 (2022).
Eilers, S. et al. Accuracy of self-report in assessing Fitzpatrick skin phototypes I through VI. JAMA Dermatol 149, 1289–1294 (2013).
Nakashima, Y., Wada, K., Yamakawa, M. & Nagata, C. Validity of self-reported skin color by using skin color evaluation scale. Ski. Res. Technol. 28, 827–832 (2022).
Yélamos, O. et al. Understanding Color. in Photography in Clinical Medicine (ed Pasquali, P.) 99–111 (Springer International Publishing, 2020). https://doi.org/10.1007/978-3-030-24544-3_8.
Langeveld, M., van de Lande, L. S., O’ Sullivan, E., van der Lei, B. & van Dongen, J. A. Skin measurement devices to assess skin quality: A systematic review on reliability and validity. Ski. Res. Technol. 28, 212–224 (2022).
Clarys, P., Alewaeters, K., Lambrecht, R. & Barel, A. O. Skin color measurements: comparison between three instruments: the Chromameter(R), the DermaSpectrometer(R) and the Mexameter(R). Ski. Res. Technol. 6, 230–238 (2000).
Ly, B. C. K., Dyer, E. B., Feig, J. L., Chien, A. L. & Del Bino, S. Research techniques made simple: cutaneous colorimetry: a reliable technique for objective skin color measurement. J. Investig. Dermatol. 140, 3–12.e1 (2020).
Moroney, N., Fairchild, M., Hunt, R. & Li, C. The CIECAM02 color appearance model. In CIC 10, 23–27 (2002).
Logvinenko, A. D. An object-color space. J. Vis. 9, 5.1–23 (2009).
Del Bino, S., Duval, C. & Bernerd, F. Clinical and biological characterization of skin pigmentation diversity and its consequences on UV impact. Int. J. Mol. Sci. 19, 2668 (2018).
Choe, Y. B., Jang, S. J., Jo, S. J., Ahn, K. J. & Youn, J. I. The difference between the constitutive and facultative skin color does not reflect skin phototype in Asian skin. Ski. Res. Technol. 12, 68–72 (2006).
Pershing, L. K. et al. Reflectance spectrophotometer: the dermatologists’ sphygmomanometer for skin phototyping? J. Investig. Dermatol. 128, 1633–1640 (2008).
Emery, K. J. & Webster, M. A. Individual differences and their implications for color perception. Curr. Opin. Behav. Sci. 30, 28–33 (2019).
Finnane, A. et al. Proposed technical guidelines for the acquisition of clinical images of skin-related conditions. JAMA Dermatol 153, 453–457 (2017).
Oh, Y., Markova, A., Noor, S. J. & Rotemberg, V. Standardized clinical photography considerations in patients across skin tones. Br. J. Dermatol. 186, 352–354 (2022).
Sommers, M. S. et al. Are The Fitzpatrick skin phototypes valid for cancer risk assessment in a racially and ethnically diverse sample of women? Ethn. Dis. 29, 505–512 (2019).
Hao, S. et al. Utility of skin tone on pulse oximetry in critically ill patients: a prospective cohort study. bioRxiv https://doi.org/10.1101/2024.02.24.24303291 (2024).
Oyer, R. A. et al. Increasing racial and ethnic diversity in cancer clinical trials: an American Society of Clinical Oncology and Association of Community Cancer Centers joint research statement. J. Clin. Oncol. 40, 2163–2171 (2022).
Bøttern, J., Stage, T. B. & Dunvald, A.-C. D. Sex, racial, and ethnic diversity in clinical trials. Clin. Transl. Sci. 16, 937–945 (2023).
National Institutes of Health. NOT-OD-18-014: Amendment: NIH Policy and Guidelines on the Inclusion of Women and Minorities as Subjects in Clinical Research. U.S. Department of Health and Human Services, https://grants.nih.gov/grants/guide/notice-files/NOT-OD-18-014.html (2017).
Del Bino, S. & Bernerd, F. Variations in skin colour and the biological consequences of ultraviolet radiation exposure. Br. J. Dermatol. 169, 33–40 (2013).
Discussion Paper: Approach for Improving the Performance Evaluation of Pulse Oximeter Devices Taking Into Consideration Skin Pigmentation, Race and Ethnicity. https://www.fda.gov/media/173905/.
He, S. Y. et al. Self-reported pigmentary phenotypes and race are significant but incomplete predictors of Fitzpatrick skin phototype in an ethnically diverse population. J. Am. Acad. Dermatol. 71, 731–737 (2014).
Jo, H. C. & Kim, D. Y. Correlation between light absorbance and skin color using fabricated skin phantoms with different colors. Lasers Med. Sci. 35, 919–926 (2020).
Treesirichod, A., Chansakulporn, S. & Wattanapan, P. Correlation between skin color evaluation by skin color scale chart and narrowband reflectance spectrophotometer. Indian J. Dermatol. 59, 339–342 (2014).
Hoffman, A. F. et al. Establishing a clinically applicable methodology for skin color matching in vascularized composite allotransplantation. Plast. Reconstr. Surg. Glob. Open 8, e2655 (2020).
Schumann, C. et al. Consensus and subjectivity of skin tone annotation for ML fairness. Advances in Neural Information Processing Systems 36, (2024).
Matias, A. R., Ferreira, M., Costa, P. & Neto, P. Skin colour, skin redness and melanin biometric measurements: comparison study between Antera(®) 3D, Mexameter(®) and Colorimeter(®). Ski. Res. Technol. 21, 346–362 (2015).
Baquié, M. & Kasraee, B. Discrimination between cutaneous pigmentation and erythema: comparison of the skin colorimeters Dermacatch and Mexameter. Ski. Res. Technol. 20, 218–227 (2014).
Groh, M. et al. Evaluating deep neural networks trained on clinical images in dermatology with the Fitzpatrick 17k dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 1820–1828 (2021).
International Color Consortium. Improving Color Image Quality in Medical Photography. https://www.color.org/whitepapers/ICC_White_Paper46-Medical_Photography_Guidelines.pdf (2017).
Penczek, J., Boynton, P. A. & Splett, J. D. Color error in the digital camera image capture process. J. Digit. Imaging 27, 182–191 (2014).
McFall, K. Photography of dermatological conditions using polarized light. J. Audiov. Media Med. 19, 5–9 (1996).
Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019).
Fitzpatrick, T. B. soleil et peau. J. Med. Esthet. 2, 33–34 (1975).
Fitzpatrick, T. B. The validity and practicality of sun-reactive skin types I through VI. Arch. Dermatol. 124, 869–871 (1988).
Sachdeva, S. Fitzpatrick skin typing: applications in dermatology. Indian J. Dermatol. Venereol. Leprol. 75, 93–96 (2009).
von Luschan, E. & von Luschan, F. Anthropologische Messungen an 95 Engländern: (S. S. ‘Durham Castle’ ; Brit. Association 1905). (Behrend, 1914).
Tool used to classify skin color in racial studies conducted in Nazi Germany. United States Holocaust Memorial Museum, https://collections.ushmm.org/search/catalog/irn564926#rights-restrictions (2023).
Monk, E. The Monk Skin Tone Scale. https://doi.org/10.31235/osf.io/pdf4c (2023).
Porgali, B., Albiero, V., Ryda, J., Ferrer, C. C. & Hazirbas, C. The Casual Conversations v2 dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 10–17 (2023).
Iranmanesh, B. et al. Brief overview of PANTONE SkinTone Guide chart in CIEL*a*b color space. Presented at The 7th International Color & Coating Congress, Tehran, Iran, Amirkabir University of Technology (2017).
Kadyrova, A., Ansari-Asl, M. & Benito, E. M. V. Evaluation of the Human Visual System in Cosmetics Foundation Colour Selection. Imaging Science and Technology 2020, 60–64. https://doi.org/10.2352/issn.2694-118X.2020.LIM-22 (2020).
CR-400 Chroma Meter. Konica Minolta Sensing https://sensing.konicaminolta.us/us/products/cr-400-chroma-meter-colorimeter/ (2017).
SkinColorCatch. Delfin Technologies https://delfintech.com/products/skincolorcatch/ (2019).
Antera 3D skin analysis as it should be miravex Limited. https://miravex.com/ (2020).
Linming, F. et al. Comparison of two skin imaging analysis instruments: The VISIA® from Canfield vs the ANTERA 3D® CS from Miravex. Ski. Res. Technol. 24, 3–8 (2017).
Bauer, H. Skin-Colorimeter Flex CL 440. Courage + Khazaka Electronic, Köln https://www.courage-khazaka.de/en/scientific-products/skin-colorimeter-flex-cl-440 (2018).
CM-700d Spectrophotometer. Konica Minolta Sensing https://sensing.konicaminolta.us/us/products/cm-700d-spectrophotometer/ (2017).
Spectro 1. Variable https://www.variableinc.com/spectro-1-shop.html.
Bauer, H. Mexameter® MX 18. Courage + Khazaka Electronic, Köln https://www.courage-khazaka.de/en/faq?view=article&id=169:mexameter-d-2&catid=16:alle-produkte (2018).
Johnston, A., Pasquali, P. & Alberich-Carrasco, R. Equipment and Materials for Medical Photography. in Photography in Clinical Medicine (ed Pasquali, P.) 167–189 (Springer International Publishing, 2020). https://doi.org/10.1007/978-3-030-24544-3_11.
Benvenuto-Andrade, C. et al. Differences between polarized light dermoscopy and immersion contact dermoscopy for the evaluation of skin lesions. Arch. Dermatol. 143, 329–338 (2007).
Cerminara, S. E. et al. Diagnostic performance of augmented intelligence with 2D and 3D total body photography and convolutional neural networks in a high-risk population for melanoma under real-world conditions: A new era of skin cancer screening? Eur. J. Cancer 190, 112954 (2023).
Ji-Xu, A., Dinnes, J. & Matin, R. N. Total body photography for the diagnosis of cutaneous melanoma in adults: a systematic review and meta-analysis. Br. J. Dermatol. 185, 302–312 (2021).
Acknowledgements
This work was supported by the Prevent Cancer Foundation, the Lauder Diversity Fund, and NIH/NCI Cancer Center Support Grant P30 CA008748. AIW is supported by the Duke CTSI by the National Center for Advancing Translational Sciences (NCATS) of the National Institutes of Health under UL1TR002553 and REACH Equity under the National Institute on Minority Health and Health Disparities (NIMHD) of the National Institutes of Health under U54MD012530. JWG declares support from RSNA Health Disparities grant (#EIHD2204), Lacuna Fund (#67), Gordon and Betty Moore Foundation, NIH (NIBIB) MIDRC grant under contracts 75N92020C00008 and 75N92020C00021, and NHLBI Award Number R01HL167811.
Author information
Authors and Affiliations
Contributions
Conceptualization: V.R., A.I.W., J.W.G. Data curation: V.W., K.D. Funding acquisition: V.R., A.I.W. Methodology: V.R., A.I.W. Supervision: V.R., A.I.W. Writing—original draft: V.W., K.D. Writing—review & editing: V.R., A.I.W., J.W.G.
Corresponding authors
Ethics declarations
Competing interests
The authors disclose no competing Non-Financial Interests but the following competing financial interests: A.I.W. holds equity and management roles in Ataia Medical. V.R. is a consultant for Inhabit Brands, Inc (not relevant), receives research funding from Lutris Pharma, and in kind research support from Kaggle and AWS through the Open Data Program. J.W.G. is a 2022 Robert Wood Johnson Foundation Harold Amos Medical Faculty Development Program. All other authors have no financial or nonfinancial competing interests to disclose.
Additional information
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Weir, V.R., Dempsey, K., Gichoya, J.W. et al. A survey of skin tone assessment in prospective research. npj Digit. Med. 7, 191 (2024). https://doi.org/10.1038/s41746-024-01176-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41746-024-01176-8
- Springer Nature Limited