Machine Learning May Aid in Diagnosing Type 2 Diabetes

Christopher Epps, PharmD Candidate

Investigators from Monash University determined the prevalence of undiagnosed type 2 diabetes utilizing machine learning to analyze modifiable markers not included in screening guidelines.

Diabetes is one of the most prevalent noncommunicable diseases across the world, with a projection to affect 552 million people by the year 2030. Most attempts to curb its growth focus primarily on lifestyle interventions to slow or stop type 2 diabetes (T2D) onset.

Screening and diagnosing tools currently in practice do not include modifiable factors such as nutritional intake and anthropometric data. However, anthropometric markers and dietary intake are indicators of an individual’s relative risk of developing T2D. Studies have shown that the available screening tools that examine limited known predictors result in the underdiagnosis of early hyperglycemia. The inclusion of modifiable factors may be invaluable in approaches to prevent the onset of T2D more aggressively.

Investigators from Monash University in Australia determined the prevalence of undiagnosed T2D to be 5.26% when utilizing machine learning to analyze modifiable markers not included in current screening guidelines. This equates to up to 29 million people worldwide with undiagnosed T2D by the year 2030.

The research team compiled 16,429 medical files and stratified them based on the confirmation of undiagnosed T2D. They identified patients who lacked a current diagnosis and had a positive glycemic response to 1 of 3 tests. Three machine learning algorithms analyzed this group against 114 potential nutritional markers with 13 behavioral and 12 socio-economic variables.

Investigators found significant anthropometric markers that included upper leg length, age at heaviest weight, waist circumference, and arm circumference. Significant dietary markers found in the study included number of meals not prepared at home, number of ready-to-eat foods, daily fat consumption, and amount of water consumed daily.

Across the data, ultra-processed food consumption was an emerging risk factor for the development of type 1, type 2, and gestational diabetes. Conversely, caffeine intake positively reduced risk of disease development. Ninety countries have established food-based dietary guidelines and this study’s findings could help create disease-specific nutritional guidance.

This analysis revealed that current diagnosing guidelines miss important data that can lead to an earlier diagnosis. Health care providers can use diet-related, anthropometric, and nutrient-based variables to enhance current prediction models and diagnose T2D sooner.

Christopher Epps is a 2022 PharmD candidate at the University of Connecticut in Storrs.


De Silva K, Lim S, Mousa A, et al. Nutritional markers of undiagnosed type 2 diabetes in adults: Findings of a machine learning analysis with external validation and benchmarking. PLoS One. 2021;16(5):e0250832. DOI:10.1371/journal.pone.0250832