AI won’t replace you, but it might make you better

3 minute read


Artificial intelligence may not outperform dermatologists alone, but can boost human sensitivity and specificity, study finds.


Should you get a second opinion from your artificial intelligence colleague?

In the spirit of trying to get more out of less from an underfunded healthcare system, a small team of German researchers concluded that while AI performs comparably to dermatologists at diagnosing melanoma, working together leads to better patient outcomes than either do individually.

Despite the complete lack of validation or regulation of AI tools in Australia, their convenience has led to increased use in clinical practice.

As many of the potential issues with this are based on a lack of human oversight, using these systems to confirm or query the diagnosis you’ve already made could improve the performance of both human and AI.

In a systematic review and meta-analysis of 11 prospective studies published between 2002 and 2024 – including more than 2500 patients and 50 dermatologists – dermatologists and AI performed similarly when used alone, with comparable sensitivity (78.6% vs 80.9%) and specificity (75.2% vs 75.6%). Balanced accuracy was also similar (77.4% vs 78.3%).

However, when dermatologist and AI were combined, sensitivity and specificity increased to 91.9% and 83.7%, highlighting the value of AI as a support tool.

Histopathology was used as the reference standard for diagnosing melanoma in all included studies, while the criteria for confirming benign lesions varied, including histopathology, clinical follow-up or expert consensus.

Study populations were highly heterogeneous, with melanoma cases ranging from 26 to 653 and non-malignant lesions from 88 to 4495.

Overall, while AI and dermatologists showed similar diagnostic performance, in direct head-to-head comparisons within the same clinical setting, AI demonstrated higher specificity, suggesting a potential to reduce false positives and unnecessary biopsies, although in some cases it may detect slightly fewer melanomas than clinicians.

“This observation may be explained by the fact that dermatologists tend to act cautiously and are more likely to recommend biopsy in cases of diagnostic uncertainty,” the authors wrote.

While only one of the studies directly compared AI-assisted dermatologists to working alone, some of the studies compared human diagnosis to two different AI systems and one stratified dermatologist performance by level of experience.

However, nearly all studies applied a binary classification (malignant vs non-malignant), which introduced index test bias because pooling could have obscured differences in performance across specific differential diagnoses, causing sensitivity and specificity to potentially be over- or underestimated, authors explained.

Performance varied widely across studies. Dermatologist sensitivity ranged from 41.8% to 96.6% and specificity from 29.3% to 97.0%, while AI sensitivity ranged from 16.4% to 100% and specificity from 37.4% to 98.3%.

While AI is promising, the review included studies which tested its use in controlledconditions rather than real-world clinical settings, and many relied solely on pre-selected suspicious lesions.

“The evidence base remains small, and study designs are heterogeneous, with a high risk of bias in patient selection and index test domains,” the authors wrote.

JAMA Dermatology, 25 March 26

End of content

No more pages to load

Log In Register ×