Filip Christiansen, Emir Konuk, Adithya Raju Ganeshan, Robert Welch, Joana Palés Huix, Artur Czekierdowski, Francesco Paolo Giuseppe Leone, Lucia Anna Haak, Robert Fruscio, Adrius Gaurilcikas Dorella Franchi, Daniela Fischerova, Elisa Mor , Luca Savelli, Maria Àngela Pascual, Marek Jerzy Kudla, Stefano Guerriero, Francesca Buonomo, Karina Liuba , Nina Montik, Juan Luis Alcázar, Ekaterini Domali, Nelinda Catherine P Pangilinan, Chiara Carella, Maria Munaretto, Petra Saskova, Debora Verri, Chiara Visenzi, Pawel Herman, Kevin Smith, Elisabeth Epstein.
Nat Med. 2025 Jan;31(1):189-196.
DOI: 10.1038/s41591-024-03329-4. Epub 2025 Jan 2.
Ovarian lesions are common and often detected incidentally. Transvaginal ultrasound is the main technique used to differentiate between benign and malignant ovarian lesions due to its wide availability and high diagnostic accuracy when performed by experienced examiners. Biopsy, on the other hand, is contraindicated, as it may cause the spread of malignant tumours, worsening the prognosis.
However, diagnostic accuracy and interobserver agreement are significantly lower among less experienced examiners, which can lead to delayed or incorrect cancer diagnoses, as well as unnecessary treatments. Moreover, there is a substantial shortage of experts in this field.
In this context, artificial intelligence (AI)-based diagnostic support has emerged as a potential solution, and convolutional neural network (CNN) architectures have already shown promising results in classifying ovarian lesions. The challenge lies in the variability of clinical environments: factors such as patient populations, imaging devices and acquisition protocols can differ substantially between centres. In addition, the collection of datasets that are sufficiently large and diverse to capture the full variability of clinical data and be universally representative is limited by both legal and financial constraints.
A large-scale multicentre study capable of validating generalisability could therefore provide essential evidence to build confidence in AI-driven diagnostic support systems for clinical use. With this objective in mind, a group of researchers including Dr Mª Angela Pascual, Consultant and Director of R&D at the Gynaecological Imaging Diagnostics Department at Dexeus Mujer, has conducted a large retrospective international multicentre study to develop and validate transformer-based neural network models using a large dataset. A total of 17,119 ultrasound images from 3,652 patients across 20 centres in 8 countries were analysed. These images were acquired using 21 different ultrasound systems from 9 manufacturers.
According to the results, the models demonstrated robust performance across all centres, ultrasound systems, histological diagnoses and patient age groups, significantly outperforming both expert and non-expert examiners in all metrics evaluated. Furthermore, AI-driven diagnostic support reduced referrals to experts by 63%, significantly surpassing the diagnostic performance of current practice.
According to the authors, these findings demonstrate that transformer-based models exhibit strong generalisation and diagnostic accuracy superior to that of human experts. This could potentially help address the shortage of expert sonographers and improve diagnostic accuracy, particularly in cases that are challenging for examiners to classify.