A Two-Stage Deep Learning Approach Using YOLO11 and U-Net for Detection and Segmentation of Caries Under Crowns in Panoramic Radiographs

Çebi Gül, BUSE; Çebi, Can; Altınsoy, Burçin; Dedeoğlu, NUMAN

doi:10.58600/eurjther2981

A Two-Stage Deep Learning Approach Using YOLO11 and U-Net for Detection and Segmentation of Caries Under Crowns in Panoramic Radiographs

Çebi Gül B., Çebi C., Altınsoy B., Dedeoğlu N.

EUROPEAN JOURNAL OF THERAPEUTICS, cilt.0, sa.0, 2026 (ESCI, TRDizin)

Yayın Türü: Makale / Tam Makale
Cilt numarası: 0 Sayı: 0
Basım Tarihi: 2026
Doi Numarası: 10.58600/eurjther2981
Dergi Adı: EUROPEAN JOURNAL OF THERAPEUTICS
Derginin Tarandığı İndeksler: Emerging Sources Citation Index (ESCI), CINAHL, TR DİZİN (ULAKBİM)
Açık Arşiv Koleksiyonu: AVESİS Açık Erişim Koleksiyonu
İnönü Üniversitesi Adresli: Evet

Özet

Objective: This study aimed to develop and evaluate deep learning models for the automatic detection and segmentation of caries under crowns on panoramic radiographs. Methods: A retrospective study was conducted. From 1742 panoramic radiographs initially screened, 257 high-quality regions of interest (ROIs) containing crowns with radiographically detectable caries were extracted. Multiple ROIs could be obtained from a single panoramic radiograph when more than one qualifying crown was present. A two-stage coarse-to-fine strategy was implemented. In the first stage, the YOLO11l (Large) model was used for automatic detection and localization of crowns, achieving 0.977 mAP@50 and 0.857 mAP@50-95 on the validation set. In the second stage, three deep learning architectures were compared for caries segmentation: YOLO11-seg variants, U-Net with ResNet34 backbone, and U-Net++ with ResNet34 backbone. Expert annotations were created by three dentists with over 10 years of experience, showing near-perfect interobserver agreement. Results: In ROI detection, the YOLO11l model achieved a 97.7% mAP@50 value. In the segmentation phase, YOLO11s-seg showed the highest object detection success with a 74.3% instance F1 score, while in pixel-based evaluation, U-Net++ (71.2% Dice) performed 9.8 points higher than YOLO11s-seg (61.4% pixel F1) (p<0.01). Dice coefficient and F1 score are mathematically equivalent metrics when calculated at the pixel level. The final U-Net++ model achieved a 68.6% Dice score in the independent test set. Conclusion: U-Net++ (ResNet34 backbone) achieved significantly superior pixel-based segmentation performance compared to standard U-Net and YOLO11-seg, demonstrating the effectiveness of nested skip connections for caries detection. An inverse relationship between model size and performance was observed in YOLO11 variants, emphasizing the critical importance of matching model capacity to dataset size. While these deep learning architectures demonstrate considerable potential for enhancing diagnostic accuracy, they should be implemented as adjunctive decision-support tools integrated with clinical expertise rather than standalone diagnostic systems.