
浏览全部资源
扫码关注微信
1.秦皇岛市第一医院超声医学科,河北 秦皇岛 066001
2.秦皇岛市卫生学校,河北 秦皇岛 066001
贺玉卿(ORCID:0009-0002-2840-4798),硕士,主治医师。
吴梓政(ORCID:0000-0001-8253-0352),博士,副主任医师,E-mail:wwzzh890415@163.com。
收稿:2025-08-15,
修回:2026-01-05,
纸质出版:2026-02-28
移动端阅览
贺玉卿, 吴梓政, 郭 帅, 等. 人工智能ChatGPT-4V在乳腺超声病灶良恶性鉴别中的诊断效能[J]肿瘤影像学, 2026, 35(1): 57-63.
HE Y Q, WU Z Z, GUO S,Citation: et al. Diagnostic performance of artificial intelligence ChatGPT-4V in differentiating benign and malignant breast lesions on ultrasound[J]. Oncoradiology, 2026, 35(1): 57-63.
贺玉卿, 吴梓政, 郭 帅, 等. 人工智能ChatGPT-4V在乳腺超声病灶良恶性鉴别中的诊断效能[J]肿瘤影像学, 2026, 35(1): 57-63. DOI: 10.19732/j.cnki.2096-6210.2026.01.008.
HE Y Q, WU Z Z, GUO S,Citation: et al. Diagnostic performance of artificial intelligence ChatGPT-4V in differentiating benign and malignant breast lesions on ultrasound[J]. Oncoradiology, 2026, 35(1): 57-63. DOI: 10.19732/j.cnki.2096-6210.2026.01.008.
目的
2
评估ChatGPT-4V在乳腺超声病灶良恶性判读中的诊断效能,并与低年资及高年资医师进行比较,探讨其辅助诊断的可行性。
方法
2
回顾并纳入2024年1月—2025年6月秦皇岛市第一医院乳腺病变患者,以病理学检查结果为金标准,由ChatGPT-4V、2名低年资(3~5年工作经验)及2名高年资医师(>10年工作经验),采用盲法独立判读超声图像。记录灵敏度、特异度、准确度、受试者工作特征(receiver operating characteristic curve,ROC)曲线的曲线下面积(area under curve,AUC),并以乳腺影像报告和数据系统(Breast Imaging Reporting and Data System,BI-RADS)为标准评估形状、边界、回声类型、后方回声、钙化特征识别准确度。采用McNemar检验比较准确度,DeLong检验比较AUC,临床决策曲线评估净收益。
结果
2
ChatGPT-4V诊断效能接近低年资医师(准确度
P
>
0.05),但低于高年资医师(
P
<
0.05),临床决策曲线显示低阈值净收益接近低年资医师。ChatGPT-4V与低年资医师比较,在回声类型(
P
=0.012)、后方回声(
P
=0.018)方面的识别准确度显著更低,钙化特征识别差异无统计学意义(
P
=1.000);ChatGPT-4V与高年资医师比较,在形状、边界、回声类型、后方回声及钙化所有超声特征的识别上均显著不足(
P
<
0.05)。ChatGPT-4V误判24例(16.0%),恶性误为良性多见于边界光整,良性误为恶性多见于不规则形状。
结论
2
ChatGPT-4V接近低年资医师效能,适合基层筛查辅助,但在复杂特征识别方面需改进,未来可继续优化以提升临床应用价值。
Objective
2
To evaluate the diagnostic performance and feature recognition accuracy of ChatGPT-4V in classifying benign and malignant breast lesions on ultrasound
compared with junior and senior physicians
and to explore its feasibility as an auxiliary diagnostic tool.
Methods
2
A retrospective study included patients with breast lesions from The First Hospital of Qinhuangdao between January 2024 and June 2025. With pathological examination results as the gold standard
ChatGPT-4V
two junior physicians (3-5 years of experience)
and two senior physicians (>10 years of experience) independently interpreted ultrasound images in a blinded manner. Sensitivity
specificity
accuracy
and area under the receiver operating characteristic curve (AUC) were recorded. The accuracy of identifying ultrasound features
including shape
margin
echo pattern
posterior acoustic features
and calcifications
were evaluated against the criteria of the Breast Imaging Reporting and Data System (BI-RADS). The McNemar test was used to compare diagnostic accuracy among different interpreters
the DeLong method was employed to compare AUC values
and decision curve analysis (DCA) was performed to assess the net benefit across varying threshold probabilities.
Results
2
ChatGPT-4V’s diagnostic performance was comparable to that of junior physicians (accuracy
P>
0.05) but inferior to senior physicians (
P<
0.05). DCA showed similar net benefit to junior physicians at low thresholds. Compared with junior physicians
ChatGPT-4V had significantly lower accuracy in identifying echo pattern (
P
=0.012) and posterior features (
P
=0.018)
with no statistical difference in calcification recognition (
P
=1.000). Compared with senior physicians
ChatGPT-4V showed significantly insufficient accuracy in recognizing all ultrasound features
including shape
margin
echo pattern
posterior features
and calcification (
P
<
0.05). ChatGPT-4V misdiagnosed 24 cases (16.0%)
with malignant-to-benign errors often linked to smooth margins and benign-to-malignant errors to irregular shapes.
Conclusion
2
ChatGPT-4V demonstrates diagnostic performance close to junior physicians
making it a potential auxiliary tool for breast ultrasound screening in primary care. However
its limitations in complex feature recognition require improvement through targeted optimization to enhance clinical utility.
SUNG H , FERLAY J , SIEGEL R L , et al . Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries [J]. CA Cancer J Clin , 2021 , 71 ( 3 ): 209 - 249 .
GUO Y L , LI N , SONG C H , et al . Artificial intelligence-based automated breast ultrasound radiomics for breast tumor diagnosis and treatment: a narrative review [J]. Front Oncol , 2025 , 15 : 1578991 .
YAN H J , DAI C C , XU X J , et al . Using artificial intelligence system for assisting the classification of breast ultrasound glandular tissue components in dense breast tissue [J]. Sci Rep , 2025 , 15 ( 1 ): 11754 .
LEE J , KIM W H , KIM J , et al . Efficacy of a real-time artificial intelligence ultrasound system with computer-aided detection and diagnosis for breast cancer: a feasibility study [J]. J Breast Cancer , 2025 , 28 ( 3 ): 206 - 214 .
MAHANT S S , VARMA A R . Artificial intelligence in breast ultrasound: the emerging future of modern medicine [J]. Cureus , 2022 , 14 ( 9 ): e28945 .
American College of Radiology . ACR BI-RADS atlas: breast imaging reporting and data system (6th edition) [M]. Reston, VA : American College of Radiology , 2021 : 42 - 98 .
XIANG H L , WANG X , XU M , et al . Deep learning-assisted diagnosis of breast lesions on US images: a multivendor, multicenter study [J]. Radiol Artif Intell , 2023 , 5 ( 5 ): e220185 .
LIU H X , CUI G Z , LUO Y , et al . Artificial intelligence-based breast cancer diagnosis using ultrasound images and grid-based deep feature generator [J]. Int J Gen Med , 2022 , 15 : 2271 - 2282 .
王 琪 , 党晓智 , 许 磊 , 等 . 超声在乳腺癌筛查中的应用现状与未来 [J]. 中华医学超声杂志 (电子版), 2024 , 21 ( 4 ): 429 - 433 .
WANG Q , DANG X Z , XU L , et al . Current status and future perspectives of application of ultrasound in breast cancer screening [J]. Chin J Med Ultrasound Electron Ed , 2024 , 21 ( 4 ): 429 - 433 .
PESAPANE F , TRENTIN C , FERRARI F , et al . Deep learning performance for detection and classification of microcalcifications on mammography [J]. Eur Radiol Exp , 2023 , 7 ( 1 ): 69 .
DAN Q , XU Z T , BURROWS H , et al . Diagnostic performance of deep learning in ultrasound diagnosis of breast cancer: a systematic review [J]. NPJ Precis Oncol , 2024 , 8 ( 1 ): 21 .
何奕宗 , 姚振强 , 何小娜 , 等 . 具备视觉功能的ChatGPT对乳腺超声图像病变的识别能力和诊断价值初探 [J]. 中国超声医学杂志 , 2025 , 41 ( 1 ): 13 - 16 .
HE Y Z , YAO Z Q , HE X N , et al . Preliminary exploration of ChatGPT with vision in the recognition and diagnostic value of breast ultrasound lesions [J]. Chin J Ultrasound Med , 2025 , 41 ( 1 ): 13 - 16 .
GU Y , XU W , LIN B , et al . Deep learning based on ultrasound images assists breast lesion diagnosis in China: a multicenter diagnostic study [J]. Insights Imaging , 2022 , 13 ( 1 ): 124 .
QIAN X J , PEI J , ZHENG H , et al . Prospective assessment of breast cancer risk from multimodal multiview ultrasound images via clinically applicable deep learning [J]. Nat Biomed Eng , 2021 , 5 ( 6 ): 522 - 532 .
程妙仙 , 曾令红 , 吴 忧 , 等 . 人工智能与大数据在超声医学实践中的应用进展 [J]. 肿瘤影像学 , 2023 , 32 ( 1 ): 78 - 82 .
CHENG M X , ZENG L H , WU Y , et al . Application progress of artificial intelligence and big data in ultrasound medicine practice [J]. Oncoradiology , 2023 , 32 ( 1 ): 78 - 82 .
0
浏览量
16
下载量
0
CNKI被引量
关联资源
相关文章
相关作者
相关机构
京公网安备11010802024621