To explore the inter-observer consistency in the Breast Imaging Reporting and Data System (BI-RADS) classification of breast density and reliability of classification of breast density assessed by original report.
Methods:
Theretrospective study was conducted on 774 women w
ho underwent mammography screening in Nanfang Hospital from Jan. to May. 2018. Chi-square test was used to analyze the differences in mammographic density of screening women among different age groups. The Kappa test was used to analyze the level of consistency between observers and between the observer and the gold standard.
Results:
Using the majority of classification results in radiologist evaluations as the gold standard
of the 774 cases
13 were the fatty
112 were the scattered areas
526 were the heterogeneously dense
and 123 were the extremely dense. There was a statistically significant difference in mammographic density between<60 years and60 years old women. The accuracy rates of junior (R1)
intermediate (R2)
senior (R3) radiologist and original report classification were 81.14% (628/774)
87.86% (680/774)
90.96% (704/774)
and 67.70% (524/774)
respectively; the agreement between R1 and gold standard was moderate (Kappa=0.602); the agreement between R2 and R3 was good (Kappa=0.766
0.817)
and the consistency between the original report and the gold standard was moderate (Kappa=0.430); the overall agreement between the observers was moderate (Kappa=0.671)
and the consistency between pairs was from fair to moderate (Kappa=0.396-0.604
P
<0.001).
Conclusion:
The inter-observer consistency in the BI-RADS classification of mammographic density is moderate
and the reliability of classification of breast density assessed by original report is limited.