CheXbert: Combining automatic labelers and expert annotations for accurate radiology report labeling using BERT

Akshay Smit*, Saahil Jain*, Pranav Rajpurkar*, Anuj Pareek, Andrew Y. Ng, Matthew P. Lungren

*Corresponding author af dette arbejde
115 Citationer (Scopus)

Abstract

The extraction of labels from radiology text reports enables large-scale training of medical imaging models. Existing approaches to report labeling typically rely either on sophisticated feature engineering based on medical domain knowledge or manual annotations by experts. In this work, we introduce a BERT-based approach to medical image report labeling that exploits both the scale of available rule-based systems and the quality of expert annotations. We demonstrate superior performance of a biomedically pretrained BERT model first trained on annotations of a rule-based labeler and then fine-tuned on a small set of expert annotations augmented with automated backtranslation. We find that our final model, CheXbert, is able to outperform the previous best rule-based labeler with statistical significance, setting a new SOTA for report labeling on one of the largest datasets of chest x-rays.

OriginalsprogEngelsk
TidsskriftEMNLP 2020 - 2020 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference
Sider (fra-til)1500-1519
Antal sider20
StatusUdgivet - 2020
Udgivet eksterntJa
Begivenhed2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020 - Virtual, Online
Varighed: 16 nov. 202020 nov. 2020

Konference

Konference2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020
ByVirtual, Online
Periode16/11/202020/11/2020
SponsorAmazon Science, Apple, Baidu, Bloomberg Engineering, et al., Google Research

Fingeraftryk

Dyk ned i forskningsemnerne om 'CheXbert: Combining automatic labelers and expert annotations for accurate radiology report labeling using BERT'. Sammen danner de et unikt fingeraftryk.

Citationsformater