TY - JOUR
T1 - Should artificial intelligence have lower acceptable error rates than humans?
AU - Lenskjold, Anders
AU - Nybing, Janus Uhd
AU - Trampedach, Charlotte
AU - Galsgaard, Astrid
AU - Brejnebøl, Mathias Willadsen
AU - Raaschou, Henriette
AU - Rose, Martin Høyer
AU - Boesen, Mikael
N1 - © 2023 The Authors. Published by the British Institute of Radiology.
PY - 2023
Y1 - 2023
N2 - The first patient was misclassified in the diagnostic conclusion according to a local clinical expert opinion in a new clinical implementation of a knee osteoarthritis artificial intelligence (AI) algorithm at Bispebjerg-Frederiksberg University Hospital, Copenhagen, Denmark. In preparation for the evaluation of the AI algorithm, the implementation team collaborated with internal and external partners to plan workflows, and the algorithm was externally validated. After the misclassification, the team was left wondering: what is an acceptable error rate for a low-risk AI diagnostic algorithm? A survey among employees at the Department of Radiology showed significantly lower acceptable error rates for AI (6.8 %) than humans (11.3 %). A general mistrust of AI could cause the discrepancy in acceptable errors. AI may have the disadvantage of limited social capital and likeability compared to human co-workers, and therefore, less potential for forgiveness. Future AI development and implementation require further investigation of the fear of AI's unknown errors to enhance the trustworthiness of perceiving AI as a co-worker. Benchmark tools, transparency, and explainability are also needed to evaluate AI algorithms in clinical implementations to ensure acceptable performance.
AB - The first patient was misclassified in the diagnostic conclusion according to a local clinical expert opinion in a new clinical implementation of a knee osteoarthritis artificial intelligence (AI) algorithm at Bispebjerg-Frederiksberg University Hospital, Copenhagen, Denmark. In preparation for the evaluation of the AI algorithm, the implementation team collaborated with internal and external partners to plan workflows, and the algorithm was externally validated. After the misclassification, the team was left wondering: what is an acceptable error rate for a low-risk AI diagnostic algorithm? A survey among employees at the Department of Radiology showed significantly lower acceptable error rates for AI (6.8 %) than humans (11.3 %). A general mistrust of AI could cause the discrepancy in acceptable errors. AI may have the disadvantage of limited social capital and likeability compared to human co-workers, and therefore, less potential for forgiveness. Future AI development and implementation require further investigation of the fear of AI's unknown errors to enhance the trustworthiness of perceiving AI as a co-worker. Benchmark tools, transparency, and explainability are also needed to evaluate AI algorithms in clinical implementations to ensure acceptable performance.
U2 - 10.1259/bjro.20220053
DO - 10.1259/bjro.20220053
M3 - Journal article
C2 - 37389001
SN - 2513-9878
VL - 5
SP - 20220053
JO - BJR Open
JF - BJR Open
IS - 1
ER -