A chat about actinic keratosis: Examining capabilities and user experience of ChatGPT as a digital health technology in dermato‐oncology

Abstract

Background: The potential applications of artificial intelligence (AI) in
dermatology are evolving rapidly. Chatbots are an emerging trend in
healthcare that rely on large language models (LLMs) to generate answers
to prompts from users. However, the factuality and user experience (UX) of
such chatbots remain to be evaluated in the context of dermato‐oncology.
Objectives: To examine the potential of Chat Generative Pretrained
Transformer (ChatGPT) as a reliable source of information in the context of
actinic keratosis (AK) and to evaluate clinicians' attitudes and UX with regard
to the chatbot.
Methods: A set of 38 clinical questions were compiled and entered as natural
language queries in separate, individual conversation threads in ChatGPT
(OpenAI, default GPT 3.5). Questions pertain to patient education, diagnosis,
and treatment. ChatGPT's responses were presented to a panel of 7
dermatologists for rating of factual accuracy, currency of information, and
completeness of the response. Attitudes towards ChatGTP were explored
qualitatively and quantitatively using a validated user experience questionnaire
(UEQ).
Results: ChatGPT answered 12 questions (31.6%) with an accurate, current,
and complete response. ChatGPT performed best for questions on patient
education, including pathogenesis of AK and potential risk factors, but
struggled with diagnosis and treatment. Major deficits were seen in grading
AK, providing up‐to‐date treatment guidance, and asserting incorrect
information with unwarranted confidence. Further, responses were considered
verbose with an average word count of 198 (SD 55) and overly alarming of
the risk of malignant transformation. Based on UEQ responses, the expert
panel considered ChatGPT an attractive and efficient tool, scoring highest for
speed of information retrieval, but deemed the chatbot inaccurate and verbose,
scoring lowest for clarity.
Conclusions: While dermatologists rated ChatGPT high in UX, the underlying
LLMs that enable such chatbots require further development to
guarantee accuracy and concision required in a clinical setting.
OriginalsprogDansk
TidsskriftJEADV Clinical Practice
Sider (fra-til)1-8
ISSN2768-6566
StatusUdgivet - 2024

Citationsformater