- Impact of artificial intelligence in managing musculoskeletal pathologies in physiatry: a qualitative observational study evaluating the potential use of ChatGPT versus Copilot for patient information and clinical advice on low back pain
-
Christophe Ah-Yan, Ève Boissonnault, Mathieu Boudier-Revéret, Christopher Mares
-
Received October 8, 2024 Accepted October 25, 2024 Published online November 29, 2024
-
DOI: https://doi.org/10.12701/jyms.2024.01151
[Epub ahead of print]
-
-
Abstract
PDFSupplementary Material
- Background
The self-management of low back pain (LBP) through patient information interventions offers significant benefits in terms of cost, reduced work absenteeism, and overall healthcare utilization. Using a large language model (LLM), such as ChatGPT (OpenAI) or Copilot (Microsoft), could potentially enhance these outcomes further. Thus, it is important to evaluate the LLMs ChatGPT and Copilot in providing medical advice for LBP and assessing the impact of clinical context on the quality of responses.
Methods This was a qualitative comparative observational study. It was conducted within the Department of Physical Medicine and Rehabilitation, University of Montreal in Montreal, QC, Canada. ChatGPT and Copilot were used to answer 27 common questions related to LBP, with and without a specific clinical context. The responses were evaluated by physiatrists for validity, safety, and usefulness using a 4-point Likert scale (4, most favorable).
Results Both ChatGPT and Copilot demonstrated good performance across all measures. Validity scores were 3.33 for ChatGPT and 3.18 for Copilot, safety scores were 3.19 for ChatGPT and 3.13 for Copilot, and usefulness scores were 3.60 for ChatGPT and 3.57 for Copilot. The inclusion of clinical context did not significantly change the results.
Conclusion LLMs, such as ChatGPT and Copilot, can provide reliable medical advice on LBP, irrespective of the detailed clinical context, supporting their potential to aid in patient self-management.
|