Background Large language models (LLMs) are increasingly used in medical and dental education to enhance clinical reasoning, patient communication, and academic learning. This study evaluates the effectiveness of four advanced LLMs— ChatGPT-4 (OpenAI), Claude 3.5 Sonnet (Anthropic), Microsoft Copilot, and Grok 3 (xAI)—in conveying fluoride-related dental knowledge.
Methods A cross-sectional comparative study was conducted using a mixed-methods approach. Each LLM answered 50 multiple- choice questions (MCQs) and 10 open-ended questions on fluoride chemistry, clinical applications, and safety concerns. Two blinded experts rated the open-ended responses on accuracy, depth, clarity, and evidence. Interrater reliability was assessed using Cohen’s kappa and Spearman’s correlation, and statistical analyses were performed using analysis of variance, Kruskal-Wallis, and post-hoc tests.
Results All models showed high MCQ accuracy (88%–94%). Claude 3.5 Sonnet achieved the highest scores in open-ended responses, especially for clarity (p=0.009). Minor differences in accuracy, depth, and evidence were not statistically significant. Overall, all LLMs performed strongly, with high interrater agreement supporting result reliability.
Conclusion Advanced LLMs show strong potential as supportive tools in dental education and patient communication on fluoride use. Claude 3.5 Sonnet demonstrated superior linguistic clarity, enhancing its educational value. Continued evaluation and clinical oversight are crucial for their safe and effective integration into dentistry.
Citations
Citations to this article as recorded by
What should researchers do in the era of artificial intelligence? Min Cheol Chang Journal of Yeungnam Medical Science.2025; 43: 2. CrossRef
Background Workplace-based assessments, such as the mini-clinical evaluation exercise (mini-CEX), are increasingly used to evaluate clinical competence in authentic healthcare settings. This study aimed to map and evaluate the global research landscape of mini- CEX in nursing and dental education through bibliometric analysis.
Methods A literature search was conducted in the Web of Science Core Collection on July 1, 2025, using the terms “mini-CEX,” “mini clinical evaluation exercise,” “nursing,” “nurse,” “dental,” and “dentistry.” Eligible articles were studies published in English that involved learners or educators in nursing or dental education. Data such as publication metrics, authorship, affiliations, keyword co-occurrence, journal impact, and Sustainable Development Goal (SDG) alignment were extracted and analyzed.
Results Thirty-seven articles were included. They received 229 citations, with an h-index of nine and an average of 6.19 citations per article. Most were indexed in the Science Citation Index Expanded or the Social Sciences Citation Index (67.6%), and 42.9% were published in Quartile 1 journals. The majority aligned with SDG 04 (Quality Education). Nursing-focused studies outnumbered dental studies. Authorship networks were fragmented, with limited cross-institutional collaboration. BMC Medical Education was the leading journal, and 2022 saw the highest number of publications. From 2020 onwards, both publication and citation counts increased significantly (p<0.01). Iran and China contributed the most articles. Keyword analysis revealed five clusters: “skills,” “mini-CEX,” “clinical competence,” “competence,” and “impact.”
Conclusion Research on mini-CEX in nursing and dental education is expanding, yet enhanced interprofessional collaboration is needed to maximize its global scholarly and practical impact.